Oracle9i Data Guard Broker Release 2 (9.2) Part Number A96629-01 |
|
This chapter describes site objects and how the broker manages them during switchover and failover operations.
This chapter includes the following sections:
A site object is the middle level of the hierarchy of objects managed by the broker. A site object corresponds to a primary or standby site in a Data Guard configuration. Through site objects, you have the ability to centrally control the states and behavior of the primary and standby databases in the configuration, such as starting up and mounting the databases, starting and stopping log transport services and log apply services, performing a switchover or failover operation, dismounting and shutting down databases, and so on.
A site object may be enabled or disabled. When disabled, a site object is no longer managed and monitored by the broker. When enabled, a site object can be in an offline or an online state.
The state of a site object is dependent upon the state of the configuration containing the site, and the state of the database object is dependent upon that of the site. Thus, if a site is in an offline state, the database that is dependent on the site must also be in an offline state. Similarly, if the configuration is offline, all of the sites and resources in the configuration are also offline because all are logically dependent on the configuration object.
When in an online state and enabled, the broker manages the sites in a broker configuration in their mutually exclusive roles: primary or standby:
Thus, if a site is in an primary role, the database that is dependent on the site must also be in an primary role. With the broker, you can change these roles dynamically as a planned transition called a switchover operation, or you can change these roles as a result of a database failure through either a graceful failover or a forced failover operation. These are known as role transitions. The broker manages the steps involved in switchover and failover operations automatically for you by coordinating the role transitions for all of the affected sites and their dependent databases.
In configurations that include multiple standby sites, the standby sites that are not involved in the role transition are referred to as bystanders.
When the primary site fails, such as when a system or software failure occurs, you may need to transition one of its corresponding standby sites to take over the primary role by performing a failover operation. Even in the absence of a disaster, you may have reason to perform a switchover operation to direct one of the standby sites to assume the role of being the primary site, while the former primary site assumes the role of being a standby site.
Without the broker, failover and switchover operations are manual processes that can be automated only by using script-based solutions. For example, if a physical standby site is in read-only mode (log apply services are offline) when a failure occurs on the primary site, you must change the standby database to managed recovery mode, apply archived redo logs that have not yet been applied to the standby database, and fail over the standby database to the primary role.
The broker simplifies the switchover or failover operations by allowing you to invoke them through a single command and then coordinating role transitions on all sites in the configuration.
You can switch a site role from primary to standby, as well as from standby to primary, without resetting the online redo logs of the associated new primary database. This is known as a database switchover operation, because the standby database on the site that you specify becomes the primary database, and the original primary database becomes a standby database. There is no loss of application data, the data does not diverge between the original and the new primary database after the switchover operation completes, and there is no need to restart the bystander databases.
Whenever possible, you should always perform a switchover operation to a physical standby site:
Consider the following points before you begin a switchover operation:
SYNC
or ASYNC
, then you need to set up standby redo logs on the primary site. If you pre-set database properties for the standby database role, note that these properties are not verified by the broker until you actually switch over the primary database to the standby role.SYNC
, ASYNC
, or ARCH
) of bystanders does not change after a switchover operation. Log apply services for all bystanders automatically begin applying archived redo logs from the new primary database.The act of switching roles should be a well-planned activity. The primary and standby databases involved in the site switchover operation should have as small a transactional lag as possible. Oracle Corporation highly recommends that you consider performing a full, consistent backup of the primary database prior to starting the switchover operation. (Oracle9i Data Guard Concepts and Administration provides detailed information about setting up the sites and databases in preparation of a switchover operation.)
To start a switchover operation using Data Guard Manager, select the Data Guard broker configuration and select Switchover from the right-click menu to invoke the Switchover wizard. When using the CLI, you need to issue only one SWITCHOVER
command to specify the name of the standby site that you want to change into the primary role.
The broker controls the rest of the switchover operation, as described in Section 3.2.1.3.
Once you start the switchover operation, the broker:
The broker allows the switchover operation to proceed as long as there are no errors for the primary site and standby site that you selected to participate in the switchover operation. However, errors occurring for any bystanders will not stop the switchover operation.
The broker first converts the original primary database to run in the standby role. Then, the broker transitions the target standby database to the primary role. If any errors occur during either conversion, the broker stops the switchover operation. See Section 3.2.1.4 for more information.
Because the configuration file describes all site and resource objects in the configuration, this ensures that each object will run in the correct role.
The broker verifies the state and status of the database resources on each site to ensure that the switchover operation has successfully transitioned the sites to their new role correctly. Bystanders will continue operations in the state they were in before the switchover operation. For example, if a bystander physical standby database was in read-only mode, it will remain in that mode after switchover completes. Log apply services for all bystanders automatically begin applying archived redo logs from the new primary database.
If the switchover operation fails due to problems with the configuration, the broker reports any problems it encounters. In general, you can choose another site for the switchover operation or fix the problem and then retry the switchover operation. The following subsections describe how to recover from the most common problems.
If the error messages returned indicate a problem when transitioning the original primary site and database to the standby role (including stopping log transport services and starting log apply services), use these general guidelines to fix the problem:
If the error messages that have been returned indicate that a problem occurred when transitioning the original standby database to the primary role (including stopping log apply services and starting log transport services), use these general guidelines to fix the problem:
SHUTDOWN IMMEDIATE
command on the original primary database instance to restart it.Database failover transitions one of the standby sites to the role of primary site. You should perform a failover operation only when a catastrophic failure occurs on the primary site, and there is no possibility of recovering the primary site and database in a timely manner. The failed primary site is discarded and the target standby site and database assume the primary role.
The broker supports two grades of failover operations:
This is the recommended failover option. Graceful failover automatically recovers some or all of the original primary database application data and attempts to bring along any bystander sites and databases to continue serving as standby databases to the new primary database:
Do not perform a forced failover to a standby site except in an emergency. Forced failover may result in lost application data even when standby redo logs are configured on the (physical) standby database. A consequence of a forced failover operation is that you must re-create the original primary database and all bystanders before they can serve as standby sites to the new primary site. Another consequence is that there may be lost application data unless the standby and primary databases had been configured to run in maximum protection mode prior to the failover, and all logs have been successfully applied to the standby database.
Depending on the log transport services destination attributes, a graceful failover may provide no data loss or minimal data loss. A forced failover may result in data loss. Always try to perform a graceful failover operation; only when a graceful failover is unsuccessful should you perform a forced failover operation.
To start a failover operation using Data Guard Manager, select the Data Guard configuration in the navigator tree and then select Failover from the right-click menu to invoke the Failover wizard. The Failover wizard guides you through the steps necessary to transition one of the standby sites into the primary role. When using the CLI, you issue one FAILOVER
command that specifies the name of the standby site that you want to change into the primary role, and the keyword GRACEFUL
or FORCED
to specify the type of failover operation.
The standby site that is the target of the failover operation should be a physical standby site in an enabled state. You can fail over to logical standby sites only if there are no enabled physical standby sites in the configuration.
After the failover operation, the overall protection mode of the new configuration (maximum protection, maximum availability, or maximum performance) is reset to the maximum performance mode, which is the default.
The broker controls the failover operation steps described in Section 3.2.2.2. However, you must perform the additional steps described in Section 3.2.2.4 after the failover operation completes.
Once you start the failover operation, the broker:
If a bystander was in an online state, then the bystander will be restarted in the state it was in before the failover operation. If a bystander was in the offline state, then it will be taken to its default online state during the failover operation. For example, if a physical standby database was operating in read-only mode, it will remain in read-only mode.
The broker allows the failover operation to proceed as long as there are no errors for the standby site that you selected to participate in the failover operation. However, errors occurring for any bystanders will not stop the failover operation. If you initiated a graceful failover operation and it fails, you might need to restart it as a forced failover operation.
Once you start the failover operation, the broker:
Because a forced failover operation starts a new log stream from the new primary site, all bystanders are permanently disabled from the broker configuration. These standby sites are left in an online state, but they are no longer manageable by the broker.
The broker allows the failover operation to proceed as long as there are no errors for the standby site that you selected to participate in the failover operation.
You must perform recovery steps after the failover operation completes:
For instance, this could happen if a bystander finds that it has applied more logs than the new primary itself has applied, hence diverging from the new primary. The bystander must be reinstantiated before it may serve as a standby for the new primary database.
A permanently disabled site is recovered for broker operation by:
Although it is possible for a failover operation to stop, it is very unlikely. If an error occurs, it is likely to happen when the standby site is transitioning to the primary role. If the error messages that have been returned indicate that this is when the problem occurred, use these general guidelines to fix the problem:
|
Copyright © 2000, 2002 Oracle Corporation. All Rights Reserved. |
|