Oracle® Database High Availability Overview 11g Release 1 (11.1) Part Number B28281-01 |
|
|
View PDF |
Oracle Database offers an integrated suite of high availability solutions that increase availability and eliminate or minimize both planned and unplanned downtime. These solutions help enterprises maintain business continuity 24 hours a day, seven days a week. However, the Oracle high availability solutions go beyond reducing downtime by providing solutions to increase system utilization on the primary and secondary systems and helping to improve overall performance, scalability, and manageability.
The chapter contains the following sections that outline the key impacts of the Oracle high availability features on businesses and applications:
Oracle High Availability Features and Solutions for Unplanned Downtime
Oracle High Availability Features and Solutions for Planned Downtime
Oracle provides the following features for high availability:
Also, the Oracle HA Solutions and Recovery Times for Unplanned Downtime section provides a summary of the key HA solutions that address different types of unplanned downtime along with the recovery time that can be attained with each solution.
See Also:
The "High Availability" chapter in Oracle Database Concepts. for an overview of the high availability features
The "Availability" section in the Oracle Database New Features Guide.for a list of all the new high availability features introduced in Oracle Database 11g Release 1 (11.1)
Oracle provides fast and predictable recovery from system faults and database failures. The Fast-Start Fault Recovery technology included in Oracle Database automatically bounds database recovery time upon startup by using its self-tuned checkpoint processing. This makes recovery time fast and predictable, and improves the ability to meet service level objectives. Oracle's Fast-Start Fault Recovery can reduce recovery time on a heavily-laden database from tens of minutes to a few seconds.
Fast-Start Fault Recovery features include:
Predictable, bounded recovery from instance, database, and computer failures
Database checkpointing is self-tuning to maintain desired recovery time objective
Oracle Real Application Clusters (Oracle RAC) and Oracle Clusterware allow the Oracle Database to run any packaged or custom application across a set of clustered servers. This capability provides the highest levels of availability and the most flexible scalability. If a clustered server fails, Oracle Database continues running on the surviving servers. When more processing power is needed, you can add another server without interrupting access to data.
Oracle RAC enables multiple instances that are linked by an interconnect to share access to an Oracle database. In an Oracle RAC environment, the Oracle Database runs on two or more systems in a cluster while concurrently accessing a single shared database. The result is a single database system that spans multiple hardware systems. This enables Oracle RAC to provide high availability and redundancy during failures in the cluster. Oracle RAC accommodates all system types, from read-only data warehouse (DSS) systems to update-intensive online transaction processing (OLTP) systems.
Oracle Clusterware is software that, when installed on servers running the same operating system, enables the servers to be bound together to operate as if they are one server, and manages the availability of user applications and Oracle databases. Oracle Clusterware also provides all of the features required for cluster management, including node membership, group services, global resource management, and high availability functions:
For high availability, you can place Oracle databases (single-instance or Oracle RAC databases), and user applications (Oracle and non Oracle) under the management and protection of Oracle Clusterware so that they restart on process failure or fail over to another node on node failure.
For cluster management, Oracle Clusterware presents multiple independent servers as if they are a single-system image or one virtual server. This single virtual server is preserved across the cluster for all management operations, enabling administrators to perform installations, configurations, backups, upgrades, and monitoring functions once. Then, Oracle Clusterware automatically distributes the execution of these management functions to the appropriate nodes in the cluster.
Oracle Clusterware is a requirement for using Oracle RAC and it is the only clusterware that you need for most platforms on which Oracle RAC operates. Although the Oracle Database continues to support select third-party clusterware products on specified platforms, using Oracle Clusterware provides the benefit of dispensing with proprietary vendor clusterware and using an integrated software stack from Oracle that provides disk management with Oracle ASM to data management with the Oracle Database and Oracle RAC. In addition, Oracle Database features, such as Oracle Services, use the underlying Oracle Clusterware mechanisms to provide their capabilities.
Oracle Clusterware requires two clusterware components: a voting disk to record node membership information and the Oracle Cluster Registry (OCR) to record cluster configuration information. The voting disk and the OCR must reside on shared storage. The Oracle Clusterware requires that each node be connected to a private network by way of a private interconnect.
Together, Oracle RAC and Oracle Clusterware provide the following benefits:
Ability to tolerate and quickly recover from computer and instance failures
Ability to apply Oracle Clusterware upgrades, patch sets, and interim patches in a rolling fashion
For example, upgrading Oracle Clusterware from Oracle 10g to Oracle 11g, patching Oracle Clusterware from Oracle 10.2.0.3 to 10.2.0.4, and patching Oracle Clusterware from Oracle 10.2.0.2 Bundle 1 to Oracle 10.2.0.2 Bundle 2
Rolling upgrades for system and hardware changes
Rolling patch upgrades for some interim patches
Service relocation and, when you perform additional Fast Application Notification (FAN) and client configuration, applications can receive FAN events and react immediately to achieve fast, automatic, and intelligent connection and failover
Load balancing advisory
Runtime connection load balancing with Oracle JDBC, OCI, and ODP.NET
Flexibility to increase processing capacity using commodity hardware without downtime or changes to the application
Comprehensive manageability integrating database and cluster features
Scalability across database instances
Specifically, Oracle Clusterware provides the following benefits
Automatically restarts failed Oracle processes
Automatically manages and fails over Oracle Virtual IP (VIP) on another node in the cluster on node failures
Automatically restarts resources from failed nodes on surviving nodes
For Oracle RAC databases, all Oracle processes are under the control of Oracle Clusterware by default; for Oracle single-instance databases, you can configure the Oracle processes into a resource group that is under the control of Oracle Clusterware
For Oracle and non Oracle applications, Oracle Clusterware provides an application programming interface (API) that enables you to control other Oracle processes with Oracle Clusterware, such as restart or react to failures and certain rules
Manages node membership and prevents split brain syndrome in which two or more instances attempt to control the database
Provides rolling release upgrades of Oracle Clusterware, with no downtime for applications
Oracle Data Guard ensures high availability, data protection, and disaster recovery for enterprise data. Data Guard provides a comprehensive set of services that create, maintain, manage, and monitor one or more standby databases to enable Oracle databases to survive disasters and data corruptions. Data Guard maintains standby databases as transactionally consistent copies of the primary (production) database. Then, if the primary database becomes unavailable because of a planned or an unplanned outage, Data Guard can switch any standby database to the primary role, minimizing the downtime associated with the outage. Data Guard can be used with traditional backup, restoration, and cluster techniques to provide a high level of data protection and data availability.With Data Guard, administrators can optionally improve primary database performance by offloading resource-intensive backup and reporting operations to standby systems.
An Oracle Data Guard configuration consists of one primary database and one or more standby databases. Using a backup copy of the primary database, you can create up to nine standby databases and incorporate them in a Data Guard configuration. Once created, Data Guard automatically maintains each standby database by transmitting redo data from the primary database and then applying the redo to the standby database.
Similar to a primary database, a standby database can be either a single-instance Oracle database or an Oracle RAC database.
A standby database can be a physical standby database, a snapshot standby database, or a logical standby database, and a Data Guard configuration can include any combination of these types of standby databases:
Provides a physically identical copy of the primary database, with datafiles that are identical to the primary database. The database schema, including indexes, are the same. A physical standby database is kept synchronized with the primary database, though Redo Apply, which recovers the redo data received from the primary database and applies the redo to the physical standby database.
A physical standby database can be used for business purposes other than disaster recovery. Starting in Oracle Database 11g Release 1, the physical standby database can be opened with read-only access while redo is being applied to the standby database. This mode, referred to as real-time query standby, allows users to access an up-to-date physical standby database for queries at any time. Also, you can convert a physical standby database into a snapshot standby database.
A snapshot standby database is an updatable standby database that you create from a physical standby database. A snapshot standby database receives and archives redo data received from the primary database, but the snapshot standby database does not apply redo data from the primary database while the standby is open read/write. Because of this, the snapshot standby typically diverges from the primary database over time. Moreover, local updates to the snapshot standby database cause additional divergence.
Redo data is not applied until you convert the snapshot standby database back into a physical standby database, and after all local updates to the snapshot standby database are discarded. With a single command, you can revert a snapshot standby back to a physical standby database, at which time the changes made to the snapshot standby state are discarded, and Redo Apply automatically resynchronizes the physical standby database with the primary database using the redo that was archived.
Contains the same logical information as the primary database, although the physical organization and structure of the data can be different. The logical standby database is kept synchronized with the primary database though SQL Apply, which transforms the data in the redo received from the primary database into SQL statements and then executing the SQL statements on the standby database.
A key benefit of a logical standby database is that you can create significant auxiliary structures to optimize the reporting workload, including structures that could have a prohibitive effect on the primary database's transactional response time. A logical standby database can have its data physically reorganized into a different storage type with different partitioning, having many different indexes, having on-demand refresh materialized views created and maintained, and it can be used to drive the creation of data cubes and other OLAP data views.
A logical standby database can be used for other business purposes in addition to disaster recovery requirements. This allows users to access a logical standby database for queries and reporting purposes at any time. Also, using a logical standby database, you can upgrade Oracle Database software and patch sets with almost no downtime. Thus, a logical standby database can be used concurrently for data protection, reporting, and database upgrades.
Oracle Data Guard provides the following benefits:
Maintenance of real-time, transactionally consistent database copies to provide protection against unplanned downtime and disaster.
Data protection and fast repair against computer failures, human errors, data corruption, lost writes, and site failures.
Automatic failover with flexible data protection levels to support all network configurations and business requirements.
Faster redo application, redo transport, and role transitions with various enhancements.
Reduction of planned downtime for system changes, some platform migrations, hardware and system upgrades, and Oracle patch set and database upgrades (see also Table 2-1, "Outage Types and Oracle High Availability Solutions for Unplanned Downtime").
Multiple levels of data protection and performance to balance data availability against system performance requirements.
Support for the real-time query capability of physical standby databases provides better primary database performance and better use of standby resources, because real-time query:
Diverts read-only queries and transactions from the primary database to standby databases.
Diverts backup operations from the primary database to standby databases.
Support for both physical standby databases (real-time query) and logical standby databases to provide more efficient use of system resources by diverting more querying and reporting functions from the primary database to standby databases (with the logical standby databases providing greater flexibility for any activity that requires access to a standby database that is open read/write).
Support for snapshot standby database for reporting or testing (cloning) purposes and automatic resynchronization with the primary database once reporting or testing has completed.
Managed and automatic role transition and application notification to minimize planned and unplanned downtime.
Automatic or automated resynchronization of a failed primary database following a failover.
Management of all systems as a single configuration for simplified administration.
Increased flexibility for Data Guard configurations where the primary and standby systems may have different CPU architectures, operating systems (for example, Windows and Linux), operating system binaries (32-bit and 64-bit), and Oracle database binaries (32-bit and 64-bit); this is subject to restrictions that are defined in MetaLink note 413484.1.
See Also:
Metalink note 413484.1 athttps://metalink.oracle.com/
for the latest capabilities and restrictionsOracle Streams is a very flexible and powerful database feature to implement fine-grained replication, multimaster replication, many-to-one replication, data transformation, hub and spoke replication, and message queuing.
Comparing Streams and Data Guard
Streams is designed from the ground up for information sharing. It enables highly customized replication strategies to satisfy the many varied uses of data replicated to a target database. These same capabilities can also make Streams a useful technology for addressing HA/DR requirements and for minimizing planned downtime during upgrades to new database releases and patch sets.Data Guard is designed for simple, one-way replication of an entire database expressly for maintaining a synchronized copy that can assume the primary role in the event of a failure. Data Guard Redo Apply (physical standby) best exemplifies this notion of simplicity, as a disaster recovery solution that is both datatype and application agnostic, and able to scale to very high levels of performance. While Data Guard also provides capabilities that enable a standby database to offload the primary database of the overhead of performing backups, queries, and reports, these capabilities are ancillary to Data Guard's primary mission, and are provided to increase the ROI of customer HA/DR investments. You can use Data Guard SQL Apply (logical standby) to get additional value from a Data Guard configuration by minimizing planned downtime during upgrades to new database releases and patch sets.
Streams Messaging and Information Flow
Oracle Streams enables information sharing. Using Streams, each unit of shared information is called a message, and you can share these messages in a stream. The stream can propagate information within a database or from one database to another.
For example, Figure 2-1 shows a multimaster configuration where all sites are directly connected to all other sites participating in the replication environment. This enables data to be replicated between all locations in a near realtime manner.
Figure 2-1 Oracle Streams Multimaster Configuration
Another example is the Oracle Streams 1-N, or hub-and-spoke configuration in which changes made at the primary or hub location can be propagated to the remote or spoke locations in a near real-time manner.
Although it is possible to configure a hub-and-spoke configuration for bidirectional replication, you may prefer to restrict updates to a single location, the hub, as shown in Figure 2-2. In query intensive environments, you can still balance the load between multiple locations, with fast local access, while updates are restricted to the hub. By offloading reporting to the spoke locations, you improve performance at the hub, or primary OLTP location. This type of configuration is easier to implement than multimaster replication because it is not necessary to establish connectivity between all locations in the replication environment and it is not necessary to implement a conflict resolution strategy.
Figure 2-2 Information Dissemination with Oracle Streams (1-N configuration)
The stream routes specific information to specific destinations. The result is a feature that provides greater functionality and flexibility than traditional solutions for capturing and managing messages, and sharing the messages with other databases and applications. Streams provides the capabilities needed to build and operate distributed enterprises and applications, data warehouses, and high availability solutions. You can use all of the capabilities of Oracle Streams at the same time. If your business requirements change, then you can implement a new capability of Streams without sacrificing existing capabilities.
As with any Oracle Streams configuration, there are three phases: capture, stage (propagate), and consume (apply). Using Oracle Streams, you control what information is put into a stream, how the stream flows or is routed from database to database, what happens to messages in the stream as they flow into each database, and how the stream terminates. By configuring specific capabilities of Streams, you can address specific requirements. Based on your specifications, Streams can capture, stage, and manage messages in the database automatically, including, but not limited to, data manipulation language (DML) changes and data definition language (DDL) changes. You can also put user-defined messages into a stream, and Streams can propagate the information to other databases or applications automatically. When messages reach a destination, Streams can consume them based on your specifications.
The following figure shows the Oracle Streams information flow.
Figure 2-3 Oracle Streams Information Flow
With Streams, you can create a local or remote copy of a production database. In the event of human error or a catastrophe, the copy can be used to resume processing. You can use Streams to configure flexible high availability environments.You can use the features of Oracle Streams to achieve little or no database down time during database upgrade and maintenance operations. Maintenance operations include migrating a database to a different platform, migrating a database to a different character set, modifying database schema objects to support upgrades to user-created applications, and applying an Oracle software patch.
Figure 2-4 shows an application that explicitly enqueues and dequeues messages through Oracle Streams Advanced Queuing as a method of sharing information with business partners or customers with different messaging systems. Once enqueued, messages can be transformed and propagated as desired, before being dequeued to a business partner's application that is a non-database oriented messaging systems.
Figure 2-4 Oracle Streams Message Queuing
Benefits of Using Oracle Streams
Oracle Streams provides the following benefits:
Data protection by maintaining a full or partial remote copy of the database
Achieves little or no downtime during database upgrade or maintenance operations such as migrating a database to a different platform or character set, modifying database objects to support upgrades to applications, and applying an Oracle software patch
Data replication by capturing DML and DDL changes made to database objects and replicating these changes to one or more other databases; bidirectional replication environment, in which exactly two databases share the replicated database objects and data, is possible.
Event management and notification by enqueuing messages or capturing events, propagating the messages and events through queues, and dequeuing and applying or acting upon the message or event (as shown in Figure 2-4)
Supports heterogeneous platforms across databases within the configuration
Allows character sets to differ between replicas
Permits fine-grained control of data sharing
Note:
Although Oracle Streams requires some initial implementation investment, it is well worth the effort because of the high flexibility you get with Streams.Transportable technologies provides transportable database and transportable tablespace:
Transportable database is used to transport an entire database (user data and the Oracle dictionary) to a new platform with the same endian format. Transportable database permits a minimal downtime migration to a new platform by avoiding the time-consuming method of unloading all user data from the source database and loading it into the target database.
Transportable tablespaces is a feature designed to move a subset of one database into another, even among platforms that differ in endian format. The cross-platform capability of transportable tablespaces can be used to migrate all user data within a database to a new platform with a different endian format. Leveraging transportable tablespaces in this manner permits a minimal downtime migration to a new platform by avoiding the time-consuming method of unloading all user data from the source database and loading it into the target database.
See Also:
Oracle Database Administrator's Guide for details about how to move or copy tablespaces to another database, including details about transporting tablespaces across platformsFlashback technology provides a set of features to switch between views of the data as it existed at different points in time. Using flashback features you can query past versions of schema objects and historical data. You can also perform change analysis and self-service repair to recover from logical corruption while the database is online.
Flashback technology provides a SQL interface to quickly analyze and repair human errors. Flashback provides fine-grained analysis and repair for localized damage such as deleting the wrong customer order. Flashback technology also enables correction of more widespread damage, yet does it quickly to avoid long downtime. Flashback technology is unique to Oracle Database and supports recovery at all levels including row, transaction, table, tablespace, and database.
Most of the flashback features use undo data while other features, such as Flashback Database and Block Media Recovery, use flashback logs:
Undo tablespace—A dedicated tablespace that stores only undo information when the database is run in automatic undo management mode.
Flashback data archive—An archive that is stored in a tablespace and contains transactional changes to every record in a table for the duration of the record's lifetime. The archived data can be retained for much longer duration than the retention period offered by an undo tablespace.
Flashback logs—Oracle-generated logs used to perform flashback operations. The database can only write flashback logs to the flash recovery area. Flashback logs are written sequentially and are not archived. They cannot be backed up to disk.
The following list describes the Flashback features:
Oracle Flashback Query provides the ability to view the data as it existed in the past by using the Automatic Undo Management system to obtain metadata and historical data for transactions. Undo data is persistent and survives a database malfunction or shutdown. The unique features of Flashback Query not only provide the ability to query previous versions of tables, they also provide a powerful mechanism to recover from erroneous operations.
Uses of Flashback Query include:
Recovering lost data or undoing incorrect, committed changes. For example, rows that have been deleted or updated can be immediately repaired even after they have been committed.
Comparing current data with the corresponding data at some time in the past. For example, using a daily report that shows the changes in data from yesterday, it is possible to compare individual rows of table data, or find intersections or unions of sets of rows.
Checking the state of transactional data at a particular time, such as verifying the account balance on a certain day.
Simplifying application design by removing the need to store certain types of temporal data. Using a Flashback Query, it is possible to retrieve past data directly from the database.
Applying packaged applications, such as report generation tools, to past data.
Providing self-service error correction for an application, enabling users to undo and correct their errors.
Oracle Flashback Versions Query
Oracle Flashback Versions Query is an extension to SQL that you can use to retrieve the versions of rows in a given table that existed in a specific time interval. Oracle Flashback Versions Query returns a row for each version of the row that existed in the specified time interval. For any given table, a new row version is created each time the COMMIT
statement is executed.
Flashback Versions Query is a powerful tool for the DBA to run analysis to determine the sources of problems. Additionally, application developers can use Flashback Versions Query to build customized applications for auditing purposes.
Oracle Flashback Transaction
Oracle Flashback Transaction is a new feature in Oracle Database 11g Release 1 that can easily back out a transaction and its dependent transactions. The DBMS_FLASHBACK.TRANSACTION_BACKOUT()
procedure rolls back a transaction and its dependent transactions while the database remains online. This recovery operation uses undo data to create and execute the compensating transactions that return the affected data to its original state. You can query the DBA_FLASHBACK_TRANSACTION_STATE
view to see the current state of a transaction with respect to whether the transaction has been backed out using dependency rules or forced out by either:
Backing out nonconflicting rows
Applying undo SQL
Oracle Flashback Transaction increases availability during logical recovery by easily and quickly backing out a specific transaction or set of transactions and their dependent transactions, with one command while the database remains online.
Oracle Flashback Transaction Query
Oracle Flashback Transaction Query provides a mechanism to view all changes made to the database at the transaction level. When used in conjunction with Flashback Versions Query, it offers a fast and efficient means to recover from a human or application error. Flashback Transaction Query increases the ability to perform online diagnosis of problems in the database by returning the database user that changed the row, and performs analysis and audits on transactions.
Oracle Flashback Table
Oracle Flashback Table enables you to recover a table to a previous point in time. It provides a fast, online solution for recovering a table or set of tables that has been modified by a human or application error. In most cases, Flashback Table alleviates the need for administrators to perform more complicated point-in-time recovery operations. Even after a flashback, the data in the original table is not lost; it can later be reverted back to the original state.
Oracle Flashback Drop
Dropping objects by accident is a problem for database users and DBAs alike. While there is no easy way to recover dropped tables, indexes, constraints, or triggers, Oracle Flashback Drop provides a safety net when dropping objects. When you drop a table, Oracle automatically places it into the Recycle Bin. The Recycle Bin is a virtual container where all dropped objects reside. You can continue to query data in a dropped table.
Oracle Flashback Restore Points
When an Oracle Flashback recovery operation is performed on the database, the DBA must determine the point in time—identified by the System Change Number (SCN) or timestamp—to which the data can later be flashed back. Oracle Flashback restore points are labels you can define that can be substituted for the SCN or transaction time used in Flashback Database, Flashback Table, and Recovery Manager (RMAN) operations. Furthermore, a database can be flashed back through a previous database recovery and open resetlogs by using guaranteed restore points. Guaranteed restore points allow major database changes—such as database batch jobs, upgrade, or patch—to be quickly undone by ensuring that the undo required to rewind the database is retained.
Using the Oracle Flashback restore points feature provides the following benefits:
Provides the ability to quickly restore to a consistent state, to a point in time that was prior to a planned operation that has gone awry (for example, a failed batch job, an Oracle software upgrade, or an application upgrade.
Allows the snapshot standby to be resynchronized with the production database.
Allows for a quick mechanism to restore a test or cloned database back to its original state.
Oracle Flashback Database
Oracle Flashback Database provides a more efficient alternative to database point-in-time recovery. With Oracle Flashback Database, current datafiles can be reverted to their contents at a past time. The result is much like restoring data from data file backups and executing point-in-time database recovery. However, Flashback Database skips the data file restoration and most of the application of redo data.
Enabling Oracle Flashback Database provides the following benefits:
Eliminates the time to restore a backup when fixing human error that has a database-wide impact.
Because human errors can be quickly undone, it allows standby databases to use real-time apply to synchronize with the primary database.
Allows quick standby database reinstantiation after a database failover.
Block Recovery Using Flashback Logs
Starting with Oracle Database release 11.1, block recovery can optionally retrieve a more recent copy of a data block from the flashback logs to reduce recovery time. Furthermore, a corrupted block encountered during instance recovery does not result in instance recovery failing. The block is automatically marked as corrupt and added to the RMAN corruption list in the V$DATABASE_BLOCK_CORRUPTION
table. You can subsequently issue the RMAN RECOVER BLOCK
command to fix the associated block.
An archive that is stored in a tablespace and contains transactional changes to every record in a table for the duration of the record's lifetime. The archived data can be retained for much longer duration than the retention period offered by an undo tablespace.
Automatic Storage Management (ASM) provides a vertically integrated file system and volume manager directly in the Oracle kernel, resulting in:
Significantly less work to provision database storage
Higher level of availability
Elimination of the expense, installation, and maintenance of specialized storage products
Unique capabilities for database applications
For optimal performance, ASM spreads files across all available storage. To protect against data loss, ASM extends the concept of SAME (stripe and mirror everything) and adds more flexibility in that it can mirror at the database file level rather than the entire disk level.
More importantly, ASM simplifies the processes of setting up mirroring, adding disks, and removing disks. Instead of managing hundreds and possibly thousands of files (as in a large data warehouse), DBAs using ASM create and administer a larger-grained object called a disk group. The disk group identifies the set of disks that are managed as a logical unit. Automation of file naming and placement of the underlying database files save DBAs time and ensures adherence to standard best practices.
The ASM native mirroring mechanism (2-way or 3-way) is an option that protects against storage failures. With ASM mirroring, an additional level of data protection can be provided with the use of failure groups. A failure group is a set of disks sharing a common resource (disk controller or an entire disk array) whose failure can be tolerated. Once defined, an ASM failure group intelligently places redundant copies of the data in separate failure groups. This ensures that the data is available and transparently protected against the failure of any component in the storage subsystem.
ASM provides the following benefits:
Provides the ability to mirror and stripe across drives and storage arrays
Automatically re-mirrors from a failed drive to remaining drives
Automatically rebalances stored data when disks are added or removed while the database remains online
Allows for operational simplicity in managing database storage
Manages OCR and voting disks
Provides local read capability, which gives better performance in an extended cluster
Supports very large databases
Supports ASM rolling upgrades
Supports finer granularity in tuning and security
ASM Fast Mirror Resync, which provides fast repair after a temporary disk failure
Recovery Manager (RMAN) is an Oracle utility to manage the backup and, more importantly, the recovery of the database. It eliminates operational complexity while providing superior performance and availability of the database.
Recovery Manager determines the most efficient method of executing the requested backup, restoration, or recovery operation and then submits these operations to the Oracle Database server for processing. Recovery Manager and the server automatically identify modifications to the structure of the database and dynamically adjust the required operation to adapt to the changes.
RMAN provides the following benefits:
Automated channel failover on backup and restore operations
Automatic failover to a previous backup when the restore operation discovers a missing or corrupt backup
Automated creation of new database and temporary files during recovery
Automated recovery through a previous point-in-time recovery—recovery through resetlogs
Block media recovery enables the datafile to remain online while fixing the block corruption
Fast incremental backups using block change tracking
Fast backup and restore operations with intrafile and interfile parallelism
Enhanced security with Virtual Private Catalog
Lower space consumption when creating a database over the network by eliminating staging areas
Merge incremental backups into image copies in the background providing up-to-date recoverability
Optimized backup and restore of required files only
Retention policy ensures that relevant backups are retained
Ability to resume backup and restore of previously failed operations
Automatic backup of the control file and the server parameter file ensuring that backup metadata is available in times of database structural changes as well as media failure and disasters
Online backup does not require the database to be placed into hot backup mode
Oracle Secure Backup (OSB) is a centralized tape backup management solution providing performant, heterogeneous data protection in distributed UNIX, Linux, Windows, and Network Attached Storage (NAS) environments. By protecting file system and Oracle database data, OSB provides a complete tape backup solution for your IT environment.
OSB is tightly integrated with Recovery Manager (RMAN) to provide the media management layer for RMAN, supporting releases since Oracle9i. With optimized integration points, OSB and RMAN provide the fastest most efficient tape backup for the Oracle database.
You can backup distributed servers to local and remote tape devices from a central OSB administrative server using backup policies, calendar based scheduling for lights out operations, or on-demand backup for immediate requirements. With its highly scalable client/server architecture, OSB provides local and remote data protection, leveraging SSL for secure intradomain communication and two-way server authentication.
The following list describes the key benefits of OSB:
Optimized tape backup for the Oracle database by backing up only the currently used blocks and increasing backup performance by 10% to 25%.
Policy based management allows backup administrators to exercise precise control over the backup domain.
Dynamic drive sharing for increased tape resource use.
Heterogeneous storage area network (SAN) support allowing NAS, UNIX, Windows, and Linux to share tape drives and media.
File system backup at the file, directory, file system or raw partition level with full, incremental and offsite backup scheduling.
Integrated with Oracle Enterprise Manager, providing an intuitive, familiar interface.
Backup encryption to tape.
Broad tape-device support for new and legacy tape devices in SAN and SCSI environments.
Network Data Management Protocol (NDMP) support for highly efficient backup of NAS filers.
Scalable, low-cost licensing model reduces IT costs and operational considerations.
Data Recovery Advisor, a new feature in Oracle Database 11g release 1 (11.1), automatically diagnoses persistent (on disk) data failures, presents appropriate repair options, and runs repair operations at your request.
Note:
Note that the initial release of Data Recovery Advisor does not support Oracle RAC. In addition, while you can use Data Recovery Advisor when managing a primary database in a Data Guard configuration, you cannot use Data Recovery Advisor to troubleshoot a physical standby database. Data Recovery Advisor only takes the presence of a standby database into account when recommending repair strategies if you are using Enterprise Manager 11g Grid Control.Data Recovery Advisor includes the following functionality:
Failure Diagnosis
The first symptoms of database failure are usually error messages, alarms, trace files and dumps, and failed health checks. Assessing these symptoms can be complicated, error-prone and time-consuming. Data Recovery Advisor automatically diagnoses data failures and informs you about them.
Failure Impact Assessment
After a failure is diagnosed, you must understand its extent and assess its impact on applications before devising a repair strategy. Data Recovery Advisor automatically assesses the impact of a failure and displays it in a easily understood format.
Repair Generation
You typically have several repair options available, offering trade-offs in recovery time and potential data loss. If there are multiple failures present, you must also determine the best sequence of repair steps. In some situations it can be advantageous to consolidate repairs. Data Recovery Advisor does all this for you, automatically determining the best repair options.
Repair Feasibility Checks
Before presenting repair options, Data Recovery Advisor validates them with respect to the specific environment and availability of media components required to complete the proposed repair. The feasibility check is fast and validates if the required backups are available. The actual contents of these backups will be validated during repair.
Repair Automation
If you accept the suggested repair option, Data Recovery Advisor automatically performs the repairs, verifies that the repair was successful, and closes the appropriate failures.
Validation of Data Consistency and Database Recoverability
Data Recovery Advisor can validate the consistency of your data, as well as backups and redo stream, whenever you choose.
Early Detection of Corruption
Through Health Monitor, you can schedule periodic runs of Data Recovery Advisor diagnostic checks to detect, analyze, and repair data failures before a database process executing a transaction discovers the corruption and signals an error. Early warnings can limit the damage caused by corruption.
Integration of Data Validation and Repair
Data Recovery Advisor is a single tool for data validation and repair.
See Also:
ÒDiagnosing and Repairing Failures with the Data Recovery AdvisorÓ in Oracle Database Backup and Recovery User's GuideThe flash recovery area is a unified storage location for all recovery-related files and activities in Oracle Database. After this feature is enabled, all RMAN backups, archive logs, control file autobackups, and datafile copies are automatically written to a specified file system or automatic storage management disk group, and the management of this disk space is handled by RMAN and the database server.
Making a backup to disk is faster because using the flash recovery area eliminates the bottleneck of writing to tape. More importantly, if database media recovery is required, then datafile backups are readily available. Restoration and recovery time is reduced because you do not need to find a tape and a free tape device to restore the needed datafiles and archive logs.
The flash recovery area provides:
Unified storage location of related recovery files
Management of the disk space allocated for recovery files, which simplifies database administration tasks
Fast, reliable disk-based backup and restoration
Ability to backup and restore the entire flash recovery area
Ability to tolerate failures to the flash recovery area
The best protection against human errors is to prevent their occurrence. The best way to prevent human errors is to restrict user access to only those data and services truly needed to perform business functions. Oracle provides a wide range of security tools to control access to application data by authenticating database users and then enabling administrators to grant them only those privileges required to perform their duties.
In addition, the Oracle Database security model provides the ability to restrict data access at a row level using Virtual Private Database, thereby further isolating database users from data that they do not need to access.
Oracle security features include:
Authentication control to validate the identities of entities using networks, databases, and applications. Network sessions between databases, such as redo transport sessions, are also authenticated.
Authorization control to provide limits to access and actions linked by database user identities and roles.
Access control to objects, providing protection regardless of the entity seeking to access or alter them.
Auditing control to monitor and gather data about specific database activities, investigate suspicious activity, deter users (or others) from inappropriate activities, and detect problems with authorization or access control implementation.
Security policy management using profiles.
Encryption of data residing within the database and backups, or transferred to and from databases.
Oracle log files contain useful information about the activities and history of the Oracle database. Log files contain all data necessary to perform database recovery, and also record all changes made to the data and metadata within the database.
LogMiner is a fully relational tool that allows redo log files to be read, analyzed, and interpreted using SQL. Analysis of the log files with LogMiner can be used to:
Track or audit changes to data
Provide supplemental information for tuning and capacity planning
Retrieve critical information for debugging complex applications
Recover deleted data
Provide additional browser-based simplification to help troubleshoot and resolve logical failures
LogMiner features include:
Pinpoint when a logical corruption to the database—such as errors made at the application level—may have occurred
Determine the necessary actions to perform fine-grained recovery at the transaction level
Performance tuning and capacity planning through trend analysis
Perform post auditing
See Also:
Oracle Database UtilitiesThe Hardware Assisted Resilient Data (HARD) Initiative is a joint initiative between Oracle and hardware vendors to prevent data corruptions from being written out to disk. Data corruption is very rare, but when it happens, it can have a catastrophic effect on a database, and therefore a business.
Under the HARD Initiative, Oracle works with selected system and storage vendors to build operating system and storage components that can detect corruption early and prevent corrupted data from being written to disk. The key approach is block checking where the storage subsystem validates the Oracle block contents.
Any datafiles and log files located on HARD-compliant storage is protected. You must also enable the HARD validation feature on the storage, using the vendor-provided interface. When Oracle writes data to the storage, the storage system validates the data. If the data appears to be corrupted, then the write is either rejected with an error, or it is accepted with an error logged by the storage in the internal logs.
Storage vendors may choose to implement some or all of the checks in their implementation. Also, each vendor's implementation is unique and their control interfaces may have different features.
See Also:
The HARD initiative page for the latest vendor and implementation information athttp://www.oracle.com/technology/deploy/availability/htdocs/HARD.html
The Hang Manager, a new feature in Oracle Database 11g Release 1, simplifies database management by enabling you to more quickly resolve database hangs. It automatically detects, analyzes, and dumps diagnostic information for hangs in Oracle Database environments, including Oracle RAC and ASM databases.
The Hang Manager is enabled by default in Oracle RAC databases and ASM instances. Thus, you can use Hang Manager to follow a hang from the database instance to the underlying ASM instance.
Active entities that attempt to obtain restrictive access to shared resources or request services from other Oracle Database processes, sessions, and transactions are in danger of hanging. A hang chain is a chain of processes with each one waiting on a resource held by the next, with a single process serving as the root of the hang.
Hangs in Oracle Database can cost a great deal in terms of system unavailability. Specifically, hangs lead to the following problems:
Extended system outages. These outages may occur frequently before a fix is found, which adds to the total downtime.
Difficult analysis: Analyzing the hang to determine where the problem lies can be lengthy, complex, and prone to error.
See Also:
Oracle Database Reference for more information about the views used to determine hangsPrior to Oracle Database 11g, RMAN-detected block corruptions were recorded in V$DATABASE_BLOCK_CORRUPTION
. In Oracle Database 11g, several database components and utilities, including RMAN, can now detect a corrupt block and record it in that view. Oracle Database automatically updates this view when block corruptions are detected or repaired (for example, using block media recovery or data file recovery). The benefit is that the time it takes to discover block corruptions is shortened.
In addition, you can use the DB_ULTRA_SAFE
initialization parameter to automatically configure the appropriate data protection block checking level in the database. The performance impact may vary depending on the application and available system resources, but the effect can vary from 1% to 10%.
The DB_ULTRA_SAFE
initialization parameter:
Controls the setting of other related initialization parameters, including DB_BLOCK_CHECKING
, DB_BLOCK_CHECKSUM
, and DB_LOST_WRITE_PROTECT
Controls other data protection behavior in the Oracle Database, such as requiring ASM to perform sequential mirror writes
By making it possible to detect data corruptions in a timely manner, these features provide critical high availability benefits for the Oracle database.
See Also:
Oracle Database Reference for more information about these views and initialization parametersOracle provides high availability solutions to prevent, tolerate and reduce downtime for all types of unplanned failures.
Table 2-1 describes the various Oracle high availability solutions for unplanned downtime along with the recovery time that can be attained with each solution. The table shows how the features discussed in Section 2.1.1 through Section 2.1.16 can be used to address various causes of unplanned downtime. Also, see Table 4-4 for a summary of the attainable recovery times for all types of unplanned downtime for each Oracle high-availability architecture.
Table 2-1 Outage Types and Oracle High Availability Solutions for Unplanned Downtime
Outage Type | Oracle Solution | Benefits |
---|---|---|
Computer Failures |
Oracle Real Application Clusters and Oracle Clusterware |
|
Computer Failures |
|
|
Computer Failures |
|
|
Computer Failures |
|
|
Storage Failures |
|
|
Storage Failures |
|
|
Storage Failures |
|
|
Storage Failures |
|
|
Human Errors |
|
|
Human Errors |
|
|
Human Errors |
|
|
Data Corruption |
Hardware Assisted Resilient Data (HARD) Initiative |
|
Data Corruption |
Data Block Corruption Prevention and Detection Parameters Database initialization settings such as |
|
Data Corruption |
Data Recovery Advisor and Recovery Manager with Flash Recovery Area |
|
Data Corruption |
|
|
Data Corruption |
|
|
Lost writes |
Oracle Data Guard, Recovery Manager, and the Also, setting |
|
Lost writes |
Hardware Assisted Resilient Data (HARD) Initiative |
|
Hangs or slow down |
Hang Manager and Oracle Enterprise Manager |
|
Site Failures |
|
|
Site Failures |
|
|
Site Failures |
|
|
Planned downtime can be just as disruptive to operations as unplanned downtime. This holds especially true for global enterprises that need to support users in multiple time zones, or for those that need to provide Internet access to customers 24 hours a day, seven days a week.
In the past, planned downtime became necessary when performing periodic maintenance or when migrating to new deployments. Periodic maintenance—such as patching or reconfiguring the system—may be necessary to update the database, application, operating system, middleware, or network. New deployments include major upgrades or new rollouts of the hardware, database, application, operating system, middleware, or network.
Oracle provides the following high availability solutions to eliminate or reduce planned downtime for system and database changes, data changes, and application changes:
For system and database changes, see Dynamic Resource Provisioning
For data changes, see Online Reorganization and Redefinition
For application changes, see Online Application Maintenance and Upgrades
This section describes dynamic resource provisioning under the following topics:
Oracle continues to broaden support for dynamic reconfiguration of the database, enabling it to adapt to changes in hardware demands without any service interruptions. Oracle Database dynamically accommodates various changes to hardware and database configurations:
Add and remove processors from an SMP server
Add and remove nodes and instances in an Oracle RAC environment
Dynamically grow and shrink its shared memory allocation and automatically tune memory online using Automatic Shared Memory Management
Add and remove database disks online without disturbing database activities using Automatic Storage Management (ASM)
Add and remove storage arrays online without disturbing database activities using ASM
Automatically rebalance the I/O load across the database storage using ASM
Move datafiles online when adding or dropping disks using ASM, which automatically rebalances database storage whenever the storage configuration is changed
Change almost all initialization parameters without shutting down the instance by using the SQL*Plus ALTER SESSION
statement to change the value of a parameter during a session, or the ALTER SYSTEM
statement to change the value of a parameter in all sessions of an instance for the duration of the instance
These capabilities provide no-cost system changes and capacity on-demand provisioning, both of which are fundamental requirements of enterprise Grid computing.
Oracle Database 11g release 1 (11.1) introduces two new memory management initialization parameters, MEMORY_TARGET
and MEMORY_MAX_TARGET
that enable automatic management of the system global area (SGA), program global area (PGA), and other memory required to run Oracle Database.
Note:
MEMORY_MAX_TARGET
is the value up to which MEMORY_TARGET
can grow dynamically. If these initialization parameters are left at their default values (0
), then Oracle Database does not autotune memory. If one parameter is set to a nonzero value and other is not set, then Oracle Database internally sets both parameters to the nonzero value.Oracle Database uses a noncentralized policy to free and acquire memory in each subcomponent of the SGA and the PGA. Oracle Database autotunes memory by prompting the operating system to transfer granules of memory from less needy to more needy components. The granularity of the memory transfer is dependent on the current free memory and the amount of memory the operating system needs to maintain a basic level of service.
Note:
Automatic memory management with theMEMORY_TARGET
and MEMORY_MAX_TARGET
initialization parameters is supported on Linux, Windows, Solaris, HP-UX, and AIX. Check the Oracle Database Concepts and Oracle Database Administrator's Guide for more information about all supported platforms.ASM automates and simplifies the layout of datafiles, control files, and log files. Database files are automatically distributed across all available disks. Database storage is rebalanced whenever the storage configuration changes, including adding and removing disks or storage arrays. ASM provides redundancy through the mirroring of database files, and provides optimal performance by automatically striping database files across available disks.
See Also:
Oracle Database Concepts and Oracle Database Storage Administrator's Guide for more information about Automatic Storage ManagementOracle provides high availability solutions to prevent, tolerate and reduce downtime for all types of unplanned failures.
Table 2-2 describes the various Oracle high availability solutions for planned downtime along with the outage time that can be attained with each solution and their known considerations. In all cases, Oracle recommends you perform extensive testing prior to performing any rolling upgrade. Also, see Table 4-5 for a summary of the attainable recovery times for all types of planned downtime for each Oracle high-availability architecture.
Table 2-2 Oracle High Availability Solutions for Planned Downtime
Maintenance Type | Oracle Recommended Solution | Solution Description | Outage Time |
---|---|---|---|
Operating system and hardware upgrades |
Oracle Real Application Clusters and Oracle Clusterware |
|
No downtime |
Oracle interim patches |
Oracle Real Application Clusters (Oracle RAC) |
|
No downtimeFoot 1 |
Online Patches |
Online Patching |
|
No downtime |
Oracle Clusterware upgrades and patches |
Cluster Ready Services (CRS) |
|
No downtime |
ASM upgrades |
|
|
No downtime |
Storage migrationFoot 2 |
|
|
No downtime |
Migrating to ASM or migrating a single-instance database to Oracle RAC |
|
|
Seconds to minutes |
Patch set and database upgrades |
Oracle Data Guard using SQL Apply and logical standby databases |
|
Seconds to minutes |
Platform Migration Across Windows and Linux Platforms |
|
|
Seconds to minutes |
Platform Migration across same endian format platforms |
Transportable Database |
|
Minutes to hours |
Platform migration across different endian format platforms |
Transportable Tablespace |
|
Minutes to hours |
Application upgrades |
Online Application Maintenance and Upgrades |
|
Footnote 1 Patches that cannot be applied by performing a rolling upgrade can be applied with the MINIMIZE_DOWNTIME
option of the OPatch utility to reduce the availability impact of the patch application.
Footnote 2 An example is migration from traditional storage to low-cost storage
See Also:
Oracle Data Guard Concepts and Administration for more information about using Data Guard with SQL Apply to upgrade an Oracle database
Oracle Database Concepts and Oracle Database Administrator's Guide for more information about transportable tablespace
The best practices white papers on rolling upgrades at
http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm
Using Oracle RAC is the recommended solution for avoiding downtime during system and hardware upgrades.
If you cannot perform the upgrade using Oracle RAC, then Oracle Data Guard and physical standby databases are the recommended solution. See Section 2.2.2.2 for more information.
Oracle RAC Solution Description
Perform the following steps:
Stop the application service.
This implicitly redirects connections off of the target instance when using FAN.
Shut down target instance or instances with the IMMEDIATE
option.
Shut down and disable Oracle Clusterware.
Disable is used to prevent startup at boot time.
Perform maintenance.
Enable and start Oracle Clusterware.
This step implicitly starts the database instances.
Start the application service.
This step implicitly redirects connections back on to the target instance when using FAN.
Repeat all steps on the next node.
Additional Considerations
In addition, verify the following:
Ensure the planned maintenance can be done in a rolling fashion from an operating system perspective.
Ensure the database and clusterware versions are certified with the new system and hardware changes.
See Also:
Your operating system-specific Oracle Real Application Clusters Installation GuideOracle Data Guard and physical standby databases are the recommended solution for performing system and cluster upgrades that are not upgradeable using Oracle RAC rolling upgrades. Oracle Data Guard is also recommended for migrations to ASM, Oracle RAC, 64-bit systems, Windows to Linux or Linux to Windows, or same processor architecture platforms.
For example, Oracle Data Guard can be used for system upgrades that cannot be upgraded using Oracle RAC rolling upgrades due to system restrictions or if migrating to ASM, from a noncluster environment to Oracle RAC, to a different platform with the same endian format or to a different platform with the same processor architecture.
In general, you first upgrade the physical standby database and then perform a Data Guard switchover to the physical standby database, as follows:
Upgrade the system or change the physical standby database system to your target environment.
For example, you can convert the standby database from a single-instance database to an Oracle RAC database by using ASM, without any impact on the primary database. Then, restart the standby database, ensure that it matches your target environment, and wait for Redo Apply to finish applying all redo data to the standby database.
Perform a Data Guard switchover—optimally the switchover should take only seconds to minutes.
Shut down the original primary database (now the standby database).
Upgrade or make system changes to the original primary database.
Restart it as a standby database and allow recovery to synchronize the databases.
Optionally, perform a Data Guard switchover to return the standby database to the primary database role.
Additional Considerations
For fastest switchover, the standby database should be using real-time apply and synchronized prior to the switchover operation.
This is the best approach if Oracle RAC rolling upgrade or online patching is not possible.
The conversion from 32 to 64 bit is automatic if you are applying an Oracle Database patch set or doing an Oracle Database upgrade at the same time. If you are upgrading only the operating system, you may need to perform additional post-upgrade steps that are described in Metalink note 414043.1. Also, see the Oracle Database Upgrade Guide for more information about upgrades.
Using Oracle RAC is the recommended solution for avoiding downtime when applying Oracle interim database patches. If you cannot apply patches using Oracle RAC, then Oracle Data Guard and physical standby databases are the recommended solution. See Section 2.2.2.2 for more information.
Solution Description
Oracle interim (one-off) patches to database software are usually applied to implement known fixes for software problems, or to apply diagnostic patches to gather information about a problem. Applying patches is often performed during a schedule maintenance outage.
Oracle provides the capability to do rolling patch upgrades with Oracle RAC with little or no database downtime using the opatch
command-line utility.
An Oracle RAC rolling upgrade enables all but one of the instances of the Oracle RAC installation to be available during the scheduled outage. This means that the impact on the application downtime required for scheduled outages is further reduced. The Oracle opatch utility enables you to apply the patch successively to the different instances in an Oracle RAC installation.
Additional Considerations
Performing a rolling upgrade is possible only for patches that are certified for rolling upgrades. Typically, patches that can be installed in a rolling upgrade include:
Patches that do not affect the contents of the database, such as the data dictionary
Patches not related to Oracle RAC internode communication
Patches related to client-side tools such as SQL*Plus, Oracle utilities, development libraries, and Oracle Net
Patches that do not change shared database resources, such as datafile headers, control files, and common header definitions of kernel modules
Do not use Oracle RAC to perform rolling upgrades of patch sets.
See Also:
Your operating system-specific Oracle Real Application Clusters Installation GuideUsing Online Patching is the recommended solution for avoiding downtime when an online patch is available.
Solution Description
Online patches are a special type of interim patch that can be applied while the instance remains online.
Oracle provides the capability to do online patching with any Oracle database using the opatch command-line utility.
Additional Considerations
Oracle provides online patches when the changed code is small in scope and complexity, such as with diagnostic patches or small bug fixes.
Oracle provides online patches when the patch does not change shared memory structures in the System Global Area (SGA), or other critical internal code structures.
Applying an online patch increases memory consumption on the system because each Oracle process uses more memory from the Program Global Area (PGA) during the patch application. You need to take your memory requirements into consideration before you begin applying an online patch. Each online patch is unique and the memory requirements are patch specific. As is always the case, the best practice is to apply the patch on your test system first. Doing so also enables you to assess the effect of the online patch on your production system and estimate any additional memory usage.
See Also:
Oracle Universal Installer and OPatch User's Guide for information about online patching and OPatch, and see Oracle Database Upgrade Guide for an overview of rolling upgrades and rolling patchesPerforming rolling upgrades of the Oracle Clusterware using Cluster Ready Services (CRS) software is the recommended solution for avoiding downtime when upgrading Oracle Clusterware.
Solution Description
All upgrades to Oracle Clusterware can be performed in a rolling fashion.
See Also:
Your operating system-specific Oracle Clusterware installation guideUpgrades ASM in a rolling fashion is the recommended solution for upgrading ASM.
Solution Description
All upgrades starting with Oracle Database 11g (and later releases) can be performed in a rolling fashion.
Using ASM is the recommended solution for performing storage migrations.
Solution Description
ASM enables you to add all disks in one storage array and subsequently drop all disks from another array. ASM automatically rebalances and migrates data to the new storage while the database remains operational.
Additional Considerations
Before removing the source storage array, ensure that the rebalancing is complete.
See Also:
The chapter about performing ASM Data Migration in the Oracle Database Backup and Recovery User's GuideOracle Data Guard using SQL Apply is the recommended solution for performing patch set and database upgrades. Section 2.2.2.8.1 describes this solution.
Oracle Streams is the recommended solution for performing database upgrades when there are data type restrictions that prevent you from performing a rolling upgrade with SQL Apply. Section 2.2.2.8.2 describes this solution.
Follow these steps to leverage Data Guard using SQL Apply to upgrade an Oracle database:
Upgrade logical standby database to the new release and evaluate the change.
Ensure that SQL Apply has applied all redo data to the logical standby database
Disconnect applications.
Perform Data Guard switchover.
Reconnect applications to the new primary database.
Shut down the original primary database (now the logical standby database).
Execute database software upgrade steps on the new standby database.
Restart the standby database and allow recovery to synchronize.
Optionally perform a Data Guard Switchover to return to the original database.
Additional Considerations
Oracle Data Guard is the best approach if performing a Oracle RAC rolling upgrade is not possible and there are no data type restrictions.
See Also:
The ÒRolling Database Upgrades Using Data Guard SQL ApplyÓ white paper available athttp://www.oracle.com/technology/deploy/availability/htdocs/maa.htm
Streams is similar in function to Data Guard SQL Apply but provides added flexibility if your database includes data types that SQL Apply does not support. Streams enables a method that can be used to work around the existence of unsupported data types and still effect a rolling database upgrade with minimal downtime.
The following high-level steps describe how to perform a database upgrade:
Before you begin the upgrade process, see Oracle Streams Concepts and Administration for information about how to perform a database upgrade on a database that has user-defined types.
Create a duplicate database. (Ideally the replica will start out as a physical standby database that is up to date.)
Activate and upgrade the database to the later version.
Enable Oracle Streams replication.
During the upgrade of the replica, the source database continues ahead. Once the replica is caught up, perform a switchover.
See Also:
Oracle Streams Concepts and Administration for complete information about online database upgrade with Oracle StreamsConsider that following approaches when performing platform migrations across same endian format platforms
Oracle Data Guard (physical standby database) is the recommended solution for performing platform migration across Linux and Windows platforms. This solution is described in Section 2.2.2.2.
If cross-platform physical standby database is not available for the platform combination to be migrated, then use transportable database. This solutions is described in Section 2.2.2.9.1, "Solution Description for Platform Migration Using Transportable Database".
If transportable database cannot perform the migration quickly enough, then use Oracle Streams as described in Section 2.2.2.9.2, "Solution Description for Platform Migration Using Oracle Streams"
Transportable database should be used for platform migration only when cross-platform physical standby database or logical standby database is not supported for the platform combination in questionFoot 1 .
For example, if you want to move from Windows x86-64 to Linux x86-64, it is best to use cross-platform standby database instead of transportable database. There is less downtime (simply the time it takes to switchover) and it is possible to run the standby database on the new platform for a period of time to ensure that everything is working as planned.
The high-level steps (with target system conversion) are as follows:
Place the source database in read/only mode
Run RMAN CONVERT DATABASE
command
Move files to the target system
Run RMAN generated script to convert datafiles to target platform format
Run RMAN generated script to complete the migration
When using transportable database, the downtime required for a platform migration is determined by the time needed to:
Place the source database in read-only mode
Convert all data files to the new platform format
Transfer all data files from the source system to the target system
You can significantly minimize this time by using a storage infrastructure that can make the data files available to the target system without the need to physically move the files.
Invalidate and recompile all PL/SQL using SQL scripts utlirp.sql
and utlrp.sql
.
See Also:
The ÒPlatform Migration using Transportable DatabaseÓ white paper available athttp://www.oracle.com/technology/deploy/availability/htdocs/maa.htm
Oracle Streams enables updates on the multiple masters and provides support for heterogeneous platforms with different database releases. Therefore, Oracle Streams may provide the fastest approach for database upgrades and platform migration.
Oracle Streams has data type limitations and restrictions, such as for advanced queue and object types. But in some cases you can work around them by creating shadow tables on the source database. You can create a trigger on tables with unsupported data types to capture and propagate changes to tables with supported data types. Those changes are replicated by way of Streams to the target database. You can customize apply to apply the changes to the original tables in the target database.
Oracle Streams implementations require additional administrative effort for testing, setup, and configuration because Streams is designed to be a more flexible architecture.
The following high-level steps describe how to perform a platform migration with Oracle Streams:
Set up the Streams environment on the source database.
Instantiate the replica database (aka target database) using the new target version or on the target platform.
Setup the Streams environment on the target database.
Enable Streams to propagate all changes made on the source database to the target database to completely synchronize the target database with the source.
Connect users to target database and shutdown source database.
Remove Streams configuration.
Consider that following approaches when performing platform migrations on different endian format platforms:
Transportable tablespace is the recommended solution for performing platform migration across different endian format platforms and reduces downtime significantly. See the "Solution Description for Transportable Tablespace" section for more details.
Oracle Data Pump is the simplest of all the approaches. See the Oracle Database Utilities for complete information about using Oracle Data Pump.
For planned downtime of potentially seconds, consider using Oracle Streams as described in Section 2.2.2.9.2, "Solution Description for Platform Migration Using Oracle Streams".
Solution Description for Transportable Tablespace
Migrating a database to a new platform using a different endian format with transportable tablespaces requires the following high level steps:
Create a new, empty database on the target platform.
Import objects required for transport operations from the source database into the target database.
Export transportable metadata for all user tablespaces from the source database.
Transfer datafiles for user tablespaces to the target system.
Use RMAN to convert the datafiles to the target system's endian format.
Import transportable metadata for all user tablespaces into the target database.
Import the remaining database objects and metadata (that were not moved by the transport operation) from the source database into the target database.
If the target database is being moved to a new location (for example, to a new data center) during the migration, then create a physical standby database from the original primary database co-located with the target database. After a Data Guard switchover, transport the tablespaces from the source to the target without incurring the file transfer time as part of the downtime.
Additional Considerations
Transportable tablespace has limitations and restrictions in regard to character sets, opaque types, and system tablespace objects. Unlike previous solutions, the steps are not automated.
Perform a platform migration using transportable tablespaces if all of the following are true:
The source and target platforms have different endian formats.
The time required to perform a full Data Pump Export and Import does not fit in the maintenance window.
See Also:
The best practices white paper titled: ÒOracle Database 10g Release 2 Best Practices: Platform Migration using Transportable TablespacesÓ available athttp://www.oracle.com/technology/deploy/availability/htdocs/maa.htm
One way to enhance availability and manageability is to allow customary user access to the database during a data reorganization operation. The Online Reorganization and Redefinition feature in Oracle Database offers administrators significant flexibility to modify the physical attributes of a table and transform both data and table structure while allowing customary user access to the database. This capability improves data availability, query performance, response time, and disk space utilization. All of these are important in a mission-critical environment and make the application upgrade process easier, safer, and faster.
This online architecture provides the following benefits:
Tables can be reorganized and redefined online
Any physical attribute of the table can be changed online. The table can be moved to a new location, partitioned, and converted from one organization (such as heap-organized) to another (such as index-organized).
Many logical attributes can also be changed. Column names, types, and sizes can be changed. Columns can be added, deleted, or merged. One restriction is that the primary key of the table cannot be modified.
All index operation can be performed online
Indexes can be created online and analyzed simultaneously. Online repair of the physical guess component of logical rowids (used in secondary indexes and in the mapping table for index-organized tables) can also be used.
An index-organized table and secondary indexes can be reorganized online to eliminate the reorganization maintenance window. Secondary indexes support efficient use of block hints (physical guesses). Invalid physical guesses of logical rowids stored in secondary indexes on index-organized table can also be repaired online.
An index-organized table or table partition can be reorganized without rebuilding its secondary indexes, resulting in a short reorganization maintenance window.
Online move of a partitioned table
Online reorganization support for advanced queues, clustered tables, materialized views, and abstract data types (objects)
Fast ADD COLUMN
with default value (does not need to update all rows to default value)
Invisible Indexes speed application migration and testing:
Speeds up migration with explicit hints, then drops when finished
Prevents premature use of newly created indexes
Tests effects of DROP INDEX
, making the index visible if needed, thus there is no need for an index rebuild
Online index build with no pause to DML (no exclusive DML locks are required)
No recompilation of dependent objects when online redefinition does not logically affect objects (for example, when columns are added to tables, or when procedures are added to packages)
Easier to execute table DDL operations online (there is an option to wait for active DML operations instead of aborting)
Support for redefinition of tables that have materialized views or materialized view logs
Depending on the type of online reorganization that is required, you can perform the following types of data reorganization using the DBMS_REDEFINITION
package or the SQL CREATE/ALTER TABLE
and INDEX
commands:
Modify table storage parameters
Move table to a different tablespace
Add support for parallel queries
Add or drop partitioning support
Re-create table to avoid fragmentation
Change from table to IOT or from IOT to table
Add or drop a column
Transform a column using a function
Create indexes online
Rebuild indexes online
Coalesce indexes online
Move index-organized tables online
Copy dependent objects (such as triggers, constraints, and indexes)
Convert LONG
and LONG RAW
columns to a LOB
Use a unique key as an alternative to a primary key or rowid
Specify columns to order data by
Change a table without recompiling stored procedures
Online segment shrink
Reorganize a single partition
Reorganize advanced queue and clustered tables
Reorganize a table containing an ADT
Retain and clone statistics
Copies check and not null constraints
Copies dependent objects for nested tables
See Also:
Oracle Database Administrator's GuideThe following sections describe features that can significantly reduce (or eliminate) the application downtime required to make changes to an application's database objects.
Consider using Oracle Streams for fast rolling upgrades. However, note that while Oracle Streams upgrades can achieve little or no database down time, your ability to configure this solution will require some operational investment. See Section 2.1.4, "Oracle Streams" and Oracle Streams Concepts and Administration for more information.
Data definition language (DDL) commands require exclusive locks on internal structures. If DDL commands are issued, these locks may not be available causing the statement to immediately fail even though the DDL could have possibly succeeded sub-seconds later. DDL specified with the WAIT
option resolves this issue.
DDL with the WAIT
option is the new DEFAULT
. The wait time is specified instance-wide (in the initialization parameter file) and can be modified on a session level.
DDL commands specified with the WAIT
option gives you more flexibility to define grace periods for such commands to succeed instead of raising an error right away, thus requiring additional application logic to handle such errors.
See Also:
Oracle Database Administrator's GuideNew states (ENABLE
and DISABLE
) and ordering (FOLLOWS
) are introduced for triggers to control the firing of triggers. With this feature, the CREATE TRIGGER
statement can be created in a disabled state to validate successful compilation before enabling. In addition, the trigger order can be controlled with the FOLLOWS
clause.
These additional states allow greater administrative control for triggers.
Default values of columns are maintained in the data dictionary for columns specified as NOT NULL
.
Adding new columns with DEFAULT
values and NOT NULL
constraint no longer requires the default value to be stored in all existing records. This not only enables a schema modification in sub-seconds and independent of the existing data volume, it also consumes no space.
See Also:
Oracle Database Administrator's GuideIn releases prior to Oracle Database 11g, metadata records mutual dependencies between objects with the granularity of the whole object. For example, PL/SQL unit P depends on PL/SQL unit Q or that view V depends on table T. This means that dependent objects were sometimes invalidated when there was no logical requirement to do so. For example, if view V depends only on columns C1, C2, and C3 in table T and a new column, C99, is added, the validity of view V is not logically affected. Nevertheless, in earlier releases, V was invalidated by the addition of column C99.
Oracle Database 11g records dependency metadata at a finer level of granularity so that the addition of C99 does not invalidate view V. Similarly, if procedure P depends only on elements E1 and E2 in package PKG, then if element E99 is added to PKG, procedure P is not invalidated. (In Oracle Database 10g, this change to PKG would invalidate procedure P.)
By reducing the consequential invalidation of dependent objects in response to changes in the objects they depend upon, application availability is increased. The benefit is felt both in the development environment and when a live application is parsed or upgraded. The benefit occurs when an Oracle Database patch set is applied because changes to schema objects are required to be compatible and, therefore, not cause consequential invalidations.
An invisible index is an alternative to making an index unusable or even to drop it. An invisible index is maintained for any DML operation but is not used by the optimizer unless you explicitly specify the index with a hint.
Applications often have to be modified without being able to bring the complete application offline. Invisible indexes enable you to leverage temporary index structures for certain operations or modules of an application without affecting the overall application. Furthermore, invisible indexes can be used to test the removal of an index without dropping it right away, thus enabling a grace period for testing in production environments.
See Also:
Oracle Database Administrator's GuideOracle Database 11g Release 1 has added session-level control for materialized view logs. The capture of changes for materialized views (materialized view logs) can be disabled for an individual session while logging continues for changes made by other sessions. This feature reduces Application patching downtime.
This feature minimizes the need to recompile dependent PL/SQL packages after an online table redefinition. If the redefinition does not logically affect the PL/SQL packages, recompilation is not needed. This optimization is turned on by default.
This feature reduces the time and effort to manually recompile dependent PL/SQL after an online table redefinition. This also includes views, synonyms, and other table dependent objects (with the exception of triggers) that are not logically affected by the redefinition.
With Grid Computing and new standby database capabilities, you can leverage and scale your existing system infrastructure. For the primary database, this implies that all hardware resources are leveraged for performance and scalability. For secondary or disaster recovery systems, system and database resources can be used for real-time query or reporting serving a production purpose while in standby database role. With Oracle Database 11g Release 1 (11.1), Oracle Data Guard can be integral part of your IT operations and application business.
This section covers the following topics:
Grid computing is a computing architecture that effectively pools large numbers of servers and storage into a flexible, on-demand computing resource for all enterprise computing needs.
Oracle Database captures the cost advantages of Grid enterprise computing without sacrificing performance, scalability, security, manageability, functionality, or system availability. A Database Server Grid is a collection of commodity servers connected together to run on one or more databases. A Database Storage Grid is a collection of low-cost modular storage arrays combined together and accessed by the computers in the Database Server Grid.
With the Database Server and Storage Grid, you can build standby and testing Hubs to leverage a pool of system resources. The system resources can be dynamically allocated and deallocated depending on various priorities. For example, if the production database fails over to one of the standby databases in the standby hub, it will acquire more system and storage resources while the testing resources may be temporarily starved. With Grid technologies, you can enable high level of utilization and low TCO without sacrificing business requirements.
Figure 2-6 illustrates the Database Server Grid and Database Storage Grid in a Grid enterprise computing environment.
The availability of low-cost and reliable blade servers, small multiprocessor servers, and inexpensive open-source operating systems such as Linux, has made it possible to build a Database Server Grid that is highly available, scalable, flexible, and manageable.
Oracle RAC is the technology that enables a Database Server Grid by providing a single database that spans multiple low-cost servers yet appears to the application as a single, unified database system. Oracle RAC provides flexibility to dynamically provision resources and services within the Grid as computing needs change, and to add systems to the Grid as capacity demands increase. In addition, Oracle RAC provides protection from system failures by automatically recovering the processing of a failed node by any of the surviving systems running the database, and facilitating the reconnection of clients and redistribution of load affected by the failed system.
The availability of low-cost ATA disk-based storage arrays and low-cost storage networks has made it possible to use a Database Storage Grid with Oracle Database at very low cost. A DBA can use the Automatic Storage Management (ASM) interface to specify the disks within the Database Storage Grid that ASM should manage across all server and storage platforms. ASM partitions the disk space and evenly distributes the data storage throughout the entire storage array. Additionally, ASM automatically redistributes the data storage as storage arrays are added or removed from the Database Storage Grid.
Beginning in Oracle Database 11g, standby databases can be used for dynamic IT and application requirements in addition to providing disaster recovery. The real-time query feature in Oracle Data Guard enables you to use physical standby databases for other useful work during normal operations, in addition to providing a disaster-recovery solution.
The following sections describe the Oracle Data Guard features that help you to leverage physical standby databases for additional business purposes:
Data Guard Redo Apply (physical standby database) has proven to be a popular solution for disaster recovery due to its relative simplicity, high performance, and superior level of data protection. Beginning with Oracle Database 11g, a physical standby database can be open read-only while redo apply is active. This means that you can run queries and reports against an up-to-date physical standby database without compromising data protection or extending recovery time in the event a failover is required. This makes every physical standby database able to support productive uses even while in standby role.
To enable real-time query, open the database in read-only mode and then issue the ALTER DATABASE RECOVER MANAGED STANDBY
statement. Note that the COMPATIBLE
parameter must be set to 11.0.0 on both the primary and physical standby databases. Using this feature is totally transparent to applications.
Real-time query provides an ultimate high availability solution because it:
Supports Oracle RAC on the primary and standby databases
Real-time query works on both single-instance and Oracle RAC physical standby databases. Although Redo Apply can be running on only one Oracle RAC instance, you can have all of the instances running in read-only mode while Redo Apply is running on one instance.
Queries return transactionally consistent results that are very close to being up-to-date wish the primary database
Depending on any delay settings or apply rates, the standby database can be lagging seconds behind the primary database. The queries will always be transactionally consistent and will represent a consistent view of the last committed transaction at that time.
Allows fast switchovers or failovers because the redo generated by the primary database while the standby database was open read-only has already been applied to the standby database, making it immediately available to assume the primary database role
Enables you to use fast-start failover to allow for automatic fast failover in the case the primary database fails
Note:
Transactions that attempt to modify a physical standby database running with real-time query enabled will fail with an error.See Also:
Oracle Data Guard Concepts and Administration for complete information about using real-time queryBeginning with 11gR1, you can use both physical standby databases (using real-time query) and logical standby databases to deploy a reader farm. An example of such a configuration is provided Figure 2-7, complete with the use of Data Guard fast-start failover to automatically fail over should the primary database fail. Note that all standby databases in the reader farm automatically recognize the new primary database after such a fast-start failover occurs.
Because a Data Guard configuration can support multiple standby databases, customers have used this capability to boost read performance of the most demanding web applications beyond what the underlying system and storage architecture can support. This provides a relatively low-cost method of scaling out using a Grid architecture where I/O is the driving factor.
The concept is straightforward—a single primary database that supports read/write transactions, and multiple standby databases that provide read-only access to web users. Such an approach scales read performance linearly as additional standby databases are added. It is also an effective way to isolate faults, because problems that impact one standby database are isolated from the other standby databases in the configuration.
The benefits of creating a reader farm of physical standby databases include the following:
Fault isolation
High performance with physical standby databases and Redo Apply
Seamless support for all DDL and data types using Redo Apply
All reader databases are kept up-to-date with changes made to the primary database
Automatic, zero or minimal data loss failover capability
Management as a unified configuration through Grid Control
Scale-out using single writer database and n reader databases.
Figure 2-7 shows a good example of how you can leverage Oracle Data Guard, physical standby databases and real-time query to provide the flexibility you need to grow your business quickly, while still providing disaster recovery. In the configuration, the primary database transmits redo data to multiple standby database, one of which is also enabled for fast-start failover for automatic, zero or minimal data loss failover.
If a fast-start failover is triggered in the Data Guard configuration in Figure 2-7:
Automatic failover occurs to the designated standby database
All standby databases accept data from new primary database
You can perform a switchover at a convenient time in the future to return all databases to their original roles
Complex environments demand coordinated configuration changes, system upgrades and new application roll-outs. Manageability in Oracle Database 11g has improved dramatically to automate and simplify operations in high availability architectures, and represents a major milestone in the drive toward self-managing Oracle databases.
This section contains these topics:
Oracle Database has a sophisticated self-management infrastructure that allows the database to learn about itself and use this information to adapt to workload variations or to automatically remedy any potential problem. The self-management infrastructure includes the following:
Automatic Workload Repository
The Automatic Workload Repository (AWR) is a built-in repository that contains performance statistics used by Oracle Database for problem detection and self-tuning purposes. At regular intervals, Oracle Database makes a snapshot of vital statistics and workload information and stores them in the AWR. The data contained in the snapshots is then analyzed by the Automatic Database Diagnostic Monitor (ADDM). See the Oracle Database Performance Tuning Guide for information about the AWR.
Automatic Maintenance Tasks
By analyzing the information stored in the AWR, the database can identify the need to perform routine maintenance tasks. The automated maintenance tasks infrastructure (known as ÒAutoTaskÓ) enables Oracle Database to automatically schedule such operations. AutoTask schedules automatic maintenance tasks to run in a set of Oracle Scheduler windows known as maintenance windows. Maintenance windows are those windows that are members of the Oracle Scheduler window group MAINTENANCE_WINDOW_GROUP
. See the Oracle Database Administrator's Guide and the Oracle Database 2 Day DBA for more information.
Fault diagnosability infrastructure
Oracle Database includes an advanced fault diagnosability infrastructure for preventing, detecting, diagnosing, and resolving problems. The problems that are targeted are critical errors such as those caused by database code bugs, metadata corruption, and customer data corruption. This includes:
The automatic diagnostic repository (ADR), which is a file-based repository for database diagnostic data such as traces, the alert log, health monitor reports, and more. It has a unified directory structure across multiple instances and multiple products.
The incident packaging services that a DBA can use to automatically and easily gather all diagnostic data (traces, health check reports, SQL test cases, and more) pertaining to a critical error and package the data into a zip file suitable for transmission to Oracle Support.
See the Oracle Database Administrator's Guide for more information about these components.
For problems that cannot be resolved automatically and require administrators to be notified (such as running out of space) the Oracle Database provides server-generated alerts. Oracle Database can monitor itself and send out alerts to notify you of any problem and provide recommendations on how the reported problem can be resolved. This ensures quick problem resolution and helps prevent potential failures.
Oracle Database includes a number of advisors for different subsystems in the database to automatically determine how the operation of the corresponding subcomponents could be further optimized. The SQL Tuning Advisor and the SQL Access Advisor, for example, provide recommendations for running SQL statements faster. Memory advisors help size the various memory components without resorting to trial-and-error techniques. The Segment Advisor handles space-related issues, such as recommending wasted-space reclamation and analyzing growth trends, while the Undo Advisor guides you in sizing the undo tablespace correctly. See the Oracle Database 2 Day DBA for more information about using advisors.
The Hang Manager is an Oracle Database infrastructure that can detect hangs, analyze them, and then obtain the required diagnostic data from Oracle. The Hang Manager is enabled by default in single-instance databases, Oracle RAC databases, and Automatic Storage Management (ASM) instances. Thus, you can use Hang Manager to follow a hang from the database instance to the underlying ASM instance.
Oracle Database 11g Release 1 introduces automatic capture and replay of workloads before and after changes so that you can analyze the impact of a database or a SQL change:
The Database Replay feature addresses enables you to perform real-world testing by capturing the actual database workload on the production system and replaying it on the test system. It also provides analysis and reporting to highlight potential problems (for example, errors encountered and divergence in performance) and recommend ways to remedy the problems.
SQL performance regression is always a concern during system changes such as database upgrades, initialization parameter changes, and addition or dropping of indexes. The SQL Performance Analyzer feature alleviates this concern by providing an easy way to assess the impact of a change on the performance of SQL statements by comparing and contrasting their response times before and after the change. SQL Replay enables you to capture the SQL workload from the source system, such as the production database, and to replay it on the test system where the change has been applied.
See Also:
Oracle Database Performance Tuning GuideBy reducing the amount of human intervention required to execute routine and repetitive tasks, services become more stable, reliable, and available. This is particularly important when administrators need to manage very large numbers of systems as efficiently as possible.
Oracle Enterprise Manager Grid Control is an HTML-based interface that provides the administrator with complete monitoring across the entire Oracle technology stack—business applications, application servers, databases, and the E-Business Suite—as well as non Oracle components. If a component within the fast application notification becomes unavailable or experiences performance problems, an alert is automatically generated to Grid Control to inform the administrator so appropriate action can be taken.
The components of Grid Control include:
Oracle Management Service (OMS)
The OMS is now a set of J2EE applications that renders the interface for Grid Control, works with all Management Agents to process monitoring information, and uses the Management Repository as its persistent data store.
These are processes deployed on each monitored host to monitor all targets on the host, communicate that information to OMS, and maintain the host and its targets.
This is a schema in Oracle Database that contains all available information about administrators, targets, and applications managed by Grid Control.
Communication between Grid Control, the OMS, and Oracle Management Agents is done through HTTP. Also, you can enable SSL to allow secure communications between tiers within firewall-protected environments. The Management Agent uploads collected monitoring data to the OMS, which in turn loads the data into the Management Repository. Changes in a target state (such as an availability state change) result in an alert being generated to Grid Control.
Using Grid Control, an administrator can:
Monitor architecture components and be alerted when a failure occurs
View overall system status, such as the number of nodes in the database cluster and their current status
View alerts aggregated across all instances
Set thresholds for alert generation on a database cluster-wide basis
Monitor performance metric across all instances
Perform database cluster-wide operations such as backup and recovery
Interconnect monitoring of cluster databases
See Also:
Oracle Enterprise Manager Grid Control Installation and Basic Configuration and Oracle Enterprise Manager Concepts for more information about Oracle Enterprise Manager Grid Control
The best practices white papers on configuring Enterprise Manager for high availability at
http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm
Footnote Legend
Footnote 1: Beginning with Oracle Database release 11g, the primary and standby systems in a Data Guard configuration can have different CPU architectures, operating systems (for example, Windows and Linux), operating system binaries (32-bit and 64-bit), and Oracle database binaries (32-bit and 64-bit). For the latest capabilities and restrictions, see Metalink note 413484.1 athttps://metalink.oracle.com/
.