Skip Headers
Oracle® Database Administrator's Guide
11g Release 1 (11.1)

Part Number B28310-01
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

In-Doubt Transactions

The two-phase commit mechanism ensures that all nodes either commit or perform a rollback together. What happens if any of the three phases fails because of a system or network error? The transaction becomes in-doubt.

Distributed transactions can become in-doubt in the following ways:

The RECO process automatically resolves in-doubt transactions when the machine, network, or software problem is resolved. Until RECO can resolve the transaction, the data is locked for both reads and writes. The database blocks reads because it cannot determine which version of the data to display for a query.

This section contains the following topics:

Automatic Resolution of In-Doubt Transactions

In the majority of cases, the database resolves the in-doubt transaction automatically. Assume that there are two nodes, local and remote, in the following scenarios. The local node is the commit point site. User scott connects to local and executes and commits a distributed transaction that updates local and remote.

Failure During the Prepare Phase

Figure 32-5 illustrates the sequence of events when there is a failure during the prepare phase of a distributed transaction:

Figure 32-5 Failure During Prepare Phase

Description of Figure 32-5 follows
Description of "Figure 32-5 Failure During Prepare Phase"

The following steps occur:

  1. User SCOTT connects to Local and executes a distributed transaction.

  2. The global coordinator, which in this example is also the commit point site, requests all databases other than the commit point site to promise to commit or roll back when told to do so.

  3. The remote database crashes before issuing the prepare response back to local.

  4. The transaction is ultimately rolled back on each database by the RECO process when the remote site is restored.

Failure During the Commit Phase

Figure 32-6 illustrates the sequence of events when there is a failure during the commit phase of a distributed transaction:

Figure 32-6 Failure During Commit Phase

Description of Figure 32-6 follows
Description of "Figure 32-6 Failure During Commit Phase"

The following steps occur:

  1. User Scott connects to local and executes a distributed transaction.

  2. The global coordinator, which in this case is also the commit point site, requests all databases other than the commit point site to promise to commit or roll back when told to do so.

  3. The commit point site receives a prepared message from remote saying that it will commit.

  4. The commit point site commits the transaction locally, then sends a commit message to remote asking it to commit.

  5. The remote database receives the commit message, but cannot respond because of a network failure.

  6. The transaction is ultimately committed on the remote database by the RECO process after the network is restored.

    See Also:

    "Deciding How to Handle In-Doubt Transactions" for a description of failure situations and how the database resolves intervening failures during two-phase commit

Manual Resolution of In-Doubt Transactions

You should only need to resolve an in-doubt transaction in the following cases:

  • The in-doubt transaction has locks on critical data or undo segments.

  • The cause of the machine, network, or software failure cannot be repaired quickly.

Resolution of in-doubt transactions can be complicated. The procedure requires that you do the following:

  • Identify the transaction identification number for the in-doubt transaction.

  • Query the DBA_2PC_PENDING and DBA_2PC_NEIGHBORS views to determine whether the databases involved in the transaction have committed.

  • If necessary, force a commit using the COMMIT FORCE statement or a rollback using the ROLLBACK FORCE statement.

    See Also:

    The following sections explain how to resolve in-doubt transactions:

Relevance of System Change Numbers for In-Doubt Transactions

A system change number (SCN) is an internal timestamp for a committed version of the database. The Oracle Database server uses the SCN clock value to guarantee transaction consistency. For example, when a user commits a transaction, the database records an SCN for this commit in the redo log.

The database uses SCNs to coordinate distributed transactions among different databases. For example, the database uses SCNs in the following way:

  1. An application establishes a connection using a database link.

  2. The distributed transaction commits with the highest global SCN among all the databases involved.

  3. The commit global SCN is sent to all databases involved in the transaction.

SCNs are important for distributed transactions because they function as a synchronized commit timestamp of a transaction, even if the transaction fails. If a transaction becomes in-doubt, an administrator can use this SCN to coordinate changes made to the global database. The global SCN for the transaction commit can also be used to identify the transaction later, for example, in distributed recovery.