Two-phase Commit (2PC) is one of the implementations of XA distributed transaction protocol proposed by Oracle Tuxedo system.

There are two important roles in the XA protocol: transaction coordinator and transaction participant

Since it is called two-phase commit, it must be divided into two phases.

Atomikos can be used in the Java ecosystem to quickly implement two-phase commit distributed transactions.

The first stage

In a smooth situation

  1. The transaction coordinator’s node first sends a Prepare request to all the participant nodes.
  2. After receiving the Prepare request, each participant performs its own transaction-related data update, writing the Undo Log and Redo Log.
  3. The participant returns a done message to the transaction coordinator node without committing the transaction.
  4. Enter the second stage

When something goes wrong

In the first phase of XA, if a transaction participant reports a failure message, the local transaction on that node was not executed successfully and must be rolled back.

  1. The transaction coordinator’s node first sends a Prepare request to all the participant nodes.
  2. After receiving the Prepare request, each participant performs its own transaction-related data update, writing the Undo Log and Redo Log.
  3. The participant fails to execute the command, and returns a failure message.
  4. The coordinator interrupts the transaction

Break things

Any participant gives the coordinator a No response, or the coordinator is unable to receive feedback from all participants after the wait times out, then the transaction is interrupted.

  1. Send a rollback request. The coordinator issues a Rollback request to all the participant nodes.
  2. Things roll back. After the participant receives the Rollback request, it uses the Undo information recorded in the phase to perform the transaction Rollback and, upon completion of the Rollback, releases the resources occupied during the entire transaction execution.
  3. Feedback things rollback results. The participant sends an Ack message to the coordinator after completing the rollback of the transaction.
  4. Interrupt the transaction

The second stage

In the second phase of an XA distributed transaction, the transaction coordinator node will issue a Commit request to all transaction participants if all previous received returns are forward.

After receiving a Commit request, each of the transaction participant nodes commits the transaction locally and releases the lock resource. When the local transaction is committed, a “done” message is returned to the transaction coordinator.

When the transaction coordinator receives “completion” feedback from all transaction participants, the entire distributed transaction is complete.

disadvantages

  1. Two-stage submission involves multiple network communication between nodes, which takes a long time! During the whole process, all nodes are blocked, and resources (such as database data and local files) held by all nodes are locked

  2. Single point of failure. Because of the importance of the coordinator, if the coordinator fails. The participants will keep blocking. Especially in phase 2, when the coordinator fails, all participants are still locked in the transaction resources and cannot continue to complete the transaction. (If the coordinator is down, you can re-elect a coordinator, but you can’t solve the problem of participants being blocked because the coordinator is down)

  3. The data are inconsistent. In phase 2 of the two-phase commit, after the coordinator sends a COMMIT request to the participant, a local network exception occurs or the coordinator fails during the commit request, which results in only a subset of the participant receiving the commit request. These participants perform the COMMIT operation after receiving the COMMIT request. However, other parts of the machine that do not receive the COMMIT request cannot perform the transaction commit. So the whole distributed system will appear data inconsistency phenomenon.