Two-phase commit 2PC

We have already learned the basic theory of distributed transactions. Based on the theory, the industry solutions for different distributed scenarios include 2PC, TCC, reliable message ultimate consistency, and best effort notification. This article focuses on the specific use of two-phase commit.

What is 2PC

The two-phase commit protocol divides the entire transaction process into two phases: Prepare Phase and Commit phase. 2 refers to two phases, P refers to the preparation phase, and C refers to the commit phase

1. Prepare: The transaction manager issues a Prepare message to each participant. Each participant executes a transaction locally and writes it to the Undo/Redo log. (Undo log data before modification is used for rollback, and Redo log data after modification is used for writing to data files after transaction commit)

2. Commit phase: If the transaction manager receives an execution failure or timeout message from each participant, it directly rolls back the message to each participant; otherwise, it sends a commit message. Note: Lock resources must be released at the last stage

There are two cases of using 2PC to complete distributed transactions:

  • Ideally: If return all of the participants agree to news, the transaction manager will know all of the participants are ready, the coordinator will submit a message to all the participants to send back one, participants in the received message is submitted, will commit local affairs, and release the amount of resources, after processing is completed, will return a complete message, The distributed transaction is complete.
  • Abnormal case: If one of the participants is abnormal, a termination message will be sent to the transaction manager. After receiving the rollback request, the transaction manager will undo all operations and restore to the state before the transaction started. The distributed transaction fails to execute.

It seems to be fine, but it certainly has many disadvantages:

  • 1. What if the transaction manager is down? The entire distribution cannot continue to execute
  • 2. Suppose a participant’s commit message is down because the network is down, the request does not arrive, and no exception message is sent back. Then the transaction fails.

To address the shortcomings of 2PC, a three-phase commit is proposed, which introduces a timeout mechanism in both the coordinator and the participant, and divides the first phase of the two-phase commit protocol into two steps: ask, then lock the resource, and finally commit.

  • 1. Preparatory phase: The coordinator sends commit requests to participants, who return yes, or no, if available.
  • 2. Pre-commit phase: after sending the request, the transaction operation is performed and the information is recorded in the transaction log. If a coordinator does not execute, the transaction is interrupted
  • 3. Actual commit: The process to perform the final transaction execution or interrupt the transaction operation.

But neither 2PC nor 3PC can completely solve the distributed consistency problem. This theory will not be overstated here. As an understanding.

Second, solutions

2.1 the XA scheme

Two-phase commit is implemented at the database level. For example, Oracle and Mysql support the 2PC protocol

It is a variant of multi-segment transactions, and MySQL’s transaction model is based on this diagram

1. App initiates a transaction and enters the transaction manager

2. Transaction manager TM notifies resource manager RM that the service is ready to execute, and notifies TM that the service is OK

3. After receiving all RM’s OK, TM will notify them to commit the transaction

Conclusion: The whole 2PC transaction process involves three roles AP, RM and TM. An AP is an application that uses 2PC transactions; RM is the resource manager. TM stands for transaction manager, which controls the entire global transaction.

  • In the preparation phase, the RM performs actual service operations, but does not commit transactions and resources are locked
  • In the commit phase, TM will receive the reply from RM in the preparation phase. If any RM fails to execute the transaction, TM will notify all RMS to perform the rollback operation; otherwise, TM will notify all RMS to commit the transaction. The commit phase ends and the resource lock is released.

Problems of XA scheme:

  • Local databases are required to support XA
  • Resource locks are not released until the end of two phases, resulting in poor performance.

2.2 Seata scheme

Seata is an open source distributed transaction solution of Alibaba, committed to providing high performance and easy to use distributed transaction services. Seata provides users with AT, TCC, SAGA and XA transaction modes to create a one-stop distributed solution for users. The traditional 2PC problem is solved in Seata, which drives the completion of global transactions by coordinating branch transactions to local relational databases. Seata is middleware that works at the application layer. The main advantage is that it has good performance and does not occupy connection resources for a long time. It solves the problem of distributed transaction in microservice scenarios in an efficient and business-0 intrusive way. AT present, it provides distributed transaction solutions in AT and TCC modes.

There are three modules in Seata, which are TM, RM and TC. TM and RM are integrated with the business system as clients of Seata, and TC is deployed independently as server of Seata.

Role division:

TM: transaction manager that enables, commits, and rolls back distributed transactions

RM: registers, reports, and executes resources

TC: Transaction manager service function, store transaction log, compensate abnormal transaction, etc., centrally manage transaction global lock (global row lock), SEATA server

Overall process of transaction execution:

  • TM enables distributed transactions (TM registers global transaction records with TC);
  • Schedule intra-transaction resources, such as databases and services, based on service scenarios (RM reports resource readiness status to TC).
  • TM ends the distributed transaction and the transaction ends in one phase (TM notifies TC to submit/roll back the distributed transaction);
  • TC collects transaction information and decides whether a distributed transaction should be committed or rolled back.
  • The TC notifies all RM to commit or roll back resources, and the transaction phase 2 ends.

2.3 Scheme Comparison

  • Architecture level: The RM of the traditional 2PC scheme is actually in the database layer, and RM is the database itself in essence, which is implemented through XA protocol. Seata’s RM is deployed on the application side as a middleware layer in the form of Jar packages.
  • Two-phase commit: In traditional 2PC, no matter whether the second phase is commit or rollback, the locks of transaction resources are held until the second phase is completed. Seata commits the local transaction in the first phase, which saves the time of lock holding and the overall prompt efficiency.

\