Said in the previous

Apache RocketMQ-4.3.0 officially released the transaction messaging feature, following the recent hot button. In this first article, I’ll talk about Distributed Transaction, a perennial software engineering conundrum.

This technology is also widely implemented, utilized and optimized within major companies such as Alibaba and Tencent. However, due to the difficulties in theory, distributed transactions become a veiled technical barrier for large factories to small factories. Those of you reading this article have heard a lot about distributed transactions, such as two-phase commit, TCC, final consistency, etc., so there are not many common concepts here.

Distributed transactions based on RocketMQ

Let’s get right down to business and design our own distributed transaction component using RocketMQ.

Use a virtual scenario to raise the question

Users transfer 100 yuan from AGRICULTURAL Bank to CMB. The systems of Agricultural Bank and CMB are deployed in their own computer rooms respectively. They communicate with each other through messages to prevent excessive coupling.

The whole model can be inappropriately described as follows: After the Agricultural Bank deducts 100 yuan, it sends a message of “payment has been deducted” to CMB. CMB receives the message and knows that the payment has been successfully deducted, and then adds 100 yuan to CMB’s account.

The problem is, agricultural Bank side, plan 1. First buckle 100 yuan and then send a message, plan 2. First send a message and then buckle 100 yuan

Sort out the scenarios where the entire transaction is inconsistent:

Scheme 1.

Agricultural bank buckles 100 hind success, but message sends failure, move did not add 100

Scheme 2.

The message is sent successfully, but the agricultural bank buckles 100 yuan failure, merchants bank received the message to add 100

You should have noticed that there is no “simultaneous success” or “simultaneous failure” by switching the order of debit and message sending. If the former succeeds and the latter fails, there will be inconsistency.

RocketMQ, hereinafter referred to as RMQ, introduces a new message type for transactional messaging: TransactionMsg

A complete transaction message is divided into two parts:

HalfMsg(Prepare) + Commit/RollbackMsg

After Producer sends the HalfMsg, the Consumer cannot consume it immediately because the HalfMsg is not a complete transaction message. Producer can Commit HalfMsg or Rollback EndTransacaction. The message can only be consumed by a Consumer if HalfMsg is committed. RMQ will periodically ask the Producer whether they can Commit or Rollback HalfMsg’s that have not been terminated due to errors to end their life cycle in order to achieve final transaction consistency.

Again in the transfer scenario, we use RMQ transaction messages to optimize the process:

  1. ABC sent HalfMsg to RMQ synchronously, which carried the information that ABC was about to deduct 100 yuan

  2. ABC HalfMsg successfully sent, the execution of the database local transaction, in their own system deducted 100 yuan

  3. ABC checks the execution of local transactions

  4. Local transaction returned successfully, ABC Commit HalfMsg to RMQ

  5. CMB system subscribed RMQ, successfully received the agricultural Bank has deducted 100 yuan information

  6. Add 100 yuan to CMB’s system for performing local transactions

Figure 1: RMQ transaction message principle

Similarly, let’s analyze whether there will be inconsistencies in this process one by one:

  1. If HalfMsg is not sent successfully, the local transaction will not be executed at all

  2. If the local transaction is not successful, send Rollback immediately to Rollback HalfMsg. Act like nothing ever happened before

  3. The ABC local transaction succeeded, but the Commit failed, but since HalfMsg is already in RMQ, RMQ can make the ABC recheck whether the local transaction succeeded through the timing program, and then recommit. Rollback failed, too

  4. After CMB consumes the message, the local transaction fails to add money, but the message received by CMB is persisted in MQ, or even persisted in CMB database, which can conduct transaction retry

The case just discussed is very ideal, the whole distributed transaction, only involves the change of amount, but the real online system, as the message sender of the local transaction may be very complex, may involve dozens of different tables, then RMQ uses timer to Check HalfMsg, Do you want to check if every table involved in the transaction committed successfully? Obviously, this solution is very business-intrusive, very difficult to componentize. Therefore, it is necessary to design a Transaction table in the local Transaction and bind the business table and Transaction in the same local Transaction. If the local Transaction of agricultural Bank of China deducts money successfully, the status of the TransactionId should be recorded as “completed” in the Transaction. When you finally need to check, you only need to check whether the status of the corresponding TransactionId is “completed”, regardless of the specific business data.

One more small detail,

If you are careful, you may find that the discussion in No.3 is actually a little lax. When RMQ calls Commit or Rollback, Oneway is used. If you are familiar with RMQ source code, you will know that this kind of network call only sends Request in one direction and does not get Response. The performance of sending messages is greatly improved. However, if the sending fails, the Producer does not know, and the transaction can only be terminated by checking HalfMsg periodically.

public void endTransactionOneway( final String addr, final EndTransactionRequestHeader requestHeader, final String remark, final long timeoutMillis ) throws RemotingException, MQBrokerException, InterruptedException { RemotingCommand request = RemotingCommand.createRequestCommand(RequestCode.END_TRANSACTION, requestHeader); request.setRemark(remark); / / using Oneway send end the transaction type of enclosing remotingClient. InvokeOneway (addr, request, timeoutMillis); }Copy the code

Distributed transactions out of RocketMQ

Not all MQ supports transactional messages. How do you use generic MQ to build distributed transactional components or even abstract them into a transactional SOA service?

Taking a closer look at the RMQ transaction message, we can break it down into two parts:

Transaction manager + messages

The so-called transaction manager manages the Prepare, Commit, and Rollback of transactions. It also includes a scheduled checker for preparing transactions.

Message, refers to the general synchronization message, after sending can be clear send result, used for transaction system and business system decoupling. Almost all distributed MQ supports this kind of messaging.

Let’s design our own DistributedTransaction SOA, hereafter referred to as DT-SOA

Figure 2: Distributed transaction servitization

The process remains the same, but distributed transactions no longer rely heavily on RMQ, instead using generic MQ:

  1. After system A sends A transaction, it first invokes the Prepare method of DT-SOA to Prepare to start the transaction. Because the call is synchronous, it obtains SendResult. If the transaction is sent successfully, it obtains the ID of the globally distributed transaction — TID

  2. System A uses the obtained TID to execute the local Transaction, which contains the Transaction status table. After the Transaction is successful, the status corresponding to the TID is set to “Completed”.

  3. System A invokes DT-SOA to commit the transaction, and DT-SOA sends synchronous messages to system B using MQ

  4. System B listens to the corresponding Topic and executes the corresponding local transaction after receiving the message