background

At the beginning of April, I had an interview with a company in this city that had been engaged in unmanned shelves in the office. Although they are now facing transformation, they are still attractive to children like me who want to go from traditional enterprises to the Internet industry.

The issue of distributed transactions came up during the interview. I once again did not tidy up the problem on the loss, record, or long memory!!

Look at the interview process first

The interviewer first drew a picture on a piece of paper:

Let me take a look at this and follow the process. Any questions? The interviewer didn’t directly mention that there would be a distributed transaction, but asked me to tell him, that’s the interview routine.

I answered the question that there may be distributed transactions in the process. When system B is called in Step 2, it may time out in response after the processing is completed, causing system A to think that the processing of B has failed, which leads to the rollback of system A and the data inconsistency between system B and system A.

Ok, at this point, I should have answered the interviewer’s first meaning, at least I was aware of it, he nodded.

Then he asked, “Do you have a good solution?”

At this time, I only had the impression of the general flow chart of the two-stage submission in my mind, and Then Barkabala told him something, such as a coordinator in the middle, pre-submission of what, if there is a failure, rollback, if OK, and then the real submission of transactions, is the theory of these online gods said.

The interviewer then asks: “What does your code do when the line A calls B breaks?” How do you do rollback? Tell me about your code.

At this point, I was confused.

The final result, you can certainly guess, cool cool.

What is a transaction

Here we say the transaction generally refers to the database transaction, referred to as the transaction, is a logical unit in the database management system execution process, composed of a limited sequence of database operations. Wikipedia says so.

For example, account A needs to transfer RMB 100 to account B, which involves at least two operations:

  1. Minus $100 from account A
  2. Add $100 to account B

In A transaction-enabled DBMS, you need to make sure that both of the above operations (the entire “transaction”) are completed and that you can’t have A situation where A’s $100 is deducted and B’s account is not added.

Database transactions contain four features, which are:

  • Atomicity: The transaction is executed as a whole, and all or none of the operations on the database contained within it are executed. In the case of transfers, account A takes money and account B adds money, and either it succeeds or fails at the same time.
  • Consistency: Transactions should ensure that the state of the database changes from one consistent state to another. Consistent state means that the data in the database should meet integrity constraints
  • Isolation: When multiple transactions are executed concurrently, the execution of one transaction should not affect the execution of other transactions. When transferring money from other accounts, the previous transactions of A and B above cannot be affected.
  • Durability: Modifications to the database by committed transactions should be permanently saved in the database.

What are distributed transactions

As we know, the above transfer is a transaction operation in a database. We can use frameworks such as Spring’s transaction manager to do this for us.

However, if a crash occurs in our system, such as an operation where I need to operate multiple libraries, or an operation that crashes previous calls to the application, then Spring’s transaction management mechanism has no protection against such a scenario.

Just like the problem in the interview question above, when the step 2 of system A called B remotely, B did not respond normally due to network timeout. However, A failed to call B and rolled back, and B submitted the transaction again. This may lead to data inconsistency and contribute to the problem of distributed transactions.

The existence of distributed transactions is to solve data inconsistency.

Why do we want consistency

Theory of CAP

There is a popular theory in distributed systems: the CAP theorem

The theorem stems from a conjecture made by Eric Brewer, a computer scientist at the University of California, Berkeley, at the Principles of Distributed Computing conference (PODC) in 2000. Then, in 2002, Seth Gilbert and Nancy Lynch of the Massachusetts Institute of Technology (MIT) published a proof of Brewer’s conjecture, making it a theorem. [From Wikipedia]

He said it was impossible for a distributed computing system to satisfy all three requirements:

  • Consistency
  • Availability
  • Partition tolerance

A distributed system can satisfy at most two of these.

So, what are these three things? Why can we only have both?

Let’s start with a scenario where we have two systems deployed (two nodes, web1 and web2), with the same business code, but maintaining the data generated by our own node. But when users come in, they may visit different nodes. However, whether to access Web1 or Web2, after the user parameter data, this data must be synchronized to another node, so that no matter which node the user visits, it is the data he needs. The diagram below:

Partition fault tolerance

Let’s start with partition fault-tolerance: Even if there is a network failure between the two nodes, there is no synchronization problem, but users access to each node, that node has to provide a separate service, which is a must for Internet companies.

If the network between Web1 and Web2 is faulty, data cannot be synchronized. The user writes data on Web1, then accesses it to read the data and requests Web2, but web2 has no data at this point. So are we returning null to the user? Or do you give some hint that the system is unavailable and try again later?

All wrong, man.

consistency

If you want to ensure availability, and nodes with data return data and nodes with no data return NULL, then the user will see some data and some data, and there will be a consistency problem.

availability

If it’s consistent, then when the user accesses, whether it’s Web1 or Web2, we might return some message saying the system is not available, try again later, etc., to make sure it’s consistent every time. We have data, but our system is responding to a prompt, which is a usability issue.

Since partition fault tolerance (P) must be guaranteed, our distributed system is more of a balance between consistency (CP) and availability (AP), which can only meet both conditions.

In fact, if you think about it, ZK strictly implements CP, while Eureka guarantees AP.

In fact, distributed transactions emphasize consistency.

Several distributed transaction solutions

2PC

Before we talk about 2PC, what is the XA specification?

The XA specification describes the interface between a global transaction manager and a local resource manager. The purpose of the XA specification is to allow multiple resources (such as databases, application servers, message queues, and so on) to be accessed in the same transaction so that ACID properties remain valid across applications.

XA uses two-phase commit (2PC) to guarantee that all resources commit or roll back any particular transaction at the same time.

Think of a scenario, when making a single application, some of you have connected two libraries? Data is inserted into both systems simultaneously in a single transaction. But for ordinary transactions, it can’t be managed.

Take a look at the following figure (just to illustrate the pattern of this operation, not limited to the following business) :

There are two libraries to operate in one service, how to ensure the success of the transaction.

Here we introduce a framework, Atomikos, that implements this XA routine. Look at the code:

Github AtomikosJTATest: github.com/heyxyw/lear…

See the diagram above, Atomikos implements a transaction manager itself. All the connections we get are taken from it.

  • The first step is to start the transaction and then pre-commit, which is performed by BOTH DB1 and DB2. Note that no transaction is committed.
  • The second step is the actual commit transaction, which is initiated by Atomikos and rolled back if an exception occurs.

Does this process have two roles, a transaction manager and a resource manager (in this case the database, but also other components, message queues and so on)?

The entire execution process is as follows:

Image from: XA Transaction Processing: www.infoq.cn/article/xa-… , detailed explanation of XA, you can take a good look. The whole 2PC process:

Phase 1 (Submit request phase) :

  1. The coordinator node asks all the participant nodes if they can commit and waits for the response from each participant node.
  2. The participant node performs all transactions until the query is initiated and writes Undo and Redo information to the log.
  3. Each participant node responds to the query initiated by the coordinator node. If the transaction of the participant node actually succeeds, it returns a “agree” message; If the transaction operation of the participant node actually fails, it returns an abort message.

Phase ii (Submission for implementation) :

Success, when the coordinator node gets the corresponding message “agree” from all the participant nodes:

  1. The coordinator node issues a “formally commit” request to all the participant nodes.
  2. The participant node formally completes the operation and releases the resources that were held during the entire transaction.
  3. The actor node sends a “done” message to the coordinator node.
  4. The coordinator node completes the transaction after receiving a “complete” message from all the participant nodes.

Failure, if any participant node returns a “terminated” response message during phase 1, or if the coordinator node is unable to obtain the response message from all participant nodes before phase 1 query times out:

  1. The coordinator node issues a “rollback action” request to all the participant nodes.
  2. The participant node performs a rollback with the previously written Undo information and frees the resources occupied during the entire transaction.
  3. The participant node sends a rollback complete message to the coordinator node.
  4. The coordinator node cancels the transaction after receiving a “rollback complete” message from all the participant nodes.

The second phase is sometimes referred to as the completion phase because this is where the coordinator must end the current transaction, regardless of the outcome.

Reliable message final consistency scheme

Generic message queue based middleware

Now that we’ve talked about the two-phase commit scheme, we’ll talk about how to solve the distributed transaction problem based on the reliable message final consistency scheme.

In this scenario, the message service middleware role is involved. Let’s start with a flow chart:

Let’s illustrate the above figure using the process of creating an order placing and the subsequent process of shipping out.

In the ordering logic (Producer end), the data of an order, such as the order number, quantity and other key information, are first packaged into a message, and the state of the message is set as init, and then sent to the independent message service and stored in the database.

Next we move on to the rest of the local logic for the order.

After the processing is completed, the step of confirming the sending of the message indicates that my order can be placed successfully. Then we send a confirm message to the message service, which changes the status of the order to SEND and sends it to the message queue.

Next, the consumer consumes the message. Process its own logic, and then feed back the message processing results to the independent message service, independent message service set the message state to end, indicating the end. However, attention should be paid to ensure the idempotency of the interface to avoid the problems caused by repeated consumption.

The possible problems in this and how to solve each step:

  1. For example, if an exception occurs in the prepare stage, the order will not be placed successfully. But we said, we’re based on reliable information here, and we need to make sure our messaging service is up and running.
  2. There is something abnormal in comfirm, and the confirmation fails to be sent at this time, but our order has been placed successfully. This kind of situation, we can play a timed task in independent news service and timing to query message status to init data, to reverse the query order number in the system exists, if it exists, then we will put the message to the send status, and then sent to the message queue, if the query to there is no order, Then just discard the message. So here our order system has to provide a batch query order interface, and the downstream consumption system has to ensure idempotent. Ensure the consistency of repeated consumption.
  3. Messages are discarded from the message queue or the downstream system fails to process messages all the time. As a result, no messages are sent. In this case, we also need a scheduled task to process the messages in the SEND state. We can resend the messages until the system consumes them successfully.
  4. At the end of the consumer side, when we consume, if there is abnormal consumption, or the system bug leads to abnormal situation. So here we can also go to log, if it is not the system code problem, network jitter caused, then in the third case above, the message system will send a message again, we will deal with it. If it keeps failing, you need to think about whether your code is really buggy.
  5. The final guarantee scheme, log, problem human processing data. Now our system error, with the current technical means is not able to do all rely on machines to solve, we have to rely on people. As far as I know, now many large factories will have such a person, specializing in dealing with this type of problem, to manually modify the way of the database. The small factory we stayed in before, basically rely on our own to write SQL to modify the data, think about it, right?

Post the key isSS core logic code framework:

Scheduled task:

Implemented based on RocketMQ

This solution is the same as the standalone messaging service mentioned above, where the standalone service is removed and only the message queue is implemented, namely Alibaba’s RocketMQ.

The flow chart is as follows:

The entire process here is the same as the message-based service above. Here, but more specific code refer to: www.jianshu.com/p/453c6e7ff… It is very well written.

In terms of the reliable message ultimate consistency scheme, when we say reliable, we mean that the message is guaranteed to be sent to the message middleware.

For the downstream system, if the consumption is not successful, it is generally taken as failure retry. If the retry fails for several times, logs will be recorded and subsequent manual intervention will be performed. So it’s important to emphasize that the next system has to deal with idempotent, retry, and logging.

If it is for the business of capital class, after the subsequent system rollback, we have to find a way to notify the system in front of the rollback, or send an alarm by manual rollback and compensation.

TCC scheme

The whole PROCESS of TCC is divided into three stages, namely Try, Confirm and Cancel:

  1. Try phase: This phase checks the resources of each service and locks or reserves the resources
  2. Confirm phase: This phase is about performing the actual operations in the various services
  3. Cancel phase: If the business method execution of any of the services fails, there is a need to compensate by performing a rollback of the business logic that has been successfully executed

Again, take the transfer example as an example. When the transfer is carried out across banks, it needs to involve the distributed transaction of two banks, transferring one piece from Bank A to bank B. If the TCC scheme is used to achieve:

Here’s the idea:

  1. At the Try stage, 1 yuan will be frozen in the account of Bank A, and 1 yuan will be pre-added to the fund in the account of bank B.
  2. Confirm stage: Perform the actual transfer operation, deduct RMB 1 from bank ACCOUNT A and increase RMB 1 from bank account B.
  3. Cancel stage: If any bank fails to perform the operation, it needs to be rolled back for compensation. For example, if the account of Bank A has been deducted, but the fund of bank B fails to increase, the fund of bank A must be added back.

This scheme is more complex, one step operation to do multiple interfaces to cooperate to complete.

To give an overview of the process, use a ByteTCC framework implementation example at gitee.com/bytesoft/By…

At the beginning, both bank accounts of A and B were set as follows: Amount =1000, frozen = 0

1 RMB from bank A account to bank B account:

In the try stage, the amount of the bank account of A is reduced by 1, and the frozen amount of the bank account of B is increased by 1.

At this time:

  • A 银行账户:amount(数量)= 1000 – 1 = 999,frozen(冻结金额)= 0 + 1 = 1
  • B Bank account: Amount = 1000, frozen = 0 + 1 = 1

Confirm stage: The frozen amount of bank account A shall be reduced by 1; the frozen amount of bank account B shall be increased by 1 and the frozen amount shall be reduced by 1

At this time:

  • A 银行账户:amount(数量)= 999,frozen(冻结金额)= 1 – 1 = 0
  • B 银行账户:amount(数量)= 1000 + 1 = 1001,frozen(冻结金额)= 1 – 1 = 0

Cancel stage: Bank A account amount + 1, frozen amount -1, frozen amount -1 in bank B

At this time:

  • A 银行账户:amount(数量)= 999 + 1 = 1000,frozen(冻结金额)= 1 – 1 = 0
  • B 银行账户:amount(数量)= 1000,frozen(冻结金额)= 1 – 1 = 0

Now that I’ve done the whole thing, you should run through the code. In fact, it is quite complicated, there are many interfaces to complete the whole business, imagine if we use TCC to write a lot of projects, you can bear?

Again, BASE theory

BASE theory is an acronym for Basically Available, Soft State, and Eventually Consistent.

  1. Basically Available: A distributed system that is allowed to lose some of its availability in the event of unpredictable failures.
  2. Soft State: Indicates that data in the system is allowed to exist in an intermediate State without affecting the overall availability of the system. That is, data synchronization between data copies on different nodes is delayed.
  3. Eventual Consistency: It emphasizes that all data update operations can finally reach a consistent state after a period of synchronization. Therefore, the essence of final consistency is that the system needs to ensure the consistency of the final data, rather than ensuring the strong consistency of the system data in real time.

Its core idea is:

Even if Strong consistency cannot be achieved, each application can adopt appropriate methods to achieve Eventual consistency according to its own service characteristics.

The account in the ABOVE TCC scheme designs a frozen field. Is this the soft state in the middle of BASE theory?

The last

When introducing distributed transactions, you need to consider the complexity and development cost of implementing the system, or where distributed transactions are not needed at all.

In fact, there is no need to do distributed transactions everywhere, for most businesses, in fact, we do not need to do distributed transactions, directly do logging, do monitoring. Then there are problems, manual to deal with, a month will not have so many problems. If you’re having these problems every day, you might want to check your code for bugs.

For financial scenarios, distributed transaction schemes are basically used to ensure that other services, memberships, points, product information, etc., may not need to do so.