The concept and implementation scheme of distributed transaction

The basic concept

The transaction

A set of work with well-defined boundaries consisting of multiple computing tasks. Transactions may include interface access, network communication, data acquisition, and processing. Strict transaction implementation should have four characteristics of atomicity, consistency, isolation and persistence.

Atomicity: All tasks in a transaction either complete or fail. There are no in-between states.

Isolation: The operations of different transactions do not affect each other, and the intermediate state of a concurrent transaction is not visible to other transactions.

Durability: States are permanently valid once a transaction completes.

Consistency: The resources or data involved in a transaction adhere to certain constraints before and after the transaction. The completion or failure of the transaction does not affect this state.

Distributed transaction

In a distributed system, the resources involved in transaction access and the nodes involved in computing are deployed on different nodes. In this case, the transactions involved are called distributed transactions.

From the perspective of the overall architecture of the system, the scenarios involved in distributed transactions can be divided into two categories. The first is that the transaction itself involves only a single application, but involves multiple data stores, and a transaction needs to access multiple data stores to complete. Second, the transaction itself involves multiple applications, and each application may be connected to one or more data stores. A transaction needs to cooperate with multiple independent applications to access multiple data stores before it can be completed.

consistency

Strictly speaking, consistency is not a feature of the transaction itself. As you can see, the so-called “constraints” that consistency talks about change as business scenarios change.

To ensure consistency, in addition to the corresponding mechanism at the database level, the application level needs to be considered first. For example, in the case of a typical two account transfer, the application needs to ensure that the transfer out account and the transfer in account respectively initiate a decrease and increase in the same transaction. If either is missing, even if a transaction is used, it violates the consistency constraint from the business perspective.

Once you’ve got the business right at the application level, look at the database level. In the case of transfer, suppose there is a problem with transaction support in a certain database. After some kind of failure occurs in a transaction, it is found that the amount of money in the transfer account has decreased correspondingly, while the amount of money in the transfer account has not increased. Then it is obvious that the consistency constraint of the business is violated. However, careful analysis reveals that this scenario actually violates the atomicity of the transaction in the first place, i.e. only half of what should be done simultaneously. So, under this scenario, the embodiment of consistency is actually ultimately guaranteed by atomicity. For another example, consider two simultaneous transfer transactions on the same account. If transaction A deducts $100 and transaction B deducts $50, and if after both transactions are committed, the account is actually deducted $50 instead of $150, then the business consistency constraint is clearly violated as well. This is actually guaranteed by the isolation of the transaction.

So, in general, consistency is more about business constraints on top of the data store. The characteristics described are guaranteed by atomicity and isolation in different scenarios from the perspective of transactions.

Strong consistency and final consistency

When discussing the consistency of transaction, especially the consistency of distributed transaction, the two problem categories of distributed system and transaction are involved, and these two problem categories have their own definitions and are difficult to understand. If we study carefully, we will find that the consistency problem discussed in the domain of transaction is not the same problem as that discussed in the domain of distributed system. Their problem scenarios and corresponding solution models are different. Unfortunately, the two issues are often discussed side by side, and analogies are drawn, creating some confusion.

Due to the complexity of the two issues themselves and the mixed information created by common existing discussions, consistency is not discussed in depth here.

The following assumptions are made for the consistency problem involved in distributed transactions:

The consensus issues involved in distributed transactions and distributed systems are two separate issues.

The conformance discussed in distributed transactions is commonly referred to as strong consistency and final consistency

Looking at the system as a whole, the strong consistency of distributed transactions discusses the same problem as the consistency of transactions, that is, the system satisfies certain business constraints before and after the transaction is executed.

Final consistency means that business consistency constraints may not be satisfied immediately after a distributed transaction commits, but will be satisfied at some point in the future and will definitely occur.

The typical implementation

Strong consistency implementation

Two/Three Phase Commit

The two-phase commit protocol, or 2PC, is an algorithm to solve the consistency problem of distributed systems. Some solutions for distributed transactions use this protocol.

In a two-phase commit protocol, there are two roles:

Coordinator: Usually there is only one coordinator in the system who coordinates decisions

There are many in the system that have the ability to implement local affairs

The flow of the two-phase protocol can be described as follows:

Stage one, the voting stage

The coordinator asks all participants if they can commit a transaction

Each participant prepares for the transaction, such as locking resources, pre-allocating resources, and logging the results

The participant returns the results of the preparation phase of the transaction to the coordinator and, if successful, the response can be committed

Phase two, the submission phase

The coordinator makes a decision based on the results returned by the participants, committing the transaction if all returns are successful, and rolling back the transaction if any of the participants returns fail.

After receiving notification from the coordinator, the participant performs the corresponding action to complete the final commit or rollback of the transaction.

2PC makes the following assumptions:

Any nodes can communicate with each other.

All participants guarantee that local transactions can be successfully committed or rolled back, that is, participants themselves have the ability to manage local transactions.

Participants write the results to the persistent log before the first phase returns, so that the results of the vote are not lost even if the node fails at this point.

None of the nodes will fail permanently, that is, the node can recover from the failure.

As you can see from the description of 2PC above, the protocol itself has some drawbacks:

Because resources need to be pre-allocated or locked in the first phase, they are occupied until the entire transaction is completed, resulting in low overall concurrency performance

Network timeout problem. In both phases, network timeouts can occur. In the first phase, if the participant returns a result that times out, the coordinator can consider it a failure and roll back the transaction. In the second phase, if the final commit/rollback result returned by the participant times out, the retry operation can be performed; Or the coordinator can move it out of the cluster so that the final data is still consistent; What is more troublesome is that in the second stage, if the coordinator cannot send the final decision result to the participants, then the participants do not know how to carry out the next action (lack of decision information), and thus fall into the embarrassing scene of unable to make decisions.

In order to solve the above problems in the two-stage submission, a three-stage submission protocol is proposed, and its three stages are as follows:

Voting phase: The coordinator asks the participant if the submission can be performed, and the participant returns the result.

Pre-commit phase: If all participants return success, the coordinator issues pre-commit instructions to all participants. Participants do not commit, but lock the responding resources and return the pre-commit results.

Submission stage: If all participants return success in the pre-submission stage, the coordinator issues the final submission instruction and the participant performs the final submission action.

Unlike two-phase commit, participants do not lock resources during the voting phase, thus avoiding the performance degradation caused by resource locking during that phase. In the final commit phase, if there is a timeout problem, since the participant has received the success instruction in the pre-commit phase, it can be assumed that all other participants have also agreed to the commit, then the status can be directly changed to success.

The DTP and XA

Distributed Transaction Processing (DTP) is a Distributed Transaction model developed by X/Open standards-setting organization, which stipulates that Application programs (AP), A distributed Transaction model under a logical architecture consisting of multiple Resource Managers (RMs) and Transaction Managers (TM) nodes (extended appropriately on this basis, a distributed system can be composed of TM domains, Each TM Domain consists of a set of independent AP, RMs and TM. Each TM Domain can constitute a global transaction, and each TM Domain contains a branch transaction). AP initiates the transaction, TM coordinates each RM to complete the distributed transaction, and each RM has the independent management ability of local transaction. Below is a schematic diagram of the DTP model.

The transaction commit protocol for DTP follows two phase commit protocol 2PC (actually: two phase commit with presumed rollback). The XA protocol specifies the interface specification for bidirectional interaction between TM and RM.

Final consistency implementation

Compensation for subtransactions

One phase commit + compensation

A combination of one-phase commit and compensation patterns is mentioned in Sharding-JDBC (sharding-JDBC refers to this as maximum effort delivery, but the name is not used here to distinguish it from subsequent message queuing patterns), which is shown as follows:

In this mode, there is no need to introduce a transaction manager, and transaction participants each submit a local transaction without affecting each other. By the framework before the transaction record transactions all participants involved in the SQL statement, writes it to independent transactions and business libraries, and to monitor individual transaction execution as a result, if a failure occurs, has a certain number of retries, if still fail, then follow-up by independent asynchronous service reads and retry again.

Obviously, this pattern assumes that the transaction will eventually succeed after a certain number of retries. In addition, an obvious weakness of this model is the introduction of the transaction library in the whole process, if the transaction library itself fails during the execution of the transaction, then the retry mechanism will not be guaranteed.

Message based: Reliable message/best effort to deliver

Based on the idea of ultimate consistency, some solutions for distributed transactions are accomplished by introducing messaging services. There are many versions of relevant implementation schemes, which are divided according to which the compensation responsibility belongs to the message producer or the consumer. They are respectively called reliable message and maximum effort delivery.

Reliable sources

In the reliable message mode, transaction compensation is the responsibility of the message producer, who is also the active party of the business. Its mode is shown as follows:

Its steps are summarized as follows:

Business actors submit business data and message data in the same local transaction.

After the local transaction is committed, the business processing service notifies the passive side of the business through the real-time message service. After the notification is successful, the real-time message service deletes the message data.

If the notification fails, the system scans the failed notification and retries the notification through the message service.

After receiving the message notification, the business passive performs a local transaction and submits it.

As you can see, this model assumes a few things:

It is assumed that the real-time messaging service and message recovery system will not fail unrecoverable

The business processes of the business initiative are independent of the business passives

The passive should keep the business idempotent

In this mode, because the transaction compensation responsibility is taken by the business initiative, the implementation cost of the passive side of the business is low. However, for the business initiative, the need to maintain transaction compensation mechanisms outside of the main business process creates coupling with the main business. The following reliable message implementation eliminates this coupling:

Its steps are summarized as follows:

The business initiative sends a message to the implementation message service request before the local transaction is committed, and the real-time message service records the message to be sent.

Upon a successful request, the business initiative commits the local transaction and, upon success, sends a message to the real-time messaging service for confirmation. If the local transaction fails to commit, the message is cancelled.

The real-time messaging service selects whether or not to send a message based on an instruction to confirm or cancel sending. The next steps are the same as above.

If the service active party does not send confirmation or cancel sending instruction to the real-time message service, the message status confirmation system initiates the query and queries the message sending status to the service processing service.

Compared with the first implementation scheme, the above implementation introduces additional costs as follows:

In one transaction, the business initiative needs to interact twice with the real-time messaging service.

Business initiatives need to implement message status validation interfaces

Best effort delivery

Different from the reliable message mode, in the maximum effort delivery mode, the business initiative makes a certain number of attempts (maximum effort), and the final consistency guarantee responsibility is held by the business passive. In this model, the transformation cost of the passive side of the business is higher.

Similar to the reliable message, this mode also requires that the active business does not depend on the passive business, but the passive business needs to implement a regular proofreading system, initiate a query to the active party regularly, obtain the lost messages within a period of time, and compensate for the transaction.

As you can see, there is a problem with either a phase commit + compensation scheme or a message-based scheme. If the compensation mechanism fails (it is impossible to guarantee 100% availability in a real project), then there will be business inconsistencies, which will have an impact on the business. In this case, consider introducing persistent logging or logging to the system to record the status of all pending transactions, with manual intervention if necessary.

Compensating transaction pattern

One of the problems faced by the ultimate consistency implementation is how to recover resources if the transaction fails for any reason. Because final consistency is used to provide system concurrency, resources are not locked in the usual implementation, which results in a concurrent scenario where a rollback cannot simply be performed to recover a resource because it is likely to have been modified by other access. In other scenarios, even resources can’t be reversed exactly on a peer basis; for example, for some scarce resources, users also have to pay for unsubscribing after purchase.

One of the problems of implementing distributed transactions with ultimate consistency is how to recover resources if the transaction fails for some reason. Because final consistency is used to provide system concurrency, resources are not locked in the usual implementation, which results in a concurrent scenario where a rollback cannot simply be performed to recover a resource because it is likely to have been modified by other access. In other scenarios, even resources can’t be reversed exactly on a peer basis; for example, for some scarce resources, users also have to pay for unsubscribing after purchase.

In order to solve the above problems, the compensation transaction pattern needs to be introduced. That is, you need to define a separate operation to perform the compensation operation to complete the corresponding recovery after the transaction has failed.

Note that compensation operations can also fail, so compensation actions defined need to be idempotent in order to retry operations. Finally, after retry, compensation may still fail due to serious faults. Therefore, to ensure high reliability, the system may introduce logging or other functions to record compensation failures and perform manual intervention or other operations after fault recovery.

TCC

TCC, short for Try Confirm/Cancel, is a distributed transaction implementation mode, which was proposed by Alipay in 2008 and has been widely promoted. Here is a detailed description of the TCC mode:

Similar to the two-phase commit protocol, TCC divides the transaction execution into two phases. If all participants of the first phase Try reply that the transaction can be executed, Confirm operation is performed to commit the transaction. If the first phase Try operation returns any failures, Cancel is performed to Cancel the transaction. The actions performed by the two phases are defined as follows:

Phase 1, the Try operation (trying to execute the business) :

Complete all business checks (consistency)

Reserve necessary service resources

Stage 2, Confirm operation (Confirm business execution) :

Actually doing business

No operational checks are made

Use only the service resources reserved in the Try phase

The Confirm operation must be idempotent

Phase 2, Cancel operation (Cancel business execution) :

Release service resources reserved during the Try phase

The Cancel operation needs to be idempotent

In the above description, “reserving the necessary business resources” and “actually executing the business” are both rather general descriptions. From the information available, the actual implementation corresponding to the description here varies according to the actual business scenario.

Consider a scenario where a user performs a business and then grants credits, assuming that credits are a separate service. From a business point of view, there is no need to do anything at the database level to reserve the necessary resources to ensure the success of subsequent Confirm operations when giving points to customers. One way to do this is to record the credits that need to be increased, for example by adding a field to the table to record the credits that need to be increased. You can even do nothing but update or insert data during the final Confirm phase.

Consider a transfer scenario in a distributed application. So in the Try stage, for transfer account, reserve must business resources means that you need as far as possible to ensure that the subsequent Confirm operation will be successful, if do nothing or just record need to deduct the amount, consider the scenarios of concurrent, is likely in the subsequent Confirm operation due to the business rules to limit operation is successful. Therefore, it is necessary to deduct the corresponding amount directly from the transfer account, and no actual operation is required in the subsequent Confirm stage. If the transaction needs to be cancelled due to some exception, you can simply increase the amount of the response during the Cancel phase.

Different from the way that the submission and rollback of 2PC transactions are completed by TM directly coordinating RM, TCC mode defines the two phases of distributed transactions from the operation of application layer, and the operation of different phases needs to be determined from the actual business, so unified implementation is basically impossible. Therefore, if the TCC mode is adopted, more work needs to be done at the application level, while if the project transformation is carried out at the application level, which brings relatively high development costs.

It can be seen that TCC mode is a distributed transaction implementation mode that meets the final consistency, and comprehensively draws on the idea of compensation transaction mode and two-phase commit. Its advantage is that resources are not locked in the first phase, which guarantees high concurrency; In addition, because the two-phase operation is implemented at the application level, it provides high business flexibility. Accordingly, it does not satisfy strong consistency and high development cost.

Saga

The term Saga was first proposed by Princeton University in the late 1980s for Long Lived Transactions. In fact, it refers to a group of global Transactions composed of many sub-transactions without strong dependencies among them.

Saga completes the entire transaction by coordinating the calls of each subtransaction Ti and completes the rollback by defining the compensation transaction Ci corresponding to each subtransaction Ti. Suppose a global transaction should be executed by T1… Tn structure, when the execution of Tm(m<=n) transaction occurs abnormal, then Saga coordinated all compensation transactions to complete the overall rollback operation, from the overall view, the execution sequence may be T1… Tm,Cm… C1 (cancellation from Cm to C1 does not need to be reversed in the exact order of calls from T1 to Tm if there is no strong dependency here).

Saga can be divided into two implementation modes: coordination mode and choreography mode. The implementation of these two modes needs to introduce stable and reliable message service, and the reliability of the overall transaction depends on the availability level of message service.

In coordination mode, each transaction participant releases the message of success or failure of its sub-transaction to the message service. Other transaction participants subscribe to the transaction they are interested in, and the message drives the final completion or rollback of the whole transaction.

Choreographer is introduced to complete the scheduling of the whole transaction. The choreographer releases messages at each stage of the transaction, and the transaction participants consume messages and return the execution results to the choreographer. According to the results, the choreographer chooses to schedule the next stage or roll back the transaction.

Absence of isolation and its countermeasures

In all final consistency mode, since there is no isolation between concurrent global transactions, other transactions can see each other’s intermediate state (although committed state for local subtransactions) until the final commit of the global transaction is completed.

Because isolation is not satisfied, the following problems arise:

Update override: One transaction overwrites another transaction’s update, resulting in a business exception.

For example, an order operation needs to complete the order, then check the inventory, and finally complete the order creation; Assuming that the user clicks the order and immediately returns and cancels the order, the order of execution might look like this due to the uncertainty of time:

Generates the order with the status of CREATING

Invoke the inventory service to check that the inventory is sufficient

CANCELED the order with the status changed to CANCELED

The inventory check result is passed, and the status is changed to CREATED

The user’s intention is to cancel the order, but the final order is successfully created.

Dirty read: a transaction reads the uncommitted state of another transaction.

For example, for a customer’s account, a refund service and a withdrawal service may be executed in the following order:

Refund transaction, increase customer account limit

Call the order service to modify the order status

Withdrawal transaction, check that the amount meets the withdrawal limit and return successfully

Failed to modify order status, cancel increasing customer account limit.

In the end, the refund transaction did not take place, but a withdrawal beyond the limit was successfully completed, causing business risks.

Non-repeatable reads and phantom reads: a transaction reads different values twice or reads data that should not be seen because another transaction has modified the data. Both are relatively harmless to the business itself.

In view of the above problems caused by the lack of isolation, some corresponding methods can be introduced in the design stage to solve them. For example, update overwriting can be done by adding semantic locks. In the above example, the order status can be determined to be CREATING instead of being updated to cancel. For dirty reads, you can adjust the order in which the business is executed, such as placing the increment at the end of the entire refund transaction. The illusory problem can be solved by instantiating conflicts and so on.

Comprehensive comparison

Based on the above discussion, it can be seen that solutions for global transactions have their own characteristics, and there is no optimal solution. Therefore, it is necessary to select an appropriate mode based on the actual application scenarios. In general, it has the following characteristics:

Two-phase commit is a highly consistent implementation and is suitable for scenarios that require high consistency. However, the problem is that its concurrency performance is low. In actual business scenarios, high concurrency performance is often required, so it is rarely used.

All solutions except two-phase commit are final consistency implementations, global transactions do not meet isolation requirements, and some implementations do not meet atomicity (subtransaction compensation failure). Therefore, in practical application, it is necessary to consider the solutions to these problems based on the actual business.

Reliable message, best effort delivery and one-stage commit + compensation schemes are relatively less intrusive to the business itself, and the final consistency can be achieved by introducing compensation services outside the main business. The first phase of commit + compensation does not require the introduction of additional middleware, which is the simplest implementation. The major transformation costs of reliable message and maximum effort delivery are located on the active and passive side of the business respectively, which is suitable for scenarios where collaborative transformation between interactive systems cannot be achieved.

TCC and SAGA schemes are more suitable for the actual scene and more perfect, but they are not suitable for the transformation of stock application and are highly intrusive to business.

[REF]

Distributed Transaction Processing Reference Model Version 3.pdf
Distributed Transaction Processing The XA Specification.pdf
Transactions in distributed systems [coolshell.cn/articles/10…]
Distribution: What is Consistency, Anyway? [mp.weixin.qq.com/s?__biz=MzA…].
Distributed Transaction Processing in large-scale SOA systems – Cheng Li
Docs.microsoft.com/zh-cn/azure…
Implementation of a distributed transaction in Sharding – Sphere [zhuanlan.zhihu.com/p/41446128]
Design patterns for microservices Architecture -Chris Richardson