In every age, people who can learn are not mistreated.

Hello, I’m Yes.

Today I’d like to take a look at distributed transactions, introduce some common implementations of distributed transactions, their advantages and disadvantages, their applicable scenarios, and bring out some of their variants.

I’ll also piggy-back on the distributed database’s improved 2PC model to see how distributed databases work.

Then I will analyze the implementation of Seata, a distributed transaction framework, to see how distributed transactions actually fall to the ground, after all, the protocol to fall to the ground is useful.

First let’s mention what a transaction and a distributed transaction are.

The transaction

The ACID of a transaction is well known. It is a strict definition that the transaction implementation must be atomic, consistent, isolated, and durable.

However, strict transactions are difficult to achieve, as we are familiar with databases with various isolation levels, the higher the isolation level, the lower the performance, so often we will find their own balance, not strictly adhere to the transaction.

And in our daily talk, the so-called transaction often refers simply to a series of operations that all succeed or all fail, without some success or some failure.

Now that we have a clear definition of what a transaction is, let’s look at what a distributed transaction is.

Distributed transaction

Due to the rapid development of the Internet, the previous single architecture can not withstand so much demand, so complex business, so large traffic.

The advantage of single architecture is that it can be built and put online quickly in the early stage, and methods and modules are called internally, which makes it more efficient without the overhead of network.

It’s easy to deploy in a way, because it’s just one package, throw it up.

However, with the development of enterprises, the complexity of business is becoming more and more high, and the internal coupling is extremely serious, which leads to the development and testing is not easy.

Moreover, dynamic scaling cannot be carried out according to hot services. For example, the traffic of goods and services is very large. If it is a single architecture, we can only duplicate multiple clusters for the entire application, which wastes resources.

So the split is inevitable, and here comes the microservices architecture.

After splitting, the boundaries between services are clear, and each service can run independently and deploy independently, so it can scale flexibly at the service level.

Local calls between services become remote calls, the links are longer, the time of a call is longer, but the overall throughput is greater.

However, the split will also introduce other complexities, such as monitoring of service links, overall monitoring, fault tolerance measures, elastic scaling and other o&O monitoring issues, as well as issues such as distributed transactions, distributed locking and business related issues.

Often solving one pain point leads to introducing other pain points, so the evolution of your architecture is a matter of trade-offs, depending on which pain points your system can tolerate more.

And today we’re talking about distributed transactions.

Distributed transaction is composed of multiple local transactions, distributed transactions across multiple devices, and between the complex network, it is imagined that the implementation of strict transaction road resistance and long.

Stand-alone transactions, let alone distributed transactions, do not adhere to strict implementation of transactions, so in reality we can only implement incomplete versions of transactions.

With transactions and distributed transactions defined, let’s first look at the common distributed transaction scenarios: 2PC, 3PC, TCC, local message, transaction message.

2PC

2PC: Two-phase commit protocol. It introduces a transaction coordinator role to manage each participant (that is, each database resource).

It is divided into two phases, the preparation phase and the commit/rollback phase.

Let’s look at the first stage, the preparation stage.

The transaction coordinator sends a prepare command to each participant, and after receiving the command, each participant performs the relevant transaction action. You can assume that everything has been done except the commit of the transaction.

Each participant then returns a response informing the coordinator of its readiness to succeed.

After receiving the response from each participant, the coordinator moves to the second phase, based on the collected responses, and sends a rollback command to all participants if one participant fails to respond, and a submit command to all participants if not.

This agreement is in line with the normal thinking, just like when we attend the roll call in college, in fact, the teacher is the coordinator, and we are all participants.

The teacher called the roll one by one, and we called one by one. Finally, the teacher received the arrival of all the students and began today’s lecture.

Unlike roll call, the teacher can continue the class when certain students are absent, which our business does not allow.

If the transaction coordinator does not receive the response from individual participants in the first stage, it will consider the transaction failed after a certain period of time and send a rollback command. Therefore, the transaction coordinator has a timeout mechanism in 2PC.

Let’s look at the pros and cons of 2PC again.

The advantage of 2PC is that it can make use of the functions of the database itself to commit and roll back local transactions. That is to say, the actual operation of commit and roll back does not need to be implemented by us, and the business logic is not invaded by the database. I believe you will understand this after the introduction of TCC.

2PC has three major disadvantages: synchronous blocking, single point of failure and data inconsistency.

A synchronized block

You can see that after executing the prepare command in the first phase, we are locked for each local resource because we have done everything except commit the transaction.

So if another local request to access the same resource, for example to modify the item list with ID equal to 100, then the request will be blocked until the previous transaction is completed and the commit/rollback command is executed to release the resource.

So if you have a distributed transaction that involves a lot of actors, and some of them are very complex and very slow, then the fast ones have to wait, so it’s a little inefficient.

A single point of failure

As you can see, the single point is the coordinator, and if the coordinator dies, the entire transaction will not execute.

It’s ok if the coordinator hangs up before sending the prepare command, after all, each resource has not executed the command yet, so the resource is not locked.

If a hotspot resource is blocked, it will be GG. If a hotspot resource is blocked, it will be GG. If a hotspot resource is blocked, it will be GG.

Data inconsistency

Because the communication between the coordinator and the participants is through the network, and the network can sometimes be erratic or have local network exceptions.

It is possible that some participants will not receive the coordinator’s request and some will. For example, the request is committed, and then those participants who receive the order commit the transaction, which creates the data inconsistency problem.

So to summarize, 2PC

So far let’s summarize some 2PC, which is a synchronous blocking strongly consistent two-phase commit protocol, which is the prepare phase and commit/rollback phase.

The advantage of 2PC is that the business is not invaded, can use the database itself mechanism to commit and rollback transactions.

Disadvantages: it is a synchronous blocking protocol, resulting in high latency and performance degradation, as well as coordinator single point of failure, and in extreme cases, data inconsistency.

Of course, this is just a protocol, the specific landing can be flexible, for example, the coordinator single point problem, I will have a master to implement the coordinator, right.

A 2PC improved model for distributed databases

Some of you may not be familiar with distributed databases, that’s ok, we are mainly learning ideas, look at other people’s ideas.

Let me briefly talk about the Percolator model, which is based on the distributed storage system BigTable. It doesn’t matter if you don’t know what BigTable is.

Again, let’s say I have $200 and you have $100, and I’m not going to draw this in a normal way just to highlight the point.

And I’m gonna transfer you $100.

At this point the transaction manager initiates the prepare request, and then I have less money in my account, you have more money in your account, and the transaction manager logs the operation.

At this point, the data is still private, other transactions can not read, simple understanding of the Lock on the value is still private.

You can see that my record Lock marks PK, and your record marks the record pointer pointing to me, and the PK is randomly selected.

The transaction manager then issues a commit order to the record selected as PK.

This removes the lock on my record, which means my record is no longer private and can be accessed by all other transactions.

So you still have a lock on your record? Don’t update it?

You don’t need to update it in time, because when you access this record, you will look for my record according to the pointer, and find that the record has been submitted, so your record can be accessed.

Some people say that this efficiency is not poor, every time to find a time, don’t worry.

A thread in the background will scan and update the lock record.

Now that’s steady.

Improvements over 2PC

First of all, Percolator does not need to interact with all participants during the commit phase. The master needs to interact with one participant, so the commit is atomic! Solved the data inconsistency problem.

The transaction manager then logs the operation, so that the new transaction manager elected after the transaction manager has hung can know the current situation from the log and continue to work, resolving the single point of failure.

Percolator also has a background thread that scans for transaction status and rolls back transactions on each participant after the transaction manager goes down.

You can see that there are a lot of improvements over 2PC, which are also clever.

Distributed database and other transaction model, actually, but I’m not familiar with, not beep, interested students can understand.

It’s a good way to broaden the mind.

XA specification

Let’s go back to 2PC, and while we’re at it, let’s just briefly mention the XA specification. The XA specification is based on the two-phase commit protocol, and it implements the two-phase commit protocol.

Before we talk about the XA specification, we should mention the DTP model, or Distributed Transaction Processing, which specifies the design of the Distributed Transaction model.

The XA specification limits the interaction between the transaction manager (TM) and the resource manager (RM) in the DTP model. Simply put, the two of you need to communicate according to certain format specifications!

Let’s first look at the DTP model with XA constraints.

  • The AP application, which is our application, is the initiator of the transaction.
  • RM resource manager, simply referred to as the database, has transaction commit and rollback capability, corresponding to our above 2PC is the participant.
  • The TM transaction manager, the coordinator, communicates with each RM.

In simple terms, the AP defines transaction operations through THE TM, the TM and the RM communicate through the XA specification to perform a two-phase commit, and the AP’s resources are taken from the RM.

In the model, there are three roles, while the actual implementation can have two functions implemented by one role, such as AP to implement the functions of TM, TM does not need to be extracted and deployed separately.

MySQL XA

Now that we know about DTP, let’s look at how XA works in MySQL, but only InnoDB supports it.

In simple terms, you define a globally unique XID and then tell each transaction branch what to do.

You can see that the two operations performed in the figure are name change and log insert, which is equivalent to registering what to do first and wrapping the SQL to be executed with XA START XID and XA END XID.

Then you need to send the prepare command to perform the first phase, which is the phase that does everything except commit the transaction.

Then, depending on the preparation, you choose to execute the commit or rollback transaction command.

That’s basically the process, but it’s important to note that MySQL XA doesn’t perform very well.

It can be seen that although 2PC has disadvantages, there is still a landing implementation based on 2PC. The introduction of 3PC is to solve some disadvantages of 2PC, but it costs more on the whole, and can not solve the problem of network partition. I have not found the landing implementation of 3PC.

But I’ll mention it a little bit, just so you know, purely theoretical.

3PC

3PC was introduced to solve 2PC synchronization blocking and reduce data inconsistency.

3PC is an extra phase, an inquiry phase, which is preparation, pre-commit and commit.

The preparation phase simply involves the coordinator visiting the participants, something like are you ok? Can you take the request.

Pre-commit is basically the 2PC prep phase, doing everything except committing transactions.

The commit phase is consistent with the 2PC commit.

3PC has an extra stage, which is to confirm whether the participants are normal before the execution of the transaction, so as to prevent the other participants to execute the transaction and lock the resource if the individual participants are abnormal.

The intention is good, but most of the time it’s normal, so having an extra interaction phase every time doesn’t make sense.

Then 3PC also introduced a timeout mechanism on the participant, so that if the coordinator is dead, the participant will automatically commit the transaction if it has already reached the commit phase and the participant has not received the coordinator for a long time.

But what if the coordinator sends a rollback order? You see, there’s an error. The data is inconsistent.

And wikipedia says that after the 2PC participant preparation phase, if the facilitator dies, the participants don’t know the overall situation, because the facilitator controls the overall situation, so they don’t know what’s going on with each other.

While 3PC has been confirmed by the first stage, even if the coordinator dies, the participants know they are in the pre-submission stage because they have been approved by all the participants in the preparation stage.

Put simply, it’s like adding a fence to unify the states of the participants.

Summary 2PC and 3PC

As we know from the above, 2PC is a strongly consistent synchronization blocking protocol, and its performance is already poor.

The 3PC’s starting point is to solve the shortcomings of 2PC, but each additional phase is an extra communication overhead, and in most cases, useless communication.

Although participant timeouts are introduced to solve the blocking problem of the coordinator hanging, the data will still be inconsistent.

As you can see, the introduction of 3PC is not a real breakthrough, and the performance is worse, so there is only the actual implementation of 2PC.

Again, 2PC or 3PC is a protocol, can be considered as a guiding ideology, and the real landing is still different.

TCC

I don’t know if you’ve noticed, but both 2PC and 3PC rely on the database for transaction commit and rollback.

Sometimes it’s more than just a database, maybe it’s sending a text message, maybe it’s uploading a picture.

Therefore, transaction commit and rollback should be promoted to the business level rather than the database level, and TCC is a two-phase commit at the business level or application level.

TCC is divided into three methods, namely Try, Confirm and Cancel, which need to be written at the business level. TCC is mainly used for data consistency of cross-database and cross-service business operations.

TCC is divided into two phases. The first phase is to check and reserve resources, that is, to Try resources. The second phase is to commit or roll back resources.

For example, if I have a debit service, I need to write a Try method to freeze the withheld funds, a Confirm method to perform the actual deduction, and a Cancel method to roll back the frozen operation. All services for a corresponding transaction need to provide these three methods.

You can see that one method has to be expanded to three methods, so TCC is very intrusive. For example, if you don’t have the frozen field, you need to change the table structure.

Let’s look at the process.

Although it is intrusive to business, TCC does not block resources, and each method directly commits transactions. If an error occurs, it is compensated by canceling at the business level, so it is also called compensatory transaction method.

What if everyone tries successfully, everyone executes Comfirm, but some Confirm fails?

At this time, the only way is to keep retrying the failed Confirm until it succeeds. If it really fails, it can only be recorded, and then human intervention will occur.

Note on TCC

These are key points that must be paid attention to during implementation.

Idempotent problem, because the network call cannot guarantee that the request can arrive, so there is a rerouting mechanism, so for the Try, Confirm, Cancel three methods need idempotent implementation, to avoid repeated execution error.

The null rollback problem is that the Try method has not timed out due to network problems, and the transaction manager will issue the Cancel command. Therefore, it is necessary to support Cancel to Cancel normally when the Try has not been executed.

Hang problem, this is also a problem where the Try method has timed out because the network is blocked and the transaction manager issues the Cancel command, but after executing the Cancel command, the Try request arrives, and you say, “Oh, no!”

This cancellations your Try, and the transaction manager is “suspended” when the transaction is over, so you have to record it after the null rollback to prevent the Try from being called again.

TCC variant

We are talking about a generic TCC, which requires a modification of the previous implementation, but there is a case where you cannot modify if you are calling another company’s interface.

There’s no TCC for Try

For example, if you need to transfer to A different airline, for example, if you want to fly from A to B, and then from B to C, it only makes sense if you have tickets for both A-B and B-C.

In this case, there is no Try, and the ticket operation of the airline is directly called. When both airlines successfully buy the ticket, it is directly successful. If one airline fails to buy the ticket, it needs to call the cancel ticket interface.

This means that the entire business operation is performed in the first phase, so focus on the rollback operation. If the rollback fails, you need to be alerted, manual intervention, etc.

That’s actually the idea of TCC.

Asynchronous TCC

This TCC can be asynchronous, right? There is a trade-off, such as the fact that some services are hard to transform, but it does not affect the main business decision, i.e. it is not important enough to be executed in a timely manner.

At this time, reliable message service can be introduced to replace individual services to Try, Confirm, Cancel.

When you Try, you just write the message, and the message can’t be consumed yet. Confirm is the action that actually sends the message. Cancel is the action that cancles the message.

This reliable messaging service is similar to the transaction messaging that I’ll talk about in a minute, which is a combination of transaction messaging and TCC.

TCC summary

It can be seen that TCC implements transaction commit and rollback through business code, which is a big intrusion on business. It is a two-phase commit at the business level.

Its performance is better than 2PC because there is no blocking of resources and its scope is larger than 2PC, with several considerations mentioned above in the implementation.

It is the most common way to implement distributed transactions in the industry, and as you can see from the variation, it depends on the business flexibility. It does not mean that you have to be rigid and have all services adapt to the three methods to use TCC.

Local message table

Local message is the use of local transactions, local transaction message table will be stored in the database, and the insertion of local message is added in the local transaction operation, that is, the execution of business and the operation of putting the message into the message table are committed in the same transaction

If the local transaction succeeds, the message must be inserted successfully, and then another service is invoked, which modifies the state of the local message if the invocation succeeds.

If the failure does not matter, there will be a background thread scan, find these status of the message, will always call the corresponding service, generally will set the number of retries, if not always a special record, to manual intervention processing.

As you can see, it’s very simple, and it’s a maximum-effort idea.

Transaction message

In fact, I have written an article on transaction messages, analyzing the implementation of RocketMQ and Kafka transaction messages from the source level, and the difference between the two.

I’m not going to go into detail here, because the previous article was very detailed, maybe four or five thousand words. I’ve attached the link: Transaction message

The realization of the Seata

First of all, what is Seata? Take a paragraph from the official website.

Seata is an open source distributed transaction solution dedicated to providing high performance and easy to use distributed transaction services. Seata will provide users with AT, TCC, SAGA and XA transaction modes to create a one-stop distributed solution for users.

As you can see, many patterns are provided, so let’s look AT the AT pattern first.

AT mode

AT mode is a two-phase commit. We mentioned earlier that the two-phase commit has the problem of synchronous blocking. Seata solves this problem.

AT phase 1 commits the transaction directly and releases the local lock directly. Of course not, this is similar to the local message table, where a local transaction is used, and the rollback log is inserted in the actual transaction operation, and then committed in a transaction.

How did this roll log come about?

Through the framework broker JDBC classes, the SQL is parsed as it executes to get a pre-execution data mirror, then executed to get a post-execution data mirror, and then assembled into a rollback log.

The accompanying commit of the local transaction inserts the rollback log into the database’s UNDO_LOG table (so the database needs to have one).

After this wave of operations, the transaction can be committed without any worries in one phase.

Then if phase 1 succeeds, phase 2 can asynchronously delete those rollback logs, and if phase 1 fails, it can reverse compensation recovery by rolling back the logs.

At this time, a careful student thought, what if someone changed this data? Isn’t that the right mirror image?

There is also the concept of a global lock, which requires a global lock to be obtained before the transaction can commit.

If not, then you need to roll back the local transaction.

The example from the Seata website is good. I won’t make it up myself. The following is a copy of the example from the Seata website:

At this time, there are two transactions, namely TX1 and TX2, which respectively update the m field of table A, and the initial value of M is 1000.

Tx1 starts first, opens local transaction, gets local lock, update operation m = 1000-100 = 900. Before a local transaction is committed, the global lock of the record is obtained and the local lock is released.

Start after TX2, open local transaction, get local lock, update operation m = 900-100 = 800. Before the local transaction is committed, tx1 tries to hold the global lock of the record. Tx2 needs to wait for the global lock again.

You can see that the TX2 modification is blocked, and then after trying to get the global lock, you can commit and release the local lock.

If the two-phase global rollback of TX1 is performed, TX1 needs to obtain the local lock of the data again and perform the reverse compensation update operation to realize the rollback of the branch.

At this point, if TX2 is still waiting for the global lock of the data and holds the local lock, the rollback of tx1’s branch will fail. Branch rollback will continue to retry until tx2’s global lock timeout, discard the global lock and roll back the local transaction to release the local lock, TX1 branch rollback is successful.

Because the entire global lock is held by TX1 until the end of tx1, dirty writes do not occur.

AT mode then defaults globally to read uncommitted isolation level. If applied in a specific scenario, a global read committed must be required, which can be done through the SELECT FOR UPDATE statement proxy.

That is, if your local transaction isolation level is read committed or above.

Summary of AT mode

You can see the non-intrusive pre and post mirroring of the data obtained through the agent, assembled into a rollback log committed with the local transaction, resolving the two-phase synchronization blocking problem.

And use global locks to achieve write isolation.

FOR overall performance purposes, the default is to read the uncommitted isolation level, and SELECT FOR UPDATE is only proxy FOR read committed isolation.

This is essentially a variant implementation of a two-phase commit.

TCC mode

Do we need to use three methods to manage custom branch transactions and integrate them into global transaction management

Let me post a picture from the website that should be pretty clear.

Saga mode

This Saga is a long transaction solution provided by Seata, which is suitable for business processes that are large and long, where it is possible to nest multiple transactions in order to implement a normal TCC.

Moreover, some systems, such as old projects or other companies, could not provide the three interfaces of TCC, so the Saga model was developed. This Saga model was proposed in the paper published by Hector & Kenneth in 1987.

So what does Saga do? So let’s look at this graph.

Let’s assume that there are N operations, starting from T1, the commit transaction is executed directly, and then T2, you can see that the lock is committed directly, and T3 discovers that the execution fails, and then the Compenstaing phase starts to compensate one by one.

The idea is to do it with your head covered at the beginning. Don’t hesitate. If something goes wrong, let’s change it back one by one.

You can see that this situation does not guarantee transaction isolation, and Saga has the same concerns as TCC, requiring null compensation, anti-suspension, and idempotence.

And in extreme cases, you can’t roll back because the data has been changed. For example, the first step gave me 20 thousand yuan, I took out to spend, at this time you roll back, the balance on my account had 0, do you say how to do? Do I still have to do negative?

This situation can only be started in the business process, I write code in fact always write like this, take the scenario of buying skin, I deduct the money first and then give the skin.

Hypothetically buckled money to the skin first failed to white? Are you going to pay for it? Do you think users will come back and say that the skin has not deducted the money?

May have a small clever ghost said I will change the skin to go back, hey hey this kind of thing really happened, tut tut, be scolded really miserably.

So the correct process should be to deduct the money and then give the skin, money to his pocket first, the skin did not give the successful user will naturally find over, at this time to him bai, although you may have written a BUG, but it is not a white BUG.

So this is something you have to pay attention to when you’re coding.

The last

It can be seen that distributed transactions will still have various problems, and the general implementation of distributed transactions can only reach the final consistency.

In extreme cases, human intervention is needed, so good logging is critical.

And code the business process, write in a direction that is good for the company, such as get the user’s money first, then give the user something, remember.

Before you do a distributed transaction, do you have to think about how you can adapt to avoid distributed transactions?

Even more extreme, does your business need to be a business?

Finally, personal ability is limited, if you have flaws, please quickly contact me, if you think the article is good also look at a point to see support yo.

Shoulders of giants

Distributed Protocol and Algorithm practice, Han Jian

Distributed Database 30 lecture, Wang Lei

seata.io


I’m Yes, from a little bit to a billion points. See you next time.

Chat 🏆 technology project stage v | distributed those things…