The transaction

ACID

  • Atomic: A transaction guarantees that changes to more than one piece of data will either succeed or be undone at the same time in the same business process.
  • Isolation: In a concurrent environment, when different transactions operate the same data at the same time, each transaction has its own complete data space (different Isolation levels ensure that the data being read and written by each transaction is independent of each other and will not affect each other).
  • Durability: Transactions should ensure that all data modifications that are successfully committed are persisted correctly without loss of data (e.g., restarting the database system in case of a system crash restores the database to the state at the successful end of the transaction).
  • Consistency: Data in the database before and after a transaction is consistent with expectations and there is no contradiction between related data. If the transaction completes successfully, all changes in the transaction take effect, and if an error occurs during transaction execution, all changes in the transaction are rolled back and the system returns to its original state. For example, if A and B transfer money, AB each has 200 yuan. Whether the transfer is successful or not, their total amount is 400 yuan.

These four properties are the “ACID” properties of A transaction, but I do not agree with this statement. These four properties are not orthogonal, A, I, D are the means, and C is the end.

Although the concept of transactions originated in database systems, today it has been extended beyond the database itself. All applications that require data consistency, including but not limited to databases, transactional memory, caches, message queues, distributed storage, and so on, may use transactions. I will use the term “data source” to refer generally to the logical devices that provide and store data in all of these scenarios, but the meanings of transactions and consistency in these scenarios may not be identical, as described below.

  • When A service uses only one data source, it is classic and relatively easy to achieve consistency through A, I, and D. At this point, the data read and write of multiple concurrent transactions can be perceived by the data source whether there is a conflict, and the final order of read and write of concurrent transactions on the time line is determined by the data source. Such inter-transaction consistency is called “internal consistency”.
  • When a service uses multiple different data sources, or when multiple different services involve multiple different data sources at the same time, the problem becomes much more difficult. At this point, the order of concurrent or even successively executed transactions on the timeline is not determined by any data source. Such consistency between transactions involving multiple data sources is called “external consistency”.

External consistency problems are often difficult to solve using A, I, and D because of the high or unrealistic cost involved; But the external consistency in the distributed system is bound to encounter and must solve the problem, therefore we must change concept, the consistency from “yes or no” dual attribute into can discuss separately according to different intensity of multiple attributes, in ensuring that the price can get strength as high as possible under the premise of the consistency of the guarantee, is also for this reason, Transaction processing has risen from a “programming problem” of specific operations to an “architectural problem” requiring global trade-offs.

In the process of exploring these solutions, many new ideas and concepts are generated, some of which are not so intuitive. In this chapter, the author will explain how to deal with these concepts in different transaction schemes through the same scenario and examples.

Example scenario

Fenix's Bookstore is an online Bookstore. When each book is successfully sold, you need to ensure that the following three things are handled correctly: - The user's account is deducted from the corresponding merchandise payment. - Deduct the inventory in the commodity warehouse and mark the commodity as waiting for distribution. - The merchant's account increases the corresponding commodity payment.Copy the code

Local transactions

Local transactions should have been translated as “Local Transaction” to correspond to the later term “global Transaction”, but now that the term “Local Transaction” seems to have become mainstream, there is no need to worry about the name here. Local transactions are transactions that operate on a single transactional resource and do not require coordination by a global transaction manager. It is difficult to explain “local transactions” from a conceptual point of view without introducing what a “global transaction manager” is, but I will leave the concept behind and come back to the comparison after the next section on “Global transactions”.

Local transactions are the most basic transaction solution and are only suitable for scenarios where a single service uses a single data source. From the application point of view, it is directly dependent on the data source itself provides transaction ability to work, at the code level, most can only do the transaction interface a standardized packing layer (such as JDBC interface), and not deeply involved in the process of transaction, the transaction open, termination, commit, rollback, nested, set the isolation level, Even the transaction propagation mode close to the application code, all rely on the support of the underlying data source to work, which is quite different from XA, TCC, SAGA and other transactions mainly implemented by the application code introduced later. For example, if your code calls the Transaction::rollback() method in JDBC, the successful execution of the method does not necessarily mean that the Transaction has been successfully rolled back. If the data table engine is MyISAM, the rollback() method is a meaningless and empty operation. Therefore, if we want to discuss local transactions in depth, we have to go beyond the application code level to understand some of the transaction implementation principles of the database itself, and understand how traditional database management systems implement transactions using ACID.

Studies on the implementation principles of transactions today must trace back to Algorithms for Recovery and Isolation Exploiting Semantics (ARIES), which translates directly to “semantically based Recovery and Isolation Algorithms”. There must be some evil taste in trying to piece together the word “ARIES” in this difficult name, similar to ACID.

ARIES is the basic theory of modern database, if not all databases implement ARIES, at least can be called modern mainstream relational database (Oracle, MS SQLServer, MySQL/InnoDB, IBM DB2, PostgreSQL, And so on) are deeply influenced by this theory in transaction implementation. In the 1990s, IBM Almaden Research Institute summarized the experience of developing prototype database System “IBM System R” and published three major papers in ARIES theory, among which ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging focuses on how two of ACID’s properties, atomicity (A) and persistence (D), should be implemented at an algorithmic level. While another article, ARIES/KVL: A Key-Value Locking Method for Concurrency Control of Multiaction Transactions Operating on B-Tree Indexes is a foundational article for modern database isolation (I), and we’ll start with atomicity and persistence.

Implement atomicity and persistence

Atomicity and persistence are two closely related attributes in a transaction. Atomicity ensures that multiple operations of a transaction either take effect or do not take effect, and there is no intermediate state. Persistence ensures that once a transaction is in effect, its modifications cannot be undone or lost for any reason.

As we all know, data must be successfully written into persistent storage such as disks and tapes to have persistence. Data stored only in memory will be lost once the application program crashes suddenly, or the crash of database and operating system, or even the machine breaks down suddenly. All of these accidents are called “crashes”. The biggest difficulty in achieving atomicity and persistence is that the operation “write to disk” is not atomic. There are not only “write” and “unwrite” states, but also objectively exist an intermediate state of “write”. Because it is impossible to eliminate write intermediate states and crashes, writing data in memory to disk does not guarantee atomicity and persistence without additional safeguards. Here are specific examples to illustrate.

Following the preset scenario example, buying a book from Fenix’s Bookstore requires modifying three data: subtracting the purchase price from the user’s account, adding the purchase price to the merchant’s account, and marking a book as shipping status in the warehouse. Because of the intermediate state of the write, the following situations can occur.

  • Uncommitted transaction, crash after write: Didn’t modify the three data application, but the database will have one or two changes in the data to disk, collapse at this time, once after the restart, the database must have a way to know that before the crash happened in an incomplete shopping operation, will have been modified to recover data from disk into without modification, to ensure atomicity.
  • Committed transactions, written before the crash: the program has been modified finished three data, but has not been all three database data changes are written to disk, collapse at this time, once after the restart, the database must have a way to know that before the crash happened a full shopping operation, will hardly had written to disk that part of the data to write, to ensure the durability.

Since the intermediate state of writing and Crash are unavoidable, in order to ensure atomicity and persistence, Recovery measures can only be taken after the Crash. Such data Recovery operation is called “Crash Recovery”. Data is also referred to as Failure Recovery or Transaction Recovery.

In order to successfully complete the crash recovery, can’t write data in the disk like program to modify memory variable values, direct change a certain table row a column of a value, but will have to modify the data needed for the operation of all information, including changing what data, the data is physically located on which page of memory and disk blocks, from what value to any value, and so on. In the form of a log — that is, only sequential appending file writes (which are the most efficient writes) are recorded to disk first. Only after all the log records are safely dropped and the database sees a “Commit Record” in the log that represents a successful transaction, will the real data be modified based on the information in the log. After the modification is complete, Adding an “End Record” to the log to indicate that the transaction has been persisted is called “Commit Logging”.

Bonus tip: Shadow Paging for atomicity and persistence of transactions through logging is the dominant solution today, but it is not the only option. In addition to the log, there is another called "[Shadow the Paging] (https://en.wikipedia.org/wiki/Shadow_paging)" (a Chinese translation for Shadow "Paging") transaction implementation mechanism, A commonly used lightweight database, SQLite Version 3, uses a transaction mechanism called Shadow Paging. The general idea of Shadow Paging is that changes to data will be written to the data on hard disks, but instead of directly modifying the original data in place, data will be copied to retain the original data and the duplicate data will be modified. During a transaction, the modified data is stored in two copies, one before the modification and one after the modification, hence the name "Shadow". When a transaction commit successfully, all the data change after successfully persisted, the last step is to modify the data in the reference pointer, change the references from the original data to the new copy out the revised copy of the last pointer to "modify" the operation will be considered atomic operations, modern disk write operations can be thought of in hardware ensures that there will not be changed half a "value" of the phenomenon. So Shadow Paging can also guarantee atomicity and persistence. Shadow Paging is simpler to implement transactions than Commit Logging. However, when it comes to isolation and concurrent locking, the transaction concurrency capability of Shadow Paging is relatively limited, so it is not widely used in high-performance databases.Copy the code

It is not hard to understand how Commit Logging ensures data persistence and atomicity: First of all, once the log is successfully written to the Commit Record, the entire transaction is successful, even if the actual data modification crash, restart according to the log information has been written to the disk, continue to modify the data, which ensures persistence. Second, if the log crashes without successfully writing a Commit Record, the entire transaction will fail. After the system restarts, some logs without a Commit Record will be seen, which will be marked as rollback, and the entire transaction will look like it never happened, which ensures atomicity.

The principle of Commit Logging is clear, and there are some databases that use Commit Logging directly to implement transactions, such as Ali’s OceanBase. However, Commit Logging suffers from a huge congenital defect: all real changes to the data must occur after the transaction commits, that is, after the log is written to the Commit Record. Before that, if the disk I/O have enough spare time, if a transaction to modify the amount of data is very large, takes up a lot of memory buffer, no matter what reason, are not allowed to modify the data on the disk before the transaction Commit, this is a Commit Logging was set up, the premise of is very unfavorable to improve the performance of the database. To solve this problem, the ARIES theory, mentioned earlier, finally comes into play. ARIES proposes a Logging improvement called “write-ahead Logging”, which allows changing data to be written in advance before a transaction commits.

Write-ahead Logging divides the time at which variable data is written into FORCE and STEAL cases, bounded by the transaction commit point.

  • FORCE: After a transaction is submitted, it is called FORCE if changing data must be written at the same time. If changing data cannot be written at the same time, it is called no-force. In reality, most databases use the no-force policy. As long as there are logs, changing data can be persisted at any time. In order to optimize disk I/O performance, there is NO need to FORCE data to be written immediately.
  • STEAL: Before the transaction commits, allowing the variable data to be written in advance is called STEAL, and not allowing it is called no-steal. In order to optimize disk I/O performance, allowing data to be written in advance helps to utilize idle I/O resources and save memory in the database cache.

Commit Logging allows no-force, but not STEAL. If a portion of the changed data is written to disk before the transaction commits, it will become an error if the transaction is rolled back or a crash occurs.

Write-ahead Logging allows both no-force and STEAL. The solution is to add another Log type called Undo Log. When changing data is written to disk, Undo Log must be recorded first, indicating where the data has been changed and from what value to what value. And so on. In order to erase pre-written data changes based on Undo Log during transaction rollback or crash recovery. Undo logs are now commonly translated as “rollback logs”, and the Redo logs that were used to repeat data changes during crash recovery were named Redo logs, which are generally translated as “Redo logs”. With the addition of Undo Log, write-Ahead Logging performs the following three phases during crash recovery.

  • Analysis phase: This phase scans logs from the last Checkpoint, at which all changes that should have been persisted have been safely flushed. Find all transactions that do not have an End Record and form a set of transactions to be recovered. This set will contain at least two parts: Transaction Table and Dirty Page Table.
  • Redo phase: This phase repeats History based on the set of transactions to be recovered generated during the analysis phase as follows: Find all logs that contain Commit Records, write the modified data to disk, add an End Record to the log, and remove the set of transactions to be recovered.
  • Rollback stage (Undo) : this stage deals with the set of remaining recovery transactions after the analysis and redo stage, and the remaining transactions need to be rolled back, which are called Loser. According to the information in Undo Log, the information that has been written to the disk in advance is rewritten to achieve the purpose of rolling back these Loser transactions.

Operations in both the redo and rollback phases should be designed to be idempotent ==. In order to achieve high I/O performance, the above three phases inevitably involve very cumbersome concepts and details (Redo logs, Undo Log data structures, etc.). Due to space constraints, I do not want to cover these details, but the two papers cited at the beginning of this section are the best way to do so. Write-ahead Logging is part of the ARIES theory. ARIES as a whole has many advantages such as rigor and high performance, but these also come at the expense of high complexity. Database can produce a total of four combinations according to whether FORCE and STEAL are allowed. From the point of view of optimizing disk I/O, no-force plus STEAL combination is undoubtedly the highest performance. From the perspective of algorithm implementation and logging, the complexity of no-force plus STEAL combination is undoubtedly the highest. Figure 3-1 shows the relationship between the four combinations and Undo Log and Redo Log.

Figure 3-1 Four combinations of FORCE and STEAL

Achieve isolation

Isolation ensures that each thing reads and writes data independently of each other and does not affect each other. Sniffing out isolation by definition must be closely related to concurrency, because if there is no concurrency and all transactions are serial, there is no need for any isolation, or such access is inherently isolated. But the reality is impossible without concurrency, how to achieve serial data access under concurrency? Almost all programmers will answer: Lock sync! True, modern databases provide the following three types of locks.

  • Write Lock: If a Write Lock is added to data, only the transactions that hold the Write Lock can Write data. When the data holds the Write Lock, no other transactions can Write data or impose a read Lock.

  • Read Lock (also called Shared Lock, s-lock for short) : Multiple transactions can add multiple Read locks to the same data. After the data is added to a Read Lock, it cannot be added to a write Lock. Therefore, other transactions cannot write to the data, but can still Read the data. For a transaction holding a read lock, if the data has a read lock only on its own transaction, it is allowed to be upgraded directly to a write lock and then written to the data.

  • Range Lock: An exclusive Lock is applied to a Range within which data cannot be written. The following statement is a typical example of range locking:

    SELECT * FROM books WHERE price < 100 FOR UPDATE;
    Copy the code

Note the difference between “a range cannot be written” and “a batch of data cannot be written”, i.e. do not think of a range lock as a set of exclusive locks. With a range lock, not only can you not modify existing data in the range, but you can’t add or delete any data in the range, which a set of exclusive locks cannot do.

Serializable: Serializable access provides the highest level of isolation. The highest level of isolation defined in ANSI/ISO SQL-92 is Serializable. Serializability is in line with the common programmer’s understanding of data contention locking. If performance optimization is not considered, serializability can be achieved by adding read locks, write locks and range locks to all read and write data of a transaction. Expanding locks and Shrinking locks were needed to determine the relationship between read locks, write locks, and data in two-phase locks, 2PL). The Concurrency Control theory dictates that isolation and Concurrency are at odds with each other. The higher the isolation, the lower the throughput for concurrent access. Modern databases must provide other isolation levels besides serializability for users to use, so that users can adjust the isolation level. The fundamental purpose is to enable users to adjust the locking mode of the database to achieve a balance between isolation and throughput.

Repeatable Read: Read and write locks are added to the data involved in the transaction and are held until the end of the transaction, but no scope locks are added. Repeatable r Reads are less serializable than Phantom Reads, which are two identical range queries that produce different result sets during a transaction. For example, if we were to count the number of books sold in Fenix’s Bookstore for less than $100, we would execute the first SQL statement:

SELECT count(1) FROM books WHERE price < 100					/* Time order: 1, transaction: T1 */
INSERT INTO books(name,price) VALUES ('Understanding the Java Virtual Machine in depth'.90)	/* Time order: 2, transaction: T2 */
SELECT count(1) FROM books WHERE price < 100					/* Time order: 3, transaction: T1 */
Copy the code

Lock, according to the scope of the face before, read and write locks lock definition, if the SQL statement repeated twice in the same transaction, and between the two execution happens to have another transaction in database insert a book is less than 100 yuan, it’s will be allowed, then the two the same query will get different results, The reason is that repeatable reads do not have a scope lock to prevent the insertion of new data within the scope, which is a sign that a transaction is affected by other transactions and the isolation is broken.

It is important to note that this introduction is aimed at discussing ARIES theory, and that a specific database does not have to be implemented in full accordance with the theory. For example, MySQL/InnoDB’s default isolation level is repeatable, but it completely avoids illusory read problems in read-only transactions (because InnoDB uses MVCC, the repeatable read level is only less than or equal to the record with the current transaction ID. MVCC is described below). For example, transaction T1 in the above example only has query statements. Is a read-only transaction, so the problem in this example does not occur in MySQL. However, in the read/write transaction, MySQL still has magic problems. For example, if transaction T1 changes the name of all books less than $100 after inserting new books into other transactions, it will still be affected by the newly inserted books.

Read Commited: A write lock on the data involved in a transaction lasts until the end of the transaction, but the Read lock is released immediately after the query is complete. Read committed is weaker than Repeatable Reads because of non-repeatable Reads, which means that two queries of the same data get different results during the transaction execution. For example, if I want to obtain the price of “Deep Understanding Java Virtual Machine” in Fenix’s Bookstore, I execute two SQL statements. Between the two statements, another transaction changes the price of the book from 90 yuan to 110 yuan, as shown in the following SQL statement:

SELECT * FROM books WHERE id = 1;   						/* Time order: 1, transaction: T1 */
UPDATE books SET price = 110 WHERE id = 1; COMMIT;			/* Time order: 2, transaction: T2 */
SELECT * FROM books WHERE id = 1; COMMIT;   				/* Time order: 3, transaction: T1 */
Copy the code

If read isolation level is submitted, the two repeat the query results will be different, because the read committed isolation level lack throughout the entire transaction cycle read locks, could not read the data changes, prohibited transactions at this time T2 update statement would be immediately submitted successfully, it is also a transaction to receive other transactions, the performance of the isolation are destroyed. If the isolation level is repeatable, transaction T2 cannot acquire the write lock because the data has been read locked by transaction T1 and will not be released immediately after reading, and the update will be blocked until transaction T1 is committed or rolled back.

Read Uncommitted: Read Uncommitted locks only write data in a transaction until the end of the transaction, but no Read locks at all. Read uncommitted is weaker than read committed in Dirty Reads, which occur when one transaction Reads uncommitted data from another transaction during the execution of a transaction. For example, the author think the deep understanding of the Java virtual machine, price from 90 yuan to 110 yuan is harmful to the interests of the consumer behavior, and perform an update statement to change the price back to $90, before the transaction is committed, colleagues say this is not literally rise in price, but the cause for the rising cost of printing, according to the 90 yuan to sell at a loss, so the author immediately roll back the transaction, The SQL scenario is as follows:

SELECT * FROM books WHERE id = 1;   						/* Time order: 1, transaction: T1 */
/* Note that there is no COMMIT */
UPDATE books SET price = 90 WHERE id = 1;					/* Time order: 2, transaction: T2 */
/* SELECT */
SELECT * FROM books WHERE id = 1;			  				/* Time order: 3, transaction: T1 */
ROLLBACK;			  										/* Time order: 4, transaction: T2 */
Copy the code

However, transaction T1 has sold several copies for $90 after the price change. The reason is that the uncommitted data does not have a read lock on it at all, which in turn allows it to read data on which other transactions have a write lock. That is, the two query statements in transaction T1 get different results. If you can’t understand the “instead of” the word in this sentence, please re-read the definition of a write lock: write locks to ban other transactions read lock, not prohibit reading data transaction, if the transaction T1 when reading data simply don’t have to go and read lock, will cause the transaction T2 uncommitted data also can be read by transaction T1 immediately. This is also a sign that a transaction is affected by other transactions and the isolation is broken. If the isolation level is read committed, the second query in transaction T1 cannot obtain the read lock because transaction T2 holds the write lock on the data, while the read committed level requires the read lock to be added before the data is read, so the query in T1 is blocked until transaction T2 is committed or rolled back.

In fact, different isolation levels and illusory read, unrepeatable read, dirty read and other problems are only superficial phenomena, which are the result of the combination application of various locks in different locking time. The fundamental reason for different isolation levels of databases is to realize isolation by means of locks.

Another common feature of the four isolation levels is that phantom read, unrepeatable read, dirty read and other problems are caused by one transaction reading data, the impact of another transaction writing data to destroy the isolation of this “one transaction read + another transaction write” isolation problem, In recent years, a lock-free optimization called Multi-Version Concurrency Control (MVCC) has been widely adopted by mainstream commercial databases. MVCC is a read optimization strategy whose “lock free” means that the read does not need to be locked. The basic idea of MVCC is that any changes to the database do not overwrite the data directly, but instead produce a new copy that co-exists with the old version so that it can be read without locking at all. In this sentence, “version” is a keyword, which can be interpreted as two invisible fields for each row in the database: CREATE_VERSION and DELETE_VERSION, both of which record the value of the transaction ID, which is a globally strictly incrementing value, then write the data according to the following rules.

  • Insert data: CREATE_VERSION Records the transaction ID of the insert data. DELETE_VERSION is empty.
  • When deleting data: DELETE_VERSION Indicates the transaction ID of the deleted data. CREATE_VERSION is empty.
  • When modifying data: The modified data is regarded as the combination of “Delete old data and insert new data”, that is, the original data is copied first. The DELETE_VERISON of the original data records the transaction ID of the modified data, and the CREATA_VERSION is empty. CREATE_VERSION of the copied new data records the transaction ID of the modified data. DELETE_VERSION is empty.

At this point, if another transaction reads the changed data, the isolation level determines which version of the data should be read.

  • The isolation level is repeatable: records whose CREATE_VERSION is less than or equal to the current transaction ID are always read, in which case the latest (with the largest transaction ID) is taken if there are still multiple versions of the data.
  • The isolation level is read committed: always take the latest version, that is, the most recently committed version of the data record.

MVCC is not necessary for either of the other isolation levels, because the raw data can be modified without committing, and the version field is not required at all when other transactions view the data. Serializable semantics are meant to block reads from other transactions, and MVCC is read-lock-free optimization, so it’s not meant to be used together.

MVCC is optimized only for “read and write” scenarios. If two transactions are modifying data at the same time, i.e., “write and write”, there is little room for optimization and locking is almost the only viable solution. There is a bit of debate about whether the Locking strategy is “Optimistic Locking” or “Pessimistic Locking”. The previous locking strategies introduced by the author are pessimistic locking strategies, that is, if you do not do locking and then access data, there will be problems. Optimistic locking strategies, on the other hand, assume that the existence of contention between transactions is accidental and the absence of contention is common, and that locking should not be done in the first place, but should be remedied when competition occurs. This approach is called “Optimistic Concurrency Control” (OCC), but I caution that there’s no reason to believe that Optimistic locks are faster than pessimistic locks. It’s just a matter of how competitive the competition is, and Optimistic locks are slower if it is.

Global transaction

The opposite of a local Transaction is a Global Transaction, also referred to in some sources as an External Transaction. In this section, a Global Transaction is defined as a transactional solution for scenarios where a single service uses multiple data sources. Note that theoretically true global transactions do not have a “single service” constraint. This is a concept in the DTP (Distributed Transaction Processing) model, but this section discusses a Transaction solution that still pursues strong consistency in a Distributed environment. For the occasion of multi-node and call each other services (typically is now micro service system) is highly appropriate, today it is almost only practical application in the occasion of single service data source, with a subsequent introduction in order to avoid the ACID weak consistency of the transaction way to confuse each other, so here referred to in the global transaction scope of reduction, The subsequent transactions involving multiple services and multiple data sources will be called “distributed transactions”.

In 1991, The X/Open Organization (later incorporated into The Open Group) proposed a transactional Architecture called X/Open XA (XA stands for eXtended Architecture) to solve The problem of consistency in distributed transactions. At its core, it defines the communication interface between a global Transaction Manager (which coordinates global transactions) and a local Resource Manager (which drives local transactions). The XA interface is bidirectional and can form a communication bridge between a transaction Manager and multiple Resource managers. By coordinating the consistent actions of multiple data sources, the unified submission or rollback of global transactions can be realized. The names XADataSource and XAResource that we still occasionally see in Java code come from this.

However, XA is not a technical specification for Java (there was no Java at that time when XA was proposed), but a set of general specifications that are language independent, so the JSR 907 Java Transaction API is specifically defined in Java, which implements a standard for global Transaction processing in the Java language based on XA pattern. This is what we now know as JTA. The two main interfaces of JTA are:

  • Transaction manager interface:javax.transaction.TransactionManager. This set of interfaces is used by Java EE servers to provide container transactions that are automatically responsible for transaction management, and another set is providedjavax.transaction.UserTransactionInterface for manually starting, committing, and rolling back transactions through program code.
  • Resource definition interfaces that meet the XA specification:javax.transaction.xa.XAResource, any resource (JDBC, JMS, and so on) that wants to support JTA simply implements the methods in the XAResource interface.

JTA is a Java EE technology that is typically supported by Java EE containers such as JBoss, WebSphere, and WebLogic. But now Bittronix, Atomikos, and JBossTM (formerly Arjuna) all implement JTA interfaces as JAR packages called JOTM (Java Open Transaction Manager), This allows us to use JTA in Java SE environments such as Tomcat and Jetty.

Now, let’s make another assumption for the scenario examples in this chapter: What would happen if the bookstore users, merchants, and warehouses were in different databases, and everything else remained the same? This may look like a Transactional transaction if you code it as a declarative transaction, with the @Transactional annotation. However, if you implement it as a programmatic transaction, you will see a difference in the way it is written. This is what the Transactional code looks like:

public void buyBook(PaymentBill bill) {
    userTransaction.begin();
    warehouseTransaction.begin();
    businessTransaction.begin();
	try {
        userAccountService.pay(bill.getMoney());
        warehouseService.deliver(bill.getItems());
        businessAccountService.receipt(bill.getMoney());
        userTransaction.commit();
        warehouseTransaction.commit();
        businessTransaction.commit();
	} catch(Exception e) { userTransaction.rollback(); warehouseTransaction.rollback(); businessTransaction.rollback(); }}Copy the code

We can see from the code, the program aims to do transaction commit three times, but in fact the code and can’t write, just imagine, if there is an error in the businessTransaction.com MIT (), the code to perform in the catch block, At this point, userTransaction and warehouseTransaction have been committed, and calling the rollback() method will not help. This will cause some data to be committed and some data to be rolled back, and the overall transaction consistency will not be guaranteed. To solve this problem, XA splits the transaction commit into a two-phase process:

  • Prepare phase: Also known as the vote phase, in which the coordinator asks all participants in a transaction if they are ready to commit. Participants respond Prepared if they are ready to commit and non-prepared otherwise. Preparing is not the same as preparing in human language. For a database, a redo log is used to Record the contents of the entire transaction Commit. The difference is that the Commit is not written to the last Commit Record. This means that the isolation is not released immediately after the data is persisted, that is, the lock is still held and the data is kept isolated from other non-transactional observers.
  • Commit phase: In the execution phase, if the coordinator receives the Prepared message from all participants in the previous phase, he/she will set the transaction state to Commit locally and send the Commit command to all participants. All participants will Commit immediately. Otherwise, if any participant replies with a non-prepared message, or if any participant times out, the coordinator persists his transaction state as Abort, then sends Abort to all participants, and the participants immediately perform rollback. The Commit operation at this stage should be lightweight for the database. Simply persisting a Commit Record can usually be done quickly. It is only when Abort is received that committed data needs to be cleaned up from the rollback log, which can be a relatively heavy load.

These two processes are known as the “2 Phase Commit” (2PC) protocol, and there are several other prerequisites for its success in ensuring consistency.

  • You must assume that the network is reliable during the short period of commit, that is, messages are not lost during commit. At the same time, it is also assumed that there is no error in the whole process of network communication, that is, the message can be lost, but no wrong message will be delivered. The design goal of XA is not to solve problems such as the Byzantine general. In two-stage commit, the failure of voting phase can be remedied (rollback), while the failure of commit phase cannot be remedied (the result of commit or rollback is not changed, and the crashed node can only be restored). Therefore, the time of this phase should be as short as possible, in order to control network risks as much as possible.
  • It must be assumed that nodes lost due to network partitions, machine crashes, or other causes will eventually recover and not remain permanently lost. Since the complete redo log has been written during the preparation phase, once the lost machine recovers, it is possible to find ready but uncommitted transaction data from the log and then query the status of the transaction with the coordinator to determine whether commit or rollback should be performed next.

Note that the coordinator and actor mentioned above is usually played by the database itself, without application intervention. The coordinator is generally elected among the participants, and the application only acts as a client to the database. The interaction sequence of two-stage submission is shown in Figure 3-2.

Figure 3-1 Interactive timing diagram of two-stage submission

Two-phase commits are simple and not difficult to implement, but they have several very significant disadvantages:

  • Single point problem: The coordinator plays a pivotal role in the two-paragraph submission. When the coordinator is waiting for the response of the participant, there can be a timeout mechanism, allowing the participant to break down, but when the participant is waiting for the instruction of the coordinator, it cannot do timeout processing. If the downtime is not for one of the participants, but for the coordinator, all participants are affected. If the coordinator never recovers and does not send Commit or Rollback normally, then all participants must wait.
  • Performance issues: In the two-phase Commit process, all participants are equivalent to being bound into a unified scheduling whole, which requires two remote service calls and three data persistence (redo log is written in the preparation stage, state persistence is done by the coordinator, and Commit Record is written in the log in the Commit stage). The process continues until the slowest processing operation in the cluster of participants is completed, which results in poor performance for two-stage commits.
  • Consistency risk: As mentioned earlier, the establishment of two-stage commit is conditional, and consistency problems may occur when the assumption of network stability and downtime recovery is not established. In 1985, Fischer, Lynch and Paterson proposed the “FLP Impossibility principle”, proving that there is no distributed protocol that can correctly achieve consistent results if downtime cannot be recovered eventually. This principle is the same as the “CAP can’t have it all” principle in distributed theory. The consistency risk caused by network stability refers to: Although the commit phase of time is very short, but it is still a clear existence crisis, if the coordinator sends a ready to instruction, according to each participant received back the information to determine the state of affairs can be submitted, the coordinator will be persistent state of things, and submit their own affairs, if the network is suddenly disconnected, No longer able to issue a Commit command to all participants over the network, some data (the coordinator’s) is committed, but some data (the participant’s) is neither committed nor can be rolled back, resulting in data inconsistencies.

In order to alleviate some of the shortcomings of the two-phase Commit protocol, specifically the single point problem of the coordinator and the performance problems in the preparation Phase, the “3 Phase Commit” (3PC) protocol was subsequently developed. The preparation phase of the two-stage commit is subdivided into two phases, called CanCommit and PreCommit, and the commit phase is renamed as DoCommit phase. The new CanCommit is an inquiry phase where the coordinator asks each participating database to evaluate whether the transaction is likely to complete successfully, based on its status. Will be ready to stage two sides is heavy load operation at this stage is the reason, once the coordinator to start preparing the message, each participant will immediately start to write the redo log, they are involved in the data resource that is locked, if one participant said cannot be completed and submitted at this time is equivalent to all white made a round of busywork. Therefore, by adding a round of inquiries, if all responses are positive, the transaction is more likely to commit successfully, which means there is less risk of a rollback due to a crash of one participant committing. Therefore, in scenarios where the transaction needs to be rolled back, performance of the three-segment is usually much better than that of the two-segment, but in scenarios where the transaction can commit normally, performance of both is still poor, and even slightly worse for the three-segment because of one more query.

Also because the rollback probability of transaction failure is smaller, in a three-stage commit, if a coordinator outage occurs after the PreCommit phase, i.e. the participant does not receive a message that can wait for DoCommit, the default operation strategy is to commit the transaction rather than roll back the transaction or wait. This effectively avoids the risk of a single point of problem for the coordinator. Figure 3-2 shows the operation sequence of three-stage submission.

Figure 3-3 Operation sequence of three-stage submission

As you can see from the above process, three-phase commit improves the performance of single point problems and rollback, but it does not improve the consistency risk, and it even slightly increases the risk in this regard. For example, after entering the PreCommit phase, the coordinator issues an Ack but Abort instruction. If some participants fail to receive the coordinator’s Abort instruction until timeout due to network problems, these participants will commit the transaction by mistake, resulting in data inconsistency between different participants.

Distributed transaction

Distributed Transaction in this chapter specifically refers to a Transaction mechanism in which multiple services access multiple data sources simultaneously. Please note the difference between Distributed Transaction and DTP model. The DTP model refers to “distribution” relative to data sources and does not refer to services, as discussed in the section “Global Transactions.” The “distributed” referred to in this section is relative to services, and is more strictly referred to as a “transaction mechanism in a distributed service environment”.

Theory of CAP

  • Consistency: represents that data is consistent with expectations at any time and in any distributed node.

  • Availability: Represents the ability of the system to provide services without interruption.

  • Partition Tolerance: After some nodes lose contact with each other due to network reasons in distributed environment, it forms “network partition” with other nodes. The ability of the system to still provide services normally at this time.

    Discuss why CAP can’t have both?

    1. A small example

    First let’s look at a picture.

Now there are two nodes in the network, N1 and N2, which can communicate with each other. N1 has an application, A, and A database, V. N2 also has an application, B2, and A database, V. Now, A and B are two parts of the distributed system, and V are two sub-databases of the distributed system.

Now here’s the problem. Suddenly, two users, Xiao Ming and Xiao Hua, accessed N1 and N2 at the same time. Our ideal operation would look like this.

(1) Xiao Ming visits N1 node, and Xiao Hua visits N2 node. Accessed simultaneously.

(2) Xiao Ming changed the data V0 of N1 node into V1.

(2) As soon as N1 sees its data changes, it immediately performs operation M and tells N2.

(4) What Xiaohua reads is the latest data. It’s the right data.

This is an ideal scenario. It satisfies three characteristics of CAP theory. Now let’s look at how to understand these three properties of contentment.

3. Verify CAP theory

Since there will always be errors in the system, let’s take a look at what can go wrong.

N1 node updated V0 to V1, and tried to send this message to N1 node through M operation, but a network failure occurred. In this case, both Xiao Ming and Xiao Hua need to access the data at the same time. What should they do? Now we still want our system to have the three CAP features, let’s analyze what will happen.

(1) The network between system A and B is faulty, but system A or system B can still be accessed, so it has fault tolerance.

(2) Xiaoming changed V0 to V1 when he visited node N1, and it was V1 when he wanted Xiaohua to access the V database of node N2. Therefore, it is necessary to update the database of node N2 after the network fault is recovered.

(3) It is impossible for the system to meet the availability requirements during the recovery period. Because availability requires that access to the system be correct and efficient anytime, anywhere. There is a contradiction.

Analyze the different impacts of discarding C, A and P.

If they give up partition fault tolerance (CA without P), means that the partitions are not allowed to happen, all the nodes in the system to the communication between the is always reliable, then you may be able to ensure consistency and availability, but always reliable communication must not stand in the distributed system, it is not a problem do you want to, However, as long as the network is used to share data, the partition phenomenon will always exist, so we cannot abandon the fault tolerance of partition.

Abandoning availability (CP without A) means that we will assume that information synchronization between nodes can be extended indefinitely once the network is partitioned.

If consistency is abandoned (AP without C), it means that we will assume that the data provided between nodes may be inconsistent once partitioning occurs.

At present, the AP system that gives up consistency is the mainstream choice for designing distributed systems, because P is the natural property of distributed networks, which can neither be desired nor discarded. While A is usually the purpose of building distributed systems, if the availability decreases with the increase of the number of nodes, many distributed systems may lose their value. Most systems will not tolerate less availability with more nodes unless services such as banking and securities, which involve money transactions, are interrupted rather than fail.

The original purpose of the topic of “transactions” in this chapter is to achieve “consistency.” In a distributed environment, “consistency” has to be an attribute that is often sacrificed and discarded. But in any case, we build information systems to ensure that the result is at least correct at the end of the delivery, which means that data is allowed to be wrong in the middle (inconsistent) but should be corrected at the end of the output. For this reason, consistency has been redefined. The Consistency we discuss in CAP and ACID is called “Strong Consistency”, sometimes also called “linear Consistency”, while the AP system sacrificing C to get the right result as much as possible is called the pursuit of “weak Consistency”. However, to simply say “weak consistency” means “not guaranteed consistency”… Human language is a really profound thing. Eventual Consistency, a slightly stronger exception to weak Consistency, is something called Eventual Consistency, which means: If the data has not been changed by another operation for a period of time, it will eventually achieve the same result as the strong consistency process. The algorithm for final consistency is sometimes called “optimistic replication algorithm”.

In the topic of this section, “Distributed transactions,” the goal also has to be reduced from the strong consistency pursued by the previous three transaction patterns to the pursuit of “final consistency.” As the definition of consistency changes, the meaning of “transaction once” has also been extended to refer to transactions using ACID as “rigid transactions”, while the common practices of distributed transactions described below are collectively referred to as “flexible transactions”.

Reliable event queue

The ultimate concept of consistency is eBay’s systems architect Dan Pritchett’s 2008 ACM paper “Base: An Acid Alternative, which summarizes An approach to using BASE for consistency purposes independent of the strong consistency obtained by Acid. BASE stands for Basically Available, Soft State, and Eventually Consistent, respectively. The “BASE” phrase takes database scientists’ penchant for acronyms to their heart’s content, but given the catchy phrase, ACID vs BASE, the paper’s impact certainly spread fast enough. I won’t talk more about the concept of BASE here, although it is bad fun to mock it, but the paper itself is very valuable as the conceptual origin of ultimate consistency and a systematic summary of a technical approach to distributed transactions.

We continue with the scenario examples in this chapter to explain Dan Pritchett’s specific approach to “reliable event queues.” Again, the goal is to correctly modify the data in accounts, warehouses, and merchant services during a transaction. Figure 3-7 shows the sequence diagram of the modification process.

Figure 3-7 Sequence diagram of the modification procedure

  1. End users send a transaction request to Fenix’s Bookstore for a $100 copy of Understanding the Java Virtual Machine.
  2. Fenix’s Bookstore should first have a prior assessment of the probability of error for the three operations of user account deduction, merchant account collection and stock goods delivery, and arrange their operation sequence according to the probability of error. Such assessment is generally directly reflected in the program code, and some large systems may also realize dynamic ordering. For example, according to statistics, the most likely transaction anomaly is when a user buys a product but does not agree to a deduction, or the account balance is insufficient. Secondly, warehouses find that goods are not in stock enough to deliver goods; The lowest risk is collection, if the merchant collection link, generally will not give what accident. That order should be arranged into the most error prone first, namely: account deduction → warehouse out → merchant collection.
  3. Deductions account services business, such as deductions, success is in its own database established a message list, into a message: “transaction ID: a UUID, deductions: 100 yuan (status: completed), warehouse outbound” deep understanding of Java virtual machine “: 1 (status:), some merchants payment: 100 yuan (status: In progress), notice that in this step “debit” and “write message” are written to the account service’s own database using the same local transaction.
  4. Set up a message service in the system, poll the message table periodically, and send messages with an “in progress” status to both the inventory and merchant service nodes (it can also be sent serially, with one successful and then the other, but this is not necessary in the scenario we are discussing). The following situations can occur.
    1. Both the merchant and the warehouse service successfully completed the billing and dewarehousing, returned the execution results to the user account server, and the user account service updated the message status from “in Progress” to “completed.” The entire transaction is declared to be successfully completed and the final consistency state is reached.
    2. At least one merchant or warehouse service could not receive a message from the user account service due to network reasons. At this point, because the message stored in the user account server is always in an “in progress” state, the message server will continuously repeat the message to the unresponsive service at each poll. The repeatability of this step dictates that all messages sent by the message server must be idempotent, usually designed to carry a unique transaction ID to ensure that the outgoing and incoming actions in a transaction will be processed only once.
    3. Some or all of the merchant or warehouse services are unable to complete their work, such as the warehouse finding out that the Deep Understanding of the Java Virtual Machine is out of stock, and continue to automatically resend messages until the operation succeeds (such as replenishing new stock) or until human intervention. Therefore, as long as the first step of the reliable event queue business is completed, there is no concept of failure rollback, only success, not failure.
    4. Stores and warehouse service successfully completed the collection and dispatch work, but reply reply message lost due to network reasons, at this point, the user account services will continue to the next message, but due to the operation have idempotence, so will not lead to repeat the payments and outbound, will only lead to the businessman, warehouse server to send a reply message, This process is repeated until the network communication between the two parties is restored.
    5. There are also messaging frameworks that support distributed transactions, such as RocketMQ, which support distributed transactions natively, and these situations 2 and 4 can also be handled by messaging frameworks.

Dan Pritchett wasn’t exactly the first or original solution to reliability through constant retrying. It’s been used so often in other areas of computing that it has a name for it: best-effort Delivery. For example, in TCP, the reliability guarantee of automatically resending packets without receiving ACK response is maximum effort delivery. And reliable event queue there is a more common form, referred to as the “Best to submit” (Best Effort – 1 PC), means the business will most likely to make mistake in the form of a local transaction is completed, adopt the way of retry (not limited to, the message system) to make the same other related business in the distributed transaction is complete.

TCC transaction

Another common Distributed transaction mechanism is TCC, which is short for “try-confirm-cancel” and was written by database expert Pat Helland in his 2007 paper Life Beyond Distributed Transactions: An Apostate’s Opinion.

The reliable message queues described earlier guarantee that the end result is relatively reliable and the process is simple (compared to TCC), but there is no isolation at all. There are some businesses where isolation is irrelevant, and some businesses where the lack of isolation can cause a lot of trouble. Such as in the case of the scene in this chapter, the lack of isolation, will lead to an obvious question is “super sale” : it is entirely possible that two customers in a short period of time was able to buy the same goods, and the number of their own to buy no more than the current inventory, but they buy more than the sum of the number of inventory. If the business requires isolation, then architects should generally focus on TCC solutions, which are naturally suited for distributed transactions that require strong isolation.

In terms of concrete implementation, TCC is complicated. As a highly intrusive transaction scheme, it requires that the business process must be split into two sub-processes: “reserving business resources” and “confirming/releasing consumption resources”. As the name suggests, TCC is divided into the following three phases.

  • Try: In the execution phase, all business executability checks are completed (to ensure consistency) and all required business resources are reserved (to ensure isolation).
  • Confirm: No service check is performed during the execution phase. Resources prepared during the Try phase are used to complete services. The Confirm phase is likely to be repeated and therefore idempotent.
  • Cancel: Cancels the execution phase and releases service resources reserved during the Try phase. The Cancel stage may be repeated and also needs to be idempotent.

For our scenario example, the TCC execution process should look like Figure 3-8.

Figure 3-8 TCC execution process

  1. End users send a transaction request to Fenix’s Bookstore for a $100 copy of Understanding the Java Virtual Machine.
  2. Create transaction, generate transaction ID, log in activity log, enter Try phase:
    • User service: Check the service feasibility. If it is feasible, set the 100 yuan of the user to the “frozen” state and notify the next step to enter the Confirm stage. If not, notify the next stage of Cancel.
    • Warehouse service: check the service feasibility. If feasible, set one copy of In-depth Understanding of Java Virtual Machines in the warehouse to the frozen state and notify the next step to enter the Confirm stage. If not, notify the next stage of Cancel.
    • Merchant services: Check business viability without freezing resources.
  3. If all services in Step 2 are available, record the status as Confirm in the activity log to enter the Confirm phase:
    • Customer service: complete the business operation (minus the frozen 100 yuan).
    • Warehouse service: complete the business operation (mark the 1 frozen book as out of stock, deduct the corresponding inventory).
    • Merchant service: complete business operation (collect 100 yuan).
  4. If all steps in Step 3 are completed, the transaction is declared to be normal. If any exception occurs in step 3, the Confirm operation of the service will be repeated according to the records in the activity log, that is, maximum effort delivery.
  5. If either party in Step 2 gives feedback that the service is not feasible, or either party times out, record the status of the activity log as Cancel and enter the Cancel stage:
    • User service: Cancel service operations (release the frozen 100 yuan).
    • Warehouse service: Cancel business operation (release frozen 1 book).
    • Business services: cancel business operations (crying a comfort business is not easy to make a living).
  6. If all steps in Step 5 are completed, the transaction is declared to end with fail-rollback. If any exception occurs on either side in Step 5, no matter it is a service exception or a network exception, the Cancel operation of the service will be repeated according to the records in the activity log, that is, maximum effort delivery will be performed.

As you can see, TCC is similar to the preparation and commit phases of 2PC, but TCC is located at the user code level rather than the infrastructure level, which gives it the flexibility to implement and design the granularity of resource locking as needed. TCC only operates reserved resources during service execution, and almost does not involve locks or resource contention, thus has high performance potential. But pure TCC is not only benefits, it also brings a higher development costs and business invasive, means has a higher development costs and replacement cost of replacement transaction implementation scheme, so, usually, we will not be fully accomplished by naked coding TCC, but on some distributed transaction middleware (such as ali open source Seata) to complete, Minimize some of the coding effort.

SAGA transaction

TCC transactions are highly isolated, avoid the problem of “overbooking,” and generally have the highest performance of the flexible transaction patterns covered in this article, but they still do not satisfy all scenarios. The main limitation of TCC is that it is very intrusive. This is not a repeat of the previous section about the amount of work it requires to develop code coordination, but more about the constraints of technical control that it requires. Cases, for example, our scene changes are as follows: because of China’s online payment is increasingly popular, now users and merchants in a bookstore system can choose to open a prepaid phone account no longer, at least not importune must prepaid phone into the system from the bank first to consumption, allow you to directly through the U shield when shopping or scan code to pay, in the bank account transfer payment for goods. This requirement fully conforms to the prevailing status quo of online payment in China, but adds additional restrictions to the transaction design of the system: If the account balance of users and businesses is managed by the bank, its operation authority and data structure can no longer be defined arbitrarily, and it is usually impossible to complete such operations as freezing funds, unfreezing and deducting, because the bank generally will not cooperate with your operation. So the first Try phase in TCC often doesn’t work. We can only consider another flexible transaction scheme: SAGA transactions. SAGA is a long story, a long narrative, a long series of events.

The SAGA transaction pattern has a long history and predates the concept of distributed transactions. It comes from a 1987 ACM paper by Hector Garcia-Molina and Kenneth Salem of Princeton University, “SAGAS” (that’s the full name of the paper). In this paper, we propose a method to improve the operation efficiency of “Long Lived transactions” by splitting a large Transaction into a set of sub-transactions that can be run interlacing. SAGA was originally designed to avoid large transactions locking up database resources for long periods of time, but has since evolved into a design pattern for breaking large transactions into a series of local transactions in a distributed environment. SAGA consists of two operations.

  • The large transaction split several small transactions and decomposed the whole distributed transaction T into N sub-transactions named T1, T2… , Ti,… , Tn. Each subtransaction should be or can be treated as an atomic behavior. If a distributed transaction can commit normally, its impact on the data (final consistency) should be equivalent to a sequential successful Ti commit.
  • Design the corresponding compensation action for each subtransaction, named C1, C2… , Ci,… , Cn. Ti and Ci must meet the following conditions:
    • Both Ti and Ci are idempotent.
    • Ti and Ci are Commutative, that is, Ti or Ci are Commutative first.
    • The Ci must be successfully committed, that is, regardless of the case of the Ci itself being rolled back. If the Ci fails, it must be continuously retried until it succeeds, or manual intervention is required.

If T1 through Tn commits successfully, the transaction completes successfully, otherwise, one of the following recovery strategies is adopted:

  • Forward Recovery: If the Ti transaction fails to commit, the Ti is retried until it succeeds (maximum effort delivered). This type of recovery does not require compensation and is suitable for situations where the transaction is ultimately successful, such as the shipment of goods to someone else after the payment is deducted from someone else’s bank account. The forward recovery can be performed in the following modes: T1, T2… , Ti (failed), Ti (retry)… , Ti + 1,… , Tn.
  • Backward Recovery: If the Ti transaction commit fails, Ci is performed to compensate Ti until success (maximum effort delivered). The requirement here is that the Ci must execute successfully (after continuous retries). The reverse recovery mode is T1, T2… , Ti (failure), Ci (compensation)… C2, C1.

In contrast to TCC, SAGA does not need to design frozen states and unfrozen operations for resources, and compensation operations tend to be much easier to implement than freezing operations. For example, the aforementioned account balance is maintained directly in the bank, and the payment is transferred from the bank to Fenix’s Bookstore system. This step prompts the bank to provide services through the user’s payment operation (scan code or U shield). If the subsequent business operation fails, although we cannot ask the bank to cancel the previous user transfer operation, it is completely feasible for Fenix’s Bookstore system to transfer the payment to the user’s account as a compensation measure.

SAGA must ensure that all subtransactions are committed or compensated, but the SAGA system itself may crash, so it must be designed with a database like logging mechanism (called SAGA Log) to ensure that the system can track the execution of the subtransactions, such as the execution and compensation of the steps. In addition, although compensation operation is usually easier than freeze/cancellation, but guarantee the forward and reverse recovery process can be rigorous also need to spend a lot of time, such as through the service choreography, reliable event queue to complete, so the SAGA transaction is generally accomplished by naked coding will not, generally is also done on the basis of the transaction middleware, Seata, mentioned earlier, also supports the SAGA transaction mode.

The idea of data compensation instead of rollback can also be applied to other transaction schemes. For example, the “AT Transaction mode” proposed by Alibaba’s GTS (Global Transaction Service, Seata derived from GTS) is one such application.

Overall is AT a transaction is a reference to the XA two paragraphs of the protocol implementation to submit but for XA 2 PC defects, namely in the run-up to have to wait for all data sources are returned after the success, unified coordinator to issue a Commit command in the barrel effect (all locks and resources involved in need to wait until the slowest after the completion of the transaction to unified release). Targeted solutions are designed. The general approach is to automatically intercept all SQL data when service data is submitted, save snapshots of the SQL data before and after modification, generate row locks, and submit the data to the data source of the operation together through local transactions, which is equivalent to automatically recording redo and rollback logs. If the distributed transaction is successfully committed, then the corresponding log data in each data source can be cleaned subsequently. If a distributed transaction needs to be rolled back, “reverse SQL” is automatically generated for compensation based on log data. With this compensation approach, each data source involved in a distributed transaction can be committed individually, and locks and resources are immediately released. This asynchronous commit mode greatly improves the throughput level of the system compared to 2PC. The price is a huge loss of isolation, even directly affecting atomicity. Compensation instead of rollback is not always successful in the absence of isolation. For example, after a local transaction is committed and before a distributed transaction is completed, the data is compensated and modified by other operations, which is called Dirty Wirte. In this case, once a distributed transaction needs to be rolled back, it is no longer possible to implement compensation through automatic reverse SQL, but only human intervention.

In general, dirty writes must be avoided, and all traditional relational databases are still locked at the lowest isolation level to avoid dirty writes, because dirty writes can be very difficult to handle manually once they occur. Therefore, GTS adds a “Global Lock” mechanism to achieve write isolation, requiring local transactions to be committed only after obtaining the Global Lock on the modification record, and must wait until the Global Lock is obtained. This design sacrifices certain performance. Avoid having two distributed transactions containing local transactions that modify the same data, thus avoiding dirty writes. In Read isolation, the default isolation level for AT transactions is Read Uncommitted, which means that Dirty reads can occur. It is possible to solve the read isolation problem with a global lock, but it is expensive to block reads directly, which is not usually done. Therefore, there is no package solution to solve all problems in distributed transactions, and the only effective way is to choose the appropriate transaction processing scheme according to local conditions.

Transaction FaQs

MVCC read Committed level overbooked problem (Type 2 update loss problem)

Type 2 update loss problem: Type 2 update loss problem occurs when two or more transactions query the same record and then each update the record based on the results of the query. Each transaction is unaware of the existence of the other transactions, and changes made to the record by the last transaction override changes made to the record before the other transactions.

SELECT quantity FROM books WHERE id=1 /* SELECT * FROM books WHERE id=1 T1 */ SELECT quantity FROM books WHERE id=1 Commit */ UPDATE books SET quantity=2 WHERE id=1 /* UPDATE books SET quantity=2 WHERE id=1 Commit */ UPDATE books SET quantity=2 WHERE id=1 /* 4, transaction: T2 */ commitCopy the code

The problem with the oversold case above is that each transaction is not aware of the existence of other transactions. The solution is:

  1. Exclusive locks are added to SELECT
SELECT quantity FROM books WHERE id=1 FOR UPDATE; */ SELECT quantity FROM books WHERE id=1 FOR UPDATE; UPDATE books SET quantity=2 WHERE id=1 /* UPDATE books SET quantity=2 WHERE id=1 T1 */ commit UPDATE books SET quantity=2 WHERE id=1 /*Copy the code

SELECT … MySQL > select * from ‘FOR UPDATE’; select * from ‘FOR UPDATE’;

For index records, the search locks rows and any associated index entries as if an UPDATE statement were issued on those rows. Other transactions cannot update these rows, cannot perform SELECT... Locked in shared mode, or when data is read from some transaction isolation level. Consistent read ignores any locks set on records that exist in the read view. (Old version records cannot be locked; They can be refactored by applying undo logs to recorded memory copies.Copy the code

SELECT is not used here… LOCK IN SHARE MODE Adds a shared LOCK to a query because using a shared LOCK causes a deadlock during UPDATE operations.