Welcome to github.com/hsfxuebao/j… , hope to help you, if you feel ok, please click on the Star

1, an overview of the

A transaction can consist of a very simple SQL statement or a complex set of SQL statements. A transaction is a program execution unit that accesses and updates various data items in a database. Operations in a transaction are either modified or not, which is the purpose of the transaction and one of the important characteristics that distinguishes the transaction model from the file system. In theory, a transaction is very strictly defined and must satisfy four properties, commonly known as the ACID property of a transaction. It is important to note that although strict transaction requirements are theoretically defined, database vendors do not strictly meet the ACID standard for transactions for various purposes. For example, for MySQL’s NDB Cluster engine, although it supports transactions, it does not meet the requirements of D, that is, persistence. For Oracle databases, the default transaction isolation level is READ COMMITTED, which does not meet the requirement of I (isolation). While in most cases this will not result in serious results and may even lead to performance gains, users first need to be aware of rigorous transaction standards and avoid potential problems in real-world production applications. For the InnoDB storage engine, the default transaction isolation level is READ REPEATABLE, which fully follows and satisfies the ACID characteristics of transactions. Here, the ACID properties of transactions are introduced and related concepts are given. A, Atomicity. In computer systems, everyone takes atomicity for granted. For example, if you call SQRT in C, it will either return the correct square root value or the wrong code, without changing any data structures or parameters in unexpected circumstances. If the SQRT function is called by many programs, the return value of one program will not be the square root computed by other programs. However, the atomicity of invoking operations in data transactions is not so obvious. For example, a user withdraws money from an ATM. Assume that the withdrawal process is as follows:

  1. Log in to the ATM platform and verify the password.
  2. Get account information from the remote bank’s database
  3. The user enters the amount to withdraw at the ATM.
  4. Update account information from the remote bank’s database.
  5. ATM machine out
  6. The user withdraws money.

The entire withdrawal process should be treated as an atomic operation, that is, do it all or do it none. The user’s money has not been obtained from the ATM, but the money on the bank card has been deducted, I believe that this is not acceptable to anyone. The atomicity of the operation can be guaranteed by the object model. Atomicity means that the entire database transaction is an indivisible unit of work. The entire transaction succeeds only if all database operations in the transaction succeed. If any SQL statement in the transaction fails to execute, the SQL statement that has successfully executed must also be destroyed, and the database state should be returned to the state before the transaction was executed. If the operations in a transaction are read-only, it is easy to maintain atomicity. Once any errors occur, either retry or return the error code. Because a read-only operation does not change any relevant part of the system, it may not be as simple as a read-only operation when an operation in a transaction needs to change the state of the system, such as inserting or updating records. If the operation fails, it is likely to cause a state change, so you must protect concurrent users of the system from accessing the affected part of the data. C(consistency). Consistency refers to the transaction that changes the database from one state to the next consistent state. The database integrity constraints are not broken before and after a transaction. For example, if a field in a table is a name, the constraint is unique, that is, the name cannot be repeated in the table. If a transaction modifies the name field, but the names in the table become non-unique after the transaction commits or the transaction operation is rolled back, this breaks the consistency requirement of the transaction, that is, the transaction changes the database from one state to an inconsistent state. Therefore, a transaction is a unit of consistency, and if an action in a transaction fails, the system can automatically undo the transaction – returning to its initialized state. L (isolation), isolation. There are other names for isolation, such as concurrency Control, serializability, locking, etc. Transaction isolation requires that each read-write transaction object be separated from the other transaction’s operation object, that is, the transaction is not visible to other transactions until committed. This is usually achieved using locks. The current database system provides a granular lock strategy, which allows transactions to lock only a subset of entity objects to improve the concurrency between transactions. D(durability). Once a transaction is committed, the result is permanent. Even in the event of a failure such as an outage, the database can recover data. It is important to note that the permanence of the result can only be guaranteed from the perspective of the transaction itself. For example, after a transaction is committed, all changes are permanent. Even when the database needs to be recovered due to a crash, it is guaranteed that no data committed after recovery will be lost. However, if the database is faulty due to external causes, such as RAID card damage or natural disasters, all submitted data may be lost. Therefore, persistence guarantees High Reliability, not High Availability, of the transaction system. For the implementation of high availability, the transaction itself is not guaranteed, and some systems need to cooperate to complete it.

2, classification,

From the perspective of transaction theory, transactions can be divided into the following types:

  • Flat Transactions
  • Flat Transactions with Savepoints
  • Chained Transactions
  • Nested Transactions
  • Distributed Transactions

Flat Transaction is one of the simplest Transaction types, but it is probably the most frequently used Transaction in a real production environment. In a flat transaction, all operations are at the same level, starting with BEGIN WORK and ending with COMMIT WORK or ROLLBACK WORK. The operations in between are atomic and are either executed or rolled back. So flat transactions are the basic building blocks for applications to become atomic operations. The following figure shows three different results for flat transactions.

The figure shows three cases of flat transactions and the approximate percentage of each result in a typical transaction processing application. Again, flat transactions, while simple, are most frequently used in real production environments. Because of its simplicity and frequency of use, flat transactions are supported by every database system. The main limitation of flat transactions is that you cannot commit or roll back a portion of the transaction, or commit in several steps. Here is an example where flat transactions are not enough to support. For example, users can make their own travel plans on travel websites. The user imagines a trip from Hangzhou to Florence, Italy, where there is no direct flight between the two cities, requiring the user to book and transfer to another flight, or to take a train and wait. The user’s transaction for booking travel and vacation is: S2: Take a plane from Hangzhou to Shanghai, book a flight to Milan S3: Transfer a train from Milan to Florence, book a train to Florence, but when the user performs S3, he finds that due to the late arrival time of the plane in Milan, there is no train available on the same day. The user wants to stay in Milan for one night and head to Florence the next day. If the transaction is flat, then the previous three operations S1, S2, and S3 need to be rolled back, which is a bit expensive. Because when the transaction is repeated, the execution plan for S1 and S2 is unchanged. That is, if planned rollback operations are supported, there is no need to terminate the entire transaction. Hence flat transactions with savepoints. Flat Transactions with Savepoint, in addition to supporting operations supported by Flat Transactions, allows rollback to an earlier state in the same transaction during the execution of the transaction. This is because some transaction may go wrong during execution without invalidating all operations, and it is not desirable or expensive to abandon the entire transaction. Savepoints are used to inform the system that it should remember the current state of the transaction so that if an error occurs later, the transaction can return to that state at the Savepoint.

For flat transactions, a savepoint is implicitly set. However, there is only this savepoint in the entire transaction, so rollback can only be rolled back to the state at the beginning of the transaction. A savepoint is established using the SAVE WORK function to inform the system to record the current processing status. When something goes wrong, savepoints can be used as an internal restart point, depending on the application logic, to decide whether to go back to the most recent savepoint or some other earlier savepoint. The following figure shows the use of savepoints in a transaction.

The figure above shows how to use savepoints in a transaction. The operations in gray background indicate the operations that have not been performed due to ROLLBACK WORK. When a transaction is started with BEGIN WORK, a savepoint is implicitly included. When the transaction issues a partial ROLLBACK command through ROLLBACK WORK:2, the transaction is rolled back to savepoint 2, which is then executed, and again to ROLLBACK WORK:7, until the final COMMIT The WORK operation, which indicates that the transaction has ended and all operations have been performed and committed except those in the gray shaded area.

Another point to note is that savepoints are incremented within a transaction, as you can see from Figure 72. One might think that after going back to savepoint 2, the next savepoint would be 3, since all the previous work has been terminated. The new savepoint number, however, is 5, which means that ROLLBACK does not affect the count of savepoints, and that monotonically increasing numbers maintain the entire history of transaction execution, including changes of mind during execution.

Also, when a transaction issues partial ROLLBACK commands through the ROLLBACK WORK:2 command, remember that the transaction is not completely rolled back, just to savepoint 2. This indicates that the transaction is still active. If you want to ROLLBACK the transaction completely, you need to execute the ROLLBACK WORK command.

Chained TransactionCan be regarded as a variation of savepoint pattern. Flat transactions with savepoints disappear in the event of a system crash because their savepoints are volatile rather than persistent. This means that when a restore occurs, the transaction needs to be re-executed from the beginning and cannot continue from the most recent savepoint.

The idea of chained transactions is that when a transaction commits, unneeded data objects are released, implicitly passing the necessary processing context to the next transaction to be started. Note that the commit transaction and the start of the next transaction are combined into one atomic operation. This means that the next transaction will see the result of the previous transaction as if it had been done in one transaction. The following diagram shows how chain transactions work:

Chained transactions differ from flat transactions with savepoints in that flat transactions with savepoints can be rolled back to any correct savepoint. Rollback in chained transactions is limited to the current transaction, that is, only to the most recent savepoint. Locks are also handled differently. Chained transactions release locks held by the current transaction upon a COMMIT, while flat transactions with savepoints do not affect locks held so far.

Nested transactions is a hierarchical framework. A top level transaction controls all levels of transactions. The nested transactions below the top level are called subtransactions, which control each local transformation. The hierarchy of nested transactions is shown in figure 1.

Moss’s definition of nested transactions is given below

  1. A nested transaction is a tree composed of several transactions, and the subtree can be either nested transactions or flat transactions.
  2. A transaction at a leaf node is a flat transaction. However, the distance from the root to the leaf node can be different for each subtransaction.
  3. Transactions at the root node are called top-level transactions, and other transactions are called subtransactions. Predecessor of transactions (parent transactions), the next layer of transactions is called son transactions (CHID)
  4. Subtransactions can be either committed or rolled back. But its commit operation does not take effect immediately unless its parent transaction has already committed. It follows, therefore, that any child is committed only after the top-level transaction is committed.
  5. The rollback of any transaction in the tree will cause all its sub-transactions to be rolled back together. Therefore, sub-transactions only retain the features of A, C, and I, but do not have the features of D.

In Moss’s theory, the actual work is done by the leaf node, that is, only the transactions of the leaf node can access the database, send messages, and obtain other types of resources. The higher-level transactions are only responsible for logical control, deciding when to invoke the related subtransactions. Even if a system does not support nested transactions, users can simulate nested transactions using savepoint techniques, as shown in the figure below.

As you can also see from Figure 7-5, using savepoint techniques for recovery provides more flexibility than nested queries. For example, when completing the Tk3 transaction, you can roll back to the state of savepoint S2. In a nested query hierarchy, this is not allowed. However, using savepoint techniques to simulate nested transactions is somewhat different from nested queries in terms of lock holding. When using savepoint techniques to simulate nested transactions, the user cannot choose which locks need to be inherited by the quilt transaction and which locks need to be retained by the parent transaction. This means that no matter how many savepoints there are, all locked objects can be retrieved and accessed. In nested queries, different subtransactions hold different locks on database objects. For example, if you have a parent transaction P1 that holds exclusive locks on objects X and Y, and you want to start a call to child transaction P1, then the parent transaction P1 can either pass no locks, pass all locks, or pass only one exclusive lock. When the child transaction P1 also holds the exclusive locks of object Z, the parent transaction P1 holds the exclusive locks of X, Y, and Z through counter-inheritance. If a subtransaction P2 is called again, it can choose to pass the lock already held there. However, if the system supports parallel execution of individual sub-transactions within nested transactions, in this case, flat transactions with savepoints to simulate nested transactions are impractical. This is another reflection of the need for truly supported nested transactions in order to achieve parallelism between transactions. Distributed Transactions are typically flat Transactions that run in a Distributed environment and therefore need to access different nodes in the network based on where the data is located. For example, the cardholder transfers 10,000 yuan from the savings card of China Merchants Bank to the savings card of Industrial and Commercial Bank of China. In this case, the ATM machine can be regarded as node A, the background database of China Merchants Bank as node B, and the background database of INDUSTRIAL and Commercial Bank of China as C. This transfer operation can be divided into the following steps:

  1. Node A sends the transfer command
  2. Node B subtracts 10000 from the balance in the savings card.
  3. Node C executes the balance in the savings card plus 10000
  4. Node A notifies the user that the operation is complete or node A notifies the user that the operation fails.

A distributed transaction is required because node A cannot call A single database to complete the task. It needs to access the databases of two nodes in the network, and the transactions performed in the databases of each node are flat. For distributed transactions, ACID properties are also required, and either all occur or all fail. For the example above, if any of the steps 2) or 3) fails, the entire distributed transaction will be rolled back. Otherwise, the results would have been terrible. For InnoDB storage engine, it supports flat transactions, transactions with savepoints, chained transactions, and distributed transactions. Nested transactions are not natively supported, so the MySQL database or InnoDB storage engine is helpless for users who need parallel transactions. However, users can still simulate serial nested transactions through transactions with savepoints.