Column series: MySQL column series

The transaction

A Transaction is a logical processing unit in the execution of a database system. It can consist of a single SQL statement or a group of complex SQL statements. Either all operations in a transaction are modified or none are, which is the purpose of a transaction.

Prepare the following account table for later testing:

CREATE TABLE `account` (
  `id` bigint(11) NOT NULL AUTO_INCREMENT COMMENT 'primary key',
  `card` varchar(60) NOT NULL COMMENT 'number',
  `balance` int(11) NOT NULL DEFAULT '0' COMMENT 'the balance'.PRIMARY KEY (`id`),
  UNIQUE KEY `account_u1` (`card`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Account sheet';
Copy the code

Transaction characteristics

As a standard, a transaction must satisfy all four properties, namely the ACID property of the transaction. It is these features that ensure the security of database transactions. However, database vendors may not strictly meet the ACID standard for transactions for various purposes.

  • Atomicity
  • Consistency
  • Isolation
  • “Durability”

In MySQL, MyISAM storage engine does not support transactions, InnoDB storage engine fully follows and satisfies the ACID properties of transactions under the default READ REPEATABLE(RR) isolation level, so subsequent research on transactions is based on InnoDB storage engine.

Atomicity

Atomicity refers to the fact that all operations in a database transaction are indivisible units. Only when all database operations in a transaction are successfully executed, the whole transaction is considered to be successful. If any SQL statement in the transaction fails to execute, the SQL statement that has successfully executed must also be destroyed, and the database state should be returned to the state before the transaction was executed.

For example, if there is A transfer operation, A transfers 100 to B:

BEGIN;
UPDATE account SET balance = balance - 100 WHERE card = "A";
UPDATE account SET balance = balance + 100 WHERE card = "B";
COMMIT;
Copy the code

The transfer operation must be an atomic operation, A minus 100, B plus 100, either successful or rolled back, no intermediate state, any SQL failure, rolled back to the state before the transaction.

What we think of as two SQL update statements is that at the database level, these two SQL statements involve many operations. As you learned from Buffer pools, you need to load the data page from disk into the Buffer Pool, update the page in memory, add the page to the Flush list, and Flush the dirty page at some point. If any of these steps fail (for example, if the database is down), you need to roll back.

So atomicity is to ensure that the entire operation of the database is atomic, and any step that fails is undone. InnoDB has a complex underlying mechanism to ensure atomicity of database operations, restoring operations to the way they were before they were executed, which we will explain in a later article.

Consistency

Consistency refers to the transaction that changes the database from one state to the next consistent state. The database integrity constraints are not broken before and after a transaction.

For example, the CARD field in the Account table is unique, and regardless of how the field is modified, the data in the Card field remains unique after a transaction is committed or rolled back. If it becomes non-unique, it breaks the consistency requirement of the transaction.

mysql> INSERT INTO account(card, balance) values ("A", 1000);
1062 - Duplicate entry 'A' for key 'account_u1'
Copy the code

To ensure the consistency of data in the database, there are two main aspects:

  • The database itself guarantees some consistency: the MySQL database itself can establish some constraints, such as creating primary keys for tables, unique indexes, foreign keys, declaring a column NOT NULL, etc.

  • The business layer ensures consistency: In more cases, constraints in specific business scenarios are complex, and constraints on database establishment will have a certain loss in database performance. So we often do consistency checks on data at the business code level.

Isolation

Isolation has other names, such as concurrency control, serializability, locking, etc. Transaction isolation requires that each read-write transaction object be separated from the other transaction’s operation object, that is, the transaction is not visible to other transactions until committed. This is usually achieved using locks.

Dirty reads, unrepeatable reads, and phantom reads can occur when multiple transactions are executing on the database at the same time, which we’ll discuss later.

“Durability”

Persistence requires that once a transaction is committed, its results be permanent. Even in the event of a failure such as an outage, the database can recover data.

It is important to note that persistence ensures high reliability of a transactional system, not high availability. The transaction itself guarantees the permanence of the result, and after the transaction is committed, all changes are permanent. However, if the database fails due to external factors such as disk corruption or natural disasters, all committed data may be lost. For the implementation of high availability, the transaction itself is not guaranteed, and some systems need to cooperate to complete it.

Transaction class

From the perspective of transaction theory, transactions can be divided into the following types:

  • Flat Transactions
  • Flat Transactions with Savepoints
  • Chained Transactions
  • Nested Transactions
  • Distributed Transactions

For InnoDB storage engine, it supports flat transactions, transactions with savepoints, chained transactions, and distributed transactions. There is no native support for nested transactions.

1. Flat transactions

Flat transactions are the simplest of the transaction types and the most frequently used. In flat transactions, all operations are at the same level, starting with BEGIN/START TRANSACTION and ending with COMMIT or ROLLBACK, and the operations in between are atomic.

Flat transactions with savepoints

Flat transactions with savepoints allow rollback to an earlier state in the same transaction during the execution of the transaction. Savepoints can be set up during a transaction to inform the system that it should remember the current state of the transaction, so that if an error occurs later, the transaction can return to that state at the Savepoint.

For flat transactions, a savepoint is implicitly set at the beginning of the transaction. Flat transactions have only one savepoint, so rollback can only be rolled back to the state at the beginning of the transaction.

TO create a SAVEPOINT, ROLLBACK TO SAVEPOINT.

3. Chain transactions

A chained transaction is a transaction that automatically passes the context to the next transaction at commit time. That is to say, the commit of one transaction is atomic to the start of the next transaction, and the next transaction can see the result of the previous transaction as if it were done in one transaction.

Chained transactions can be considered a variant of the savepoint pattern, except that flat transactions with savepoints can roll back to any correct savepoint, whereas rollback in chained transactions is limited to the current transaction.

MySQL chained transactions can be opened with SET Completion_type = 1, as illustrated below.

4. Nested transactions

A nested transaction is a hierarchical framework in which a top-level transaction controls transactions at each level. The transactions nested below the top level are called subtransactions, which control each local transformation. It doesn’t commit after the child commits, it doesn’t commit until the parent commits, the parent rolls back, it rolls back all the child transactions.

MySQL does not support nested transactions, but we can simulate serial nested transactions with transactions with savepoints.

5. Distributed transactions

A distributed transaction is usually a flat transaction that runs in a distributed environment and requires access to different nodes in the network based on where the data is located. There will be a special series to learn about distributed transactions later.

Transaction control statement

Open the transaction

You can use BEGIN [WORK] or START TRANSACTION; Displays to start a transaction.

In stored procedures, the MySQL database parser automatically recognizes BEGIN as BEGIN… END, so only a START TRANSACTION statement can be used to START a TRANSACTION in a stored procedure.

START TRANSACTION can be followed by several modifiers:

  • READ ONLY: Identifies that the current transaction is a read-only transaction in which database operations can only read data, not modify it.
  • READ WRITE: Identifies the current transaction as a read-write transaction in which database operations can read or modify data.
  • WITH CONSISTENT SNAPSHOT: Starts consistent read.

If the access mode for a transaction is not explicitly specified, the access mode for the transaction defaults to READ WRITE.

For example, after read-only transactions are enabled, data cannot be modified:

mysql> START TRANSACTION READ ONLY;
Query OK, 0 rows affected (0.00 sec)

mysql> UPDATE account SET balance = balance - 100 WHERE id = 100;
1792 - Cannot execute statement in a READ ONLY transaction.
Copy the code

Commit the transaction

You can use COMMIT [WORK] to display committed transactions with or without WORK.

1. Automatically commit transactions

By default on the MySQL command line, transactions are committed automatically, that is, after an SQL statement is executed, the COMMIT operation is automatically performed. To explicitly START a TRANSACTION, use the commands BEGIN or START TRANSACTION, or run the command SET Autocommit =O to disable automatic commit.

Commit transactions implicitly

When a TRANSACTION is started using START TRANSACTION or BEGIN, or the system variable autoCOMMIT is set to OFF, the TRANSACTION will not commit automatically. However, some database operations commit transactions implicitly and do not need to start a transaction.

Common statements that implicitly commit transactions include:

  • The data Definition Language (DDL) for defining or modifying database objects,CREATE, ALTER, DROP, ALTERAnd so on.
  • withSTART TRANSACTIONBEGINWhen a transaction is started, the previous transaction is automatically committed.
  • Other such asANALYZE TABLE, FLUSH, OPTIMIZE TABLE, REPAIR TABLEEtc statements also commit transactions implicitly.

3. Transaction commit type

We can control post-commit behavior with the parameter Completion_type, which has three values:

  • 0/NO_CHAINDefault for:NO_CHAIN, indicates that no operation is performed after the COMMIT.
  • 1/CHAIN: set to1orCHAINWhen,COMMITIs equivalent toCOMMIT AND CHAINTo automatically start a new transaction with the same isolation level immediately after the transaction is committed.
  • 2/RELEASE: set to2orRELEASEWhen,COMMITIs equivalent toCOMMIT AND RELEASEIs automatically disconnected from the server after the transaction is committed.

As you can see, completion_type defaults to NO_CHAIN(0) :

mysql> SHOW VARIABLES LIKE 'completion_type';
+-----------------+----------+
| Variable_name   | Value    |
+-----------------+----------+
| completion_type | NO_CHAIN |
+-----------------+----------+
Copy the code

The Completion_type parameter only affects transactions at the beginning of BEGIN or START TRANSACTION and at the end of COMMIT or ROLLBACK. It does not affect transactions that are automatically committed (AUTOCOMMIT=1).

  • whencompletion_typeSet to1when

This becomes the chain transaction described earlier. For example, in the following operation, a new transaction is started immediately after the COMMIT, so that “B” can be rolled back. Zero does not roll back. In the end, the query only inserts “A”.

COMMIT [WORK] AND CHAIN; To do this without setting Completion_type =1.

mysql> TRUNCATE account;
Query OK, 0 rows affected (0.01 sec)

mysql> SET completion_type = 1;
Query OK, 0 rows affected (0.00 sec)

mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO account(card) VALUES ("A");
Query OK, 1 row affected (0.01 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO account(card) VALUES ("B");
Query OK, 1 row affected (0.00 sec)

mysql> ROLLBACK;
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT * FROM account;
+----+------+---------+
| id | card | balance |
+----+------+---------+
|  1 | A    |       0 |
+----+------+---------+
Copy the code
  • whencompletion_typeSet to2when

In this case, the connection will be disconnected after the COMMIT operation, and a disconnection error will be reported after the operation. Some clients will automatically try to reconnect.

mysql> SET completion_type = 2;
Query OK, 0 rows affected (0.00 sec)

mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO account(card) VALUES ("C");
Query OK, 1 row affected (0.00 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM account WHERE card = "C";
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    3
Current database: test

+----+------+---------+
| id | card | balance |
+----+------+---------+
|  1 | C    |       0 |
+----+------+---------+
1 row in set (0.00 sec)
Copy the code

Terminate the transaction

You can use ROLLBACK [WORK] to terminate transactions and undo ongoing uncommitted changes.

Note that the ROLLBACK statement is used to ROLLBACK the transaction manually. If the transaction encounters some error during execution and cannot continue, the transaction will be rolled back automatically.

The savepoint

The operation method of savepoints is as follows:

  • Create savepoint:SAVEPOINT <identifier>
  • Delete savepoint:RELEASE SAVEPOINT <identifier>
  • Rollback to savepoint:ROLLBACK TO [SAVEPOINT] <identifier>

Note that ROLLBACK TO SAVEPOINT simply rolls back TO a specified SAVEPOINT. It does not actually end a transaction. You also need TO explicitly run COMMIT or ROLLBACK at the end.

For example, the following operation,

mysql> TRUNCATE account;
Query OK, 0 rows affected (0.01 sec)

mysql> BEGIN;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO account(card) VALUES("A");
Query OK, 1 row affected (0.00 sec)

mysql> SAVEPOINT P1;
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO account(card) VALUES("B");
Query OK, 1 row affected (0.00 sec)

mysql> ROLLBACK TO P1;
Query OK, 0 rows affected (0.00 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT * FROM account;
+----+------+---------+
| id | card | balance |
+----+------+---------+
|  1 | A    |       0 |
+----+------+---------+
1 row in set (0.00 sec)
Copy the code

Concurrent transactions

Problems with concurrent transactions

Our business system is often multi-threaded concurrent execution of multiple transactions, database level will also be concurrent execution of multiple transactions, so the same data may be queried and modified. Since it is concurrent, like multithreading in Java, there are thread safety issues.

Concurrent transactions involve four problems: dirty writes, dirty reads, unrepeatable reads, and phantom reads. The problems are listed in order of severity: dirty write > dirty read > Unrepeatable read > phantom read.

Dirty read

If one transaction A reads uncommitted data that has been modified by another transaction B, the read from transaction A is A dirty read because the data read by transaction A is non-persistent.

For example, according to the following timeline: Transaction B updates the data, transaction A reads the uncommitted data of transaction B, but at time T5, transaction B rolls back this operation, and the data queried by transaction A is dirty data. If you continue to use this dirty data for services, there will be problems.

Timeline Session A Session B
t1 BEGIN; BEGIN;
t2 Query balance is 100; Query balance is 100;
t3 The balance increases by 100;
t4 Query is about 200;
t5 ROLLBACK;
t6 COMMIT;
t7 Transaction A reads uncommitted data from transaction B, causing dirty data to be read

Unrepeatable read

In the case of no dirty read, if a transaction reads the same data repeatedly and the data is inconsistent, it indicates that a non-repeatable read problem occurs. That is, the same data cannot be read repeatedly, violating the requirements of database transaction consistency.

For example, the following timeline is executed: in transaction A, the first query is 100, at this time, transaction B changes the balance and commits the transaction, the second query of transaction A reads the data that has been committed by transaction B. In the same transaction, the results of the two queries are inconsistent, which means that the data cannot be read repeatedly.

Timeline Session A Session B
t1 BEGIN; BEGIN;
t2 Query balance is 100; Query balance is 100;
t3 The balance increases by 100;
t4 COMMIT;
t5 Query balance is 200;
t6 COMMIT;
t7 Transaction A reads the data submitted by transaction B, but the same data is read repeatedly

In fact, unrepeatable reads are not a problem in some situations. For example, I want to be able to read the data immediately in one transaction when another transaction changes the data. That is unrepeatable. If you want multiple reads to be the same in a transaction, it is a repeatable read.

Phantom read

Phantom reading is when a transaction queries with the same criteria, and another transaction adds data, causing it to see data that was not there before.

For example, according to the timeline below, transaction A changed the balance of all accounts to 100, and then transaction B added A new account. As A result, transaction A queried again and found that there was another account with A balance of 0. There was no such record in the data updated before, which is unreal reading.

Timeline Session A Session B
t1 BEGIN; BEGIN;
t2 Update all account balances to 100;
t3 Create a new account with a balance of 0;
t4 COMMIT;
t5 Query found another account balance 0;
t6 COMMIT;
t7 As transaction B adds new data, transaction A still has data that has not been seen before

Dirty write

Dirty write is also known as data loss or update loss. In simple terms, the update operation of one transaction is overwritten by the update operation of another transaction, resulting in data inconsistency.

There are two types of conditions that can cause dirty writing:

1. The rollback of transaction A overwrites the changes committed by transaction B, resulting in the loss of the changes made by transaction B.

For example, in the timeline below, transaction A and B both started with 0 query balances, transaction B modified the balance by 200, and transaction A rolled back the balance to 0. Transaction B appears to have lost the modified data.

Timeline Session A Session B
t1 BEGIN; BEGIN;
t2 Query balance is 0; Query balance is 0;
t3 The balance increases by 100;
t4 The balance increases by 200;
t5 COMMIT;
t6 ROLLBACK;
t7 Query balance is 0; Query balance is 0;
t8 Data updated by transaction B is lost due to transaction A rollback

The InnoDB storage engine does not have this problem because InnoDB uses an exclusive lock to update data, so that no other transaction can modify and commit the data involved in transaction A while transaction A is still incomplete. For example, in the diagram above, when the execution balance of transaction B increases by 200, as transaction A modifies the same data and does not commit it, this data has been locked exclusively. Therefore, the modification of transaction B will be blocked and can only be modified after locking.

2. Transaction A overwrites the changes committed by transaction B, causing the changes of transaction B to be lost.

For example, according to the timeline below, transaction A and B start the query balance with 0, transaction B first increases by 200, and commits the transaction. Transaction A then increments the balance of 0 by 100 and commits the transaction. Finally, the balance is only 100, and the modification of transaction B is lost.

Timeline Session A Session B
t1 BEGIN; BEGIN;
t2 Query balance is 0; Query balance is 0;
t3 The balance increases by 200;
t4 COMMIT;
t5 The balance increases by 100;
t6 COMMIT;
t7 Query balance is 100; Query balance is 100;
t8 Transaction A overwrites the changes committed by transaction B, causing the changes of transaction B to be lost

There are two ways to prevent dirty writing from happening:

One is based on database pessimistic locking, which implements an exclusive lock using for UPDATE at the time of the query, ensuring that the data cannot be updated by other transactions at the end of the transaction. However, this can lead to performance degradation of concurrent updates.

SELECT * FROM account WHERE id = 1 FOR UPDATE;
Copy the code

The other is based on optimistic lock, you can add a version number field in the table, the version number will be found in the query, update with the version number as a condition, the update success is the same record, otherwise the update failure. If the update fails, an error such as “record does not exist or version is inconsistent” is returned, allowing the user to query and update again.

UPDATE account SET balance=balance+100, version=version+1 where id = 1 and version = 1
Copy the code

Transaction isolation level

SQL standard transaction isolation level

There are four problems with concurrent transactions: dirty writes, dirty reads, non-repeatability, and phantom reads. Among them, dirty write can be solved by optimistic locking or pessimistic locking, the remaining three problems are actually caused by database read consistency, which requires the database to provide a certain transaction isolation mechanism to solve, that is, transaction isolation.

The SQL standard defines four isolation levels:

  • READ UNCOMMITTED: Read unsubmitted, abbreviatedRU.
  • READ COMMITTED: Read submitted for shortRC.
  • REPEATABLE READCan be repeated for shortRR.
  • SERIALIZABLE: serializable.

Different isolation levels can solve some of the transaction problems, as shown in the table below.

  • READ UNCOMMITTED: Dirty reads, unrepeatable reads, and phantom reads may occur.
  • READ COMMITTED: Can not repeat read, magic read problem, will not happen dirty read problem.
  • REPEATABLE READ: Will happen magic read problem, will not happen dirty read, unrepeatable read problem.
  • SERIALIZABLE: Dirty reads, unrepeatable reads, and phantom reads will not occur.

The lower the isolation level, the more serious the problem may be. The higher the isolation level, the lower the performance of concurrency. However, few database vendors follow these SQL standards. Oracle databases, for example, do not support the transaction isolation levels of READ UNCOMMITTED and REPEATABLE READ.

MySQL transaction isolation level

The InnoDB storage engine supports four isolation levels of the SQL standard, but InnoDB can avoid phantom READ problems with REPEATABLE READ isolation.

The default isolation level of MySQL is REPEATABLE READ. You can change the isolation level of a transaction by using the following statement:

SET [GLOBAL|SESSION] TRANSACTION ISOLATION LEVEL <level>; Level Optional value: READ UNCOMMITTED READ COMMITTED REPEATABLE READ SERIALIZABLECopy the code

If you want to change the default isolation level for transactions when the server starts, you can add the parameter transaction-Isolation under [mysqld].

[mysqld]
transaction-isolation = READ-COMMITTED
Copy the code

To view the transaction isolation level of the current session:

mysql> SELECT @@tx_isolation;
+-----------------+
| @@tx_isolation  |
+-----------------+
| REPEATABLE-READ |
+-----------------+
Copy the code

To view the transaction isolation level globally:

mysql> SELECT @@global.tx_isolation;
+-----------------------+
| @@global.tx_isolation |
+-----------------------+
| REPEATABLE-READ       |
+-----------------------+
Copy the code

Transaction feature Implementation

There is a complex set of mechanisms underlying the database to implement the ACID properties of transactions, which are described briefly in this section and will be covered in several separate articles.

1. Persistence (D)

Transaction persistence is achieved through the redo log of the database, called the redo log. When updating data pages in the Buffer Pool, the redo log is also recorded. In this way, when MySQL restarts, data can be recovered using the redo log even if the dirty pages are not flushed.

2. Atomicity (A)

Atomicity of transactions is achieved through the database’s Undo log, called the undo log or rollback log. When adding, deleting, or modifying a transaction, the corresponding Undo log is recorded.

  • When deleting a piece of data, it records the information about the data, and when rolling back, it inserts the old data
  • When a data item is updated, the old value is recorded, and when a data item is rolled back, the update operation is performed based on the old value
  • When an insert is performed, the primary key of the record is used; when a rollback is performed, the delete operation is performed based on the primary key

The undo log version chain is also used to implement multi-version concurrency control (MVCC). InnoDB’s RC and RR isolation levels are based on MVCC to achieve high performance transactions, and MVCC is used to avoid phantom reading.

3. Isolation (I)

The isolation of transactions is realized by locks. Different locking methods can realize different transaction isolation mechanisms.

4. Consistency (C)

Transaction consistency requires two layers:

  • Database level: The database must be implementedAIDThree features make consistency possible. For example, atomicity is not guaranteed, and obviously consistency is not guaranteed either.
  • Application level: If you deliberately write code in a transaction that violates the constraints, consistency is not guaranteed. The application layer should determine whether the database data is valid through the code and then decide whether to roll back or commit the data.

As you can see, atomicity, persistence, and isolation are the means of guaranteeing persistence at the database level. Therefore, we will study atomicity, persistence, and isolation in a separate article.