How to solve magic read in MySQL (MVCC)

1. What is MVCC

The transaction and isolation levels in MySQL have been covered in detail in previous articles, and MVCC is designed to efficiently avoid the problems (dirty reads, unrepeatable reads, phantom reads) caused by concurrent access to the database while minimizing the use of locks.

MySQL isolation levels

Isolation level	Dirty read	Unrepeatable read	Phantom read
READ UNCOMMITTED: Indicates that the READ is not committed	May occur	May occur	May occur
READ COMMITTED: Indicates that the READ is COMMITTED	To solve	May occur	May occur
REPEATABLE READ: REPEATABLE READ	To solve	To solve	May occur
SERIALIZABLE: SERIALIZABLE	To solve	To solve	To solve

Concurrency Control MVCC stands for Multi-version Concurrency Control. Our goal is to improve the Concurrency performance of a database.

When read/write requests occur on the same row, the lock is blocked. However, MVCC has a better way to handle read/write requests without locking them in case of read/write conflicts.

This read refers to snapshot read, not current read. Current read is a pessimistic lock operation.

So how does it do that? What do snapshot reads and current reads mean?

2. Snapshot read and current read

Read the snapshot

Snapshot read reads snapshot data. SELECT without lock belongs to snapshot read.
```
SELECT * FROM table WHERE ...
Copy the code
```
The current reading

The current read is used to read the latest data, not the historical data. Lock SELECT, or adding, deleting, or modifying data, is used to read the current data.
```
SELECT * FROM table LOCK IN SHARE MODE;
SELECT FROM table FOR UPDATE;
INSERT INTO table values ...
DELETE FROM table WHERE ...
UPDATE table SET ...
Copy the code
```

3. Why MVCC

In the database concurrency scenario, only read-read operations can be executed concurrently. Read-write, writ-read, and writ-write operations must be blocked, resulting in poor MySQL concurrency performance.

With the MVCC mechanism, only the write operation can block each other, and the other three operations can be parallel, which can improve the concurrency performance of MySQL.

In other words, MVCC specifically solves the following problems:

Concurrent read-write: The read operation does not block the write operation, and the write operation does not block the read operation.
To solveDirty read,Phantom read,Unrepeatable readTransaction isolation, but does not solve the aboveWrite -(need to lock) problem.

4. Principle of MVCC mechanism

Its realization principle is mainly version chain, undo log, Read View to achieve.

Version 4.1 of the chain

The InnoDB storage engine previously introduced the row format of data pages. For tables that use it, the clustered index in the table contains three hidden columns:

The column name	Whether must	instructions
row_id	no	A table created with a primary key or a non-null UNIQUE key will not contain the ROW_ID column
trx_id	is	Transaction ID, which is assigned to the trX_ID hidden column every time a transaction changes a clustered index record
roll_pointer	is	The rollback pointer writes the old version to the Undo log every time a change is made to a clustered index record, and the hidden column acts as a pointer to find information about the record before the change

Here’s a chart:

CREATE TABLE 'user' (' id 'bigint NOT NULL COMMENT '主键',' name 'varchar(20) DEFAULT NULL COMMENT '主键', 'sex' char(1) DEFAULT NULL COMMENT 'gender ',' age 'varchar(10) DEFAULT NULL COMMENT' age ', 'url' varchar(40) DEFAULT NULL, PRIMARY KEY (`id`), KEY `suf_index_url` (`name`(3)) USING BTREE ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3;Copy the code

Insert the following data:

INSERT INTO `user` (`id`, `name`, `sex`, `age`, `url`) 
VALUES ('1', 'ayue', '1', '18', 'https://javatv.net');
Copy the code

Assuming the transaction ID of the record is 60, the diagram for the record at this point looks like this:

Suppose two transactions with id 80 and ID 120 UPDATE this record, and the operation flow is as follows:

Trx 80	Trx 120
BEGIN
	BEGIN
UPDATE user SET name = ‘a’ where id = ‘1’
UPDATE user SET name = ‘y’ where id = ‘1’
COMMIT
	UPDATE user SET name = ‘u’ where id = ‘1’
	UPDATE user SET name = ‘e’ where id = ‘1’
	COMMIT

An undo log is logged every time a change is made to the record, and each undo log has a roll_pointer attribute (the undo log for the INSERT operation does not have this attribute because the record does not have an older version). You can concatenate these undo logs. A linked list, so now it looks like this:

After each update to the record, the old value will be put into an undo log. As the number of updates increases, all versions of the record will be connected into a linked list by the roll_pointer attribute. This list is called the version chain, and the first node of the version chain is the latest value of the current record.

In addition, each version contains the transaction ID for which the version was generated. The version chain of this record can then be used to control the behavior of concurrent transactions accessing the same record. This mechanism is called multi-version concurrency control (MVCC).

4.2 the undo log

The undo log is used to record logs before data modification. Data is copied to the Undo log before table information modification. When a transaction is rolled back, data can be restored using the log in undo log. (MySQL log)

The use of Undo log:

ensureThe transactionforrollbackAt the time of theAtomicity and consistency, when the transaction is going onThe rollbackYou can use the undo log datarestore.
Used for MVCCRead the snapshotData, in MVCC multi-version control, by readingundo logtheHistorical version DataCan be implementedDifferent transaction version numbersAll have their ownIndependent snapshot data version.

Undo logs are classified into two types:

insert undo log

The undo log, which represents the transaction generated when a new record is inserted, is only needed when a transaction is rolled back and can be discarded immediately after a transaction is committed.
update undo log

Undo logs generated during update or DELETE transactions are required not only for transaction rollback but also for snapshot reads. The purge thread will purge the log only if the log is not involved in a quick read or transaction rollback.

4.3 ReadView

As mentioned above, changes are stored in the undo log. If a log needs to query the row record, which version of the row record should be read?

1️ for transactions using READ UNCOMMITTED isolation level, it is good to directly READ the latest version of the record since the modified records of UNCOMMITTED transactions can be READ.

2️ for transactions using SERIALIZABLE isolation level, InnoDB uses locking to access records without concurrency problems.

3️ For transactions using READ COMMITTED and REPEATABLE READ isolation levels, the modified records of the COMMITTED transactions must be guaranteed to be READ, that is to say, if another transaction has modified the records but has not yet COMMITTED, the latest version of the records cannot be READ directly.

The core question is: what is the difference between READ COMMITTED and REPEATABLE READ isolation levels on unrepeatable reads and phantom reads? The two isolation levels correspond to non-repeatable reads and phantom reads, which refer to the case where the same transaction reads the record twice. The key point of these two isolation levels is to determine which version in the version chain is visible to the current transaction.

ReadView is designed to solve this problem and help us solve the visibility problem. The Read View is generated when a transaction does a snapshot Read, which holds a list of all transactions that were active when the current transaction was started.

Note: Active here refers to uncommitted transactions.

When each transaction is started, a ReadView is generated to record some content. The ReadView contains four important properties:

attribute	instructions
m_ids	A list of transaction ids of the read and write transactions that are active on the system when the ReadView is generated
min_trx_id	The smallest transaction ID of the active read/write transaction in the system when the ReadView is generated is the minimum value in m_IDS
max_trx_id	The id value in the system that should be assigned to the next transaction when the ReadView is generated
creator_trx_id	The transaction ID of the transaction that generated the ReadView, specifying which transaction the current ReadView belongs to

Where max_trx_id does not refer to the maximum value in m_IDS, because transaction ids are incrementally allocated, if there are three transactions with IDS 1, 2, and 3, and then a transaction with ID 3 commits. When a new read transaction generates a ReadView, m_ids will include 1 and 2, min_trx_id will be 1, max_trx_id will be 4.

With ReadView, when accessing a record, just follow these steps to determine if a version of the record is visible:

Trx_id = creator_trx_id, accessible

If the value of the accessed version’s trx_id attribute is the same as the creator_trx_id value in ReadView, it means that the current transaction is accessing its own modified record, so the version can be accessed by the current transaction.
Trx_id < min_trx_id, accessible

If the value of the accessed version’s trx_id attribute is less than the value of min_trx_id in ReadView, the transaction that generated the version was committed before the current transaction generated the ReadView, so the version can be accessed by the current transaction.
Trx_id >= max_trx_id is inaccessible

If the value of the accessed version’s trx_id attribute is greater than or equal to the value of max_trx_id in ReadView, the transaction that generated the version was started after the current transaction generated the ReadView, so the version cannot be accessed by the current transaction.
Min_trx_id <= trx_id < max_trx_id, the m_IDS list is inaccessible

If the trx_id attribute value of the accessed version is between min_trx_id and max_trx_id of ReadView, then check whether the trx_id attribute value is in the m_IDS list. When the ReadView is created, the transaction that generated this version is still active and cannot be accessed. If not, the transaction that generated the version of the ReadView when it was created has been committed and the version can be accessed.
A version of data is not visible to the current transaction

If one version of the data is not visible to the current transaction, go down the version chain to the next version of the data, continue the above steps to determine the visibility, and so on, until the last version in the chain. If the last version is also not visible, it means that the record is not visible to the transaction at all and is not included in the query result.

One big difference between READ COMMITTED and REPEATABLE READ isolation levels in MySQL is when they generate readViews.

4.4 the RC and ReadView

A ReadView is generated each time data is READ at the READ COMMITTED level.

Let’s say two transactions with the COMMITTED isolation level of 80 and 120 are running on the system at the same time.

1 ️ ⃣ trx_id = 80

# SHOW VARIABLES LIKE 'transaction_isolation'; SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED; BEGIN UPDATE 'user' SET 'name' = 'a' WHERE id = 1; UPDATE `user` SET `name` = 'y' WHERE id = 1;Copy the code

At this point, the version linked list for the records with ID 1 in table user looks like this:

Without committing, use a transaction at the READ COMMITTED isolation level to query:

# SHOW VARIABLES LIKE 'transaction_isolation'; SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED; BEGIN SELECT * FROM 'user' WHERE id = 1;Copy the code

Obviously, the result is ayue.

The execution of this query statement is as follows:

The m_ids list of the ReadView is [80, 120], min_trx_id is 80, max_trx_id is 121. So creator_trx_id is not generated, so we can default it to 0.

In the m_IDS list, trx_id = 80, does not meet the visibility requirements, so according to roll_pointer jump to the next version.

In the next version, the value of name is a, trx_id = 80, which still does not meet the visibility requirements, so skip to the next version according to roll_pointer.

In this version, the value of name is ayue and trx_id = 60, which meets the visibility requirement trx_id < min_trx_id. Therefore, the version returned to the user is the record whose name is ayue.

Select * from user where id = 1 and trx_id = 120; select * from user where id = 1 and trx_id = 120;

2 ️ ⃣ trx_id = 120

# SHOW VARIABLES LIKE 'transaction_isolation'; SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED; BEGIN UPDATE 'user' SET 'name' = 'u' WHERE id = 1; UPDATE `user` SET `name` = 'e' WHERE id = 1;Copy the code

At this point, the version linked list for the records with ID 1 in table user looks like this:

Trx_id = 80 COMMITTED; trx_id = 120 uncommitted;

# SHOW VARIABLES LIKE 'transaction_isolation'; SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED; BEGIN SELECT * FROM 'user' WHERE id = 1;Copy the code

Obviously, this is y.

In this case, the SELECT statement will generate a separate ReadView with m_ids of [120]. Trx_id = 80 has already been committed, so it will not exist when the snapshot is generated again.

Similarly, according to visibility requirements, the read data is name = y when the transaction with TRx_id = 120 is not committed.

When the transaction trx_id = 120 is also committed, the query returns the correct result.

In summary, transactions using the READ COMMITTED isolation level generate a separate ReadView at the start of each query.

To sum up:

READ COMMITTED, can result in unrepeatable reads, as in the above test:

The transactiontrx_id = 80andtrx_id = 120At the same time,trx_id = 80submitted

Overall process:

sequence	trx_id = 80	select_80	trx_id = 120	select_120
1	BEGIN
2		BEGIN
3			BEGIN
4				BEGIN
5	UPDATE user SET name = ‘a’ where id = 1
6	UPDATE user SET name = ‘y’ where id = 1
7		SELECT * FROM user WHERE id = 1
8	COMMIT
9		SELECT * FROM user WHERE id = 1
10			UPDATE user SET name = ‘u’ where id = 1
11			UPDATE user SET name = ‘e’ where id = 1
12				SELECT * FROM user WHERE id = 1
13			COMMIT
14				SELECT * FROM user WHERE id = 1

Note: The above tests are at the READ COMMITTED level.

4.5 the RR and ReadView

Generate a ReadView at REPEATABLE READ level on the first READ of data.

Trx_id = 80; trx_id = 120; trx_id = 120; trx_id = 80; trx_id = 120

When the transaction trx_id = 80 performs the change operation and before committing, the version linked list looks like this:

The m_IDS list of the ReadView is [80,120], min_trx_id = 80, max_trx_id = 121, creator_trx_id = 0.

If trx_id is between min_trx_id and max_trx_id, trx_id is not in m_IDS list, so trx_id will be found down the version chain until trx_id = 60.

Then submit.

Update trx_id = 120 (trx_id = 120, trx_id = 120, trx_id = 120)

The m_IDS list of the ReadView is [80, 120], min_trx_id = 80, Max_trx_id = 121, creator_trx_id = 0.

The current transaction ID is 120, between min_trx_id and max_trx_id, and is in m_IDS list, so it does not meet the visibility principle, so it will continue to search down the version chain, when it finds the transaction ID is 80, it also does not meet the requirement, so continue to search down until trx_id = 60. If the condition is met, the value returned is ayue.

So RR solves the unrepeatable read problem.

5. Illusionary phenomena under MVCC

From the above analysis, we know that MVCC can solve the non-repeatable READ problem under REPEATABLE READ isolation level.

What about hallucinations? How does MVCC work?

A phantom is a transaction that reads a record many times under the same condition, but later reads a previously unread record from a new record added by another transaction. In other words, a phantom is a newly inserted row. Such as:

Bank A opened A transaction window to check whether there was A “ayue” user in the current system, but found there was not. Bank B also opened A transaction window to check whether there was A “ayue” user in the current system. Bank A first created the “ayue” user and submitted it. Bank B must ensure data consistency in a transaction, so “AYue” could not be queried. As a result, the window of bank B thought that “AYue” had not been registered, so it created “ayue” user to register “Ayue” user, only to find that the system indicated that “ayue” user had been registered. But “ayue” is not found in this transaction, just like an illusion.

Why does illusory reading occur at RR level?

Now let’s query the database:

Then we start two transactions, transaction A and transaction B.

A 1 ️ ⃣ affairs

BEGIN;

SELECT * FROM `user` WHERE id = 2
Copy the code

Obviously, there is no data in the table with id = 2.

B 2 ️ ⃣ affairs

Then, we insert a piece of data with ID = 2 into transaction B.

BEGIN;

INSERT INTO `user` (`id`, `name`, `sex`, `age`, `url`) 
VALUES ('2', 'ayue', '1', '18', 'https://javatv.net');

COMMIT;
Copy the code

Then go back to transaction A and execute:

SELECT * FROM `user` WHERE id = 2;
Copy the code

In this case, the RR data cannot be queried to ensure repeatable read.

However, when we perform the following operations under transaction A, we find that the modification can be successful:

UPDATE `user` SET name = 'a' WHERE id = 2;
Copy the code

Execute again under transaction A:

SELECT * FROM `user` WHERE id = 2;
Copy the code

How fat? Transaction A could not query the data with id=2 before, but now it can query the data with ID =2, obviously there is A phantom phenomenon.

Analyze:

At the REPEATABLE READ isolation level, transaction A generates A ReadView (and only one RV under RR) for the first time to execute A normal SELECT statement, and then transaction B inserts A new record into the User table and commits it.

ReadView does not prevent transaction A from performing an UPDATE or DELETE statement to alter the newly inserted record (since transaction B has already committed, changing the record will not block), but in doing so, The value of the trx_ID hidden column for this new record becomes the transaction ID for transaction A. After that, A can see the record and return it to the client by using the ordinary SELECT statement to query the record.

Because of this special phenomenon, we can also assume that MVCC does not completely prohibit phantom reading.

6. How to solve phantom reading

In 2 points, we know that the database read operation is divided into current read and snapshot, and under the RR isolation level, MVCC solved under the condition of the snapshot read the magic of reading, and in the actual scenario, we may need to read the real-time data, such as in the banking and other special scenarios, must be need to read to real-time data, snapshot cannot be read at this time.

So what’s the solution to this problem?

There is no doubt that in a concurrent scenario you can implement the current read by locking, whereas in MySQL you solve the phantom read problem by next-key Locks. For an introduction to locks in MySQL, see this article.

Next-key Locks consist of two parts: Record Locks (row Locks) and Gap Locks. A record lock is a lock placed on an index, and a gap lock is a lock placed between indexes.

6.1 Record the Lock

Record lock, a lock on a single index record.

Innodb creates a hidden aggregate primary key index in the background and locks the hidden aggregate primary key index even if there are no indexes on the table.

Record lock can be divided into S lock and X lock. After a transaction acquires the S-type record lock of a record, other transactions can also continue to acquire the S-type record lock of the record, but cannot continue to acquire the X-type record lock. Once a transaction acquires an X-lock for a record, other transactions cannot continue to acquire either the S-lock or the X-lock for that record.

6.2 Gap the Locks

Gap lock: locks the gaps before and after the index, not the index itself. Open in front and open in back.

In REPEATABLE READ isolation level, MySQL can solve the phantom problem. There are two solutions: MVCC solution and lock solution. However, there is a problem with the locking solution. The phantom records do not exist when the transaction performs the first read operation, so we cannot lock the phantom records. So we can use a gap lock to lock it.

If there is a table like this:

CREATE TABLE test (id INT (1) NOT NULL AUTO_INCREMENT, number INT (1) NOT NULL COMMENT 'numeral ', PRIMARY KEY (id), KEY number (number) USING BTREE ) ENGINE = INNODB AUTO_INCREMENT = 1 DEFAULT CHARSET = utf8; INSERT INTO test VALUES (1, 1); INSERT INTO test VALUES (5, 3); INSERT INTO test VALUES (7, 8); INSERT INTO test VALUES (11, 12);Copy the code

As follows:

Start A transaction A:

BEGIN;

SELECT * FROM test WHERE number = 3 FOR UPDATE;
Copy the code

(1,1),(5,3)) and (5,3),(7,8)) are locked.

If a transaction B is opened at this time to insert data, as follows:

BEGIN; INSERT INTO test (id, number) VALUES (2,2);Copy the code

The results are as follows:

Why not insert? Insert (2,2) is locked between (1,1),(5,3) and (5,3),(7,8). If it is outside the range, of course it can be inserted, as in:

INSERT INTO test (id, number) VALUES (8,8); 
Copy the code

Also, since indexes are involved, what impact does an index have on gap locking?

For primary keys or unique indexes, if all the WHERE conditions hit exactly (= or in) on the current read, the scenario itself will not have phantom reads, so only row record locks will be added, that is, gap locks will degenerate into row locks.
Non-unique indexed columns. If the WHERE condition is partially hit (>, <, like, etc.) or not hit at all, a nearby gap lock is added. For example, a table has the following data and is not a unique index2,6,9,9,11,15. The following statement will operate on a non-unique index column 9. The column (6,11) that will be locked by a gap lock cannot be inserted.
If a column does not have an index, a full table gap lock will be added to the current read operation.

6.3 Next-Key Locks

A next-key lock is a combination of a record lock on an index record and a gap lock on the gap before the index record, including the record itself. Each next-key lock is an interval between open and close (again, the lock is larger and affects concurrency). Next-key locks lock the right boundary row on a gap-lock basis.

By default, InnoDB runs at REPEATABLE READ isolation level. In this case, InnoDB uses next-key Locks for search and index scans, which prevents phantom reads from happening.

7. To summarize

We say that MVCC addresses the following issues at the repeatable read (RR) isolation level:

Concurrent read-write: The read operation does not block the write operation, and the write operation does not block the read operation.
To solveDirty read,Phantom read,Unrepeatable readAnd transaction isolation issues.

For the difference between RR and RC isolation levels:

RC generates a ReadView before each ordinary SELECT operation, while RR generates a ReadView before the first ordinary SELECT operation. The ReadView should be reused for subsequent queries, and illusions can be largely avoided.

However, for phantom reads, there are also current and snapshot reads:

Clearance locks are available only at RR isolation level, but not at RC isolation level.
To solve the phantom read problem at RR isolation level, snapshot reads are controlled by MVCC, and current reads are solved by gap locks.
A gap lock and a row lock are collectively called next-key Locks. Each next-key lock is a section that opens before closing.
The introduction of gap locking may cause the same statement to be locked in a larger range, affecting concurrency.