MVCC multi-version concurrency control is a method of database management system concurrency control. Under MVCC multi-version concurrency control, the data in the database will have multiple versions corresponding to different transactions, so as to achieve the isolation of concurrent data between transactions. The biggest advantage of MVCC is that read data is not locked and read/write data does not conflict. In the scenario of many read data and few write data, read/write data does not conflict, which greatly improves the concurrent performance of the database.

MVCC multi-version concurrency control

In MYSQL, MyISAM storage engine uses table locks and InnoDB storage engine uses row locks. And InnoDB transaction is divided into four isolation levels, including the default isolation level is repeatable read, repeatable read request two parallel transaction data change between each other, by adding a row lock way while it is possible to realize data between two transactions according to the change each other, but those two transactions between lock waits, affect the efficiency of database. So InnoDB uses the more powerful MVCC instead of row locking for repeatable reads.

MVCC only works at repeatable reads and read committed isolation levels. The other two isolation levels are incompatible with MVCC because uncommitted reads always read the latest row, regardless of the transaction version. Serialization locks all read rows. Because repeatable reads are complex and are MySQL’s default isolation level, this article uses repeatable reads to explain the principles of MVCC.

Repeatable read

Databases have four isolation levels: read uncommitted/Read Committed/repeatable read/serialized. Repeatability is MySQL’s default transaction isolation level, which ensures that multiple instances of the same transaction will see consistent rows when reading data concurrently.

Data row consistency consists of two parts:

  • Case 1: If the content of existing data is changed, the query result should be the same if the data is queried repeatedly in the same transaction. If the data is modified in the current transaction, the query result should be the same as the modified result in the current transaction.
  • Case 2: increase or decrease of data rows. The same transaction can only view the data in the database before the transaction starts or the result sets added/deleted by the transaction itself, but cannot see the result sets added or deleted by other transactions during the transaction start.

InnoDB’s default isolation level is repeatable reads, which addresses row consistency issues in both cases. The problem of data row consistency in case 1 is solved by MVCC multi-version concurrency control.

InnoDB uses the Gap lock implementation 2 data row consistency problem, but this article does not cover Gap locks.

The role of MVCC

MVCC ensures that a single data read from the beginning to the end of a transaction is consistent and that multiple transactions do not block each other. Let’s use a user table as an example to illustrate the power of MVCC version control.

Create a user table and insert user data into it.

create table user_info
(
    age int ,
    name  varchar(255));insert into user_info(age,name) value (23.'Joe');
Copy the code

Suppose there are three transactions A, B and C, in which the information of the inserted user is read at different times and the user information is modified, and the time line is as follows:

  1. At time T1, transaction A starts and transaction A readsage=23For the usernameforZhang SAN;
  2. At T2, transaction B starts and transaction B readsage=23For the usernameforZhang SAN;
  3. At time T3, transaction A is modifiedage=23Users of thenameModified toLi si;
  4. At T4, transaction A readsage=23For the usernameforLi si, transaction A commits the transaction;
  5. At time T5, transaction B readsage=23For the usernameforZhang SAN, transaction B commits a transaction;
  6. At time T6, transaction C starts and transaction C readsage=23For the usernameforLi si, transaction C commits a transaction;

The effect of MVCC can be reflected at T5, when transaction A has committed, and the name of the user whose age=23 is changed is Li Si, but transaction B cannot see the change, and the name of the user whose age=23 is seen by transaction B is Zhang Sam. This is because under the isolation level of repeatability, the data read by InnoDB transaction is snapshot read, i.e. transaction B starts with a snapshot of the data, and the data read by transaction B is always the same snapshot, corresponding to the snapshot read is the current read:

  • Current read: The latest version of the record is read. During reading, it must ensure that other concurrent transactions cannot modify the current record and lock the read record.
  • Snapshot read: MVCC uses snapshot read to generate snapshots for data when a transaction starts. Snapshot read avoids locking operations and improves database performance.

MVCC principle

InnoDB introduces MVCC to solve read/write conflicts. MVCC consists of three parts: 3 hidden fields in the database, UndoLog log, and ReadView. The functions of these three parts in MVCC are as follows:

  1. Hidden fields: Adding additional version information to data is a cornerstone of MVCC version control;
  2. UndoLog: Stores data of multiple versions, and the contents of hidden fields of different versions are different.
  3. ReadView: Determine which version of data should be read by the current transaction;

Hidden fields

Hidden fields mean that we cannot find them through SQL statements, but they actually exist in the database and take up storage space. To achieve MVCC versioning, InnoDB adds the following three hidden fields for each row of data:

  1. DB_TRX_ID: 6 bytes, the transaction ID of the record is modified at the end.
  2. DB_ROLL_PTR: 7 bytes, a Rollback pointer to the previous version of this record (stored in the Rollback Segment);
  3. DB_ROW_IDInnoDB uses DB_ROW_ID to build the cluster index if the table has no explicit primary key.

We use the following SQL to create a user table and insert a data entry into the table. By default, the new table will contain three hidden fields. The table structure is shown in the following table.

create table user_info
(
    age int,
    name  varchar(255));insert into user_info(age,name) value (23.'Joe');
Copy the code

| age | name | DB_TRX_ID | DB_ROLL_PTR | DB_ROW_ID | | — – | — – | — – | — – | | | | 23 zhang SAN | 1 | 0 x222333 | | 1

UndoLog log

I introduced UndoLog logs in another article, and as the name suggests, UndoLog logs are primarily used to roll back transactions. But InnoDB’s MVCC snapshot reads also use UndoLog. UndoLog can be divided into two broad categories:

  1. Insert UndoLog: the UndoLog corresponding to the Insert statement in the transaction, which is only needed when the transaction is rolled back, so it can be discarded immediately after the transaction is committed;
  2. Update UndoLog: UndoLog generated when a transaction is performing an Update or Delete; Not only when a transaction is rolled back, but also when a snapshot is read; Therefore, the Purge thread only Purge logs that are not involved in snapshot reads or transaction rollback.

Purge thread: In InnoDB, deleted data is not deleted directly, but marked as deleted first, and useless Update UndoLog is not immediately deleted. This data is all deleted through the Purge thread of the background task in InnoDB.

SQL > Update UndoLog user_info tablespace user_info; SQL > Update UndoLog tablespace user_info;

  1. At T1, transaction A starts, transaction Id is 2, and transaction A reads the user whose age=23, whose name is Zhang SAN. At this time, no database data is modified, no UndoLog is generated, and the table space is unchanged.

  2. At T2, transaction B starts, transaction Id is 3, and transaction B reads the user whose age=23, whose name is Zhang SAN; At this time, no database data is modified, no UndoLog is generated, and the table space is unchanged.

  3. At T3, transaction A changes the user whose age=23 and changes the name to Li Si; At this time, transaction A has not been committed, so A UndoLog will be generated for transaction A. UndoLog stores the data before transaction A changes, and the rollback pointer in the latest data in the tablespace points to this log.

  4. At T4 moment, transaction A reads the user whose age=23. Since the transaction ID recorded in the table data is consistent with that of transaction A, transaction A will read the records in the table data and read the user’s name as Li Si, and transaction A submits the transaction.

  5. At moment T5, transaction B reads the user whose age=23. Since the data in the table space does not meet the visibility condition (detailed in the next section), transaction B will search UndoLog of the table data. The data in UndoLog meets the visibility condition, so the user in UndoLog is queried. Transaction B commits the transaction;

  6. At time T6, transaction C starts, transaction ID is 3, and transaction C reads the user whose age=23. Since transaction A has been committed when transaction C starts, transaction C can query the submitted data, and transaction C reads the user whose name is Li Si.

  7. At time T7, transaction C starts, transaction ID is 3, transaction C changes user age=23, change name to king five; At this time, transaction C has not been committed, so a UndoLog will be generated for transaction C, which stores the data before the modification of transaction C.

As you can see from the above example, changes to the same record by different transactions or by the same transaction will result in the UndoLog of that record being a linear list of record versions, with the first UndoLog being the latest old record and the last UndoLog being the earliest old record. (The UndoLog nodes may be purged by the Purge thread.)

UndoLog is used for rollback by copying the pre-transaction database record rows to UndoBuffer and flushing the contents of UndoBuffer to disk at a suitable time. UndoBuffer, like RedoBuffer, is a circular buffer, but when the buffer is full, the contents of the UndoBuffer are flushed to disk. Unlike RedoLog, there is no single UndoLog file on disk. All UndoLog is stored in the main IBD data file (table space), even if the client has set up one data file per table.

ReadView read view

ReadView transaction is a snapshot of a read operation when the reading view of production, in the transaction execution at that moment, a snapshot of the read will generate a snapshot of the current database system, record and maintain system currently active transaction ID (when each transaction to open, will be assigned an ID, this ID is increasing, so the latest transaction, the greater the ID value)

So we know that ReadView is mainly used for visibility judgment, that is, when a transaction performs snapshot read, we create a ReadView of the record, which is compared to the condition to determine which version of the data can be seen by the current transaction. It could also be a version of UndoLog that the row records.

ReadView follows a visibility algorithm by taking the DB_TRX_ID (the current transaction ID) from the latest record of the data to be modified and comparing it with the ids of other currently active transactions in the system (maintained by ReadView). DB_TRX_ID = DB_TRX_ID = DB_TRX_ID = DB_TRX_ID = DB_TRX_ID = DB_TRX_ID = DB_TRX_ID = DB_TRX_ID Until a DB_TRX_ID is found, the old record where the DB_TRX_ID resides will be the latest old version visible to the current transaction.

ReadView determines visibility as follows. After a new transaction is created,InnoDB generates a ReadView for that transaction when the new transaction reads data. InnoDB creates a copy of the active transaction list in the system and saves it to ReadView. When a user reads a row in this transaction, InnoDB compares the current version number of the row with the ReadView. The specific algorithm is as follows:

  1. Set the current transaction ID of this row to cur_trx_id, the earliest transaction ID in ReadView to min_trx_id, and the latest transaction ID to max_trx_id.
  2. If cur_trx_id < min_trx_id, it indicates that the transaction in which this record belongs was committed before the new transaction was created, so the current value of this record is visible. Skip to Step 6.
  3. If cur_trx_id > max_trx_id, it indicates that the transaction in which this record is located was not started until the new transaction was created, so the current value of this record is not visible. Skip to Step 5;
  4. If min_trx_id<= cur_trx_id <= max_trx_id, then it indicates that the transaction is active at the time of the new transaction creation, traversal from min_trx_id to max_trx_id. Not visible if cur_trx_id is equal to one of their transaction ids. Skip to Step 5;
  5. Fetch the latest UndoLog version number from the rollback segment pointed to by the DB_ROLL_PTR pointer on this record, assign it the cur_trx_id value, and skip to Step 2.
  6. Returns the value of the visible row;

In MVCC version control, the first snapshot read of a transaction is used as the dividing line. After a transaction, only the first snapshot read and the previously committed data version can be found.

Read committed and repeatable

What is the difference between InnoDB snapshot reads at committed and repeatability isolation levels? ReadView is generated at different times, resulting in different read results for committed and repeatable snapshot reads:

  • At the repeatable read isolation level, when a ReadView is generated for the first snapshot read of a transaction, the ReadView takes snapshots of all other active transactions at that time. The changes made to these transactions are not visible to the current transaction. Transactions created prior to ReadView are visible;
  • Each snapshot read generates a new snapshot and ReadView. This is why we can see committed updates from other transactions in RC transactions.

Each snapshot read generates and gets the latest ReadView at the committed isolation level. Under repeatable read isolation, a ReadView is created for the first snapshot read in the same transaction, and all subsequent snapshot reads obtain the same ReadView.

Read MVCC and illusions

A phantom read is a problem where the same SQL statement is executed twice in a row in the same transaction. The second SQL statement may return a row that did not exist before. For example, at time T1, transaction A and transaction B are enabled at the same time, and A snapshot read is performed respectively. Then, transaction A inserts A new record into the database. If transaction B can read this record, A “magic read” occurs, because the data was not read by transaction B in the first snapshot read.

Can MVCC solve the phantom problem? The answer is that in some cases it can be solved and in some cases it can’t be solved. If the reads in transaction B are snapshot reads, then MVCC version control can solve the phantom read problem; MVCC cannot solve the phantom problem if transaction B is using current reads.

  • Snapshot read is based on MVCC and UndoLog, which is suitable for simple Select statements.
  • The current read is based on the Gap lock, suitable for Insert, Update, Delete, Select… For Update, Select… Lock In Share Mode statement, and Lock Select statement;

In fact, MVCC does not apply to all current reads. For example, after transaction A modifies the data, transaction B updates the corresponding data. The Update statement filters the current data in the database, not the snapshot data.

I am the god of the royal Fox. Welcome to follow my wechat official account: Wzm2ZSD

Reference documentation

Read COMMITTED && MVCC MySQL database transactions locked at each isolation level

This article was first published to wechat public account, all rights reserved, reprint prohibited!