This paper mainly introduces different transaction isolation levels, corresponding algorithms and implementation methods in Innodb through the problems arising from multi-threaded concurrent reading and writing database scenarios, and interludes the comparison of visibility, atomicity and order in Java multi-threading. PS: If you don’t know how Java guarantees access security under multiple threads, you can refer to how synchronized and volatile guarantee atomic, visible, and ordered access.

If you are familiar with transaction isolation, you can jump directly to the MVCC magic problems directory to see ~

Isolation (visibility) issues

ACID specifies that transactions should have atomicity, consistency, isolation, and persistence.

atomic

Atomicity in multithreaded semantics means that if one thread is performing an atomic operation, it means that other threads cannot see the intermediate results of the operation, but only the state before or after the operation.

The atomicity definition of a database transaction is not the same as it is for multithreading: if a thread is executing an atomic database transaction and a failure occurs that prevents the transaction from committing, the current transaction is aborted and the changes made by the current transaction are rolled back.

So atomicity in a database is better known as abortability.

Isolation,

Isolation mainly defines the problem of concurrent access of multiple transactions, so it is similar to the definition of atomicity + orderliness under multithreading, that is, other transactions cannot see the intermediate results of the current transaction execution (multithreading atomicity), and when multiple transactions access (read and write) the same data in parallel, It does not cause errors in the semantics of the database disk (multithreaded ordering) due to the network.

Therefore, the concept of weak isolation level is proposed, which is realized by different locking algorithms and multi-version concurrency control (MVCC). But different levels of isolation also present different concurrency issues.

PS :(we in multithreaded operation of the database is nothing more than concurrent read, concurrent write, and the resulting concurrent read database transactions observe what the concurrent write database transactions should see, the concurrent write database transactions should see each other, and should be presented to the concurrent read database transactions what? The answers to these three questions correspond to different isolation levels of transactions.

Multiple transaction read-reads do not have transaction isolation problems, so we focus on the read-write and write-write problems of multiple transactions on associated objects.

Write – read problems

Dirty read

A dirty read indicates that a transaction has written some data but has not committed the data. In this case, another transaction can see the uncommitted data in the transaction.

Transaction isolation level – commit read to ensure that this problem does not occur;

Non-repeatable read (read skew)

At the transaction isolation level for committed reads, user A can see the value committed by user B after user B commits the transaction, which results in unrepeatable reads for user A.

But doesn’t this guarantee that another transaction gets the latest value of the current row? Will it cause any problems?

This results in inconsistent database states obtained by the application layer: For example, if a user has two accounts (stored in the same database) and is using one account to transfer $100 to the other account (account 1 balance -100, account 2 balance +100), if the user checks the total of the two accounts before transferring the money, he may see the value changed to 1100 or 900.

Therefore, we can provide each transaction with a snapshot of the current database state, from which subsequent SELECT queries are read to ensure that they are not affected by changes to other transactions.

Transaction isolation level that ensures this problem will not occur – repeatable reads (snapshot isolation level);

Phantom read

A magic read is a special case of non-repeatable reads. Non-repeatable reads emphasize that rows of result data read in the same transaction with the same SQL are changed or deleted. A magic read emphasizes that rows of result data read in the same transaction with the same SQL are increased. But it is also because writes in one transaction change the query results of another transaction.

Note: Snapshot level isolation can avoid phantom reads in read-only queries. For transactions with UPDATE, DELETE and INSERT operations, snapshot level isolation does not guarantee phantom reads, for reasons explained in the analysis of Innodb solutions.

Write – Write the question

Dirty write

What happens if two transactions update the same object at the same time? We don’t know the order of the writes, but we do know that the write after operation overwrites the earlier writes.

However, if the previous write is part of an uncommitted transaction and is overwritten by a postwritten transaction, we define this as a dirty write (note the distinction from update loss described later).

The isolation level at which the problem is guaranteed not to occur – commit read.

Update the lost

Dirty writes are a special case in the write transaction concurrency scenario, emphasizing that changes to uncommitted transactions are overwritten. Update loss is usually caused by A Read And Modify operation. For example, if two transactions acquire A value of A=42 (Read lock) And then acquire A write lock to perform the increment operation, then the last two transactions update result is 43, which causes update loss.

In the JMM, updates are lost, and volatile does not solve the ++ I data dependency problem. Atomicity can be guaranteed by locking, or by CAS+ loop retry (atomic classes). The same is true in the database, as described below in Innodb’s solution.

Write tilt

Write skew is different from dirty writes and update misses and is generally caused by Check then Act operations, such as an application layer condition: Hospital must have a person on duty in the evening, and in the evening, there are two employees press ask for leave the key at the same time, the application will query the database first whether the current on-the-job number > 1, because the press at the same time, the database of the return is 2, and then will write on-the-job number in turn get 1 after lock, eventually lead to none on-the-job.

The transaction isolation level – serialization – that guarantees that this problem will not occur.

Innodb’s solution

Innodb essentially implements different isolation levels through a combination of different uses of MVCC and different granularity of locks to avoid the concurrency problems that can occur at weaker isolation levels.

MVCC (Multi-version Concurrency Control)

We can keep two co-existing versions of the rows involved in the write transaction in the database at the commit point in time, the old and new versions, so that relevant row queries performed by other transactions before the commit return the old version, and queries performed after the commit return the new version. This ensures that transactions started at a point in time can obtain consistent database state. (MVCC based implementation ideas)

Innodb provides a globally incremented unique identifier for each transaction — trx_ID, which determines the database time sequence of multiple transaction requests and stores additional rows in each cluster index:

In order to ensure atomicity of transactions, Innodb provides undo log, and Innodb connects the undo log corresponding to each data row through roll_pointer stored in each data row to form version chain, so that one data row can store multiple versions in the database:

Before a transaction starts, there should be one determined version of the row (the version of the most recently committed transaction) in the database, and there may be one or more uncommitted versions of the transaction. Innodb uses ReadView and the corresponding row visibility algorithm to implement multiple versions of the transaction. Confirm the version of the data row that the current transaction needs to access with trx_id:

For each transaction, the database generates a ReadView object that stores a snapshot of the current state of the database transaction: max_trx_id: the maximum value of trx_id in the current active transaction list of the database; Min_trx_id: the minimum value of trx_id in the active transaction list of the current database. M_ids: trX_ID list of active transactions in the database; Creator_trx_id: id of the transaction that created the current ReadView…

For read/write transactions, the trx_id in their ReadView is compared to the trx_id of the data row in the version chain in several cases:

  1. The trx_id of the current data row is between [min_trx_id, max_trx_id] :(1) the trx_id of the current transaction = the trx_id of the current data row, indicating that the current transaction itself is modified, naturally visible to the current transaction; (2) The trx_ID of the current row is not in the m_IDS list, indicating that the transaction corresponding to the version of the row has been committed, so the version of the row is visible to the current transaction; (3) in m_IDS list, it indicates that the transaction that modified the data row has not been committed, so it is not visible to the current transaction.

  2. The current row trx_id>max_trx_id indicates that the row version was modified by the transaction established after the current ReadView was generated, so the row version is not visible to the current transaction.

  3. The current row trx_id< MIN_trx_id indicates that the transaction corresponding to the row version has been committed, and the row version is visible to the current transaction.

This is how Innodb implements snapshot reads (unlocked consistent reads).

Illusory problem under MVCC

But with MVCC we can only avoid phantom reads in read-only queries; Consider the following non-read-only query (sorted by timeline) under two transactions in a concurrent scenario:

Start transaction A, query blog table, find A row in the table:

mysql> begin; Query OK, 0 rows affected (0.00 SEC) mysql> select * from blog where ID < 5; +----+-----------------+--------------+ | id | secondary_index | normal_field | +----+-----------------+--------------+ | 1 | index_text1 | normal_text1 | + - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- + 1 row in the set (0.00 SEC)Copy the code

In addition, open transaction B, insert a row of data with id=2 and commit:

mysql> begin; Query OK, 0 rows affected (0.00 SEC) mysql> select * from blog; +----+-----------------+--------------+ | id | secondary_index | normal_field | +----+-----------------+--------------+ | 1 | index_text1 | normal_text1 | + - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- + 1 row in the set (0.00 SEC) mysql > insert into blog (secondary_index, normal_field) values ("index_text2", "normal_text2"); Query OK, 1 row affected (0.28 SEC) mysql> select * from blog; +----+-----------------+--------------+ | id | secondary_index | normal_field | +----+-----------------+--------------+ | 1 | index_text1 | normal_text1 | | 2 | index_text2 | normal_text2 | +----+-----------------+--------------+ 2 rows in Set (0.00 SEC) mysql> commit; Query OK, 0 rows affected (0.06 SEC)Copy the code

Select * from secondary_index where id<5 = MVCC; select * from blog where id=2;

mysql> update blog set secondary_index = "mvcc" where id < 5; Query OK, 2 rows affected (5.66 SEC) rows matched: 2 Changed: 2 Warnings: 0 mysql> select * from blog; +----+-----------------+--------------+ | id | secondary_index | normal_field | +----+-----------------+--------------+ | 1 | MVCC | normal_text1 | | 2 | MVCC | normal_text2 | + - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- + 2 rows in the set (0.00 SEC)  mysql> commit; Query OK, 0 rows affected (0.06 SEC)Copy the code

Because during a transaction, a transaction ID is assigned to the transaction and a ReadView is regenerated only when changes are made to records in the table (INSERT, DELETE, UPDATE, etc.). Then we determine whether the data row is visible to the current transaction according to the MVCC ReadView. Since transaction A runs the UPDATE statement after transaction B commits, transaction A is assigned trx_ID greater than transaction B. Therefore, in the regenerated ReadView, Changes made to transaction B are visible to transaction A.

Therefore, MVCC can only solve part of the phantom problem, and the root of the phantom problem is solved by locking mechanism – predicate Lock and next-key Lock, where next-key Lock is the optimization of predicate Lock.

Predicate Lock, Gap Lock, next-key Lock

SELECT * FROM blog WHer ID < 5 FOR UPDATE (SELECT * FROM blog WHer ID < 5 FOR UPDATE) It also does not cause phantom reading problems in transaction A.

Gap Lock, that is, to Lock the range of SQL query results :(-∞,5), and next-key Lock is locked (-∞,5], that is, Gap+Record(row) Lock, is the optimization of Gap Lock; Innodb uses next-key Lock to query the range of a cluster index.

If we perform equivalent query on the cluster index, ** Because the cluster index has primary key uniqueness, so the nexk-key Lock can be degraded to Record Lock;

This is how Innodb cluster indexes are locked.

Note that: If we do an equivalent locking query on a secondary (secondary) index instead of a Record Lock, because the secondary index value is not unique, we also need to do a range Lock, SELECT * FROM blog where secondary_index = “index_text1” FOR UPDATE Index_text1], (index_text1, index_text2) add a next-key Lock and a Gap Lock, Prevents phantom reads from being repeatedly inserted into a secondary_index = “index_text1 “secondary index record in this range.

After the secondary index is locked, the cluster index row corresponding to the primary key in the secondary index needs to be locked back to ensure data consistency between indexes.

So Innodb solves the illusion problem at the repeatable read level with MVCC + next-key Lock.

Of course at this level we still can’t solve the problem of write skew -Check Then Act. To solve this problem,

The essence of the problem is that another transaction changes the status of the check condition for the current transaction, and the other transaction does not double-check for this.

So we can achieve serializable isolation levels with different granularity locks via MVCC:

  1. By explicitly locking (select xxx for updateEnsure that only one thread can query at a timecheckCondition to ensure that another transaction can query only after the previous transaction has released the lockcheckCondition to ensure that the current fetch is the latest value of the condition (2PL, two-stage locking);
  2. On the basis of snapshot isolation, it is implemented by optimistic concurrency + collision detection. There are two checking points for each transaction: before the transaction is executed and before the transaction is completed;Before executing a transaction, you need to check whether the transaction affects the current situationcheckUnsubmitted changes to the conditions;The transaction also needs to be checked to see if it has any impact on the currentCheckConditional uncommitted transactions have been committed (trx_idAfter the transaction version is determined before the transaction executes).