Recently, I have read MySQL Technology Insider InnoDB Storage Engine, and all kinds of concepts mentioned in it are very new. I have heard about dirty reading, unreal reading and non-repeatable reading before, but I don’t know much about the concepts, so I looked them up and made a note here.

Database transaction characteristics

Database transaction characteristics, i.e. ACID:

A. Atomicity B

A transaction is an atomic unit of operations in which all or none of the operations on the database are performed.

C Consistent

Data must be in a consistent state before and after a transaction, and database integrity must be guaranteed. That is, the data must conform to the rules of the database.

I Isolation Isolation

The database allows multiple concurrent transactions to operate on the data. Isolation ensures that each transaction is independent of each other. The intermediate state of a transaction is invisible to other transactions to prevent data inconsistency. Can be set by transaction isolation level: including Read uncommitted, Read Committed, Repeatable Read and Serializable

D Durable

After a transaction completes, its changes to the database are permanent and will not be lost even if the system fails.

MySQL data isolation level

First, there are four isolation levels in MySQL: Read Uncommttied, Read Committed, Repeatable Read, Serializable.

In InnoDB, the default is Repeatable level and InnoDB uses a strategy called next-key locking to avoid phantom.

Use the select @ @ tx_isolation; You can view MySQL’s default transaction isolation level.

Different transaction isolation levels cause different problems:

The concepts of dirty reading, phantom reading and unrepeatable reading

Dirty read

Dirty read refers to a transaction accessing uncommitted data from another transaction, as shown in the following figure:



If session 2 updates the age to 10, but session 1 wants to get the age before commit, the value will be the one before the update. Or if session 2 updates the value but performs rollback, session 1 still gets 10. This is dirty reading.

Phantom read

A transaction is read twice, and the number of records obtained is inconsistent:



The figure above clearly shows this situation, because a new value is inserted between session 1, so the two results are different.

Unrepeatable read

A transaction reads the same record twice and gets inconsistent results:



Because the data is changed in the middle of the read, the results obtained during the session 1 transaction query are different.

The solution

The solution is the four levels of isolation mentioned above, which can minimize the occurrence of the above three situations:

Unauthorized reading

Also called Read Uncommitted: Dirty reads are allowed, but updates are not allowed to be lost. If one transaction has already written, the other transaction is not allowed to write simultaneously, but other transactions are allowed to read the row. This isolation level can be achieved through exclusive write locks.

Authorized to read

Also known as Read Committed: Unrepeatable reads are allowed, but dirty reads are not allowed. This can be done with “instant shared read locks” and “exclusive write locks”. A transaction reading data allows other transactions to continue to access the row, but an uncommitted write transaction prevents other transactions from accessing the row.

Repeatable Read

Repeatable Read: Non-repeatable and dirty reads are forbidden, but phantom data may occur sometimes. This can be done with “shared read locks” and “exclusive write locks”. A transaction that reads data will prohibit a write transaction (but allow a read transaction), and a write transaction will prohibit any other transaction.

Serializable

Serializable: Provides strict transaction isolation. It requires serialized execution of transactions, which can only be executed one after another, not concurrently. Transaction serialization cannot be achieved through row-level locking alone, and other mechanisms must be used to ensure that newly inserted data is not accessed by the transaction that just performed the query.

A higher isolation level ensures data integrity and consistency, but has a greater impact on concurrency performance. For most applications, setting the isolation level of the database system to Read Committed is a priority. It can avoid dirty reads and has good concurrency performance. Although it can lead to concurrent problems such as unrepeatable reads, phantom reads, and type ii missing updates, in the rare cases where such problems may occur, it can be controlled by the application using pessimistic or optimistic locking.

reference

  • Dirty, phantom, and unrepeatable reads of the database
  • Dirty read, unrepeatable read, phantom read
  • What is dirty reading, unrepeatable reading, magic reading