Truly understand repeatable read transaction isolation levels

Original: Coding diary (wechat official ID: Codelogs), welcome to share, reprint please reserve the source.

Introduction of the transaction

The SQL standard defines four isolation levels, which are:

READ UNCOMMITTED: At this level of isolation, dirty reads, unrepeatable reads, and magic READ problems can occur. READ COMMITTED: Resolves dirty READ problems. REPEATABLE READ: Solve the problems of dirty and unrepeatable reading. SERIALIZABLE: Solve dirty read, unrepeatable read, phantom read problems.

Here is not a detailed explanation of dirty read, non-repeatable read, magic read problems these phenomena, the introduction of transaction articles or books will be very clear, but please note that these are the phenomena that may occur when the transaction is running concurrently, and can not be understood as a database bug!

But when I think about this knowledge after many years of work, I have a very big question about repeatable reading behavior, as follows:

Why don’t programmers keep the data they read the first time in memory and reuse it the second time? Why should the database ensure that the data read twice is the same (repeatable read), and reading the database twice will waste more database resources and reduce performance.
Besides, what’s wrong with reading the latest data every time? What’s the use of reading historical data?

So what’s the use of repeatable readability anyway?

For example

We can examine the following scenario. A financial product has a function to find users whose account balance is not in line with account transaction flow. Let’s call the account checking task and ensure that user transactions are normal and uninterrupted when the account checking task is running. For example, an account with a balance of 100 yuan has two transaction records (+200 and -100), so the account reconciliation is normal. However, if the program queries the account with a balance of 100 yuan, then the user transfers another 100 yuan, when we query the transaction record again, different results will be found under different transaction isolation levels, as follows:

Submit to read	Repeatable read	note
	Start balance 100, transaction record (+200, -100)
Check the balance of 100 yuan	Check the balance of 100 yuan
	Another transaction disbursed $100, the balance reduced to 0, and submitted
Transaction record (+200, -100, -100)	Transaction record (+200, -100)
Check for failure	Check for success

As can be seen, in the commit read scenario, the reconciliation fails, but in the repeatable read scenario, the reconciliation succeeds, and in fact the balance of the account is always aligned with the transaction record. I personally verified it in MySQL5.7, and it did.

So what exactly does repeatable read do? So what exactly does repeatable read do? So what exactly does repeatable read do?

It is essentially used to ensure that all tables in the database are queried with the same version, that is, the version at the moment when the transaction is started (in mysql, the version at the moment of the first query), regardless of whether it is the same table or a different table. So the repeatable read transaction level does not address the seemingly unrepeatable read phenomenon.

Repeatable reads are also commonly used in database backups. Since data may be changing during a database backup, you want to back up the original version of the database, not the previous version or the later version of the database.

This example also illustrates another issue, which is when you need to use transactions, when you’re writing code you’re often told that you need to put all your writes into one transaction, and in fact, in some special cases, you need to put multiple reads into one transaction.

Look at things differently

We can understand the transaction isolation level not from the phenomenon of dirty reads, unrepeatable reads, and phantom reads, but from the read consistency, as follows:

If a transaction is not committed, it does not solve any read consistency problem, but only ensures the write consistency (also known as atomicity) of the transaction. After the transaction is committed, either the modification succeeds or none succeeds.
Commit read to ensure that all changes to other concurrent transactions are either visible or invisible. Write consistency “, “read”, these are the most common transaction isolation levels to ensure consistency of meaning of business data.

For example, in the user order scenario, the open transaction writes the order data of the primary table and the order data of the sub-table order_item. If there is an uncommitted transaction in the middle of the two writes to read order and order_item, it will find that the order is only read but order_item is not read. This gave the user to see, that will be startled, I paid the result bought an empty sheet! The user refreshes and sees the full data again. However, if the commit read transaction isolation level is used, this problem does not exist. The user will either not find any data or will find complete data. This also indicates that logically related data changes must be opened for transactions. 3. Repeatable read, to ensure that at the moment when the transaction is started or the first query, all subsequent reads of all tables in the entire database are the version of the read at that moment, including repeated reads of the same table, which can also be understood as “consistent version read”. Serialization, which generally addresses logic errors on concurrency, because at this level it is logically possible to assume that all transactions are executed sequentially (although the database may actually execute concurrently). Two transactions to determine data have, for example, no data is inserted into the scene, concurrent cases two transactions at the same time the query, found no data after inserting data, the results into the two data, and use serializable isolation level, there is no this problem, this is called a race condition in concurrent programming, so the serialization to solve the problem, speaking, reading and writing of a race condition. Of course, this problem can also be solved by adding unique indexes or using external display locking.

Whether mysql repeatable read resolves phantom reads

On the web, we often see two kinds of articles, some say that mysql repeatable read solved the illusion problem, and some say that it did not solve the problem. This is both true and false. The difference is whether the current read operation is a snapshot read or a current read, as follows:

Read the snapshot	The current reading	note
	At the beginning, there were two order_items under order 1, A and B respectively
Select * from order_item where oid=1	Select * from order_item where oid=1	First read
	Another transaction inserts C under order 1 and commits
Select * from order_item where oid=1	Select * from order_item where oid=1 for update(read A, B, C)	The second reading

Select * from mysql5.7; select * from mysql5.7; select * from mysql5.7; Insert, update, delete, select XXX for update are all current reads in mysql.

Update order_item set price=199 where oid=1 select * from order_item where oid=1 Mysql > update (select * from mysql); update (select * from mysql); update (select * from MYSQL)

So, mysql does solve the illusion problem of snapshot reads, not the illusion problem of current reads, but whether or not it solves the illusion problem, it is not a substitute for the serialization isolation level.

Content of the past

Linux text command tips (2) Linux text command tips (1) AWK is really a magic ah easy to use parallel command common network command summary

Truly understand repeatable read transaction isolation levels

Introduction of the transaction

For example

Look at things differently

Whether mysql repeatable read resolves phantom reads

Content of the past

Related Posts

Article understanding OAuth

Get rid of those BeanUtils utility classes, MapStruct smells good!!

Do you understand the difference between ID and result? | small volume of free education