On MySQL InnoDB Engine Locking

Author: Fame

preface

After reading this article, you will learn:

1. Have a more comprehensive understanding of locks in MySQL.
2. Know what phantom reading is and how to avoid it.
3. How InnoDB Engine implements row-level locking.
4. The conditions, instances and how to avoid deadlock.
5. The SQL statements in this paper can be directly executed in MySQL, which is very important for you to do experiments by yourself. Only after you have done experiments by yourself can you remember them more deeply.

In addition, due to my level and limited time, if there is a mistake in the article, welcome to criticize and correct, thank you very much, of course, if there is any question also welcome comment area, study together to discuss common progress.

I believe that many students are not unfamiliar with the concept of MySQL lock, but feel that they do not understand it very thoroughly, always like a layer of veil. Before reading this article, it is recommended that you first understand the concepts of isolation level, current read, snapshot read, etc. Note: All the operations in this paper are based on MySQL 8.0.22 InnoDB engine, and the results may be different if they are executed in other large versions. (Note 📢 : This article is done under the (repeatable read) isolation level.) Next we will discuss which locks MySQL has and why.

1. What locks does MySQL have?

MySQL’s different storage engines support different locking mechanisms,All storage engines implement locking in their own way, the server layer does not need to know the specific implementation of the storage engine lock. Depending on the granularity of locks, MySQL locks can be divided into global locks, table-level locks, row-level locks (InnoDB), and others (self-increment ID locks). This article focuses on row-level locking. Others are just a brief description. Search for those you are interested in.

1.1 the global lock

Flush Tables with Read Lock: The typical use of a global lock is when a database is backed up and the global lock is used to maintain data consistency, but when the database tables are running on an engine called InnoDB it is used mysqldump… – single ws-transaction. If you use MyISAM for data backup, you can only add a global lock.

What kind of data inconsistencies can result if the data is backed up without a global lock? Here’s an example: the database contains a Wallet table and a stock table. When purchasing an item, both the Wallet and stock tables need to be deducted. The purchase happened during the database backup process. And the purchase occurred just after the wallet table was backed up and before the stock was backed up. Because the wallet table was backed up, this deduction was not recorded in the wallet backup file. However, the stock table had not been backed up when the inventory was deducted, so the inventory deduction was recorded in the backup file, which led to the inconsistent database data recovered by the backup file. The money was not deducted from the wallet, but the inventory was deducted.

Since the InnoDB engine supports snapshot reads, if you take a snapshot (–single-transaction) while the database is backed up, there will be no data inconsistencies even without global locking.

1.2 Table Lock:

Lock tables read: MyISAM and Memory storage engines typically use table locks for concurrency (they don’t support row-level locks), whereas InnoDB supports both table locks and row-level locks to improve concurrency (InnoDB’s row locks also degenerate to table locks if they are not used properly, which will be discussed later). In addition, the BDB storage engine uses page locks, and the granularity of page locks is between table level locks and row level locks.

MySQL > insert a read lock into a table; insert a read lock into a table; insert a read lock into a table; insert a read lock into a table The MDL write lock is applied when a table is to be used for a structure change operation DDL. To prevent table structure changes during data query or data update that may result in inconsistencies with the expected results of the query or update.

Imagine if there is no lock between the DDL operation and the CURD operation, and you can query the data simultaneously, and your query condition column is deleted, or the column you want to update is deleted, whether from MySQL statement execution aspect and the results of the business query data will not be strange. So MySQL introduces the concept of MDL locking.

InnoDB has two types of internally-used Intention Locks, both table level Locks, to allow row Locks and table Locks to coexist and to enable multi-granularity locking:

Intentional Shared Lock (IS) : A transaction must acquire the IS lock on a table before it can attach a read lock to a row.
Intentional exclusive lock (IX) : Before a transaction can write a lock on a row, it must acquire the IX lock on the table.

The compatibility relationship between intentional locks and locks is as follows:

Tips: X: eXclusive lock, S: Share lock, IX: Intention eXclusive Intention, IS: Intention Share

	X	IX	S	IS
IS	conflict	Compatible with	Compatible with	Compatible with
IX	conflict	Compatible with	conflict	Compatible with
S	conflict	conflict	Compatible with	Compatible with
X	conflict	conflict	conflict	conflict

In a word, the purpose of an intentional lock is to quickly detect the existence of a row lock in a table without traversing the entire index to find out that a row lock has been added to the table and cannot be added to the table. Here’s an example:

CREATE TABLE `wallet` (
  `user` VARCHAR(32) NOT NULL ,
  `balance` int(11) DEFAULT 0,
  PRIMARY KEY (`user`)
) ENGINE=InnoDB;
 
INSERT INTO wallet VALUES ("Tom",1000);

	session1	session2
T1	BEGIN; select from wallet where user = ‘tom’ for update; * Added row lock and IX lock	Unit 2 * *
T2		BEGIN; LOCK TABLES wallet READ; Block on table lock

Session 1 automatically adds iX to the wallet table when a row lock is added at T1. Sesion1 (IX) : Sesion1 (IX); Sesion1 (IX) : Sesion1 (IX); In this way, there is no need to traverse all indexes to see if there is a row-level lock, which greatly improves the efficiency of MySQL concurrency.

1.3 Row level lock:

In order to improve The concurrency, InnoDB storage engine introduces The concept of row level lock, row level lock involves exclusive lock, shared lock, row lock, interval lock, The next key lock and other concepts, which will be introduced in detail later. Note 📢 : The row locks mentioned later are actually row level locks. A lock that locks only one row is called a typical row lock.

1.4 Self-increasing ID lock

When a table contains an AUTO_INCREMENT field, MySQL uses a lock to prevent concurrent transactions from applying to the same AUTO_INCREMENT field. When a table contains an AUTO_INCREMENT field, MySQL uses a lock when a transaction applies to an AUTO_INCREMENT field. However, the self-incremending ID lock is not transactional (in other words, the lock is not held throughout the transaction) and is released as soon as the self-incremending ID is acquired.

2. Why lock them?

2.1 Ensure data consistency

Here is a scenario in which the data is inconsistent because the lock is not locked. Let’s assume we have a wallet table that stores the balance of a user’s wallet.

CREATE TABLE `wallet` ( `user` VARCHAR(32) NOT NULL , `balance` int(11) DEFAULT 0, PRIMARY KEY (`user`) ) ENGINE=InnoDB; INSERT INTO wallet VALUES ("Tom",1000);

Scenario: User Tom has 1000 yuan in his wallet. The server now wants to purchase an item for $1000, but for some reason (client, gateway retry, malicious attack, etc.), the server receives two Tom purchase requests at the same time. Assume that the execution logic of the business is:

1. Query database;
2. Verify that the balance is sufficient;
3. If sufficient, deductions shall be made;
4. If insufficient, return deduction fails;

(Note 📢 : this article is all done under the (Repeatable-Read) isolation level)

Assuming that the business logic is executed in the following timing sequence, what can go wrong? Can you think about the implementation of black results?

	session1	session2
T1	BEGIN; select * from wallet where user = ‘tom’;
T2	Whether the balance of business code check is sufficient A balance of 1000 is sufficient	BEGIN; select * from wallet where user = ‘tom’;
T3		Whether the balance of business code check is sufficient A balance of 1000 is sufficient
T4	UPDATE wallet set balance = balance – 1000 WHERE user = ‘tom’;
T5		select * from wallet where user = ‘tom’; balance = ?
T6		UPDATE wallet set balance = balance – 1000 WHERE user = ‘tom’; Does it block?
T7	select * from wallet where user = ‘tom’; balance = ?
T8	commit;
T9	select * from wallet where user = ‘tom’; balance = ?
T10		select * from wallet where user = ‘tom’; balance = ?
T11		commit

The following table is the result of execution:

	session1	session2
T1	BEGIN; select * from wallet where user = ‘tom’;
T2	Whether the balance of business code check is sufficient A balance of 1000 is sufficient	BEGIN; select * from wallet where user = ‘tom’;
T3		Whether the balance of business code check is sufficient A balance of 1000 is sufficient
T4	UPDATE wallet set balance = balance – 1000 WHERE user = ‘tom’;
T5		select * from wallet where user = ‘tom’; balance = 1000 If you don’t know why it’s 1000, can you search for the repeatable read isolation level
T6		UPDATE wallet set balance = balance – 1000 WHERE user = ‘tom’; Does it block? Yes, because the UPDATE statement defaults to write the lock, session 2 has to wait since session1 already has a lock
T7	select * from wallet where user = ‘tom’; balance = 0
T8	commit;	When session 1 commits, the UPDATE statement is no longer blocked and the execution succeeds, and balance becomes -1000
T9	select * from wallet where user = ‘tom’; balance = -1000
T10		select * from wallet where user = ‘tom’; balance = -1000 Even at the repeatable read isolation level, you can see your own commits, so it’s -1000, not 0
T11		commit

As you can see, if there is no lock, the balance will be negative, resulting in inconsistent data. In the same scenario, what if I put a lock on it?

	session1	session2
T1	BEGIN; select * from wallet where user = ‘tom’ for update;
T2	Whether the balance of business code check is sufficient A balance of 1000 is sufficient	BEGIN; select * from wallet where user = ‘tom’ for update; blocking
T3	UPDATE wallet set balance = balance – 1000 WHERE user = ‘tom’;
T4	commit;	Session 2 no longer blocks after session 1 commits.It reads that the balance is 0
T5		Whether the balance of business code check is sufficient Balance 0 returns failed

In the same scenario as above, locking can avoid data inconsistencies.

2.2 Avoid phantom reading

The existing table student. The table structure and the records in the table are as follows:

CREATE TABLE `student` (
  `student_id` int NOT NULL,
  `name` VARCHAR(32) NOT NULL ,
  `age` int(11) DEFAULT 0,
  PRIMARY  KEY (`student_id`),
    KEY `age`(`age`) USING BTREE
) ENGINE=InnoDB;
INSERT INTO student VALUES(6,'Tom',6),(11,'Jerry',11),(18,'Jamey',18),(25,'Marvin',25);

The following operations demonstrate what phantom reading is.

	session1	session2
T1	BEGIN; SELECT * from student where student_id > 6 and student_id < 20; rows 2 11 Jerry 18 Jamey
T2		INSERT into student VALUES(16,’Lucy’,16);
T3	SELECT * from student where student_id > 6 and student_id < 20; rows 2 11 Jerry 18 Jamey
T4	INSERT into student VALUES(16,’Lucy’,16); Duplicate entry ’16’ for key ‘student.PRIMARY’
T5	SELECT * from student where student_id > 6 and student_id < 20; rows 2 11 Jerry 18 Jamey
T6	commit;

Select * from T2 where student_id=16 select * from T2 where student_id=16 select * from T2 where student_id=16 select * from T2 where student_id=16 Student_id =16; student_id=16; It’s like having an illusion, where you can’t find the data, but when you insert it, it conflicts. This phenomenon is called phantom reading. Resolving phantom reads requires setting the Serializable isolation level or using a lock when reading data (current read). We will discuss how to solve phantom reading by locking.

3. Row-level locks

3.1 For Update lock

In this section, we will discuss why row-level locking can solve phantom reading. How is it solved? Let’s take a look at the results of the following statement. Still is to use the student table, there are four rows of data in the table, building table and insert statements are as follows.

CREATE TABLE `student` (
  `student_id` int NOT NULL,
  `name` VARCHAR(32) NOT NULL ,
  `age` int(11) DEFAULT 0,
  PRIMARY  KEY (`student_id`),
    KEY `age`(`age`) USING BTREE
) ENGINE=InnoDB;
INSERT INTO student VALUES(6,'Tom',6),(11,'Jerry',11),(18,'Jamey',18),(25,'Marvin',25);

	session1	session2
T1	BEGIN; SELECT * from student where student_id > 6 and student_id < 20 FOR UPDATE;
T2		SELECT * from student where age > 6 and age < 20; Block? no
T3		INSERT into student VALUES(17,’Lily’,17); block? yes
T4		INSERT into student VALUES(21,’Lily’,21); block? yes
T5		NSERT into student VALUES(26,’Lily’,26); block? No

Row locks are performed by locking index entries, not on real records. So The next key lock is used only on an index query, and The whole table is locked without an index (degenerates to a table level lock).

Why do T3 and T4 block, but T5 doesn’t? This leads to a new concept called Gap Lock (Note: Gap Lock is unique to the RR level. The RC level locks only rows, not gaps, as described later in this article). Gap lock, which locks the space between two values. SELECT * from student where student_id > 6 and student_id < 20 FOR UPDATE (student_id = 11 and student_id = 11 18) and would lock the space between them. As shown in the figure below:

Here are the two principles and two optimizations of InnoDB locking (here using the MySQL 45 teacher Dingqi). This involves the concept of Next-Key Lock, which consists of a gap lock and a typical row lock. As shown in the figure above, the next-key lock added at index 11 is composed of the gap lock of 6-11 and the typical row lock of 11. Note that the next-key lock is left open and right closed, which is similar with mathematical notation: (6, 11). Let’s look at the principle of InnoDB engine locking at RR level:

Principle 1: The basic unit of locking is the next-key lock. A next-key lock is a lock that opens left and closes right.
Principle 2: Locks are used only when objects are accessed during the lookup process.
Optimization 1: Equivalent queries on indexes. When locking a unique index, the next-key lock degenerates to a row lock (typical row lock).
Optimization 2: Equivalent query on the index. When traversing to the right and the last value does not meet the equivalence condition, the next-key lock degenerates into a gap lock.

There are two principles and two optimizations we can use to verify the locking logic in the figure below. Student_id = 6; Because the condition is >6, row 6 is not accessed, so row 6 is not locked. It also satisfies Principle 1, that is, next-key lock is left open and right closed. Student_id = 25; select * from student_id where student_id = 25; The next-key lock degenerates to a gap lock, so 25 is not locked. Student_id = 11 and student_id = 18

Now we can answer the question at the beginning of the section, why does row-level locking solve phantom reading?

1. For update or lock in share mode, the current read will read the latest committed record, so there is no illusion.
1. Because gap locks are placed between indexes, other transactions cannot insert new records, and there is no magic reading.

Row locks can only be used if the execution plan actually uses the index: Even if index fields are used in the condition, it is up to MySQL to decide whether or not to use index to retrieve data by judging the cost of different execution plans. If MySQL considers full table scans to be more efficient, such as for some very small tables, it will not use index. In this case InnoDB will use table locks instead of row locks. Therefore, when analyzing lock conflicts, don’t forget to check the SQL execution plan (which can be checked through EXPLAIN) to verify that the index is actually being used. Supplement 2: The update statement is locked by default and is the current read. In InnoDB transactions, row locks are added when they are needed, but they are not released as soon as they are not needed. Instead, they are released at the end of the transaction. This is the two-phase locking protocol.

3.2 Lock In Share Mode

Lock In share mode can also be used to read the current session, and other sessions cannot write to the session. But the current session can be written. The Lock In share mode is mutually exclusive with the for update locks of other transactions, but is shared with the Lock In share mode of other transactions. The result of the following statement explains the above conclusion.

	session1	session2
T1	BEGIN;
T2		INSERT into student VALUES(10,’Lily’,17);
T3	SELECT * from student where student_id > 6 and student_id < 20 Can you find the data you just inserted? Yes, because begin alone doesn’t actually take a snapshot, but instead generates the snapshot after executing the SQL. (Interested can understand the snapshot read)
T4		INSERT into student VALUES(12,’Lily’,17);
T5	SELECT * from student where student_id > 6 and student_id < 20 Can you find the data you just inserted? No, at the RR level, this is a snapshot read at the moment
T6	SELECT * from student where student_id > 6 and student_id < 20 LOCK IN SHARE MODE; Can you find the data you just inserted? Yes, because share mode is the current read
T7		INSERT into student VALUES(19,’Lucy’,21) Block? Yes, a lock conflict with Session 1.
T8	INSERT into student VALUES(15,’Lucy’,21); Block? No, when the same session applies for a write lock, it will not conflict with the read lock it applies for. However, it is generally not allowed to do so, which may cause deadlock. This will be described below

4. A deadlock

4.1 Generation and avoidance of deadlock

Since the lock is added, it is inevitable to create a deadlock. Now let’s review the four necessary conditions for creating a deadlock.

(1) Mutual exclusion: A resource can only be used by one process at a time.
(2) Request and hold condition: when a process is blocked by requesting resources, it will hold on to the acquired resources.
(3) No deprivation condition: the resources acquired by the process cannot be forcibly deprived before the end of use.
(4) Circulating waiting condition: a head-to-tail cyclic waiting resource relationship is formed among several processes.

As long as any of the four necessary conditions above are broken, the deadlock will not occur. When we use locks in everyday development, the way to avoid deadlocks is usually to break conditions 2 and 4. Break 2: After the failure of the lock application, the previously applied lock should be released. You cannot apply for a lock while waiting for a lock. Break # 4: An easy way to avoid loop waiting is to number the locks. For example: lock 1, lock 2, lock 3. For example, thread A needs lock 1 and lock 3, so it should first apply for 1 and then apply for 3, but cannot apply for 3 and then apply for 1. Thread B needs lock 1 and lock 2, so it should first apply for lock 1 then apply for lock 2. In this way, thread A and thread B do not apply for the lock to form A loop. How does MySQL solve the deadlock problem? In fact, MySQL has deadlock detection, that is to say, when a deadlock occurs, MySQL can detect it. When multiple transactions produce a deadlock, MySQL will choose a less costly transaction to roll back. Deadlock detection is a drain on MySQL’s performance and can be turned off by setting parameters, but it is not recommended. When deadlock detection is turned off, MySQL can reverse the deadlock problem through the lock timeout mechanism. When the lock wait time exceeds the set threshold, the transaction will automatically roll back. The lock timeout setting method is shown in Appendix.

Here are a few scenarios where MySQL can cause deadlocks.

4.2 Several scenarios of deadlock

4.2.1 Deadlocks caused by improper use of Share Mode

The following statement still uses the student table.

	session1	session2
T1	BEGIN; SELECT * from student where student_id > 6 and student_id < 20 LOCK IN SHARE MODE;	BEGIN;
T2		SELECT * from student where student_id > 6 and student_id < 20 LOCK IN SHARE MODE;
T3		INSERT into student VALUES(19,’Lucy’,21);
T4	INSERT into student VALUES(19,’Lucy’,21); dead lock

The cause of the deadlock: Session 1 applies a read lock at T1, and Session 2 applies a read lock at T2. When session 2 applies for a write lock at time T3, it waits for session 1’s read lock. Session 1 will wait for Session2’s read lock when requesting a write lock at time T4. Loop wait is created, leading to deadlocks.

Therefore, when using Lock in share mode, it is not recommended to update the record, which can easily cause deadlock and rollback.

4.2.2 Deadlocks caused by improper update sequence

CREATE TABLE `t` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `v` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

	session1	session2
T1	BEGIN; update t set v = v+1 where id = 1;
T2		begin; update t set v = v+1 where id = 2;
T3	update t set v = v+1 where id = 2;
T4		update t set v = v+1 where id = 1; deadlock

Session 1 imposes a typical row lock on table t at T1 where id=1. Session 2 applies a typical row lock to table t at time T2 (id = 2). Session 1 is waiting for session 2 to release the write lock on line id=2. Session 1 is waiting for session 2 to release the write lock on line id=2. Session 1 is waiting for session 2 to release the write lock on line id=2. In T4, session 2 applies for the row lock with id=1, but the row lock with id=1 is held by session 1. So session 1 and session2 create a loop wait that leads to deadlocks.

4.2.3 Deadlocks arising from concurrency scramble

CREATE TABLE `t` (
  `c` int(11) DEFAULT NULL,
  `d` int(11) DEFAULT NULL,
  PRIMARY KEY (`c`)
) ENGINE=InnoDB;

	session1	session2	session3
T1	BEGIN; INSERT into t VALUES (5, 5);
T2		INSERT into t VALUES (5, 5); block
T3			INSERT into t VALUES (5, 5); block
T4	ROLLBACK;	Affected rows: 1	Deadlock found

As for the deadlock in this case, to be honest, I don’t have a reasonable explanation for it. If you have a plausible explanation, please feel free to answer it in the comments section.

5. To summarize

In this review, I first introduced the locks in MySQL and the basic concepts of these locks, so that you can have a global understanding of locks. Secondly, through data consistency and phantom scenarios, we introduce why locking is needed and how row level locking works in InnoDB. Finally, several scenarios for deadlock generation are introduced. The SQL statements in this article can be copied directly to MySQL to run, so it is recommended that you execute them yourself, so that you will be more impressed.

Finally, it is emphasized that the above experiments are rated RR. If the isolation level is changed to read committed RC (read committed), then for update, in share mode, etc., the row level will be reduced to typical row lock, i.e. no gap lock will be added. If you run the example in Section 3.1 at the RC isolation level, you will get different results. If you are interested, try it.

	session1	session2
T1	BEGIN; SELECT * from student where student_id > 6 and student_id < 20 FOR UPDATE;
T2		SELECT * from student where age > 6 and age < 20; Block? no
T3		INSERT into student VALUES(17,’Lily’,17); block? no
T4		INSERT into student VALUES(21,’Lily’,21); block? no
T5		NSERT into student VALUES(26,’Lily’,26); block? No

The appendix

View the isolation level

 show variables like 'transaction%';

Set the isolation level

Set Global Transaction Isolation Level Repeatable Read; Set global transaction isolation level read committed;

Mysql > check MySQL version

select version();

Set the lock timeout time

 SHOW GLOBAL VARIABLES LIKE 'innodb_lock_wait_timeout';
 SET GLOBAL innodb_lock_wait_timeout=3600;