This is the 17th day of my participation in the August Text Challenge.More challenges in August

One, foreword

A selectOrInsert scenario for MySQL was recently designed in a project. In order to improve performance, I adopted the idea of DCL (Double Check Lock). The system has been up and running for a long time with no problems, but occasionally it throws a deadlock exception. Here, I’ll take a look at this scenario and why deadlocks occur.

Ii. Scene Description

In a real project, we have a scenario where we need to convert some string ID supplied by a third party to a globally unique numeric ID. So we designed a table called T_ID_mapper. It contains two fields: ID of the increment type (numeric type) and O_ID(Other ID) of the string type, which is a unique index. The service scenarios are as follows:

  1. Get the tripartite string type ID: strId
  2. Determine whether the strId exists in the t_ID_mapper database
  3. If so, the corresponding numeric type ID is returned
  4. If not, the strId is inserted into the table T_ID_mapper and its corresponding increment field ID is returned

So I wrote something like this:

  1. Select * from t_ID_mapper where o_id = :strId
  2. If yes, the ID field is returned.
  3. Select * from t_id_mapper where o_id = :strId for update
  4. It then checks again if the record exists and returns the corresponding ID field if it does.
  5. If not, the INSERT into statement inserts the strId content into t_ID_mapper, and then commits the transaction. And return the ID of the increment sequence, which is the ID of the number type corresponding to strId.

The pseudocode for the above operation is as follows:

public int selectOrInsert(String strId) { id = doSelect(strId); If (null == id) {transaction {// Enable transaction ID = doSelectForUpdate(strId); If (id == null) {id = doInsert(strId); // Perform insert}}// commit transaction} return ID; }Copy the code

The DCL process for MySQL is as follows:

The whole process doesn’t look like a problem and is in full compliance with the DCL, the first lock-free judgment. The second lock judgment, followed by an INSERT during the lock holding period (within a transaction), prevents concurrent writes.

However, after running for some time, the program throws a deadlock exception at the insert location in the diagram.

There is something wrong with the DCL schema in MySQL. Let’s take a closer look at why.

3. Process analysis

1. Deadlock log analysis

The first thing we should think of when we encounter a deadlock exception is to look at the deadlock log. So we first got the following deadlock log from the DBA.

------------------------ LATEST DETECTED DEADLOCK ------------------------ 200526 17:49:17 *** (1) TRANSACTION: TRANSACTION 7892ECEC4, ACTIVE 50 sec inserting mysql tables in use 1, locked 1 LOCK WAIT 4 lock struct(s), heap size 1248, 2 row lock(s), undo log entries 1 MySQL thread id 41920347, OS thread handle 0x7f8fe5598700, Query ID 26349859046 10.174.41.70 pandA_rw update INSERT into tm_***_center (id, template_id, template_type, modify_time, create_time ) values(null, '***', 2, now(), now()) *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1326 page no 4 n bits 256 index `UNIQ_IDX_TEMPLATE` of table `panda`.`tm_***_center` trx id 7892ECEC4 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 168 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 30; hex 393064383166343162336330343833343930353432383962313863663932; asc 90d81f41b3c048349054289b18cf92; (total 32 bytes); 1: len 4; hex 000000f5; asc ;; *** (2) TRANSACTION: TRANSACTION 7892F00C6, ACTIVE 35 sec inserting, thread declared inside InnoDB 500 mysql tables in use 1, locked 1 4 lock struct(s), heap size 1248, 2 row lock(s), undo log entries 1 MySQL thread id 41920345, OS Thread Handle 0x7F8FF7041700, query ID 26349859047 10.174.41.70 pandA_rw update insert into tm_***_center (id, template_id, template_type, modify_time, create_time ) values(null, '***', 2, now(), now()) *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 1326 page no 4 n bits 256 index `UNIQ_IDX_TEMPLATE` of table `panda`.`tm_***_center` trx id 7892F00C6 lock_mode X locks gap before rec Record lock, heap no 168 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 30; hex 393064383166343162336330343833343930353432383962313863663932; asc 90d81f41b3c048349054289b18cf92; (total 32 bytes); 1: len 4; hex 000000f5; asc ;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 1326 page no 4 n bits 256 index `UNIQ_IDX_TEMPLATE` of table `panda`.`tm_***_center` trx id 7892F00C6 lock_mode X locks gap before rec insert intention waiting Record lock, heap no 168 PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 30; hex 393064383166343162336330343833343930353432383962313863663932; asc 90d81f41b3c048349054289b18cf92; (total 32 bytes); 1: len 4; hex 000000f5; asc ;; *** WE ROLL BACK TRANSACTION (2)Copy the code

From the deadlock log we can see that transaction 1 is waiting for an intent lock to be inserted on the primary key.

Transaction 2 holds a gap lock:

While session2 waits for the intent lock to be inserted:

Before analyzing the reasons for the above scenario, we should first clarify the following points:

MySQL lock correlation is only a point

A, MySQL locks are compatible with each other

As can be seen from the following table, after a gap lock is added to a position, gap lock and next-key lock can be added to the region, but intention lock cannot be inserted.

That is, gap locks and gap locks are compatible with next-key locks, but gap locks and insert intent locks are mutually exclusive.

B. Basic locking mechanism for update

Select * from UPDATE where id=3 select * from update where id=3; select * from update where id=3;

If id=3 does not exist, a gap lock will be added. For example, if only records with id=1 and ID =6 exist in the database, a gap lock (1, 6) is encountered.

PS: Here we have just listed some of the for update scenarios. There are many other scenarios for the for UPDATE statement.

Deadlock reappears

Next, we continue to analyze the above problems. With that in mind, if we query for a new strId (that is, the table does not exist), it will apply a gap lock as we discussed above in the for UPDATE query where the record does not exist. Therefore, we can organize the above problems into the following execution process.

From the table above, we can clearly see why deadlocks occur in our DCL scenario. Below we detail the execution at each point in time.

  1. First, both T1 and T2 time points are lockless queries (i.e. snapshot reads), so it must be possible to execute.
  2. At t3, session1 locks the query record, because the record does not exist, the gap lock will be added at this time.
  3. In t4, session2 locks the query, but also because the record does not exist, the gap lock. The gap lock has been added at t3, why can the lock be successfully added at this time? Because according to the above knowledge we know that gap lock and gap are compatible with each other.
  4. At t5, session1 performs insert operation, need to insert intent lock. Session1 must wait because the intent lock cannot be successfully locked at this time because it is mutually exclusive with an existing gap lock.
  5. In t6, session2 performs the insert operation and also needs to insert the intent lock. Also, because the lock is mutually exclusive with the existing gap lock, it must wait at this time. Session1 holds the gap lock, waits until the intention lock is inserted, session2 also holds the gap lock, waits until the intention lock is inserted. Hence the scenario where session1 and session2 wait for each other, a deadlock. In this scenario, MySQL will throw a deadlock exception and select a transaction to roll back.

At this point, we understand why the DCL schema in MySQL causes deadlocks.

Iv. Solutions

In fact, we can see from the above analysis that we cannot use DCL mode in this scenario. We can perform this selectOrInsert scenario without locking. That is, a lock free SELECT query is performed first. If no record is found, an INSERT operation is performed. Because the O_ID field is a unique index, the subsequent INSERT thread will throw a DuplicateKey exception if a conflict occurs. When we get the DuplicateKey exception, we re-execute the lockless query to find the ID of the numeric type for the existing o_id (inserted by another thread). The corresponding pseudocode is as follows:

public int selectOrInsert(String strId) { id = doSelect(strId); If (id == null) {try {id = doInsert(strId); } catch (DuplicateKey e) {id = doSelect(strId); }} return id; }Copy the code

Five, the convention

If you have any questions or comments about this article, please add an official account to discuss it. (Add an official account to get 10GB video and graphic materials on “Java Advanced Architecture”.)