An overview of the

Our database will generally execute multiple transactions concurrently, and multiple transactions may concurrently add, delete, change and check the same batch of data, which may cause problems such as dirty write, dirty read, unrepeatable read and phantom read.

The essence of these problems is the multi-transaction concurrency problem of database. In order to solve the multi-transaction concurrency problem, database designed transaction isolation mechanism, lock mechanism, MVCC multi-version concurrency control isolation mechanism. There are a whole set of mechanisms for dealing with the multi-transaction concurrency problem, and we’ll go through these mechanisms to give you a thorough understanding of the internal execution of the database.

Note: The concepts and examples mentioned in this article are based on mysqL-5.7.x

Transactions and their ACID properties

A transaction is a logical processing unit consisting of a set of SQL statements, and a transaction has four properties, often referred to simply as the ACID property of the transaction.

  • Atpmicity: A transaction is an atomic unit of operation whose changes to data are either all performed successfully or all failed and rolled back, mainly representing the level of transaction operation.
  • Consistent: Data must be Consistent at the beginning and completion of a transaction, which means that all relevant data rules must be applied to the modification of the transaction to maintain data integrity.
  • Isolation: The database system provides some Isolation mechanism to ensure that transactions are not executed in an “isolated” environment that is affected by external concurrent operations. This means that the intermediate state in a transaction is not visible to the outside world. And vice versa.
  • Durable: Data modification after transaction processing is Durable, even in case of system failure.

Problems with concurrent transaction processing

Lost Update or dirty write

When two or more transactions select the same row and then update it with the value originally selected, the problem of lost updates occurs because each transaction is unaware of the existence of the other transactions – the last update overwrites the updates made by the other transactions

Dirty Reads

A transaction is manipulating a piece of data, but the data is already in an inconsistent state before the transaction is completed and advanced. At this point, another transaction fetches the same record, and if unchecked, the second transaction reads the “dirty” data and performs further processing, resulting in uncommitted data relationships. This phenomenon is called “dirty” reading.

Simple: Transaction A reads data that transaction B has modified but has not yet committed, and acts on that data. If B rolls back, the data read by A is invalid, which is inconsistent with the consistency principle.

No-repeatable Reads

An event in which a transaction reads previously read data again after reading some data, only to discover that the read data has changed or some data has been deleted, is called “non-repeatable read”.

Simple understanding: the same query statement inside transaction A reads inconsistent results at different times, which does not comply with isolation.

Phantom Reads

The phenomenon of a transaction re-reading previously retrieved data at the same time query only to find that other transactions insert new data that meets the query criteria is called “phantom read.”

Simple understanding: Transaction A has read the new data submitted by transaction B, which does not comply with isolation.

Transaction isolation level

“Dirty reads”, “unrepeatable reads”, and “phantom reads” are all database consistency problems, which must be solved by the transaction isolation mechanism provided by the database.

The more stringent the transaction isolation of the database, the less problems concurrency presents, but the more costly it is, because transaction isolation essentially means that transactions are “serialized” to a certain extent, which is obviously contradictory to “concurrency.” At the same time, different applications have different requirements for read consistency and transaction isolation. For example, many applications are not sensitive to “unrepeatable reads” and “phantom reads” and may be more concerned with the ability to access data concurrently.

To query the current transaction isolation level of the database, run the following command:

mysql> show variables like 'tx_isolation';+---------------+-----------------+ | Variable_name | Value | +---------------+-----------------+ | tx_isolation | The REPEATABLE - READ | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + 1 row in the set (0.01 SEC)Copy the code

Set transaction isolation level:

set tx_isolation = 'REPEATABLE-READ';
Copy the code

MySQL’s default transaction isolation level is repeatable. When developing programs with Spring, if the isolation level is not set, the MySQL isolation level is used by default. If the isolation level is set in Spring configuration, the configured isolation level is used

MySQL lock details

A lock is a mechanism by which a computer coordinates the concurrent access of multiple processes or threads to a resource.

In addition to the contention for traditional computing resources (such as CPU, RAM, I/O), data is also a resource that needs to be shared by users in a database. How to ensure the consistency and validity of concurrent data access is a problem that all databases must solve. Lock conflict is also an important factor that affects the performance of concurrent data access.

The classification of the lock

  • Performance is divided into optimistic locking (implemented by comparing version numbers) and pessimistic locking

  • Data operation types can be divided into read locks and write locks (both pessimistic locks).

    Read lock (Shared lock) : Multiple read operations can be performed simultaneously for the same data without affecting each other.

    Write locks (eXclusive) : This lock blocks other write locks and read locks until the current write operation is complete.

  • From the granularity of data operations, it can be divided into table locks and row locks

Table locks

Each operation locks the entire table. Low overhead, fast lock; No deadlocks occur; When the lock granularity is struck, the probability of lock conflict is the highest and the concurrency is the lowest. This parameter is used in the scenario of data migration of the entire table.

Basic operation

Create table SQL
create table `mylock` (
	`id` int(11) not null auto_increment,
  `name` varchar(20) default null.primary key(`id`)
) engine = Myisam default charset = utf8;

Insert data
insert into `mylock` (`id`, `name`) values ('1' , 'a');
insert into `mylock` (`id`, `name`) values ('2' , 'b');
insert into `mylock` (`id`, `name`) values ('3' , 'c');
insert into `mylock` (`id`, `name`) values ('4' , 'd');
Copy the code

Manually add table locks

lock tableTable name Read (write) indicates the table name2 read(write)
Copy the code

View the locks added to the table

show open tables;
Copy the code

Delete table locks

unlock tables;
Copy the code

Add read lock

mysql> lock table mylock read;
Query OK, 0 rows affected (0.00 sec)
Copy the code

This table can be read by the current session and other sessions

An error is reported if a lock is inserted or renewed in the current session, and any other session is inserted or renewed.

mysql> insert into `mylock` (`id`, `name`) values ('5' , 'd');
ERROR 1099 (HY000): Table 'mylock' was locked with a READ lock and can't be updated
Copy the code

Add write lock

mysql> lock table mylock write;
Query OK, 0 rows affected (0.00 sec)
Copy the code

The current session does not add, delete, or query the current table, but all operations on the table of other sessions are blocked.

conclusion

  1. Read operations (lock reads) on MyISAM tables do not block other processes’ read requests to the same table, but block write requests to the same table. Only when the read key is released will other processes write.
  2. Write operations on MyISAM table (add write lock) will block other processes’ read and write operations on the same table. Only after the write lock is released, other processes will execute write operations on the same table.

Row locks

Each operation locks a row of data. High overhead (will involve the query of the table, so the overhead is high), slow lock; Deadlocks occur; The locking granularity is minimum, the probability of conflict is lowest, and the concurrency is highest.

The biggest difference between InnoDB and MySIAM

  • InnoDB supports Transaction
  • InnoDB supports row-level locking

Row locking scenario

If a session starts a transaction update and does not commit it, another session will block updating the same row of records. Updating different records will not block.

The following figure is a process of my two clients starting a transaction and waiting for another transaction.

Conclusion:

MyISAM will automatically lock all related tables before executing the SELECT statement, and will automatically lock all related tables after executing the UPDATE, INSERT, and DELETE operations.

InnoDB does not lock the SELECT query because of MVCC. However, update, INSERT, and DELETE operations add row locks.

In short, a read lock blocks writes, but not reads, whereas a write lock blocks both reads and writes.

Row lock and transaction isolation level analysis

Create table SQL
create table `account` (
	`id` int(11) not null auto_increment,
  `name` varchar(20) default null,
  `balance` int(11) default null.primary key(`id`)
) engine = InnoDB default charset = utf8;

Insert data
insert into `account` (`id`, `name`, `balance`) values ('1' , 'zhangsan'.100);
insert into `account` (`id`, `name`, `balance`) values ('2' , 'lisi'.200);
insert into `account` (`id`, `name`, `balance`) values ('3' , 'wangwu'.300);
insert into `account` (`id`, `name`, `balance`) values ('4' , 'zhaoliu'.400);
Copy the code

Read uncommitted

  1. Open client A, set the current transaction mode to Read uncommitted, and query the initial value of the table account:
- Client A
mysql> set tx_isolation ='read-uncommitted';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from account;
+----+----------+---------+
| id | name     | balance |
+----+----------+---------+
|  1 | zhangsan |     120 |
|  2 | lisi     |     200 |
|  3 | wangwu   |     300 |
|  4 | zhaoliu  |     400 |
+----+----------+---------+
4 rows in set (0.00 sec)

Copy the code
  1. Before client A commits the transaction, open another client B and modify the balance of Zhang SAN.
-- Client B
mysql> set tx_isolation ='read-uncommitted';
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> update account set balance = 100 where id = 1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> select * from account;
+----+----------+---------+
| id | name     | balance |
+----+----------+---------+
|  1 | zhangsan |     100 |
|  2 | lisi     |     200 |
|  3 | wangwu   |     300 |
|  4 | zhaoliu  |     400 |
+----+----------+---------+
4 rows in set (0.00 sec)
Copy the code
  1. In this case, client B does not commit data, but client A can query the data that is not submitted by client B. This indicates that A dirty read occurs
- Client Amysql> set tx_isolation ='read-uncommitted';Query OK, 0 rows affected (0.00 SEC)
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from account;+----+----------+---------+ | id | name | balance | +----+----------+---------+ | 1 | zhangsan | 120 | | 2 | lisi | 200 | | 3 | wangwu | 300 | | | 4 zhaoliu | | + 400 - + -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- - + 4 rows in the set (0.00 SEC)
mysql> select * from account;+----+----------+---------+ | id | name | balance | +----+----------+---------+ | 1 | zhangsan | 100 | | 2 | lisi | 200 | | 3 | wangwu | 300 | | | 4 zhaoliu | | + 400 - + -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- - + 4 rows in the set (0.00 SEC)Copy the code
  1. However, the client transaction may roll for some reason, and all operations will be small. Then the data queried by client A is dirty data
-- Client Bmysql> rollback;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from account;+----+----------+---------+ | id | name | balance | +----+----------+---------+ | 1 | zhangsan | 120 | | 2 | lisi | 200 | | 3 | wangwu | 300 | | | 4 zhaoliu | | + 400 - + -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- - + 4 rows in the set (0.00 SEC)Copy the code
  1. Execute the update statement on client Aupdate account set balance = balance - 50 where id =1; Id = 1 becomes 70. No. 50, the data are inconsistent,In the application if we use. 100 minus 50 is 50. If you do not know of any other application rollback, you can solve the problem by reading the committed isolation level
- Client A
mysql> update account set balance = balance - 50 where id =1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> select * from account;
+----+----------+---------+
| id | name     | balance |
+----+----------+---------+
|  1 | zhangsan |      70 |
|  2 | lisi     |     200 |
|  3 | wangwu   |     300 |
|  4 | zhaoliu  |     400 |
+----+----------+---------+
4 rows in set (0.00 sec)
Copy the code

Reading has been submitted

  1. Client A queries the account table and submits the data modification.

    set tx_isolation =’read-committed’;

    - Client Amysql> set tx_isolation ='read-committed';Query OK, 0 rows affected, 1 Warning (0.00 SEC) -- Start transactionmysql> begin;Query OK, 0 rows affected (0.00 SEC) -- Update account balance (id = 1)mysql> update account set balance = 700 where id = 1;Query OK, 1 row affected (0.01sec) Rows matched: 1 Changed: 1 Warnings: 0mysql> commit;
    Query OK, 0 rows affected (0.00 sec)
    Copy the code
  2. Client B queries the data before and after client A submits the data. The same query statement in the transaction reads inconsistent results at different times, which is inconsistent with isolation. An unrepeatable read problem has occurred

    set tx_isolation =’read-committed’;

    -- Client B -- starts the transactionmysql> begin;Query OK, 0 rows affected (0.00 SEC) -- Client A queries before committingmysql> select * from account;+----+-----------+---------+ | id | name | balance | +----+-----------+---------+ | 1 | zhangsan | 70 | | 2 | lisi | 200  | | 3 | wangwu | 300 | | 4 | huayi | 400 | | 6 | wangmazi | 600 | | 8 | zhaoer | 800 | | 9 | wangmazi9 | 600 | + - + -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- + 7 rows in the set (0.00 SEC) - after the client submits A querymysql> select * from account;+----+-----------+---------+ | id | name | balance | +----+-----------+---------+ | 1 | zhangsan | 700 | | 2 | lisi | 200 | | 3 | wangwu | 300 | | 4 | huayi | 400 | | 6 | wangmazi | 600 | | 8 | zhaoer | 800 | | 9 | wangmazi9 | 600 | + - + -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- + 7 rows in the set (0.00 SEC)Copy the code

Repeatable read

  1. Client A starts the transaction and client B also starts the transaction and inserts A piece of data
-- CLIENT B set TX_ISOLATION = 'REPEATABLE-READ';
mysql>  select * from account;+----+----------+---------+ | id | name | balance | +----+----------+---------+ | 1 | zhangsan | 70 | | 2 | lisi | 200 |  | 3 | wangwu | 300 | | 4 | huayi | 400 | | 6 | wangmazi | 600 | | 8 | zhaoer | 800 | +----+----------+---------+ 6 rows In the set (0.00 SEC)Copy the code
  1. Client A modifies the data added by B, and then it can be queried in the query. So there is illusory, which violates isolation.
- Client A
-- The client submits the query
mysql>  select * from account;
+----+----------+---------+
| id | name     | balance |
+----+----------+---------+
|  1 | zhangsan |      70 |
|  2 | lisi     |     200 |
|  3 | wangwu   |     300 |
|  4 | huayi    |     400 |
|  6 | wangmazi |     600 |
|  8 | zhaoer   |     800 |
+----+----------+---------+

-- Update data 9
mysql> update account set balance =   9000 where id = 9;
Query OK, 1 row affected (7.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

-- Requery visibility, violation of isolation
mysql>  select * from account;
+----+-----------+---------+
| id | name      | balance |
+----+-----------+---------+
|  1 | zhangsan  |      70 |
|  2 | lisi      |     200 |
|  3 | wangwu    |     300 |
|  4 | huayi     |     400 |
|  6 | wangmazi  |     600 |
|  8 | zhaoer    |     800 |
|  9 | wangmazi9 |    9000 |
+----+-----------+---------+
7 rows in set (0.00 sec)
Copy the code

Repeatable reads are MySQL’s default isolation level

serializable

  1. Open A client A, set the current transaction mode to SerialIZABLE, and query the initial value of the table account.

set tx_isolation = ‘serializable’;

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from account where id = 1;+----+----------+---------+ | id | name | balance | +----+----------+---------+ | 1 | zhangsan | 700 | + - + -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- + 1 row in the set (0.00 SEC)Copy the code
  1. If you open a client B and set transaction mode to serialIZABLE, update innonDB with the same id 1 will be blocked, and update innonDB with the same ID 2 will be successful.

    If client A performs A range query, then all rows within that range, including the gap range of each row, will be locked even if the row has not been inserted. At this point, if client B inserts data in this range, it will be blocked, so phantom reading is avoided.

    This isolation level is extremely low concurrency and is rarely used in development

set tx_isolation = ‘serializable’;

mysql> set tx_isolation = 'serializable';Query OK, 0 rows affected (0.00 SEC)
mysql>  begin;Query OK, 0 rows affected (0.00 SEC) -- Client B will block because it is serializable until the lock times outmysql> update account set balance = balance - 100 where id = 1;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
Copy the code

Clearance lock

Gap lock, lock is to lock the gap between the values. The default isolation level of MySQL is REPEATABLE-READ. Gap locking can solve illusionary problems in some cases.

Suppose the data account has the following data:

mysql> select * from account;+----+----------+---------+ | id | name | balance | +----+----------+---------+ | 1 | zhangsan | 70 | | 2 | lisi | 200 |  | 3 | wangwu | 300 | | 4 | zhaoliu | 400 | | 6 | wangmazi | 600 | | 8 | zhaoer | 800 | +----+----------+---------+ 6 Rows in set (0.00 SEC)Copy the code

So the id of the gap is 4,6, and 6,7, and 8, plus infinity

Session_1 update account set name = ‘huayi’ where ID >=4 and ID < 6; . There is no way for any session to insert or modify any data in the gap between all rows locked in this range, i.e. ids [4, 6] cannot be modified, note that the last 6 is also included.

Gap locks occur at repeatable read isolation levels.

If the condition is WHERE ID < 25 then the table cannot insert 100, meaning that the gap between range hits cannot be inserted.

Next-ley Locks

Next-ley Locks are combinations of row gap Locks. As in the example above, the entire interval [4, 6] can be called a keylock.

Unindexed rows are upgraded to table locks

Locks are primarily placed on indexes and may become table locks if non-index fields are updated

Session1 run: Update account set balance = 800 where name = ‘zhangsan’

Session2 operations on any row in this table will be blocked

InnoDB rows are locked for indexes, not records. And the changed index must not fail, otherwise all row locks will be upgraded to table locks

conclusion

InnoDB storage engine implements row-level locking. Although the performance cost of InnoDB storage engine may be higher than table level locking, the overall concurrency capability is much better than MyISAM table level locking. When system concurrency is high, InnoDB’s overall performance has a significant advantage over MyISAM.

However, InnoDB’s row-level locking also has a vulnerable side. When used incorrectly, InnoDB’s overall performance can not only be as good as MyISAM’s, but even worse

Row lock analysis

Analyze row lock contention on the system by examining the InnoDB_row_lock state variable.

show status like 'innodb_row_lock%';

+-------------------------------+--------+
| Variable_name                 | Value  |
+-------------------------------+--------+
| Innodb_row_lock_current_waits | 0      |
| Innodb_row_lock_time          | 101256 |
| Innodb_row_lock_time_avg      | 50628  |
| Innodb_row_lock_time_max      | 51129  |
| Innodb_row_lock_waits         | 2      |
+-------------------------------+--------+
5 rows in set (0.00 sec)
Copy the code

A description of the quality of each state

Innodb_row_lock_time_avg Average wait time

Innodb_row_lock_waits Total waits

Innodb_row_lock_time Total wait duration

It should be noted that when the waiting times are very high and the waiting time is not small, we need to analyze why the system has so many waits, and then make optimization plans according to the analysis results.

Query INFORMRAMATION_SCHEMA database locks

-- View transaction
select * from information_schema.innodb_trx;

- check the lock
select * from information_schema.innodb_locks;

-- Check the lock wait
select * from information_schema.innodb_lock_waits;

Trx_mysql_thread_id can be viewed from the innodb_trx table
kill trx_mysql_thread_id
Copy the code

A deadlock

set tx_isolation = 'REPEATABLE-READ'; -- session1 run: select * from account where id = 1 for update; -- session2 run select * from account where id = 2 for update; -- session1 run select * from account where id = 2 for update; -- session2 run select * from account where id = 1 for update; Show engine innodb status\G;Copy the code

Most cases like mysql automatically detecting deadlocks and rolling back the transaction that caused the deadlock, but there are cases where mysql cannot automatically detect deadlocks

The optimization of the lock

  • As much as possible, make all data retrievals through indexes to avoid non-indexed row locks upgrading to table locks
  • Reasonable design epitome, reduce the scope of lock
  • Minimize the range of index conditions to avoid gap locking
  • The transaction size is well controlled, the time and length of locking resources are reduced, and the SQL related to transaction locking is put in the last execution of the transaction as far as possible
  • Reduce transaction isolation as much as possible

The resources

  • High performance MySQL version 3
  • MySQL transaction isolation level
  • MySQL official documentation
  • And other cases come from the Internet