preface


Recently, I encountered a problem related to mysql isolation level. Later, I decided to write an article to record mysql related knowledge. In the future, I do not have to search hard for information.

The storage engine


Storage engine is the biggest characteristic of MySQL apart from other database management system, we know that relational database data, there is a list of tables can be understand the form of rows and columns, similar to the form of Excel spreadsheet, each table is a data table is in the data is stored at the same time, organizing the data storage structure, The organization structure of these data is determined by the storage engine, that is, the function of the storage engine is to specify the storage structure of the data storage.

Advantages of storage engines

Within the same database, we can specify a storage engine for any table. For example, an application might use mainly InnoDB tables, with a CSV table for exporting data to spreadsheets and several MEMORY tables for temporary workspaces. This would require understanding the advantages and disadvantages of each storage engine, as shown in the following table, Mysql’s most popular storage engine is InnoDB, after all, it is the only one that supports transactions.

Characteristics of the MyISAM Memory InnoDB Archive NDB
B-tree indexes support support support Does not support Does not support
The transaction security Does not support Does not support support Does not support Does not support
Backup/point-in-time recovery support support support support support
Cluster Database Support Does not support Does not support Does not support Does not support support
Clustered index Does not support Does not support support Does not support Does not support
Compressed data support Does not support support support Does not support
Data cache Does not support Does not support support Does not support support
Encrypt the data support support support support support
Foreign key support Does not support Does not support support Does not support support
Full-text search index support Does not support support Does not support Does not support
Geospatial data type support support Does not support support support support
Geospatial index support support Does not support support Does not support Does not support
The hash index Does not support support Does not support Does not support support
The index buffer support Does not support support Does not support support
Locking granularity table table line line line
MVCC(Multi-version Concurrency Control) Does not support Does not support support Does not support Does not support
Replication support support support support support support
Storage limit 256TB memory 64TB unlimited 384EB
T tree index Does not support Does not support Does not support Does not support support
Transaction number Does not support Does not support support Does not support support
Update statistics for the data dictionary support support support support support

Storage engines supported by MySQL 8.0

InnoDB

The default storage engine in MySQL 8.0. InnoDB is a transaction-safe (ACID-compatible) storage engine for MySQL, with commit, rollback and crash recovery capabilities to protect user data. InnoDB row-level locking (no need to upgrade to coarse-grained locking) and Consistent, oracle-style unlocked reads improve multi-user concurrency and performance. InnoDB stores user data in clustered indexes to reduce I/O for common queries based on primary keys. To maintain data integrity, InnoDB also supports FOREIGN KEY reference integrity constraints.

MyISAM

These tables take up very little space. Table-level locking limits performance in read/write workloads, so it is commonly used in read-only or read-only workloads in Web and data warehouse configurations.

Memory

Store all data in RAM for quick access in environments where non-critical data needs to be found quickly. The engine was formerly known as the HEAP engine. Its use cases are shrinking; InnoDB makes use of its buffer pool storage area, which provides a common and persistent way to keep most or all data in memory, and NDBCLUSTER provides fast key lookup for large distributed datasets.

CSV

The tables are actually text files with comma-separated values. CSV tables allow you to import or dump data in CSV format to exchange data with scripts and applications that read and write in the same format. Because CSV tables are not indexed, InnoDB typically keeps data in tables during normal operation and uses CSV tables only during import or export phases.

Archive

These compact, unindexed tables are designed to store and retrieve large volumes of little-referenced historical, archived, or security audit information.

Blackhole

The Blackhole storage engine accepts but does not store data, similar to the Unix/dev/null device. A query always returns an empty set. These tables can be used to send DML statements to the slave’s replication configuration, but the master does not keep its own copy of the data.

NDB (also called NDBCLUSTER)

This clustered database engine is particularly suited for applications that require the highest possible uptime and availability.

Merge

Enables a MySQL DBA or developer to logically group a series of identical MyISAM tables and reference them as an object. Suitable for VLDB environments, such as data warehouses.

Federated

Provides the ability to link a single MySQL server to create a logical database from many physical servers. Ideal for distributed or data mart environments.

Example

This engine, as an example in the MySQL source code, shows how to start writing a new storage engine. It is primarily of interest to developers. A storage engine is a “stub” that does nothing. You can use this engine to create tables, but you cannot store data in or retrieve data from tables.

Storage engine setup and conversion

Create a table to specify the storage engine

When creating a new TABLE, you can specify the storage ENGINE to be used by adding TABLE options in the CREATE TABLE statement.

CREATE TABLE `user` (
  `id` bigint(20) unsigned zerofill NOT NULL AUTO_INCREMENT COMMENT 'on the ID'.`user_name` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT 'name'.`into_user` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL.`into_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT 'Creation time'.`upd_user` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL.`upd_time` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'Update Time'.`status` int(2) DEFAULT '1' COMMENT 'Soft Delete Status: Default: 1 (-1: deleted 1: normal)'.`version` int(2) DEFAULT '1' COMMENT 'Version number'.`remark` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_user_name` (`user_name`))ENGINE=MEMORY AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci ROW_FORMAT=DYNAMIC COMMENT='User table';
Copy the code

Example Modify the default storage engine

When the ENGINE option is omitted, the default storage ENGINE is used. InnoDB is the default storage engine in MySQL 8.0. Default_storage_engine can be set to the default storage engine for the current session:

SET default_storage_engine = NDBCLUSTER;
Copy the code

Conversion between storage engines

To convert a TABLE from one storage engine to another, use ALTER TABLE to indicate the statement of the new engine (user is the TABLE name) :

ALTER TABLE `user` ENGINE = InnoDB;
Copy the code

Innodb locking mechanism

Locking is a key feature that differentiates database systems from file systems. The locking mechanism is used to manage concurrent access to shared resources. Different databases and different search engines may have different locking mechanisms. The lock of MyISAM engine is a table lock design. There is no problem with concurrent reading, but there may be some problems with concurrent writing. InnoDB storage engine locks are implemented very similarly to Oracle databases, providing consistent non-locked reads and row-level locking support. Row-level locking has no overhead associated with concurrency and consistency. InnoDB supports multiple granularity locking, allowing both row and table locking.

Row locks

InnoDB implements standard row-level locking, with two types of locks: shared (S) and exclusive (X).

A shared Lock (S Lock) allows a transaction to read a row of data. An exclusive Lock (X Lock) allows a transaction to delete or update a row of data

If one transaction T1 has acquired a shared lock on record R, another transaction T2 can also acquire a shared lock on record R, which is called lock compatibility. However, if T1 acquires an exclusive lock on record R, T2 cannot acquire a shared lock on record R, which is called lock incompatibility.

Note: Shared and exclusive locks are row locks, and compatibility refers to the compatibility of locks on the same record.

Table locks

InnoDB supports an additional table-level Lock type, Intention Locks. Intention Locks can be divided into Intention Shared Locks (IS) and Intention eXclusive Locks (IX). If a transaction T1 wants to modify row R1 in the user table, it will lock the object twice. If the transaction T1 wants to modify row R1 in user, it will lock the object twice:

Select * from user; select * from r1; select * from user

In the meantime, other transactions will still have access to the User table, and the lock on the user table is intended to add an exclusive lock to only a few rows in the User table.

Intent locking agreements are as follows:

1. Before a transaction can acquire a shared lock for a row in a table, it must first acquire an intended shared lock or a higher lock in the table. 2. Before a transaction can acquire an exclusive lock for a row in a table, it must first acquire a lock in the table that is intended to be exclusive.

If a lock is compatible with an existing lock, it is granted to the requested transaction, but if it conflicts with an existing lock, it is not granted. Transactions wait until conflicting existing locks are released. An error occurs if the lock request conflicts with an existing lock and cannot be granted permission because it could result in a deadlock.

Other lock

InnoDB also has a locking mechanism

1. Record lock: Record lock is a lock on index records. FOR example, SELECT c1 FROM t WHERE c1 = 10 FOR UPDATE; Can prevent any other transaction from inserting, updating or deleting rows where the value of T.c. 1 is 10.

2. Gap lock: A gap lock is a lock on the gap between index records, or the gap before the first or last index record. FOR example, SELECT c1 FROM t WHERE c1 BETWEEN 10 and 20 FOR UPDATE; Prevents other transactions from inserting value 15 into t.c1 in a column, whether or not such a value is already in the column, because the gap between all existing values in the range is locked.

3. Next key lock: Next key lock is a combination of record lock on index record and gap lock on gap before index record. InnoDB performs row-level locking to enable it to set shared or exclusive locks on index records encountered when searching or scanning table indexes. Therefore, row-level locks are actually indexed record locks. A next key lock on an index record also affects the “gap” that precedes that index record. That is, the next key lock is the index record lock plus the gap lock on the gap before the index record. If one session R has a shared or exclusive lock on a record in the index, the other session cannot insert a new index record in the gap before the index order.

4. INSERT intent lock: INSERT intent lock is a gap lock set by an INSERT row operation prior to insertion. This lock signals insert intent that there is no need to wait for multiple transactions to be inserted into the same index gap if multiple transactions are not inserted at the same place in the gap. Suppose there is an index record with values of 4 and 7. Individual transactions attempt to insert values 5 and 6, respectively, and each uses the insert intent lock to lock the gap between 4 and 7 until an exclusive lock is obtained for the inserted row, but do not block each other because the rows are conflict-free.

5, automatic lock: an auto-inc lock is a special table level lock that will be acquired with the table AUTO_INCREMENT column. In the simplest case, if one transaction is inserting a value into the table, any other transaction must wait for its own insertion into the table so that the row inserted by the first transaction receives consecutive primary key values. The Innodb_autoINC_lock_mode configuration option controls the algorithm used to automatically increase locking. It allows you to choose how to trade off a predictable sequence of automatic incremental values against the maximum concurrency of an insert operation.

Details are available on the mysql website, which can be viewed here.

Transaction-related issues


What is a transaction

Transactions, consisting of a limited sequence of database operations that are either all or none executed, are an indivisible unit of work.

Such as: Zhang SAN want to transfer 100 dollars to the bill, you will need to deduct 100 from zhang SAN's wallet, then add 100 to li si's wallet, in the process, if there is no abnormal zhang SAN buckles money, abnormal when Li Sijia money, then a rollback operation, return the buckle the money to zhang, if zhang SAN Li Sijia money and money were all without exception, so executed successfully, And the one responsible for monitoring these behaviors is the transaction!Copy the code

Characteristics of transactions

The properties of transactions are ACID

Atomicity: The transaction is executed as a whole, and all or none of the operations on the database contained within it are executed.

Consistency: data will not be destroyed before and after A transaction. If account A transfers 10 YUAN to account B, the total amount of A and B will remain the same regardless of success.

Isolation: When multiple transactions are accessed concurrently, transactions are isolated from each other. A transaction should not be disturbed by other transactions, and multiple concurrent transactions should be isolated from each other.

Durability: Specifies that changes made to the database by a transaction persist in the database after the transaction completes committing.

The impact of transaction concurrency

Let’s create a table in the database, insert a record, and then test it to see what impact it might have.

Create a user table
CREATE TABLE `user_info` (
  `id` bigint(20) unsigned zerofill NOT NULL AUTO_INCREMENT COMMENT 'on the ID'.`user_name` varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT 'name'.`user_money_amount` decimal(15.0) DEFAULT '0' COMMENT 'User Sum Summary (Cent)',
  PRIMARY KEY (`id`),
  KEY `idx_user_name` (`user_name`))ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci ROW_FORMAT=DYNAMIC COMMENT='User Information Table';

-- Insert data
INSERT INTO user_info ( user_name, user_money_amount )
VALUES
	( 'wx'.2000 );
Copy the code

Dirty read

Transaction A reads something that has not yet been committed by transaction B, and generates dirty read data. Now, the right-hand side has changed but has not committed the transaction, and the left-hand side has already read the data. If the right-hand side has A subsequent exception, the change will be rolled back, and the left-hand side will read the wrong data.

Unrepeatable read

Transaction B interferes with transaction A. Although transaction A cannot read the content that has not been committed by transaction B this time, transaction A reads it three times in the same transaction, and the results of the three times are not consistent, which is called unrepeatable read.

Phantom read

Transaction A queries the result set of A range, another concurrent transaction B inserts/deletes data into the range and commits it silently, then transaction A queries the same range again and gets A different result set from the two reads. This is A phantom read.

Isolation level

What is the isolation level

Transactions specify an isolation level that defines the degree to which a transaction must be isolated from resource or data changes made by other transactions. Isolation levels are described in terms of allowed concurrent side effects (for example, dirty reads or phantom reads).

View the system-level and session-level isolation levels

To view the local system and session isolation level, you can use the following SQL query:

select @@global.transaction_isolation,@@transaction_isolation;
Copy the code

Modify SQL as follows:

-- Change the system level
set global transaction_isolation='read-uncommitted';  
-- Changes the session level
set transaction_isolation='read-uncommitted';  
Copy the code

The type of isolation level

Read Uncommitted

Alter mysql session-level isolation to read uncommitted:

What are the problems with testing read uncommitted isolation levels:

Dirty read:

Unrepeatable read:

Phantom reads:

As can be seen from the above figure, the read uncommitted isolation level may cause dirty reads, unrepeatable reads, and phantom reads. The read uncommitted isolation level is rarely used in practical applications.

Read Committed

Alter mysql session-level isolation to read uncommitted:

What are the problems with testing read committed isolation levels:

Dirty read:

Unrepeatable read:

Phantom reads:

As can be seen from the above figure, read committed isolation level will not have dirty read problems, but will have unrepeatable read, phantom read problems.

Repeatable Read

Alter mysql session-level isolation to repeatable reads:

What are the issues with testing repeatable read isolation levels:

Dirty read:

Unrepeatable read:

Phantom reads:

As you can see from the above figure, read committed isolation levels do not have dirty and unrepeatable read problems, but phantom read problems do occur. Read committed isolation level is the default isolation level of mysql. Although there may be illusory reads, repeatable reads are the most commonly used isolation level in development, as you can see below.

Serializable

Alter mysql session-level isolation level to serialization:

As can be seen from the above figure, when the isolation level is set to Serializable, there are no dirty reads, unrepeatable reads, and phantom reads, and the write to the table from transaction B is waiting for the read from transaction A. In fact, this is the strictest isolation level, and concurrent reads and writes are not allowed. It provides the best security, but not the best performance, which is why read Committed is the default isolation level and is actually the most used.

How isolation levels are implemented


Read Uncommitted

The read uncommitted isolation level is the least commonly used isolation level. At the read uncommitted isolation level, SELECT statements are executed unlocked, but earlier versions of rows may be used. Therefore, such reads are inconsistent using this isolation level. This is also called dirty reading. Otherwise, this isolation level works like Read Committed

Read Committed

With the READ COMMITTED isolation level, each consistent READ in a transaction is set and its own new snapshot is READ. When FOR SHARE is used, a lock read occurs, and SELECT blocks until the transaction containing the latest row ends.

Snapshot read: select select * from table.... ; Current read: Special read operations, such as insert/update/delete operations, are current read operations. They process current data and need to be locked. select * from tablewhere ? lock in share mode;

select * from table where ? for update;

insert;

update ;

delete;
Copy the code

Therefore, a read at the committed level takes the latest snapshot of the current data through the MVCC, without any locks, and without any locks (because historical data is constructed, there can be no locks on it).

However, unrepeatable reads and phantom reads remain at this level: MVCC version generation timing: every SELECT. This means that if we perform multiple select operations in transaction A, and other transactions update and commit the data we read in between each select, we will have unrepeatable reads, i.e., data inconsistencies on repeated reads.

Concurrency Control (Multi-version Concurrency Control) Multiversion concurrency control refers to the process of accessing the version chain of records while performing normal SEELCT operations using transactions with READ COMMITTD and REPEATABLE READ isolation levels. The read-write and read-read operations of different transactions can be executed concurrently, thus improving system performance. READ COMMITTD, REPEATABLE READ One big difference between the two isolation levels is: REPEATABLE READ generates a ReadView only before the first normal SELECT operation. Repeat the ReadView for all subsequent queries.Copy the code

Read the View:

1. Read View is the Read View generated when a transaction performs a snapshot Read. 2. When a transaction executes a snapshot read, a snapshot of the current database system is generated to record the active read and write transactions in the current system and put them into a list. 3. The Read View is used to determine which version of the data is visible to the current transaction

To facilitate discussion of the Read View visibility rule, let’s define a few variables:

  1. M_ids: represents a list of transaction ids of the read and write transactions currently active in the system at the time the ReadView was generated.
  2. Min_trx_id: Indicates the smallest transaction ID of the active read/write transaction in the system at the time the ReadView was generated, that is, the minimum value in m_IDS.
  3. Max_trx_id: indicates the id value in the system that should be assigned to the next transaction when the ReadView is generated.
  4. Creator_trx_id: indicates the transaction ID of the transaction that generated this ReadView.
  5. Trx_id: Each time a change is made to a record, the corresponding transaction ID is assigned to the trx_ID hidden column.

1) If the value of the accessed version’s trx_id attribute is the same as the creator_trx_id value in ReadView, it means that the current transaction is accessing its own modified record, so the version can be accessed by the current transaction. 2) If the value of the accessed version’s trx_id attribute is less than the value of min_trx_id in ReadView, the transaction that generated the version was committed before the current transaction generated the ReadView, so the version can be accessed by the current transaction. 3) If the value of the accessed version’s trx_id attribute is greater than the value of max_trx_id in ReadView, it indicates that the transaction that generated the version was started after the current transaction generated the ReadView, so the version cannot be accessed by the current transaction. If the value of the trx_id attribute is between the min_trx_id and max_trx_id of the ReadView, then check whether the trx_id attribute is in the m_IDS list. If yes, the transaction that generated the ReadView was active when the ReadView was created. The version is not accessible; If not, the transaction that generated the version of the ReadView when it was created has been committed and the version can be accessed

Note that repeatable reads generate a ReadView for each read committed, while repeatable reads generate a ReadView for the first read.

As shown in the figure above, in operation 1, we set the transaction ID on the left as 1, and the transaction ID on the right as 2, and then combined with the graph I drew below and the words above, you have a look, if you do not understand, read several times, you will know why the problem of unrepeatable read has been read. If you think there is a problem, please leave a message, I have a look.

Many people tend to confuse unrepeatable reading with phantom reading, and there are similarities. But non-repeatable reads focus on UPDATE and DELETE, while phantom reads focus on INSERT. If these two isolation levels are implemented using a locking mechanism, in repeatable reads, the SQL locks the data after the first read, and the data cannot be modified by other transactions to achieve repeatable reads. However, this method does not lock the insert data, so if transaction A has previously read the data, or modified the whole data, transaction B can still insert the data commit, then transaction A will find that there is no reason for the extra data, this is A phantom read, cannot be avoided by row locking. Serializable isolation level is required, read with read lock, write with write lock, read lock and write lock mutually exclusive, this can effectively avoid magic read, unrepeatable read, dirty read and other problems, but will greatly reduce the concurrency of the database. So the biggest difference between unrepeatable reads and phantom reads is how to solve the problem they create through the locking mechanism. Mature databases such as MySQL, ORACLE, and PostgreSQL, for performance reasons, use MVCC (Multi-version Concurrency Control) based on optimistic locking to avoid both of these problems.Copy the code

Repeatable Read

The biggest difference between a repeatable read and a read submitted is that the policies for generating readViews are different. A repeatable read generates a ReadView only when it reads data for the first time, while a read submitted generates a ReadView every time it reads data.

The ReadView is created for the first time and does not change after that, so the original data is read, which avoids the problem of unrepeatable reads.

Serializable

Official:

InnoDB implicitly converts all plain SELECT statements to SELECT ... FOR SHARE if autocommit is disabled. If autocommit is enabled, the SELECT is its own transaction. It therefore is known to be read only and can be serialized if performed as a consistent (nonlocking) read and need not block for other transactions. (To force a plain SELECT to block if other transactions have modified the selected rows, disable autocommit.)
Copy the code

All SELECT statements are implicitly converted to SELECT… FOR SHARE, that is, add the SHARE lock. Read and shared lock, write and exclusive lock, read and write mutually exclusive lock. If rows are being modified by uncommitted transactions, all statements that select those rows will block.

Simply put, at this isolation level, all normal SELECT is automatically converted to SELECT… Lock in share mode execution, that is, all reads and writes to the same data become mutually exclusive, greatly improved reliability, greatly reduced concurrency.

The Serializable isolation level will automatically read snapshots (select * from table….). Select * from table where? Lock in share mode), that is, automatic locking.

Reference and thanks

This article thoroughly reads the four isolation levels of MySQL transactions

MySQL 8.0 Reference Manual

Mysql version chain and readView principle

At the end


I learned to draw this picture by reading the above articles, and I’m not sure if there is any problem. If you think there is any problem, please bring it up and discuss it, and THEN I will correct it. Thank you! (Now that you’ve seen it, give it a thumbs up!)

If there is a need you can pay attention to my public number, will immediately update the Java related technical articles, public number there are some practical information, such as Java second kill system video tutorial, dark horse 2019 teaching materials (IDEA version), BAT interview summary (classification complete), MAC commonly used installation package (some are taobao to buy, Has PJ’s).

See you in the next article!