MVCC

The definition of MVCC

Multiversion concurrency Control (MVCC) : Concurrency control. When a database is concurrently accessed (read or write), the concurrency control is to manage multiple versions of the data that is being processed within a transaction. In order to avoid write operation clogging, thus causing concurrent problems of read operation.

MVCC logical flow

insert

MySQL adds hidden columns DB_TRX_ID and DB_ROLL_PT to each row by default.

The execution steps in the figure above are as follows:

  1. Start transactions manually and get a global transaction ID from InnoDB engine (1)

  2. Insert two rows into teacher table, set the version number of the row to the current transaction ID, delete the version number to NULL

Consider: if the transaction was committed automatically (SET AUTOCOMMIT = NO) and the transaction was not manually started, what would the data insert look like if the following two SQL statements were executed?

INSERT INTO teacher (NAME, age) VALUE ('seven', 18) ;

INSERT INTO teacher (NAME, age) VALUE ('qingshan', 19) ;
Copy the code

Since the transaction is committed automatically, the two insert statements get the transaction ID respectively, so the versions of the inserted rows are 1 and 2.

delete

The execution steps in the figure above are as follows:

  1. Start transactions manually and get a global transaction ID from InnoDB engine (22)

  2. InnoDB then executes a delete statement and finds the record and sets its delete version number to the current transaction ID

Modify the

The execution steps in the figure above are as follows:

  1. Start transactions manually and get a global transaction ID from InnoDB engine (33)

  2. InnoDB will find the record, copy a copy of the original data into the table, set the value of the version number of the new row to the current transaction ID, and set the value of the deleted version number of the original row to the current transaction ID

The query

The execution steps in the figure above are as follows:

  1. Start transactions manually and get a global transaction ID from InnoDB engine (44)

  2. According to the description of the data query rule

  3. Look for rows whose version is earlier than the current transaction version and find that all three rows in the table meet the condition

  4. Search for deleted records whose version number is either NULL or greater than the current transaction version number and find that only the last data meets the condition (1, Seven, 19)

Case analysis

Data preparation:

CREATE TABLE `teacher` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(32) NOT NULL,
  `age` int(11) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4;

INSERT  INTO teacher(id,NAME,age) VALUES (1,'seven',18);
INSERT  INTO teacher(id,NAME,age) VALUES (2,'qingshan',20);
Copy the code

Case a

-- transaction A executes BEGIN; -- 1 SELECT * FROM teacher; -- 2 COMMIT; -- transaction B executes BEGIN; -- 3 UPDATE teacher SET age =28 WHERE id=1; -- 4 COMMIT;Copy the code

The execution steps of case 1 are as follows: 1,2,3,4,2. The execution effect is shown in the figure below:

Although the data with ID =1 is updated when performing step 3 and step 4, the same data is obtained when performing step 2 again according to the MVCC query logic flow.

Case 2

-- transaction A executes BEGIN; -- 1 SELECT * FROM teacher; -- 2 COMMIT; -- transaction B executes BEGIN; -- 3 UPDATE teacher SET age =28 WHERE id=1; -- 4 COMMIT;Copy the code

The execution steps of case 2 are as follows: 3,4,1,2. The execution effect is shown in the figure below:

According to the MVCC query logic flow, execute 1,2, and get the uncommitted data from transaction B, which is problematic.

After analyzing case 1 and case 2, it is found that MVCC cannot solve the problem of case 2. InnoDB will use Undo log to solve the problem of case 2.

Undo Log

Definition of Undo Log

Undo: indicates to cancel. For the purpose of undoing the operation, the operation of a specified state is returned.

Undo Log: before a database transaction is committed, the mirror of the modified transaction data (the old version before the modification) is stored in the Undo Log. When the transaction is rolled back or the database crashes, the Undo Log, the old version data, can be used to cancel the impact of the uncommitted transaction on the database.

  • For insert, the undo log records the PK(ROW_ID) of the new data, which is deleted during rollback.

  • For delete/ UPDATE operations, undo logs old data row, which is directly restored during rollback.

  • They are stored in separate buffers.

Undo Log is a product of the atomicity of transactions.

Undo Log implements atomicity of transactions: If an error occurs during a transaction or the user performs a ROLLBACK statement, MySQL can use the backup in Undo Log to restore data to the state before the transaction began.

InnoDB found it possible to implement multi-version concurrency control based on Undo Log.

Undo Log is used in MySQL InnoDB storage engine for multi-version concurrency control.

**Undo Log implements multi-version concurrency control: ** Before a transaction is committed, Undo Log stores the data before the transaction is committed. The data in Undo Log can be used as snapshots of the old version of the data for snapshot reading by other concurrent transactions.

Analyze the SQL execution process in the following figure.

  1. Transaction A manually starts the transaction and performs the update operation. The data hit by the update is first copied to the Undo Buffer

  2. Transaction B manually starts the transaction and performs the query operation. The data returned in the Undo Log is read to perform quick illumination

Current read and snapshot read

Read the snapshot

The data read by SQL is the snapshot version, that is, the historical version. The common SELECT is the snapshot read.

InnoDB snapshot read, data read by cache (original data) + Undo Log (transaction modified data) two parts.

The current reading

The data read by SQL is the latest version, and the lock mechanism ensures that the data read cannot be modified by other transactions.

UPDATE, DELETE, INSERT, SELECT… LOCK IN SHARE MODE, SELECT… MySQL InnoDB locks FOR transaction A, which blocks transaction B’s SQL execution.

In the InnoDB engine, snapshot reads handle phantom reads through MVCC, and current reads handle phantom reads through next-key Locks.

Redo Log

Redo Log definition

Redo: As the name suggests, Redo. To reproduce the operation for the purpose of recovering the operation.

Redo Log: Any data performed during a transaction is backed up to a single location (Redo Log).

Redo Log persistence: The Redo Log is not written as a transaction is committed. It is written to the Redo Log during the transaction execution. The specific drop-log policy can be configured.

Redo logs are an artifact of transaction persistence.

Redo Log implements transaction persistence: It prevents the unlogged data of a transaction from being persisted by redoing the Redo Log upon restarting the MySQL service at the time of a failure.

Analyze the Redo Log execution process according to the following figure

InnoDB does not persist data from cache to hard disk every commit transaction, because every commit transaction persists data to hard disk, which is very inefficient, and every persistence requires IO operation.

After a transaction is committed, InnoDB writes new data to the Redo Log according to the Redo Log policy. InnoDB considers the transaction to have been committed successfully. Data does not necessarily immediately enter the IBD file of the table.

Redo logs are IO operations, as are IBD files that persist to a table.

Redo Log is a sequential I/O operation, and Redo Log is a random I/O operation. For example, we need to update data with id=1 and ID =8. You simply store the updated data sequence in the Redo Log; However, if the IBD file is a table, two discontinuous disk file addresses with ID =1 and ID =8 must be found before persistence, which affects the concurrent performance of database services.

Persistent configuration of Redo Log

Redo logs are stored in {datadir}/ib_logfile1 and ib_logfile2. Innodb_log_group_home_dir can be configured to store the Redo logs in the specified directory.

Once a transaction is committed and the data is persisted to the IBD file of the table, the Redo Log records are meaningless. The Redo Log writes are cyclic writes to the Log file, that is, overwrites.

  • Innodb_log_files_in_group Specifies the number of Redo Log file groups. The default value is 2
  • Log file innodb_log_file_size The default value is 48 MB
  • Innodb_log_buffer_size Specifies the buffer pool size of the Redo Log in cache/buffer. The default value is 16 MB

Redo Buffer persists to Redo Log by setting Innodb_flush_log_at_trx_commit:

  • Value: 0: Redo buffer -> Redo Log OS cache -> Flush cache to disk may lose transaction data within 1 second.
  • Value 1 (default) : Redo Buffer -> Redo Log OS cache -> Flush cache to disk for each transaction
  • Value 2: Redo Buffer -> Redo log OS cache and flush cache to disk every second

The value of 2 is recommended, because MySQL loses a maximum of one transaction commit data, and loses one second of transaction commit data during the entire service period.