“This is the second day of my participation in the Gwen Challenge in November. See details: The Last Gwen Challenge in 2021”

This article is on the premise that the storage engine is InnoDB

Mysql table operations can be divided into four operations, also known as CRUD, according to the type of DML(add, delete, modify) and DQL(query); Insert and query statements are executed in mysql.

Both DML and DQL are apis that call storage engines through connectors, caches, analyzers, optimizers, and actuators. In the first four phases, the same process is used. For details, please refer to the mysql Infrastructure article. The article introduces the function of each module in detail, but this article will not expand on it.

Little knowledge:

Sector: the smallest unit of disk storage. The sector size is 512 bytes.

Disk block: the smallest unit of interaction between a file system and a disk (the smallest unit for a computer system to read and write disks). A disk block consists of several consecutive (2^n) sectors. The block size is generally 4KB.

Mysql page: the smallest unit of interaction between mysql and disk is called a page. The default size is 16kb, or 4 disk blocks. This means that by default, mysql reads 16KB of data each time (it can be changed). If I only query for a single 1KB of data, mysql reads 16KB of data each time.

How is a query executed

Select query is relatively simple, in fact, it is to the hard disk according to the page (16KB) to load the data into memory, and then match where conditions, find the data that meets the conditions;

  1. If the WHERE condition does not have an index, then a full table scan is performed, loading data pages into memory again and again, and then matching one by one.
  2. If the WHERE condition is a primary key index, the data page of the primary key index is loaded into memory, and the specific row data is returned.
  3. If the where condition is normal index, so will the normal index data page loaded into memory, and then matched to conform to the conditions of the leaf node (B + tree leaf nodes), if you can use to cover index, then can return directly, if you can’t use coverage index, the will to carry on the back table query (a primary key index search). If overwrite indexes and back tables are not clear, refer to the history article: Index basics you need to know.

How is an update statement executed

The update statement must first follow the flow of the query, because it must know which record it is before it can be updated. Undolog (InnoDB rollback log), redolog(InnoDB data persistence log), binlog(mysql Server archive log), Let’s take a look at how they cooperate to complete the data update.

  1. First, the execution engine searches for data according to the conditions. If the data exists in memory, it returns directly. If the data does not exist, it returns.
  2. The query record is first put into Undolog for updating failed data rollback. For specific use, please refer to the previous article on multi-concurrent control MVCC.
  3. The executor calls the storage engine interface to write data and modify the record in the cache pool.
  4. Redo log, update data to redo log cache first.
  5. To prepare to commit the transaction, flush the redo log cache to disk. Redolog is the prepare stage (prepare and COMMIT), and the executor is notified to complete.
  6. The executor starts to write binlog to disk.
  7. The update is complete when the prepare state in redolog is set to COMMIT.
  8. IO threads then randomly write to disk on a page basis, slowly dumping our updated data onto disk.

The two logs are written separately, so it is difficult to ensure that the data of the two logs are consistent. Try to refer to the two log files at the same time during recovery. If the two log files are consistent, they will be submitted

If the update fails at step 8, there are two scenarios

  • SQL fails to be executed. In this case, use the historical data recorded in Undolog to restore the data.
  • If redolog is in the commit state or the process is restarted, the system checks whether the binlog is in the commit state. If the binlog is in the commit state, the system checks whether the binlog is complete (two-phase commit ensures data consistency). If the binlog is complete, the data is valid and written to the disk. If the log is incomplete or only one log exists, discard it.

Why log instead of writing directly to the disk where the database resides

The data to be updated is stored anywhere on the disk, which is random I/O, and the efficiency is very low, so the concurrency of mysql cannot be guaranteed. However, the redo log and binlog are pre-write logs, which are sequential I/O, and can be directly written. The efficiency is much higher, even if the log fails, you can recover according to the log.

Why two-phase commit is designed:

New data written is dirty and is complete only when it is updated to disk

Prove the opposite:

Redo log: Mysql crashes and restarts after redolog. After the restart, you can use Redolog to restore the data, but the binlog does not have the data. Therefore, the data will be lost when you use binlog to back up the data or perform master/slave synchronization, which is inconsistent with the original data.

If I write binlog first and then redolog: Mysql crashes after writing to binlog, redolog does not write to mysql, redolog does not write to mysql, redolog does not write to mysql, redolog does not write to mysql, redolog does not write to mysql, redolog does not write to mysql, redolog does not restore mysql, redolog does not write to mysql, redolog does not restore mysql, redolog does not restore mysql.

See a picture on the net, draw of very good, can collect once

To summarize

  1. Whether a query statement or an update statement goes through the connector, the query cache (removed from 8.0), the analyzer, the optimizer, and the executor.
  2. The query statement loads the entire page of data into memory according to the query criteria.
  3. Update statement on the basis of query using undo log, redo log, bin log to complete data update.
  4. Undo log is used to roll back data after SQL execution failure, keeping transaction atomicity.
  5. The redo log is ACID for mysql crash recovery and ensures committed transactions.
  6. Binlog is a log file used for data backup and primary/secondary synchronization.
  7. Direct update record and then brush disk is random IO, inefficient, so use redolog sequential write, improve efficiency.
  8. Two-phase commit enables crash recovery and ensures data consistency.

Welcome to the comments section to communicate


I am Mr. Ji, with the output force input and continue to learn, continue to share the series of technical articles, as well as the whole network worth collecting good articles, welcome to pay attention to the public number, do a continuous growth of technical people.

Mysql Advanced History series (also see other articles in the Nuggets column)

1. MySQL Advanced series: Learn about MySQL infrastructure;

2. MySQL Advanced series: Learn about MySQL storage engine.

3. MySQL advanced series: What is the difference between MySQL MyISAM and InnoDB?

4. MySQL Advanced series: how to better select the data type in MySQL table design;

5. MySQL Advanced series: How to use paradigm in database design;

6. MySQL advanced series: explain the meaning of each field;

MySQL advanced series: why MySQL uses B+ as its index data structure;

MySQL advanced series: You need to know the basics of indexing;

MySQL > create index ();

MySQL advanced series: master/slave replication principle and configuration

11. MySQL Advanced series: join principle -3 algorithms;

MySQL advanced series: transaction and transaction isolation levels;

MySQL Advanced series: Implementation of multi-version concurrency control MVCC;