MySQL architecture and Innodb Buffer pool architecture

Writing in the front

At the beginning, I planned to update one or two articles a week, but I did not update any of them since I had been preparing for the daily internship interview in the first two weeks. Now that I’m free, I’m going to update and try to finish Mysql, JVM, and Rocket Mq by the end of the summer. Mysql ready to write four: respectively Mysql kernel architecture, Mysql transaction implementation, Mysql index model, Mysql tuning cases.

Take a macro look at the architectural model

Process on drew a beautiful picture.

Summary:

At a macro level, we have a Tomcat network server on the top, which is constantly making SQL requests to our database server to execute SQL commands. Our database server sounds very high-end, but it’s really just a computer running Mysql. The server layer in the database server will have a lot of components, in accordance with the written procedures for a process of this statement. As it passes through the various components, parsers, optimizers, and actuators of the Server layer, it is handed over to the corresponding storage engine for execution. The storage engine is plug-in, we can specify which storage engine to use, in this case we will dissect the storage engine of InnoDB. Learn how InnoDB’s storage engine supports CRUD.

Mysql connection

To access MySql, we must have a MySql driver, such as mysql-connector-java, which is often configured in Maven. Our Tomcat server gets a lot of requests for database operation instructions, and I usually create a thread for it to perform this task on a connection. To perform tasks efficiently, we use the idea of pooling, where we create connections in advance, pool them, and each request is executed by one thread in the pool. The well-known Druid and C3P0 implementations are one of these. We also have a thread that listens for requests and then forwards them to our SQL interface for processing.

How does the Server layer process an SQL statement

Parser: mysql will parse a SQL statement. The parser will perform lexical analysis, identify some keywords, and then use the results of lexical analysis to perform syntax analysis to see if the mysql syntax conditions are met.
Optimizer: Before execution begins, it is processed by the optimizer. The optimizer determines which index to use when there are multiple indexes in a table, and the join order of tables when there are multiple table associations in a statement.
First, we will check whether you have the permission to query the table. If you have the permission, we will open the table and continue to execute. We will use the interface provided by the engine to execute the plan generated by us according to the definition of the engine of the table.

What does Innodb as a whole do

First let’s take a look at what Innodb does. Innodb is the implementer of a specific execution plan, so what do we need to do to successfully execute an SQL command?

For queries, we need to retrieve data from disk and return it to Tomcat. For other operations, we need to change the add, delete, and modify operations into actual physical changes on disk, which is done by Innodb.
Innodb also supports transactions, which means that any CRUD operation will leave a trace in our transaction file (not only logging support, but also locking and MVCC, which will be covered in detail in the second article). Innodb transactions have redo logs and bin logs and we’ll talk about how these three logs work together.

Now that we know what we’re doing, we also need to take a look at an important part of Innodb: the buffer pool. The concept of buffer pools is also often used in operating systems, which basically take advantage of the computer’s principle of locality to store frequently used data in faster storage. Further study of buffer pools will also be discussed later.

The logging operation involved in an UPDATE statement

Let’s take a closer look at the flow of an UPDATE statement and what logging operations are involved. There are three logging mechanisms we need to understand before going into detail.

Undo log: logical log, records logical information of SQL statements, Innodb’s own log
Redo log: There are two parts: the redo log buffer in the buffer pool and the redo log file on disk. A redo log is a physical log that records which data pages a database is modified on. There are three different flush strategies for when the redo log file is flushed to the redo log buffer. The innodb_flush_LOG_at_trx_COMMIT parameter is usually set to 1. To do this, the Redolog buffer flusher the disk before committing a transaction. Redo logs are also innoDB.
Bin log: Different from the previous two, the bin log is not implemented by Innodb itself. It is the mysql server layer’s own log. Unlike redo logs, bin logs are logical logs. Write the bin log file before committing the transaction.

There is a consistency issue with bin and redo logs being committed before transactions. In order to maintain logical consistency between the two logs, we adopt a two-phase commit approach

redo log prepare
Binlog submitted
redo log commit

With that in mind we can talk about update execution. Can be combined with text and text to see.

First we need to perform an update, we need to load the disk file into our buffer pool and then perform operations on the cached page in the buffer pool.
Before making any changes to the database, we first write undo log to the undo log file. Even if we have an intermediate step that fails, we can roll back.
After writing undo log, we update the buffer pool.
We write redo logs to the redo log buffer.
Use a two-phase commit redo log and bin log when a transaction commits.

Buffer Pool: Understand again

When we talked about loading disk data into buffer, did you feel that this step was particularly abstract? Next, let’s look at how disk data and buffer data are stored and what kind of data they are.

This is what the disk file looks like. We can see that the disk file is a data page, the size of the data page is 16kb, and the data page is a row of data.

This is what a buffer pool looks like: a buffer pool has a cache of pages that are the same size as the data in the disk — 16kb. Another important element in the cache page is the description information. A description corresponds to a cache page. It contains a description of the cache page, such as the address of the cache page in the buffer pool, the tablespace that the data page belongs to, the number of the data page, and so on.

Once we know what the buffer pool and the disk file hold, we focus on how to load the disk’s data page into the cache page. First we ask ourselves a few questions:

How do I know which cached pages are empty and can be replaced?
Dirty pages: Pages that have changed in the cache but have not changed on disk are called dirty pages. How do we deal with dirty pages?
When we run out of cached pages, we still need to load the disk data pages into the buffer. How do we do this?

With these three problems solved, it seems that we can implement the operation of loading data into memory, right?

Three linked lists in a Buffer Pool

In this section, we mainly answer the three questions raised above. Take a look at how the Innodb engine solves these three problems.

Free linked lists: Solve empty pages

Innodb uses a free linked list mechanism to solve the problem of finding free cached pages when loading buffers. All descriptions of free pages will not be added to the free list. If we need to load the buffer pool at this time, we just need to remove one description from the free list and then delete this node.

When we do this, we also need to check that the data page is cached, if it is cached, then we do not need to do the above operation. In Innodb we use a K-V container to solve this problem. Key is the table space number plus the data page number, and value is the cache page address in the buffer. So before working with the free list, we query the hash table to see if the page is cached.

Flush linked list: Resolves dirty pages

Dirty pages are essentially cache pages out of sync with data pages. In Innodb we flush linked lists and then collect dirty pages into a linked list. This allows you to record which cached pages are dirty. Subsequently, we need to regularly brush these dirty pages to the disk to solve the problem of dirty pages.

LRU lists solve the problem of not having enough cached pages

When loading a page, there will be no nodes in the free list, so there will be no empty pages. We need to use the LRU list to solve the problem of insufficient cached pages. This is the same idea as the page replacement algorithm in the operating system, Innodb uses an improved LRU, as well as other methods such as LFU.

The idea of LRU is to Least Reacent Use pages that have not been used recently. If we take a simple LRU can not solve the following two problems respectively book preread and full table scan

Prefetch mechanism: The prefetch mechanism is that when we load a page into memory, we will load the adjacent pages of the page into memory. It is quite possible that a pre-loaded page is placed at the head of the LRU list, but is still rarely visited.
Full table scan: select * from table1 MySql will cause a full table scan, leading to the LRU list may be loaded with full table scan pages.

To solve the above two problems, we made some improvements to the LRU linked list, using a cold and hot separation idea. We divide LRU nodes into cold nodes and hot nodes. This ratio can be set. The default is 37% of cold nodes. When the data page is first loaded into the buffer, it will be first loaded to the head of the cold node area. If the node is visited 1s after loading, it will be promoted from the head of the cold data to the head of the hot data. So every time we need to eliminate the page, must eliminate is the last node of cold data can be.

Three linked lists work together

These lists cooperate with each other almost at the same time in every operation. For example, when loading the page, the nodes of the free list will be removed, and the nodes of the LRU list will be moved to the hot data area. Modifying the page adds the page to the Flush list and moves the LRU node. You can visualize these movements as animations in your mind.

Reference: Ru Ape technical column