This is the 28th day of my participation in the August Text Challenge.More challenges in August

preface

This time we have a personal mysql column, which is dedicated to summarizing some details of mysql and related case summaries, as well as some of the underlying implementation of mysql. In the following chapters, we will look at the internal details of the storage engine most commonly used in mysql with the help of Mysql Tech Insider InnoDB Storage Engine (2nd edition).

What is database driver?

Mysql driver believe that every developer is familiar with, in the usual case we only need to introduce a mysql dependency at the same time using JDBC template code can complete a mysql connection.

A network connection must be handled by a thread

There is no doubt that a network connection is managed by a separate thread in both Tomcat and mysql, and of course the Tomcat thread has nothing to do with the mysql thread.

We usually develop directly using Tomcat’s servlet Api, and each access to the servlet is a separate thread, which means that each user’s access is isolated from each other. However, if our database-driven connection is single-threaded and can only have one user connection, Connection when before the thread is processing data, the back of the threads before connecting the process data of the database connection will be disconnected, it is certainly not, available if the thread will be discarded after at the same time, it also causes a waste of resources and possibly appear the idle thread clearly have a lot of but system due to some situation cannot be recycled, Eventually leads to threads waiting for database connection requests and other problems, based on these problems database has the concept of connection pool.

Database connection pool

The connection pool is used to maintain multiple database connections in the database. When a thread accesses the database, it allocates a connection from the pool to the corresponding thread. In addition, the first important architectural concept in mysql architecture is connection pooling. It is worth noting that connection pooling is not one-sided, but there is a database connection pool in both the connection party and mysql database driver.

There is a connection pool inside tomcat and the mysql server. With the database connection pool, each user who processes the database request will return the thread to the thread pool. This not only improves the thread reuse rate but also ensures that each user’s database connection and operation is independent.

Question:1.How many connections can be opened in the mysql database connection pool?2.How to control the number of mysql connections? Answer:1, check the maximum number of connections'%max_connections%'; But there is a parameter called'max_user_connections'The argument to, the argument and'max_connections'What does it matter? Max_connections specifies the maximum number of connections to the mysql server. Max_user_connections specifies the maximum number of connections per MySQL user2Set GLOBAL max_connections =200;
Copy the code

Basic structure of mysql

Mysql server: mysql server: mysql server: mysql server: mysql server: mysql server: mysql server: mysql server:

SQL parser: parses THE SQL language sent by users, analyzes syntax, and disassembles SQL statements.

Query optimizer: The query optimizer is responsible for logical optimization of the parsed SQL syntax to ensure that the SQL can be executed according to the shortest path

Storage engine: The storage engine is the component that actually executes SQL. The storage engine updates data in memory and on disk according to SQL statements. Use external interfaces to perform operations at the same time.

Actuators: Actuators are the components that are responsible for actually calling the interfaces on the storage engine.

Here’s how the inner workings of mysql work in the order shown above:

SQL parser:

When we send a request through Tomcat, what does the database connection do? It parses the query from Tomcat through something called an SQL parser, and the database server itself doesn’t know what this statement is for, so the SQL interpreter manages this thing. After the request is received through the SQL interface, an SQL parser is passed to perform the parsing action.

Select id,name,age from users where id=1 From parsers select id,name,age from users where id=1 Select * from users where id = 1; select * from users where ID = 1Copy the code

Through the above case we can know that Sql parsing is actually the decomposition of Sql syntax, through Sql syntax to analyze the specific things to do.

Query optimizer

Now that you know how to parse, it’s time to learn how to optimize queries. The query optimizer, as its name implies, optimizes and selects the path of the query. For example, the following SQL statement has two choices.

  • The query id =? And extract the corresponding field

  • Find all the data for the three fields you need and select id=? The data of

    The query optimizer is used to optimize the query logic, using the shortest query path to optimize the query.

    Of course, this optimization is limited, and more often depends on whether the query logic of the SQL statement is complex and the quality of the SQL is good enough.

The storage engine

Since the query optimizer is not the place to execute the query, let’s take a look at what the query optimizer does when it’s done.

As we all know, a database is nothing more than a file constructed by a special data structure. It only provides a specific interface and needs to cruD data according to the syntax of mysql. Since it is to operate data, it must not escape the hard disk and memory, since the storage engine is directly dealing with the data structure. Then the query optimized by the query optimizer must be handed over to the storage engine, since the execution of the statement is handed over to the storage engine, in other words, the storage engine has the final authority to decide what to do with the SQL.

actuator

The storage engine can execute SQL statements, but who operates the storage engine’s interface? In fact, it is the executor that calls the storage engine interface, and the executor executes the SQL according to the storage engine interface according to certain logic. So how do you call the interface? The executor provides an execution plan based on the current storage engine, and then invokes the storage engine to complete SQL statements to add, delete, modify, and query data.

From the introduction above, we can see that the most core part of mysql is actually the storage engine, which is the one component that does the real work. So here is the structure of innoDB storage engine:

Innodb storage engine and architecture

We’ll skip innoDB’s introduction and go straight to the internal structure:

The buffer pool

Innodb USES buffer pool to alleviate the pressure of the disk operation, try to get the operation of the data in memory, at the same time use the operation of the dirty data brush disk memory data synchronization to disk, and most of the cache function understanding, at the time of the query if it is found that hit the buffer pool will check the cached data, otherwise check data from disk. In order to prevent other threads from dirtying the data in the buffer pool, the data is locked.

The undo log

The undo log function is the same as the normal principle of CTRL + Z. Innodb needs to put the changed value into the Undo log before updating the data, and then execute the subsequent operation. This undo log also implements the component of transaction rollback. If the transaction fails or you manually call rollback, then mysql needs to rollback the data according to the undo log content.

Redo buffer pool and redo log file

Redo logs translate to redo logs. They are structured like a buffer pool and a hard disk, so they are understood as two parts: a buffer pool and a log file. The whole function of redo logs is to record which row of data is changed and to flush the cache to a log file for synchronization during a transaction commit. (Specifically, disk IO)

Now that we know about basic redo, let’s consider what happens if the mysql service loses data while operating on data.

To solve this problem, InnoDB runs through the redo log buffer before executing a transaction. The redo log buffer is used to store which row was changed, which field was changed, where data was stored, and what data was changed.

The transaction is not committed. How to handle the outage?

Each SQL execution in the database is a transaction commit. If the database crashes and redo logs are lost, how does mysql handle this?

The problem is very simple. If no transaction is committed, the data has not changed at all, and the data in the buffer pool has changed, the contents of the disk rows have not changed, so the loss of data in the redo buffer pool is not affected. After the shutdown and restart, the data will be restored through the undo log.

Write the redo log to disk after the transaction commits

After a transaction commits, redo uses a policy to flush redo_log_buffer data to disk files. This policy can be changed with the innodb_flush_log_at_trx_COMMIT configuration

When this value is 0, the contents of redo_log_buffer will not be flushed to disk, and if mysql crashes, all files in memory will be lost:

If you set this value to 1: the redo cache data is flushed to the log file after a successful transaction, and there is a record in the file that you changed the field on line XXX.

If the contents of the buffer pool are updated, the contents of the redo log are updated, and the contents of the redo log file record the changes recorded, will the disk data in the disk file be inconsistent if mysql crashes?

Of course not. If there is an outage, the buffer pool will go to the redo log to restore the changes

What happens if innodb_flush_LOG_at_trx_COMMIT is set to 2?

When a transaction is committed, redo logs are written to the OS cache instead of to disk. It may take 1 second for the OS cache data to be written to disk. Note, however, that this log file is stored in the cache, which means that in the event of an OS cache outage, the log content is also lost.

Which of the three redo log flushing strategies is selected?

Which flush strategy should we use for redo logs when committing transactions? What are the pros and cons of each swipe strategy? Why is that?

If you have special requirements for data, in general, 1 is ok, if you select 0, mysql will fail if it goes down, if you select 2, again, even though the data is written to the system cache. But still in memory, as long as the power outage, data will also be lost.

Binlog log

Mysql > select * from binlog;

Bin log What is a Bin log

The redo log is the core component that ensures transaction consistency and data correctness. The redo log is written to the redo log before a transaction is committed in case of mysql downtime and data loss.

But all of this is based on the storage engine InnoDB processing, in fact, the storage engine in the previous step also has a log, that is the bin log.

Conclusion: Binlog is an important part of executing transactions in mysql Server.

Binlog workflow

The innoDB storage engine writes data to a binlog file after the redo log is prepared and flushed to disk.

The executor is a very core component that needs to cooperate with the storage engine to perform all operations of an SQL on disk and memory.

Disk flushing policy of binlog

Key parameter: sync_binlog Controls the disk flushing policy. The default value is 0. Bin writes data to the OS buffer pool as innodb_flush_log_at_trx_commit = 2.

Starting at 5 and 6, as shown above, is when you commit the transaction. If an outage occurs at this point, memory files will be lost as redo_log did before.

Setting the sync_binlog parameter to 1 will force the buffered data to be flushed to the binlog disk at transaction commit time.

Write the commit flag

Why do you need a binlog when you have redo_log?

The entire transaction is processed after binlog completes the file and writes to the location record and writes the COMMIT flag in redolog. So with binlog, the final consistency and integrity of the whole transaction can be guaranteed.

What’s the point of writing a COMMIT flag in redolog?

It’s essentially for redolog and binlog consistency.

Related interview questions

How many mysql connections can be established within 5s? : blog.51cto.com/u_15127515/…

conclusion

We started from the introduction of the database driver, introduced the work flow of the whole mysql, at the same time introduced the work flow of innoDB storage engine, and finally we introduced the function of a key log of the executor binlog and the actual operation effect.

Write in the last

The next section will look at some production examples of how to optimize mysql.