How is an SQL query executed?

We often say that you should never get caught up in the details of a situation. You should have a bird’s eye view, which will help you understand the problem from a higher perspective. The same is true for MySQL learning. When we use a database, we usually see the whole thing. For example, if you have the simplest table with a single ID field, execute the following query:

Generally speaking, MySQL can be divided into Server layer and storage engine layer two parts.

The storage engine layer is responsible for data storage and extraction. Its architecture mode is plug-in, supporting InnoDB, MyISAM, Memory and other storage engines. The most commonly used storage engine is InnoDB, which has been the default storage engine since MySQL 5.5.5.

In other words, if you do not specify the engine type when executing create Table, InnoDB is used by default. However, you can also select another engine by specifying the type of storage engine, such as using engine=memory in the CREATE Table statement to specify the use of memory engines to create tables. Different storage engines store table data in different ways and support different functions, and we will discuss engine choices in a later article.

As you can see from the figure, the different storage engines share a Server layer, from the connector to the actuator. To give you an idea of the name of each component, I’m going to walk you through the execution process and see what each component does in turn.

The connector

As a first step, you will connect to the database, and your connector will receive you. Connectors establish connections to clients, obtain permissions, and maintain and manage connections. The join command is written like this:

mysql -h$ip -P$port -u$user -p

After typing the command, you need to enter the password in the interactive dialogue. Although the password can also be written directly after -p on the command line, this can cause your password to be compromised. If you are connected to a production server, you are strongly advised not to do this.

Mysql is the client tool used to establish a connection with the server. After the classic TCP handshake, the connector authenticates your identity using the username and password you enter.

1. If the user name or password is denied, you will receive an error saying “Access denied for user” and the client will stop executing.

2. If the user name and password are authenticated, the connector will check the permission table for you. After that, the permission determination logic in the connection will depend on the permissions read at that time.

This means that once a user successfully establishes a connection, even if you change the user’s permissions using the administrator account, the existing connection permissions will not be affected. After the modification is complete, only newly created connections will use the new permission Settings.

Once the connection is complete, if you have no subsequent action, the connection is idle, as you can see in the show ProcessList command. The graph in the text is the result of show ProcessList, where Co

The mmAND column is displayed as “Sleep”, indicating that there is now a free connection in the system.

If the client is inactive for too long, the connector automatically disconnects it. This time is controlled by the wait_timeout parameter, and the default is 8 hours.

In the database, a persistent connection means that the same connection is used all the time if the client continues to receive requests after a successful connection. A short connection is one that is disconnected after a few queries are executed and then re-established the next time.

However, with full use of long connections, you may find that sometimes MySQL memory usage increases very quickly, because the memory temporarily used by MySQL during execution is managed in connection objects. These resources are released when the connection is disconnected. If long connections are accumulated, the system may use too much memory and kill it forcibly (OOM). In this case, MySQL restarts abnormally.

How to solve this problem? There are two options you can consider.

1. Disconnect the long-term connection periodically. Use for a period of time, or in the program to determine the execution of a large memory – consuming query, disconnect, then query and reconnect.

2. If you are using MySQL 5.7 or later, you can re-initialize the connection resource by executing mysql_reset_connection after each large operation. This process does not require reconnection and re-authentication of permissions, but restores the connection to the state it was in when it was created.

The query cache

After the connection is established, you can execute the SELECT statement. The execution logic leads to the second step: query cache.

When MySQL gets a query request, it goes to the query cache to see if the query has been executed before. Previously executed statements and their results may be cached directly in memory as key-value pairs. Key is the query statement and value is the query result. If your query can find the key directly in the cache, the value will be returned directly to the client.

If the statement is not in the query cache, subsequent execution phases continue. After the execution is complete, the results are stored in the query cache. As you can see, if the query hits the cache, MySQL can return the result directly without having to perform further complicated operations, which can be very efficient.

But in most cases I would advise you not to use query caching, why? Because query caching often does more harm than good.

The query cache invalidates so frequently that whenever a table is updated, all the query cache on that table is cleared. So it’s very likely that you struggled to save the results, and an update wiped them out before you even used them. For databases under pressure to update, the hit ratio of the query cache can be very low. Unless your business has a static table, it only updates once in a while. For example, a system configuration table, the query on that table is suitable for query caching.

Fortunately, MySQL also provides this “on demand” approach. You can set the query_cache_type parameter to DEMAND so that query caching is not used for default SQL statements. For statements where you are sure to use the query cache, you can specify it explicitly with SQL_CACHE, as in the following statement:

Mysql > select SQL_CACHE * from T where ID=10;

Note that MySQL 8.0 directly removed the entire query cache function, meaning that this function has been completely removed from MySQL 8.0.

analyzer

If the query cache is not hit, it is time to actually execute the statement. First, MySQL needs to know what you are going to do, so it needs to parse SQL statements.

The parser first does a “lexical analysis”. You are entering an SQL statement consisting of multiple strings and Spaces. MySQL needs to figure out what the strings are and what they represent.

MySQL knows from the keyword “select” that you typed in. This is a query. It also recognizes the string “T” as “table name T” and the string “ID” as “column ID.”

Once these identifications are done, “parsing” is done. Based on the results of the lexical analysis, the parser will determine whether the SQL statement you entered meets the MySQL syntax based on the syntax rules.

If your syntax is not correct, You will receive an error warning “You have an error in your SQL syntax”, such as the following statement select missing the starting letter “s”.

mysql> elect * from t where ID=1;

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘elect * from t where ID=1’ at line 1

A common grammar error will indicate where the first error occurred, so focus on the words immediately following “use near”.

The optimizer

After a profiler, MySQL knows what you need to do. Before execution can begin, it needs to be processed by the optimizer.

The optimizer determines which index to use when there are multiple indexes in the table. Or determine the order in which tables are joined when a statement has multiple joins. For example, if you execute a statement like the following, this statement performs a join on two tables:

mysql> select * from t1 join t2 using(ID) where t1.c=10 and t2.d=20;

It can first fetch the ID value of the record c=10 from table T1, then associate it with table T2 according to the ID value, and then judge whether the value of D in T2 is equal to 20.

It can also take the ID value of the record d=20 from table T2, and then associate it with T1 according to the ID value, and then judge whether the value of C in T1 is equal to 10.

The logical result of the two execution methods is the same, but the efficiency of the execution will be different, and the role of the optimizer is to decide which one to use.

After the optimizer phase is complete, the execution plan of the statement is determined, and then the executor phase is entered. If you have any questions about how the optimizer chose the index, or whether it might have chosen the index incorrectly, I’ll explain the optimizer separately in a later article.

actuator

MySQL knows what you want to do from the parser, it knows how to do it from the optimizer, so it goes to the executor phase and starts executing statements.

If you do not have permission to perform a query on T, an error will be returned, as shown below.

mysql> select * from T where ID=10;

ERROR 1142 (42000): SELECT command denied to user ‘b’@’localhost’ for table ‘T’

If you have permission, open the table and continue. When a table is opened, the executor uses the interface provided by the table engine according to its definition.

For example, in table T of our example, the ID field has no index, then the execution flow of the executor is like this:

1. Call InnoDB engine interface to get the first row of this table, check whether the ID value is 10, if not, skip, if yes, save this row in the result set;

2. Invoke the engine interface to fetch “Next row” and repeat the same logic until the last row of the table is fetched.

3. The executor returns the recordset consisting of all rows that meet the preceding conditions as a result set to the client.

At this point, the statement is complete.

You’ll see a ROWS_EXAMINED field in the database’s slow query log that shows how many rows were scanned during this statement execution. This value is accumulated each time the executor calls the engine to retrieve the row.

In some scenarios, the executor is called once and multiple rows are scanned inside the engine, so
The number of lines scanned by the engine is not exactly the same as ROws_EXAMINED. The internal mechanics of the storage engine will be explained in detail in a later article.

The same logic is performed for indexed tables. The first call is the “fetch the first row that meets the condition” interface, and then the “next row that meets the condition” interface is iterated, all of which are already defined in the engine.

summary

Today I have introduced you to the logical architecture of MySQL, hopefully giving you an idea of the stages of a complete SQL statement execution process. Due to space constraints, I just used a query example to go through each link. Don’t worry if you still have questions about the details of how each section unfolded, I’ll cover them later in the field section.

0d2070e8f84c4801adbfa03bda1f98d9.png

(608.85KB, download times: 4)

f2da4aa3a672d48ec05df97b9f992fed.png

(15.15KB, download times: 3)

Learn more about Python at gzitcast

Related Posts

Methodology + practice, comprehensive analysis of data acquisition solutions, must see!

Leetcode_832 flips the image

The worst experience of starting a code farming company