Tell the interviewer that I can optimize groupBy and know it very well!

takeaway

After our online dating platform has been running for a period of time, in order to recommend and place interested friends in the search results of the platform users, we will do data analysis on the user’s behavior and recommend interested friends to him according to the analysis results.

Here, I used the simplest SQL analysis: I took the gender and age of the friends that the user had looked at in the past, and grouped them by age to get the results. Based on the results, recommend the highest number of friends of a certain gender and age to the user.

So, suppose we have a table T_user_VIEW where users can view their friends’ records. The table structure is as follows:

CREATE TABLE `t_user_view` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT 'on the id',
  `user_id` bigint(20) DEFAULT NULL COMMENT 'user id',
  `viewed_user_id` bigint(20) DEFAULT NULL COMMENT 'Id of user being viewed',
  `viewed_user_sex` tinyint(1) DEFAULT NULL COMMENT 'Gender of user viewed',
  `viewed_user_age` int(5) DEFAULT NULL COMMENT 'Age of user being viewed',
  `create_time` datetime(3) DEFAULT CURRENT_TIMESTAMP(3),
  `update_time` datetime(3) DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
  PRIMARY KEY (`id`),
  UNIQUE KEY `idx_user_viewed_user` (`user_id`,`viewed_user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Copy the code

To facilitate the use of SQL statistics, as shown in the table structure above, I redundant the gender and age fields of the user being viewed.

Let’s take a look at the list again:

Select * from user_id where user_id=1; select * from user_id where user_id=1; select * from user_id where user_id=1; select * from user_id where user_id=1;

SELECT viewed_user_age as age, count(*) as num FROM t_user_view WHERE user_id = 1 AND viewed_user_age BETWEEN 18 AND 22 AND viewed_user_sex = 1 GROUP BY viewed_user_age
Copy the code

The statistical results are as follows:

Visible:

This user views female users age 18 as 2
This user views the age of 19 female users as 1
The user views the age of 20 female users as 3

Therefore, a user with user_id=1 is more interested in female users aged 20 and can recommend more female users aged 20 to him.

If the number of records in t_user_view reaches 10 million, the efficiency of this SQL query will plummet. Why? What are some ways to optimize it?

To understand why, you have to look at the execution process of this SQL.

Explain

Let’s look at this SQL with explain:

EXPLAIN SELECT viewed_user_age as age, count(*) as num FROM t_user_view WHERE user_id = 1 AND viewed_user_age BETWEEN 18 AND 22 AND viewed_user_sex = 1 GROUP BY viewed_user_age
Copy the code

After executing the explain statement above, we get the following result:

In the “Extra” column, three Using statements appear. These three Using statements represent the three phases of the groupBy statement in the introduction:

Using WHEREidx_user_viewed_userThe index tree is located that meets some conditionsviewed_user_id, and then go back to the table to continue looking for records that meet the other criteria
Using temporary: Use temporary tables for temporary storagegroupByGroup and statistical field information
Using filesortsort_bufferSort the grouping fields

One noun appears in these three stages: temporary tables. MySQL: 100w? 300W? 500w? As mentioned in the article, this is an area of memory that can be independently accessed and processed by MySQL connection threads. So what does this temporary table look like?

The MySQL temporary table is an example of how SQL is executed. The MySQL temporary table is an example of how SQL is executed.

A temporary table

Let’s look at the SQL that contains the groupBy statement in The Guide, which contains a group field viewed_user_age and a statistics field count(*). These two fields are required for the statistics in the SQL. If we want to do such a statistic and group, and solidize the results, There must be a memory or disk area that is required to drop the results of the first count and then use the results for the next count. Therefore, an area like this that stores intermediate results and performs further processing on the results is called a temporary table.

As mentioned earlier, intermediate results can be placed in memory or disk, so there are two types of temporary tables in MySQL: memory temporary tables and disk temporary tables.

Memory temporary table

What is an in-memory temporary table? In the early stage, when the amount of data is not very large, take the storage group and statistical field as an example, then, basically, the memory can completely store all the corresponding values of the group and statistical field, the storage size is determined by the tmp_table_size parameter. In this case, the memory area where the value is stored, MySQL calls it a memory temporary table.

At this point, you might think that MySQL has guaranteed performance by storing intermediate results in an in-memory temporary table. However, in the MySQL sub-table timing: 100W? 300W? 500W? In, I mentioned that frequent memory access will generate fragmentation. Therefore, MySQL has designed a new memory allocation and release mechanism, which can reduce or even avoid memory fragmentation of temporary tables and improve the utilization of temporary tables in memory.

At this point, you might be thinking, “Why did I increase sort_buffer_size, concurrency is huge, query sorting is a dog?” In this article, I talked about user-mode memory allocators: PTMALloc and TCMalloc. Either allocator is used to prevent user processes from frequently applying for memory space to the Linux kernel, causing the CPU to frequently switch between user mode and kernel mode, thus affecting the efficiency of memory access. Why would MySQL want to create its own when you can use them to solve memory utilization problems?

Perhaps the authors of MySQL felt that the implementation of any memory allocator was too complex, and that this complexity would affect MySQL’s performance in memory processing, so MySQL implemented its own memory allocation mechanism: MEM_ROOT. Its memory processing mechanism is relatively simple, memory temporary table allocation is adopted in such a way.

Below, I will take the SQL in the Introduction as an example, explain in detail how the group statistics use MEM_ROOT memory allocation and release mechanism?

MEM_ROOT

Let’s first look at the structure of MEM_ROOT, MEM_ROOT design is relatively simple, mainly including these parts, as shown in the figure below:

Free: a unidirectional linked list in which each unit is called a block. A block contains free memory. Each block contains three elements:

Left: indicates the remaining memory size in the block
Size: memory size corresponding to block
Next: pointer to the next block

As shown in the figure above, the row where free is located is a free list. The part of the list where each arrow is connected is a block. The block contains the left and size

Used: a unidirectional linked list in which each unit is called a block. The block contains the used memory area. Once again, each block contains the above three elements

Min_malloc: Controls how much space a block has left when it is removed from the free list and added to the used list

Block_size: indicates the memory size corresponding to the block

Block_num: number of blocks managed by MEM_ROOT

First_block_usage: free Number of times that the first block in the linked list does not meet the space request size

Pre_alloc: When releasing the entire MEM_ROOT, you can select the block pointed to by pre_alloc through parameter control

Below I take “introduction” in the group statistics SQL as an example, look at MEM_ROOT is how to allocate memory?

distribution

Initialize MEM_ROOT, as shown in the figure above:

min_malloc = 32

block_num = 4

first_block_usage = 0

pre_alloc = 0

block_size = 1000

err_handler = 0

free = 0

used = 0
Apply for memory, see above:

When initializing MEM_ROOT, free = 0, indicating that the free list does not exist, so apply to the Linux kernel for 4 blocks with the size of 1000/4=250 to construct a free list, as shown in the figure above, the list contains 4 blocks, combined with the previous description of free list structure. Each block has size 250 and left 250
Allocate memory, see figure above:

(1) Traverse the free list and take out the first block from the free list head, as shown by the downward arrow in the figure above

(2) Divide a memory area of 220 size from the extracted block, as -220 above the arrow to the right in the figure above, and left in the block changes from 250 to 30

(3) Divide the memory of 220 size into groupBY field viewed_user_age and statistics field count(*) in SQL, and collect the statistical group data into this memory area

(4) in step (2), the left of the allocated block becomes 30,30 < 32, that is, less than min_malloc initialized in step (1), so the block will be inserted at the end of the used list, as shown in the bottom of the figure above. Since the used list is initialized to 0 in step (1), this block inserts the end of the used list, that is, the insert header

The release of

As an example, let’s take a look at how MEM_ROOT releases memory.

As shown in the figure above, the MEM_ROOT memory free process is as follows:

traverseusedIn the list, find the need to releaseblock, as shown in the above picture,Block (30250).Is previously assigned to the group statisticsblock
willBlock (30250).In theleft + 220, i.e.,30 plus 220 is 250And release theblockHas been used220The size of the memory area, after being freedBlock (250250).
willBlock (250250).insertfreeThe end of the list, as shown in the curved arrow section above

MEM_ROOT memory allocation and free MEM_ROOT memory allocation and free MEM_ROOT memory allocation and free MEM_ROOT memory allocation and free MEM_ROOT memory allocation and free MEM_ROOT memory allocation and free MEM_ROOT memory allocation and free MEM_ROOT memory allocation and free MEM_ROOT memory allocation and free MEM_ROOT memory allocation and free MEM_ROOT memory allocation A little inflexible. Therefore, for a block, when the left is less than min_malloc, the larger the memory allocated from it, the smaller the left value in the block, the higher the memory utilization of the block, the less fragmentation, and vice versa, the more fragmentation. This write death is a flaw in MySQL’s memory allocation.

Disk temporary table

When the size of all values in the group and statistics fields exceeds the value determined by tmp_table_size, MySQL will use disk to store these values. This disk area where the values are stored is called the disk temporary table by MySQL.

We all know that the performance of disk access must be much worse than that of memory access, because disk I/O is generated, so the performance is relatively poor once the group and statistical fields have to be written to disk, so we try to increase the parameter tmp_table_size, so that the group and statistical fields can be processed in the temporary table in memory.

Implementation process

Temporary tables treat groups and statistical fields the same way whether they use in-memory or disk temporary tables. In the Introduction, I mentioned that if you want to optimize the SQL in the Introduction, you need to know the principle of SQL execution. Therefore, I will combine the concept of temporary table explained above to explain the execution process of this SQL in detail, as shown in the following figure:

Create a temporary table with two fields viewed_user_age and count(*). The primary key is viewed_user_age. The box contains the values of the two fields viewed_user_age and count(*), where viewed_user_age is the primary key of the temporary table
Scan the secondary index tree idX_user_viewed_user of the table and fetch the IDS on the leaf nodes in turn, that is, fetch the primary key ID of the table from the leaf nodes of the index tree. The IDX_user_viewed_user box in the figure above is the index tree, and the arrow to the right of the box indicates the primary key ID of the table to be fetched
Search for records in the leaf node of cluster_index based on the primary key ID, that is, scan the leaf node of cluster_index:

(1) Get a record, and then get the value of viewed_user_age field in the record. In the figure above, the cluster_INDEX box contains the value of the viewed_user_AGE field in the right-most column

(2) Insert a record (viewed_user_age, 1) if there is no row in the temporary table whose primary key is viewed_user_age. As shown in the Temporary box in the preceding figure, the left arrow indicates that the viewed_user_age field value in the Cluster_INDEX box is written to the TEMPORARY table

(3) If the temporary table has rows with viewed_user_age as the primary key, add 1 to the count(*) value of the row with viewed_user_age. See the Temporary box in the preceding figure
After the traversal is complete, the result set is sorted in the sort_buffer according to the viewed_user_age field and returned to the client. The right-most arrow in the image above indicates that the values of viewed_user_age and count(*) in the TEMPORARY box are written to the sort_buffer. Then, the sort_buffer is sorted by the viewed_user_age field

Through the introduction of SQL execution process, we find that the process has experienced four parts: Idx_user_viewed_user, cluster_index, TEMPORARY and sort_buffer compare the results of the above explanation. The first two are Using WHERE, Temporary corresponds to Using TEMPORARY, and sort_buffer corresponds to Using filesort.

Optimization scheme

At this point, what can we do to optimize this SQL?

Since there are four parts to this SQL execution, can we remove the last two parts, temporary and sort_buffer?

The answer is yes, we simply add the following index to table T_user_view in SQL:

ALTER TABLE `t_user_view` ADD INDEX `idx_user_age_sex` (`user_id`, `viewed_user_age`, `viewed_user_sex`);
Copy the code

You can try it yourself! What has changed with Explain Kangkang!

summary

This chapter focuses on the group statistics SQL in The Introduction, analyzes the execution stage of SQL through explain, combines with the structure of temporary tables, and further analyzes the detailed execution process of SQL. Finally, it leads to the optimization scheme: adding indexes, avoiding the statistics of temporary tables on group fields, and sort_buffer ordering group and statistics fields.

Of course, if you can’t avoid using temporary tables, then try to increase tmp_table_size and avoid using disk temporary tables to collect grouping fields.

To consider

Why does the new index IDX_user_AGe_sex prevent temporary tables from counting group columns, and sort_buffer sort group and statistics columns?

Tip: Combine the principles of index lookup.