How does SQL tuning work in large factories?

Have feelings, have dry goods, wechat search [three prince Ao Bing] pay attention to this different programmer.

This article has been included in GitHub github.com/JavaFamily, there are a line of large factory interview complete test sites, information and my series of articles.

preface

This day I was lunch break, the company DBA woke me up, said that a library appeared a large number of slow SQL, soon ah, soon, I did not react, the library hung, I thought that now the user does not speak wudu ah, how in my sleep when a large number of requests.

This is a very common scenario, because many businesses start with a small amount of data, so when writing SQL did not pay attention to performance, and so on, many businesses need to do the tuning, in the e-commerce company in the past few years I have summarized a lot of, let’s share it with you.

In the process of code development, we will follow some SQL development specifications to write high-quality SQL to improve the Response Time(RT) of the interface. For some core interfaces, RT is required to be less than 100ms or even lower.

In the early stage of a service, the amount of data is small, which can basically meet this requirement. However, as the service volume increases, the amount of data also increases, and the SQL time of the corresponding interface becomes longer, which directly affects the user experience. In this case, THE SQL needs to be optimized.

Optimization points mainly include SQL normative check, table structure index check, SQL optimization case analysis, from these three aspects with the actual case to talk about how to optimize SQL.

SQL normative check

Each company has its own MySQL development specifications, which are basically the same with minor differences. Here are some of the more important ones that I have come into contact with during my work.

Select check

UDF user-defined functions

SQL statement select is followed by a custom UDF function. The UDF function will be called as many times as the number of rows returned by SQL, which has a significant impact on performance.

#getOrderNo is a user-defined function for the user to get the order number based on order_sn
select id, payment_id, order_sn, getOrderNo(order_sn) from payment_transaction where status = 1 and create_time between 'the 2020-10-01 10:00:00' and 'the 2020-10-02 10:00:00';
Copy the code

Text type check

If the select field of the text type appears, it will consume a lot of network and IO bandwidth. Because the returned content is too large and exceeds the max_allowed_packet setting, the program will report an error, which needs to be evaluated and used with caution.

The content in the request_log table is of type text.
select user_id, content.status.url.type from request_log where user_id = 32121;
Copy the code

Group_concat Use caution

Gorup_concat is a string aggregation function that affects the response time of SQL. If the value returned is too large to exceed the max_allowed_packet setting, the program will report an error.

select batch_id, group_concat(name) from buffer_batch where status = 0 and create_time between 'the 2020-10-01 10:00:00' and 'the 2020-10-02 10:00:00';
Copy the code

Inline subqueries

A subquery followed by a SELECT is called an inline subquery. The number of rows returned by the SQL query depends on how many times the subquery has been executed, which seriously affects SQL performance.

select id, (select rule_name from member_rule limit 1) as rule_name, member_id, member_type, member_name, status  from member_info m where status = 1 and create_time between 'the 2020-09-02 10:00:00' and 'the 2020-10-01 10:00:00';
Copy the code

Check the from

How tables are linked

In MySQL, it is not recommended to use Left Join, even if ON filters the index of the condition column, some cases will not move the index, resulting in a large number of data rows will be scanned, and the SQL performance will be poor.

SELECT a.member_id,a.create_time,b.active_time FROM operation_log a LEFT JOIN member_info b ON a.member_id = b.member_id where  b.`status` = 1
and a.create_time between '2020-10-01 00:00:00' and '2020-10-30 00:00:00' limit 100.0;
Copy the code

The subquery

Since MySQL’s cost-based optimizer CBO is weak in handling subqueries, it is not recommended to use subqueries and can be rewritten as Inner Join.

select b.member_id,b.member_type, a.create_time,a.device_model from member_operation_log a inner join (select member_id,member_type from member_base_info where `status` = 1
and create_time between '2020-10-01 00:00:00' and '2020-10-30 00:00:00') as b on a.member_id = b.member_id;
Copy the code

Where to check

The index column is evaluated

When a field is indexed and the WHERE condition is displayed, no operation can be performed on the field, causing the index to fail.

#device_no column has index, index invalid due to ltrim function
select id.name , phone, address, device_no from users where ltrim(device_no) = 'Hfs1212121';
The #balance column has an index that was invalidated due to an operation
select account_no, balance from accounts where balance + 100 = 10000 and status = 1;
Copy the code

Type conversion

MySQL > select vARCHar, varchar, varchar, varchar, varchar, varchar; In contrast, for vARCHAR type fields passing in Int values is not able to go through the index, should be made corresponding to the field type of the corresponding value is always correct.

#user_id is bigint. The incoming vARCHar value has an implicit conversion and can be indexed.
select id.name , phone, address, device_no from users where user_id = '23126';
#card_no is varchar(20)
select id.name , phone, address, device_no from users where card_no = 2312612121;
Copy the code

The column character set

Starting with MySQL 5.6, it is recommended that all object character sets should use UTF8MB4, including MySQL instance, database, table, and column character sets. Avoid index failure caused by character set mismatch in associated query Join. At present, only UTF8MB4 supports emoji storage.

character_set_server  =  utf8mb4    # database instance character set
character_set_connection = utf8mb4  # connect character set
character_set_database = utf8mb4    # database character set
character_set_results = utf8mb4     # result set character set
Copy the code

Group by check

The prefix index

The column following group by contains indexes that eliminate CPU overhead associated with sorting, but not with prefixed indexes.

#device_no field type varchar(200), create prefix index.
mysql> alter table users add index idx_device_no(device_no(64));

mysql> select device_no, count(*) from users where create_time between '2020-10-01 00:00:00' and '2020-10-30 00:00:00' group by device_no;
Copy the code

Functional operation

If you need to count the number of newly added users every day in a certain month, refer to the following SQL statement. Although the index create_time can be used, the sorting cannot be deleted. To solve this problem, you can create a redundant field of the stats_date date type.

select DATE_FORMAT(create_time, '%Y-%m-%d'), count(*) from users where create_time between '2020-09-01 00:00:00' and 'the 2020-09-30 23:59:59' group by DATE_FORMAT(create_time, '%Y-%m-%d');
Copy the code

The order by check

The prefix index

The column after order by contains an index, which eliminates the CPU overhead associated with sorting, but does not eliminate sorting if it is a prefix index.

The field order

The asC/DESC order is the same as that of the index, and the order of the index is fully utilized to eliminate the CPU overhead caused by sorting.

Limit inspection

Be careful

For limit M, n paging query, the later the page, that is, the larger the M is, the longer the SQL time will be. For this case, we should first fetch the primary key ID, and then use the primary key ID to Join associated query with the original table.

Table structure check

Table & column name keyword

In the stage of database design and modeling, the table name and field name should be set reasonably, and MySQL keywords such as DESC, order, status, group, etc. should not be used. You are advised to set lower_case_table_names to 1. The table name is case-insensitive.

Table storage engine

For OLTP service systems, InnoDB engine is recommended for better performance, which can be controlled by default_storage_engine.

AUTO_INCREMENT attribute

Select * from InnoDB where primary key ID is AUTO_INCREMENT and AUTO_INCREMENT=1; select * from InnoDB where primary key ID is AUTO_INCREMENT and AUTO_INCREMENT=1; InnoDB is designed to reserve only 6 bytes for row_id, so that row_id ranges from 0 to 2^ 48-1. If id reaches the maximum value, the next value increases from 0.

The new id value will start at 10001, which is not correct, and should start at 1.
create table booking( `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'primary key id',...)engine = InnoDB auto_increment = 10000;

If the id value is specified, the increment will start at that value by +1.
insert into booking(id, book_sn) values(1234551121.'N12121');
Copy the code

NOT NULL attribute

Add the NOT NULL DEFAULT VALUE attribute to all fields according to service requirements. If a large number of NULL values are stored in column values, index stability may be affected.

The DEFAULT attribute

When creating a table, it is recommended that each field have a DEFAULT value as far as possible, disallow DEFAULT NULL and instead populate the DEFAULT value of the response to the field type.

The COMMENT attribute

Field remarks Must be clear about the function of the field, especially for some state fields. All possible state values of the field and their meanings must be explicitly listed.

The TEXT type

It is not recommended to use Text data type. On the one hand, the transmission of a large number of data packets may exceed the max_allowed_packet setting, resulting in program errors. On the other hand, DML operations on the table will become very slow.

The index inspection

The index attribute

Index cardinality refers to the number of unique values of the indexed column. The more unique values close to count(*), the higher the select rate of the index, the fewer rows scanned by the index, and the higher the performance. For example, the select rate of primary key IDS is 100%. Because ids are clustered indexes that store entire rows of data, there is no need to return to the table, and performance is highest.

mysql> select count(*) from member_info;
+----------+
| count(*) |
+----------+| | + 148416----------+
1 row in set (0.35 sec)

mysql> show index from member_base_info;
+------------------+------------+----------------------------+--------------+-------------------+-----------+------------ -+----------+--------+------+------------+---------+---------------+
| Table            | Non_unique | Key_name                   | Seq_in_index | Column_name       | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------+------------+----------------------------+--------------+-------------------+-----------+------------ -+----------+--------+------+------------+---------+---------------+
| member_info |          0 | PRIMARY                    |            1 | id                | A         |      131088 | NULL     | NULL   |      | BTREE      |         |               |
| member_info |          0 | uk_member_id               |            1 | member_id         | A         |      131824 | NULL     | NULL   |      | BTREE      |         |               |
| member_info |          1 | idx_create_time            |            1 | create_time       | A         |        6770 | NULL     | NULL   |      | BTREE      |         |               |
+------------------+------------+----------------------------+--------------+-------------------+-----------+------------ -+----------+--------+------+------------+---------+---------------+
# Table: Table name
#Non_unique: 0- yes, 1- No
#Key_name: indicates the index name
#Seq_in_index: sequential number in an index, single-column index - all 1; Compound index - increments from 1 based on the order of index columns.
#Column_name: specifies the column name of the index
#Collation: sort order. If asC /desc is not specified, the default is ascending ASC.
Cardinality: index Cardinality - The number of unique values in an index column.
#sub_part: prefix index length; For example index (member_name(10), the length is 10.
#Packed: Index organization. Default is NULL.
#Null: YES: index column contains Null values; ": The index does not contain Null values.
#Index_type: default is BTREE, other values FULLTEXT, HASH, RTREE.
#Comment: Information not described in the index column, such as index disabled.
#Index_comment: Remarks for index creation.
Copy the code

The prefix index

For variable length string type vARCHar (m), to reduce key_len, we can consider creating a prefix index, but the prefix index does not eliminate the sorting overhead of group by and order by. If the actual maximum value of a field is much smaller than M, you are advised to reduce the field length.

alter table member_info add index idx_member_name_part(member_name(10));
Copy the code

Compound index order

Index idx_create_time_STATUS (create_time, status); index idx_create_time_status(create_time, status); The overall cost is larger than that of full table scan, so CBO finally chooses full Table scan.

MySQL follows the left-most index matching principle. For composite indexes, scan the index columns from left to right until the first range query is encountered (>=, >,<, <=, between… . The and… .). The correct index order is index idx_status_create_time(status, create_time).

select account_no, balance from accounts where status = 1 and create_time between '2020-09-01 00:00:00' and 'the 2020-09-30 23:59:59';
Copy the code

Time column index

Indexes should be created by default for default fields created_AT (create_time), updated_AT (update_time), which is generally the default rule.

SQL Optimization cases

Through the monitoring of slow query alarms, it is often found that some SQL statements where filter fields have indexes, but the index is invalid because of SQL writing problems. The following two cases tell us how to use SQL rewrite to query. You can use the following SQL to search for slow queries in the last 5 minutes to generate alarms.

select CONCAT( '# Time: '.DATE_FORMAT(start_time, '%y%m%d %H%i%s'), '\n'.'# User@Host: ', user_host, '\n'.'# Query_time: ', TIME_TO_SEC(query_time),  ' Lock_time: ', TIME_TO_SEC(lock_time), ' Rows_sent: ', rows_sent, ' Rows_examined: ', rows_examined, '\n', sql_text, '; ' ) FROM mysql.slow_log where start_time between current_timestamp and date_add(CURRENT_TIMESTAMP.INTERVAL - 5 MINUTE);
Copy the code

The slow query SQL

19:17:23 | 2020-10-02 | w_mini_user [w_mini_user] @ [10.200.20.11] | 00:00:02 | 00:00:00 | | 443117 | 9 mini_user | | 0 0 168387936 | |select id,club_id,reason,status.type,created_time,invite_id,falg_admin,file_id from t_user_msg where 1 and (team_id in (3212) and app_id is not null) or (invite_id=12395 or applicant_id=12395) order by created_time desc limit 0.10; | 1219921665 |
Copy the code

As you can see from the slow slow_log query, the execution time of 2s, which scans 443117 rows, returns only 9 rows, which is not reasonable.

SQL analysis

# raw SQL, frequently accessed interface, current execution time 2s.
select id,team_id,reason,status.type,created_time,invite_id,falg_admin,file_id from t_user_msg where 1 and (team_id in (3212) and app_id is not null) or (invite_id=12395 or app_id=12395) order by created_time desc limit 0.10;

# Execution plan
+----+-------------+--------------+-------+---------------------------------+------------+---------+------+------+------- ------+
| id | select_type | table        | type  | possible_keys                   | key        | key_len | ref  | rows | Extra       |
+----+-------------+--------------+-------+---------------------------------+------------+---------+------+------+------- ------+
|  1 | SIMPLE      | t_user_msg | index | invite_id,app_id,team_id | created_time | 5       | NULL |   10 | Using where |
+----+-------------+--------------+-------+---------------------------------+------------+---------+------+------+------- ------+
1 row in set (0.00 sec)
Copy the code

From an execution plan as you can see, the table has a single-column index invite_id, app_id, team_id, created_time, is take the create_time index, and type = index index scan, because after create_time did not appear in the where condition, Select invite_ID, app_ID, team_id, and type=index_merge (merge). Merge (invite_ID, team_id, team_id, team_id)

According to the conventional thinking, the OR condition is divided into two parts and analyzed respectively.

selectId,... .from t_user_msg where 1 and  **(team_id in (3212) and app_id is not null)** order by created_time desc limit 0.10;
Copy the code

The execution plan shows the index of team_id, no problem.

| id | select_type | table        | type | possible_keys        | key     | key_len | ref   | rows | Extra                       |
+----+-------------+--------------+------+----------------------+---------+---------+-------+------+--------------------- --------+
|  1 | SIMPLE      | t_user_msg | ref  | app_id,team_id | team_id | 8       | const |   30 | Using where; Using filesort |
Copy the code

Look at another SQL statement:

selectId,... .from t_user_msg where 1 and  **(invite_id=12395 or app_id=12395)** order by created_time desc limit 0.10;
Copy the code

Merge index_merge invite_ID,app_id index_merge invite_id,app_id index_merge

| id | select_type | table        | type        | possible_keys           | key                     | key_len | ref  | rows | Extra                                                             |
+----+-------------+--------------+-------------+-------------------------+-------------------------+---------+------+--- ---+-------------------------------------------------------------------+| | 1 SIMPLE | t_user_msg | index_merge | invite_id, app_id | invite_id, app_id 9, 9 | | NULL | 2 | Using union(invite_id,app_id); Using where; Using filesort |Copy the code

Select * from team_id, invite_id,app_id, index_merge; select * from team_id, invite_id,app_id, index_merge; select * from team_id; Shouldn’t it be index_Merge of three single-column indexes?

Index_merge is turned on by default in the optimizer option. It combines the results of multiple ranges of scans into a single set, which can be viewed by variables.

mysql >select @@optimizer_switch;
| index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,
Copy the code

App_id is not null; select * from CBO; select * from CBO; select * from CBO; App_id is not null app_id is not null

| id | select_type | table        | type        | possible_keys                   | key                             | key_len | ref  | rows | Extra                                                                     |
+----+-------------+--------------+-------------+---------------------------------+---------------------------------+---- -----+------+------+---------------------------------------------------------------------------+| | 1 SIMPLE | t_user_msg | index_merge | invite_id, app_id, teadm_id | team_id, invite_id, app_id 8,9,9 | | NULL | | 32 Using union(team_id,invite_id,app_id); Using where; Using filesort |Copy the code

Rewrite the SQL

According to the above analysis, the condition app_id is not NULL affects the selection of CBO.

Rewriting optimization 1

According to the SQL development specification, rewrite OR into Union All mode, and the final SQL is as follows:

selectId,... .from (
selectId,... .from t_user_msg where **1 and (club_id in (5821) and applicant_id is not null)**
        **union all** selectId,... .from t_user_msg where **1 and invitee_id='146737'**
        **union all** selectId,... .from  t_user_msg where **1 and app_id='146737'**
       ) as a order by created_time desc limit 0.10;
Copy the code

In general, Java code is kept separate from SQL, which is configured in AN XML file. According to business requirements, team_id is mandatory and the other two are optional, so this rewriting can improve the efficiency of SQL execution, but it is not suitable for this business scenario.

Rewriting optimization 2

App_id is not null IFNULL(app_id, 0) >0)

select id,team_id,reason,status,type,created_time,invite_id,falg_admin,file_id from t_user_msg where 1 and (team_id in (3212) and **IFNULL(app_id, 0) >0)**) or (invite_id=12395 or app_id=12395) order by created_time desc limit 0.10;
Copy the code

Rewriting optimization 3

App_id bigINT (20) DEFAULT NULL; app_id bigint(20) NOT NULL DEFAULT 0; app_id bigint(20) NOT NULL DEFAULT 0; Select * from app_id where app_id is not null where app_id > 0;

select id,team_id,reason,status,type,created_at,invite_id,falg_admin,file_id from t_user_msg where 1 and (team_id in (3212) and **app_id > 0)**) or (invite_id=12395 or app_id=12395) order by created_time desc limit 0.10;
Copy the code

In terms of the execution plan, the two rewrite optimization methods both take three single-column indexes, and the execution time is reduced from 2s to 10ms. The online optimization method is optimization 1. If the MySQL development specifications were followed at the beginning, problems would be avoided.

conclusion

In the process of writing code, if you can do these normative checks in advance, evaluate your ideal execution plan, and then explain the execution plan of MySQL CBO, then compare and analyze the differences between the two. Knowing the difference between your options and CBO will not only help you write high-quality SQL, but also help you understand how CBO works.

This article is constantly updated. You can search “Santaizi Aobing” on wechat and read it for the first time. Reply [Information] There are the interview materials and resume templates for first-line big factories prepared by me.