I read a blog post by the Ele. me team: Wait! These two spring-RabbitMQ pits have been trodden for you. Deeply inspired, be sure to take a title that can attract readers’ attention, in addition to sounding titles, of course, the content is dry. Why would you want to take such a title, because look at the theory of slow query optimization today!! Finally on the production of actual combat

(a) slow SQL a

The problem found

After the application is released to the production environment, the front-end page requests the background API to return data, which takes at least 6 seconds. Slow SQL:

Repetition is slow SQL

Execute SQL:

select count(*) from sync_block_data
where unix_timestamp(sync_dt) >= 1539101010
AND unix_timestamp(sync_dt) <= 1539705810
Copy the code

Viewing time:The total time is 2658ms

View the execution plan:

explain select count(*) from sync_block_data
where unix_timestamp(sync_dt) >= 1539101010
AND unix_timestamp(sync_dt) <= 1539705810
Copy the code

Execution plan Results:

Optimize slow SQL one

The sync_dt type is datetime. In another SQL way, the comparison is done directly by comparing dates instead of timestamps. SQL > convert time stamp into date 2018-10-10 00:03:30 and 2018-10-17 00:03:30

select count(*) from sync_block_data
where sync_dt >= "2018-10-10 00:03:30"
AND sync_dt <= "2018-10-17 00:03:30"
Copy the code

Viewing time:It took 419 milliseconds, more than six times faster than a slow query

View the execution plan:

explain select count(*) from sync_block_data
where sync_dt >= "2018-10-10 00:03:30"
AND sync_dt <= "2018-10-17 00:03:30"
Copy the code

Execution plan Results:The average request time after optimization was 900 milliseconds

The only difference between a slow query and a fast query is that the type is index in the slow query and range in the fast query.

Optimize slow query two

SQL > select * from table where data amount is less than or equal to in recent 7 days;

select count(*) from sync_block_data
where sync_dt >= "2018-10-10 00:03:30"
Copy the code

Viewing time:It took 275 milliseconds and halved the query time

View the execution plan:

explain select count(*) from sync_block_data
where sync_dt >= "2018-10-10 00:03:30"
Copy the code

Execution plan Results:Type is still range. But it doubles the speed of the query by making one less comparison

Optimize slow query three

Create a bigINT field sync_dt_LONG to store the millisecond value of sync_DT and create an index on the field sync_dt_LONG

select count(*) from copy_sync_block_data
where sync_dt >="2018-10-10 13:15:02"
Copy the code

Optimized slow query three SQL in 34 ms

select count(*) from copy_sync_block_data
where sync_dt_long >= 1539148502916
Copy the code

Duration: 22ms In the test environment, the speed is increased by about 10ms

Bigint is more efficient than DateTime under InnoDB storage engine

Request time in production environment after completing the three-step optimization:Another 200 milliseconds or so faster. By caching the query data for 10s, the fastest average response time is 20ms

Explain Usage

Mysql > explain query plan output parameters of mysql > explain query plan output parameters of MYSQL > explain query plan output parameters of MYSQL > explain query plan output parameters of MYSQL > explain query plan

The column name instructions
id The execution number identifies the row to which the select belongs. If there is no subquery or associative query in the statement and only a unique SELECT, each line displays a 1. Otherwise, the inner SELECT statements are usually numbered sequentially, corresponding to their position in the original statement
select_type Shows whether the row is a simple or complex SELECT. If the query has any complex subqueries, the outermost layer is marked PRIMARY (DERIVED, UNION, UNION RESUlT)
table Which table is referenced by access (referencing a query, such as “derived3”)
type Data access/read operation types (ALL, index, range, ref, eq_ref, const/system, NULL)
possible_keys Reveal which indexes might be useful for efficient lookups
key Shows which index mysql decides to use to optimize the query
key_len Display the number of bytes used by mysql in the index
ref Shows the columns or constants used by the previous table to find values in the index of the key column record
rows The number of rows that need to be read in order to find the desired rows, estimated value, not exact. By multiplying all the rows column values, you can roughly estimate the number of rows that will be checked for the entire query
Extra Additional information, such as using index, filesort, etc

Focus on type, the difference in type results in 6 times worse performance!!

Type shows the type of access, which is a more important indicator. The result values in descending order are: system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index < span style = “max-width: 100%; clear: both; min-height: 1px;

type instructions
All Worst case, full table scan
index Same as full table scan. Only the table is scanned in index order instead of rows. The main advantage is that sorting is avoided, but the overhead is still very high. If you see Using index in the Extra column, it indicates that you are Using an overwrite index. Only the data of the index is scanned, which is much less expensive than a full table scan in index order
range Range scan, a limited index scan. The key column shows which index is used. You can use range when comparing keyword columns with constants using the =, <>, >, >=, <, <=, IS NULL, <=>, BETWEEN, or IN operators
ref An index access that returns all rows that match a single value. Such index access can only occur if a non-unique index or a unique index non-unique prefix is used. This type differs from eq_ref in that it uses only the left-most prefix of the index in the associative operation, or the index is not UNIQUE or PRIMARY KEY. Ref can be used for indexed columns that use the = or <=> operators.
eq_ref Return at most one qualified record. Occurs when using a unique index or primary key lookup (efficient)
const When it is determined that at most there will be a row match, the MySQL optimizer reads it before the query and only reads it once, so it is very fast. When the primary key is placed in the WHERE clause, mysql converts the query to a constant (efficient)
system This is a special case of the const join type, where only one row of the table meets the condition.
Null This means that mysql can decompose queries during optimization without even accessing tables or indexes during execution (efficient)

Causes of slow query

The unix_TIMESTAMP function is used to count the difference in seconds from ‘1970-01-01 00:00:00’ to the current time in the SQL statement where the function operation is slow. Cause index full scan statistics of the last seven days of data volume

The solution

Try to avoid functional manipulation of fields in the WHERE clause, which will cause the storage engine to abandon indexes for full table scans. For values that need to be calculated, it is better to calculate the incoming values programmatically rather than in SQL statements, such as this SQL where we calculate the incoming values for the current date and the date seven days ago

Afterword.

This problem was not detected in the test environment at the time, and the request speed of the test environment was ok. Not being found can be attributed to the amount of data. The production data volume is one million levels, the test environment data volume is ten thousand levels, the data volume difference is 50 times, the increase of the data volume of slow query problems are also magnified.

(2) slow SQL 2

Because there was an obvious problem of slow response to requests online, I looked at other SQL in the project and found that the efficiency of SQL execution was relatively low

Repetition is slow SQL

Execute SQL

select FROM_UNIXTIME(copyright_apply_time/1000,'%Y-%m-%d') point,count(1) nums
from resource_info where copyright_apply_time >= 1539336488355 and copyright_apply_time <= 1539941288355 group by point
Copy the code

Viewing time:The execution plan is 1123 ms.

explain select FROM_UNIXTIME(copyright_apply_time/1000,'%Y-%m-%d') point,count(1) nums
from resource_info where copyright_apply_time >= 1539336488355 and copyright_apply_time <= 1539941288355 group by point
Copy the code

Execution plan Results:Using temporary and Using filesort are displayed in the extra column

Optimize slow SQL one

Group by is sorted before grouping. SQL > disable sorting by grouping when executing SQL:

select FROM_UNIXTIME(copyright_apply_time/1000,'%Y-%m-%d') point,count(1) nums
from resource_info where copyright_apply_time >= 1539336488355 and copyright_apply_time <= 1539941288355 group by point order by null
Copy the code

Viewing time:It took 1068 milliseconds, or about 100 milliseconds, and the effect was not particularly significant

View the execution plan:The extra field is no longer Using filesort. Filesort means to sort the returned data. Mysql > select * from ‘FileSort’ where index is not returned Using temporary Using temporary Indicates that a query uses temporary tables, which are used for sorting, grouping, and multi-table joins. The query efficiency is not high and optimization is needed. The cause of temporary tables is that a large amount of data is used for grouping operations

Optimizing slow SQL ii

The SQL business logic of slow query is to count the number of each time segment within the condition range according to the time segment classification. For example, if the given condition range is 2018-10-20 to 2018-10-27, this SQL will count the data increments of 2018-10-20 to 2018-10-27 every day. Now optimized to a day by day, check the data seven times, remove the grouping operation

select FROM_UNIXTIME(copyright_apply_time/1000,'%Y-%m-%d') point,count(1) nums
from resource_info where copyright_apply_time >= 1539855067355 and copyright_apply_time <= 1539941467355 
Copy the code

Viewing time:The total time is 38 milliseconds, which is less than 1123 milliseconds even seven times

View the execution plan:Using temporary and Using filesort are missing from extra. perfect

So for the first time, I experienced real slow query and slow query optimization, and finally the theory and practice were combined