preface

In the early stage of application development, the amount of data is small, and developers pay more attention to the realization of functions when developing functions. With the growth of production data, many SQL statements begin to expose performance problems, and the impact on production is also more and more large. Sometimes, these problematic SQL statements may be the bottleneck of the whole system performance.

General steps for SQL optimization

1. Locate SQL statements that are inefficient by slowly checking logs

2, Explain the SQL execution plan

Focus on Type, Rows, Filtered, and extra.

Type is more and more efficient from top to bottom

  • ALL Indicates the full table scan
  • Index Indicates full index scan
  • Range Range scan, commonly used operations such as <,<=,>=,between, and in
  • Ref uses a non-unique index scan or a unique index prefix scan to return a single record, often in associated queries
  • Eq_ref is similar to ref, except that it uses a unique index and an associative query with a primary key
  • Const /system Single record. Other columns in the matching row are treated as constants, such as primary key or unique index queries
  • Null MySQL does not access any tables or indexes and returns results directly

Select * from t where a = 1 and b in (1, 2) order by c; select * from t where a = 1 and b in (1, 2) order by c; If you go IDx1, then type is range, if you go IDx2, then type is ref; If the number of rows to be scanned is more than 5 times that of IDX1, idX1 is used; otherwise, IDX2 is used

Extra

  • Using Filesort: MySQL requires an additional pass to figure out how to retrieve rows in sort order. Sorting is done by browsing all rows based on the join type and saving the sort key and pointer to all rows that match the WHERE clause. The keywords are then sorted and rows are retrieved in sorted order.
  • Using temporary: A temporary table is used to save intermediate results. The performance is very poor and needs to be optimized
  • Using index: The Coveing index is used in the select operation, avoiding access to rows of the table. Efficient! If using WHERE is present at the same time, it means that the qualified data cannot be queried directly through index lookups.
  • Using index condition: an ICP is added after MySQL5.6. Using index condtion uses an ICP to filter data in the storage engine layer, rather than in the service layer.

3. Show profile analysis

Know the state and elapsed time of the thread executing the SQL. This is off by default, enabling the statement “Set profiling = 1;”

SHOW PROFILES ;
SHOW PROFILE FOR QUERY  #{id};
Copy the code

4, the trace

Trace analyzes how the optimizer selects execution plans, and the trace file provides further insight into why the coupon chooses execution plan A over execution plan B.

set optimizer_trace="enabled=on";
set optimizer_trace_max_mem_size=1000000;
select * from information_schema.optimizer_trace;
Copy the code

5. Identify problems and take appropriate actions

  • Optimize the index
  • Optimize SQL statement: modify SQL, IN query segmentation, time query segmentation, based on the last data filtering
  • Use other implementation methods: ES, data warehouse, etc
  • Data fragmentation

Scenario analysis

Case 1. Leftmost matching

The index

KEY `idx_shopid_orderno` (`shop_id`,`order_no`)
Copy the code

The SQL statement

select * from _t where orderno=''
Copy the code

The query matches from left to right. To use order_NO to go to the index, the query condition must carry shop_ID or index (shop_id, order_NO) in reverse order

Case 2: Implicit conversion

The index

KEY `idx_mobile` (`mobile`)
Copy the code

The SQL statement

select * from _user where mobile=12345678901
Copy the code

An implicit conversion is equivalent to an operation on an index that invalidates it. Mobile is a character type and uses a number. You should use string matching, otherwise MySQL will use implicit substitution and invalidate the index.

Case 3: Large paging

The index

KEY `idx_a_b_c` (`a`, `b`, `c`)
Copy the code

The SQL statement

select * from _t where a = 1 and b = 2 order by c desc limit 10000, 10;
Copy the code

If there is no optimization, there are two optimization methods as follows: one is to pass in the last data, namely c above, and then do “C < XXX” processing. However, such optimization generally requires changing the interface protocol, which is not always feasible. The other option is to use deferred association to reduce SQL back to the table, but remember that the index needs to be completely overwritten to be effective. SQL changes are as follows

select t1.* from _t t1, (select id from _t where a = 1 and b = 2 order by c desc limit 10000, 10) t2 where t1.id = t2.id;
Copy the code

例4, In order by

The index

KEY `idx_shopid_status_created` (`shop_id`, `order_status`, `created_at`)
Copy the code

The SQL statement

select * from _order where shop_id = 1 and order_status in (1, 2, 3) order by created_at desc limit 10
Copy the code

In query is searched by n* M at the bottom of MySQL, which is similar to union but more efficient than union. When the cost of in query is calculated (cost = number of tuples * average value of IO), the value contained in is queried one by one to obtain the number of tuples. Therefore, the calculation process is relatively slow, so MySQL sets a critical value (eq_range_index_dive_limit). After 5.6, when the critical value is exceeded, the cost of this column will not be involved in the calculation. Therefore, the execution plan selection may be inaccurate. The default value is 200, that is, if the in condition exceeds 200, there will be problems in the calculation of the in cost, and Mysql may select inaccurate indexes.

Processing, you can (order_status, created_AT) swap the order and adjust the SQL for deferred association.

Case 5. Range query is blocked, and subsequent fields cannot go to the index

The index

KEY `idx_shopid_created_status` (`shop_id`, `created_at`, `order_status`)
Copy the code

The SQL statement

select * from _order where shop_id = 1 and created_at > '2021-01-01 00:00:00' and order_status = 10
Copy the code

Range query also has “IN”, “between”

Case 6, does not equal, does not contain a quick search that cannot use an index. (You can use ICP)

Select * from _order where shop_id=1 and order_status not in (1,2) select * from _order where shop_id=1 and order_status ! = 1Copy the code

On indexes, avoid NOT,! =, <>! <,! >, NOT EXISTS, NOT IN, NOT LIKE, etc

Case 7, when the optimizer chooses not to use indexes

If the amount of data required to access is small, the optimizer will still choose the secondary index, but when a large portion of the data in the table is accessed (typically around 20%), the optimizer will choose to find the data by clustering the index.

select * from _order where  order_status = 1
Copy the code

Query all outstanding orders, generally such orders are rare, even if the index is built, can not use the index.

Case 8. Complex query

select sum(amt) from _t where a = 1 and b in (1, 2, 3) and c > '2020-01-01';
select * from _t where a = 1 and b in (1, 2, 3) and c > '2020-01-01' limit 10;
Copy the code

If it is the statistics of some data, may use the number of warehouse to solve; If the business has such a complex query, it may not be recommended to continue to use SQL, but to use other ways to solve the problem, such as using ES.

Use asC and DESC interchangeably

select * from _t where a=1 order by b desc, c asc
Copy the code

The combination of desc and ASC invalidates the index

Case 10 big data

For the data store of push service, there may be a large amount of data. If the solution is selected, it is stored in MySQL and the validity period is saved for 7 days. Therefore, it should be noted that frequent data cleaning will generate data fragmentation, and you need to contact the DBA for data fragmentation processing.

data

  • MySQL: Database Development, Optimization and Management maintenance (Tang Hanming/Zhai Zhenxing/GUAN Baojun/Wang Hongquan)
  • MySQL Technology Insider — InnoDB Storage Engine (Jiang Chengyao)
  • Dev.mysql.com/doc/refman/…
  • Dev.mysql.com/doc/refman/…
  • www.yuque.com/docs/share/…