preface

Mysql > delete table dataflow; delete table dataflow; Mysql > delete from testtable WHERE biz_date <= ‘2017-08-21 00:00:00’ AND status = 2 LIMIT 500\G KEY idx_bizdate_st (biz_date,status); KEY idx_bizdate_st (biz_date,status);

Analysis of the

The index of all columns in the table can be indexed.

Since you can use the index, the total size of the table is about 200M, so why the formation of slow search?

We checked the execution plan, removed the limit, and found that he chose to go full table scan. mysql > desc select * from testtable WHERE biz_date <= ‘2017-08-21 00:00:00’; +—-+————-+———–+——+—————-+——+———+——+——–+————-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +—-+————-+———–+——+—————-+——+———+——+——–+————-+ | 1 | SIMPLE | testtable | ALL | idx_bizdate_st | NULL | NULL | NULL | 980626 | Using where | +—-+————-+———–+——+—————-+——+———+——+——–+————-+ 1 row in set (0.00 SEC)

SQL > select * from biz_date; type:ALL

mysql > desc select * from testtable WHERE biz_date <= ‘2017-08-21 00:00:00’ and status = 2; +—-+————-+———–+——+—————-+——+———+——+——–+————-+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +—-+————-+———–+——+—————-+——+———+——+——–+————-+ | 1 | SIMPLE | testtable | ALL | idx_bizdate_st | NULL | NULL | NULL | 980632 | Using where | +—-+————-+———–+——+—————-+——+———+——+——–+————-+ 1 row in set (0.00 sec)

— query biz_date + status — key points: rows:980632; type:ALL

mysql > desc select * from testtable WHERE biz_date <= ‘2017-08-21 00:00:00’ and status = 2 limit 100; +—-+————-+———–+——-+—————-+—————-+———+——+——–+——————– —+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +—-+————-+———–+——-+—————-+—————-+———+——+——–+——————– —+ | 1 | SIMPLE | testtable | range | idx_bizdate_st | idx_bizdate_st | 6 | NULL | 490319 | Using index condition | +—-+————-+———–+——-+—————-+—————-+———+——+——–+——————– –+ 1 row in set (0.00 SEC)

Rows :490319; biz_date + status+ limit

mysql > select count() from testtable WHERE biz_date <= ‘2017-08-21 00:00:00’ and status = 2; + — — — — — — — — — — + | the count () | + — — — — — — — — — — + | 0 | + — — — — — — — — — — + 1 row in the set (0.34 SEC)

mysql > select count() from testtable WHERE biz_date <= ‘2017-08-21 00:00:00’; + — — — — — — — — — — + | the count () | + — — — — — — — — — — + | 970183 | + — — — — — — — — — — + 1 row in the set (0.33 SEC)

mysql > select count() from testtable; + — — — — — — — — — — + | the count () | + — — — — — — — — — — + | 991421 | + — — — — — — — — — — + 1 row in the set (0.19 SEC)

mysql > select distinct biz_status from testtable; +————+ | biz_status | +————+ | 1 | | 2 | | 4 | +————+

Through the above query, we can find the following problems:

The number of rows estimated by biz_date is almost the same as that estimated by biz_date + status=2, which is 98W. The actual query table biz_date + status=2 has no record. MySQL found that 98w rows were needed to pass the index scan (estimated)

Therefore, when the MySQL database estimates the statistics, it finds that the number of index rows to be scanned almost covers the entire table. Therefore, the MySQL database chooses to scan the entire table instead of using indexes.

Is there something wrong with his statistics? We recollected the following table statistics and found that the estimated number of rows for the execution plan was the same, and the guess could only be estimated from the first field of the composite index (to be determined).

What if we just try forcing him to go through the index? mysql > select * from testtable WHERE biz_date <= ‘2017-08-21 00:00:00’ and status = 2; The Empty set (0.79 SEC)

mysql > select * from testtable force index(idx_bizdate_st) WHERE biz_date <= ‘2017-08-21 00:00:00’ and status = 2; Empty set (0.16 SEC) we find that the query time is much faster with mandatory index and no mandatory index comparison, because there is no mandatory index scan. But! Still very slow!

So what else can you do to optimize what should be a fast query?

You’ve all heard of putting selective fields first in a composite index, right?

A more selective index in front is not universal for all scenarios, and this scenario can use status in front to make SQL faster.

Well, can we get him to stop scanning so much of the index? MySQL uses indexes to determine a scan range. If you can locate a scan range as small as possible, isn’t it faster?

And the business logic is to periodically delete data prior to a certain date. So logically, each delete deletes only one day’s data, directly allowing SQL to scan for a day’s range. So we can rewrite SQL! mysql > select * from testtable WHERE biz_date >= ‘2017-08-20 00:00:00’ and biz_date <= ‘2017-08-21 00:00:00’ and status = 2; The Empty set (0.00 SEC)

mysql > desc select * from testtable WHERE biz_date >= ‘2017-08-20 00:00:00’ and biz_date <= ‘2017-08-21 00:00:00’ and status = 2; +—-+————-+——————+——-+—————-+—————-+———+——+——+————— ——–+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +—-+————-+——————+——-+—————-+—————-+———+——+——+————— ——–+ | 1 | SIMPLE | testtable | range | idx_bizdate_st | idx_bizdate_st | 6 | NULL | 789 | Using index condition | +—-+————-+——————+——-+—————-+—————-+———+——+——+————— ——–+ 1 row in set (0.00 SEC)

“Rows” is lowered a lot, and the index is removed obediently

mysql > desc select * from testtable WHERE biz_date >= ‘2017-08-20 00:00:00’ and biz_date <= ‘2017-08-21 00:00:00’ ; +—-+————-+——————+——-+—————-+—————-+———+——+——+————— ——–+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +—-+————-+——————+——-+—————-+—————-+———+——+——+————— ——–+ | 1 | SIMPLE | testtable | range | idx_bizdate_st | idx_bizdate_st | 5 | NULL | 1318 | Using index condition | +—-+————-+——————+——-+—————-+—————-+———+——+——+————– ———+ 1 row in set (0.00 SEC)

Even if there is no status, there is no index

summary

I was going to use hint to force him to drop the index, but actually forcing the execution time of the index is not satisfactory. Optimizing SQL with business logic is one of the best and ultimate tools to use. A DBA who does not understand the business is not a good DBA…