10 minutes to understand how MySQL uses indexes

One, foreword

When doing SQL optimization in MySQL, there are often situations where there is some confusion about whether MySQL can take advantage of indexes.

Such as:

MySQL stopped matching range queries when it encountered a range query condition.
How does MySQL use indexes in fuzzy matching LIKE?
Under what circumstances can MySQL use indexes to sort?

Today, I’m going to use a model to answer all of these questions so you don’t have to worry about using MySQL indexes.

Second, knowledge supplement

key_len

A key_len column in the EXPLAIN execution plan is used to indicate the length of the selected index in bytes for this query, which is usually used to determine how many columns of the federated index were selected.

Here the key_len size is computed as follows:

Generally, key_len is equal to the length of the index column type in bytes, for example, int is 4 bytes and bigint is 8 bytes.
If the value is a string, consider the character set. For example, CHAR(30) UTF8. Key_len must be at least 90 bytes.
If the column type definition allows NULL, its key_len needs an additional bytes.
If the column type is variable length, such as VARCHAR (TEXTBLOB does not allow whole columns to be indexed, and creating partial indexes is also considered a dynamic column type), its key_len needs an additional 2 bytes.

What criteria can be used for indexes

First of all, THANK You very much to Dembo, who gave me a good inspiration. I made this picture based on his article _ and my own understanding:

At first glance, is not very dizzy, not urgent, we take a look slowly

The figure is divided into three parts:

Index Key: the MySQL Index is used to determine the range of data to be scanned.
Index Filter: MySQL determines which data can be indexed. After ICP is enabled, the Index can be used.
Table Filter: the MySQL database cannot use indexes to Filter data. After retrieving row data from the Table, the MySQL database filters data from the server layer.

Now let’s expand it out.

Index Key

Index Key is used to determine a scan range for MySQL, divided into upper and lower boundaries.

MySQL > select first key from where (‘ = ‘, ‘>’); MySQL > select first key from where (‘ = ‘, ‘>’); If it exists and is (>), the key value is added to the lower boundary to stop matching the next index key. If not, stop matching the lower boundary.

_exp: Idx_c1_c2_c3 (c1,c2,c3) where c1>=1 and c2>2 and c3=1 and c2>2 and c1 =1 –> c1 Add lower bounds to stop matching _

The last key is similar to the first key. First, check whether it is one of (=,<=). If yes, add delimit and continue with the next key value matching

_exp: Idx_c1_c2_c3 (c1,c2,c3) where c1<=1 and c2=2 and c3<3 –> last key (c1,c2,c3) Continue to match the next –> c2 for ‘=’ and join the upper bound, continue to match the next –> c3 for ‘<‘ and join the upper bound and stop matching _

Note: The simple memory here is that if the comparison symbol contains the ‘=’ sign, ‘>=’ also contains the ‘=’, then the index key can be used to continue matching the following index key value; If there is no ‘=’, that is,’ >’,'<‘, the following key will not match. At the same time, upper and lower boundaries can not be mixed, which boundary can use the index of the key value is the number of final use index key value.

Index Filter

You can filter by index. That is, the part of the field in the Index Key value that cannot be used to determine the Index Key.

_exp:

idex_c1_c2_c3

where c1>=1 and c2<=2 and c3 =1

index key –> c1

index filter–> c2 c3_

Why is the index key just c1? C2 is used to determine the upper boundary, but c1 does not appear on the upper boundary (<=,=), while c1 is >= on the lower boundary,c2 does not appear, so the index key only has c1 field. C2 and C3 both appear in the index and are treated as index filters.

Table Filter

If the index cannot be used for filtering, only table filter can be used. At this point, the engine layer will return the row data to the Server layer, and then the Server layer performs the Table filter.

4. Handling Between and Like

If there is a “between” and “like” in the query, how does MySQL handle it?

Between

Select * from ‘a’ where c1>=’a’ and ‘b’ where c1>=’a’ and c1 <=’b

Like

Select * from c1 where c1 like ‘%a’; select * from c1 where c1 like ‘%a’; select * from c1 where c1 like ‘%a’

Where c1 =’a %’ and c1<‘b’ is the same as where c1>=’a’ and c1<‘b’.

5. Sort the index

In a database, if the index cannot be used to complete the sorting, the cost of sorting will increase with the increase of the data volume of filtered data. Even if limit is adopted, the database will choose to sort all the result sets and then take the sorted limit records. MySQL has also been optimized for limits that can be sorted using indexes to reduce costs.

Make sure it uses index It is very important to have ORDER BY with LIMIT executed without scanning and sorting full result set, So it is important for it to use index — in this case index range scan will be started and query execution stopped as soon as soon as required amount of rows generated.

CREATE TABLE `t1` ( `id` int(11) NOT NULL AUTO_INCREMENT, `c1` int(11) NOT NULL DEFAULT '0', `c2` int(11) NOT NULL DEFAULT '0', `c3` int(11) NOT NULL DEFAULT '0', `c4` int(11) NOT NULL DEFAULT '0', `c5` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`id`), KEY `idx_c1_c2_c3` (`c1`,`c2`,`c3`) ) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8mb4 select * from t1; +----+----+----+----+----+----+ | id | c1 | c2 | c3 | c4 | c5 | +----+----+----+----+----+----+ | 1 | 3 | 3 | 2 | 0 | 0 | | 2 | 2 | | | | 0 5 4 0 | | | | 2 | 3 | 3 | 4 0 0 | | | 1 | 3 | 4 2 | | 0 0 | | | 1 | 3 | 3 | 0 5 | 0 | | 2 | 3 | | 6 5 | | 0 0 | | 7 | 3 | | | | | 0 0 6 + 2 - + - + - + - + - + - + 7 rows in the set (0.00 SEC) select c1, c2 and c3 from t1; +----+----+----+ | c1 | c2 | c3 | +----+----+----+ | 1 | 3 | 2 | | 1 | 3 | 3 | | 2 | 3 | 5 | | 2 | 4 | 5 | | 3 | 2 | 4 | 6 | | 2 | 3 | | 2 | 3 | 3 | + - + - + - + 7 rows in the set (0.00 SEC)Copy the code

Select c1,c2,c3 from T1; Select c1,c2,c3 from T1 order by c1,c2,c3; select c1,c2,c3 from T1 order by c1,c2,c3; Results.

So what are the order rules for indexes?

C1 =3 — > C2 ordered, c3 disordered

C1 =3,c2=2 — > c3 ordered

C1 in(1,2) — > c2 disordered, c3 disordered

There is a small rule, idx_C1_C2_C3, so how do you determine that a field is ordered? C1 is first in the index, and c2 is second. C2 is ordered only if C1 uniquely determines a value. If C1 has multiple values, then C2 is not necessarily ordered, and so is C3

Six, the summary

As for MySQL indexes, I have only mentioned the model in the case of single table queries. From this article, you should get an idea of how MySQL uses indexes in most cases.

Source: www.fordba.com/spend-10-mi…

10 minutes to understand how MySQL uses indexes

One, foreword

Second, knowledge supplement

What criteria can be used for indexes

4. Handling Between and Like

5. Sort the index

Six, the summary

Related Posts

Leetcode-63- Different paths II

Skywalking uses ClickHouse to store the action

Sword finger Offer series is coming! | 01. Find out the repeat of the array of Numbers