First, prepare 4 versions of the database, 5.5/5.6/5.7/8.0

Then, each library has exactly the same table, the same amount of data, the same index, InnoDB engine.

Take a look at the index statistics in each version (Cardinality represents how many different values the column has in the total data, which is the estimated value)

  1. MySQL5.5

  2. MySQL5.6

  3. MySQL5.7

  4. MySQL8.0



    The comparison shows that the amount of statistical data is basically similar.

<, <,>=,<= does not necessarily go to the index, because the optimizer will calculate the ratio of the sum of the eligible values to the total amount of data, the ratio <=0.30, will go to the index.

The official website reads as follows: Each table index is queried, and the best index is used unless the optimizer believes that it is more efficient to use a table scan. At one time, a scan was used based on whether the best index spanned more than 30% of the table, but a fixed percentage no longer determines the choice between using an index or a scan. The optimizer now is more complex and bases its estimate on additional factors such as table size, number of rows, and I/O block size.

Query each table index and use the best index unless the optimizer decides that a table scan is more efficient.
A single use scan is based on whether the ** best index ** spans more than 30% ** of the table, but a fixed percentage no longer determines whether to use an index or scan. Now, the optimizer is more complex and estimates are based on ** other factors **, ** such as table size, number of rows, and I/O block size **.

Note: There are several important points in this passage

  1. If multiple indexes are involved in a WHERE condition, MySQL selects the best index. The best index is the most selective index because it filters out many useless rows.
  2. 0.30 ratio value. In the past, the index may be >0.30, but the lowest version of the document is 5.6 has not been verified. Now it depends not only on the ratio value, but also on other factors. So, use > or >= or < or <= to test the following.

Next, get ready to dry.

  1. In MySQL5.5, through constant testing, the key number 8 was found.

    explain select user_name,user_age from user1 where user_age <= 8;
    explain select user_id,user_name,user_age from user1 where user_age < 8;
    Copy the code



    In the second figure, although “Extra” does not explicitly specify Using index, type is displayed as a range query, and the number of rows scanned is 35042. However, in the first figure, type:ALL and rows:210463 are consistent with the primary key ID data amount in the full table statistics at the beginning of the article. The primary key id is the number of rows in the full table.

    The difference only lies in whether the value 8 is included or not, so let’s count the amount of data <8 and =8. (Note that the value 8 is a trial, so it is impossible to find it all at once.)



    User_age < 8 = 35042/210463 = 0.1664

    User_age <= 8 = (35042 + 2221)/210463 = 0.1770 (without indexing)

    The optimizer considers full table scanning to be faster, so it does not move the index.

    Test greater than (89 is also tested repeatedly)

    explain select user_name,user_age from user1 where user_age >= 89;
    explain select user_id,user_name,user_age from user1 where user_age > 89;
    Copy the code



    Again, count the amount of data >89 and =89



    User_age > 89:35894/210463 = 0.1705

    User_age >= 89 ratio value :(35894 + 2088)/210463 = 0.1804

    User_age < 8 and user_age > 89 show that for version 5.5, the user_age index column has a ratio between 0.1705 and 0.1770.

    In special case 1, FORCE INDEX can be used to FORCE the INDEX to be removed regardless of the ratio value

    explain select user_name,user_age from user1 force index(idx_user_age) where user_age < 50;
    explain select user_name,user_age from user1 force index(idx_user_age) where user_age > 20;
    Copy the code



    Special case 2: When only indexed columns are returned, the index will be moved regardless of the ratio value

    explain select user_age from user1 where user_age < 80;
    Copy the code

  2. MySQL5.6 only tests the case of < (pass the test key value is also 8)

    explain select user_name,user_age from user1 where user_age <= 8;
    explain select user_id,user_name,user_age from user1 where user_age < 8;
    Copy the code



    Also calculate the scale value:

    User_age < 8:35042/208534 = 0.1680

    User_age <= 8 :(35042+2221)/208534 = 0.1786

    It can be roughly inferred that under version 5.6, the ratio is between 0.1680 and 0.1786.

    In the second image, the using index condition (ICP) is optimized by MySQL5.6.It can be compared to version 5.5, as shown in ExtraUsing where, indicating that data is ultimately filtered at the MySQL service layer, as shown in version 5.6Using index conditionICP can reduce the number of times the storage engine has to access the base table and the number of times the MySQL server has to access the storage engine.

    The official website is as follows:Dev.mysql.com/doc/refman/…

    Similarly, if FORCE INDEX is used, it will definitely go through the INDEX, only the INDEX column will be returned, regardless of the scale value, the INDEX will go through the same, there is no map

  3. MySQL5.7 tests only the < case (testing to 20 is critical)

    explain select user_name,user_age from user1 where user_age <= 20;
    explain select user_id,user_name,user_age from user1 where user_age < 20;
    Copy the code



    Calculate the ratio as follows:

    User_age < 20:87966/208303 = 0.422

    User_age <= 20 :(87966 + 2093)/208303 = 0.4323

    Similarly, if FORCE INDEX is used, it will definitely go through the INDEX, only the INDEX column will be returned, regardless of the scale value, the INDEX will go through the same, there is no map

    Special note: MRR appears here, which will not be discussed here, and will be discussed in more detail later.

  4. MySQL8.0 only tests < (test to key value 12)

    explain select user_name,user_age from user1 where user_age <= 12;
    explain select user_id,user_name,user_age from user1 where user_age < 12;
    Copy the code



    Calculate the ratio:

    User_age < 12:54798/208611 = 0.2626

    User_age <= 12 :(54798 + 2051)/208611 = 0.2725

    It can be inferred that the ratio is between 0.2626 and 0.2725

    Similarly, if FORCE INDEX is used, it will definitely go through the INDEX, only the INDEX column will be returned, regardless of the scale value, the INDEX will go through the same, there is no map

Conclusion:

  1. Instead of indexing a column, when a range query is performed on the column, the index will be removed, it will have a scale value. The ratio value varies by version, server, IO, data volume, and data duplication. That is to say, the same version, the same library table, at this time and the next time, the ratio may be different. I encountered this problem during the test.
  2. MySQL5.6 has been optimized for BOTH ICP and MRR to greatly improve performance. We’ll see examples later.
  3. FORCE INDEX. In special cases, only INDEX columns can be returned.