How to define cost

  • IO cost: When we want to query records in a table, we need to load the data or index into memory before we operate. The time taken to load from disk to memory is called I/O cost.
  • CPU cost: The time spent reading and detecting whether records meet the corresponding search criteria, sorting the result set, and so on is called the CPU cost.

For the InnoDB storage engine, a page is the basic unit of interaction between disk and memory. The MySQL designer sets the default cost of reading a page at 1.0, and the default cost of reading and detecting whether a record matches the search criteria at 0.2.

We named 1.0 and 0.2 as cost constants

CREATE TABLE single_table (
    id INT NOT NULL AUTO_INCREMENT,
    key1 VARCHAR(100),
    key2 INT,
    key3 VARCHAR(100),
    key_part1 VARCHAR(100),
    key_part2 VARCHAR(100),
    key_part3 VARCHAR(100),
    common_field VARCHAR(100),
    PRIMARY KEY (id),
    KEY idx_key1 (key1),
    UNIQUE KEY idx_key2 (key2),
    KEY idx_key3 (key3),
    KEY idx_key_part(key_part1, key_part2, key_part3)
) Engine=InnoDB CHARSET=utf8;
Copy the code

Cost-based optimization steps

  1. Based on the search criteria, find all possible indexes to use

  2. Calculate the cost of a full table scan

  3. Calculates the cost of executing a query with different indexes

  4. Compare the cost of various implementations and find the one with the lowest cost

SELECT * FROM single_table WHERE 
    key1 IN ('a'.'b'.'c') AND 
    key2 > 10 AND key2 < 1000 AND 
    key3 > key2 AND 
    key_part1 LIKE '%hello%' AND
    common_field = '123';
Copy the code

1. Locate all possible indexes based on the search criteria

For B+ tree indexes, as long as the index column and constant use =, <=>, IN, NOT IN, IS NULL, IS NOT NULL, >, <, >=, <=, BETWEEN,! The = (or <>) and LIKE operators are concatenated to produce a so-called range (LIKE matches a string prefix). This means that these search criteria can be indexed, and if so, they are written to the Possible Keys.

Obviously, key_part1 is compared to a string beginning with a wildcard using the LIKE operator. Indexes cannot be applied. Possible keys include IDx_key1 and IDx_key2.

2. Calculate the cost of full table scan

Full table index: Full table scan means that all the records in the cluster index (B+ tree can traverse the leaves directly) are compared with the given search criteria, and the records that match the search criteria are added to the result set

So the cost is the number of IO + the number of comparisons (the number of records)

How do you know that

mysql> SHOW TABLE STATUS LIKE 'single_table'\G
Copy the code

3. Calculate the cost of executing queries using different indexes

Take, for example, the cost analysis of performing a query using IDX_KEY2

Backtable: The essence of the backtable operation is to find the complete user record in the cluster index by using the primary key value of the secondary index record

For queries using the secondary index + back table approach, the cost depends

  • How many pages does the range cover?
  • Number of records to be returned to the table
  1. Select * from idX_key2; select * from idX_key2; select * from IDX_key2; select * from IDX_key2;
  2. Then according to the key2 < 1000 this condition continues from idX_key2 corresponding B+ tree index to find the last record to meet this condition, we call this record interval rightmost record
  3. If the left-most record AND right-most record are not too far apart (in MySQL 5.7.21, as long as they are no more than 10 pages apart), you can accurately count the number of secondary index records that meet the key2 > 10 AND key2 < 1000 conditions. Otherwise, just read 10 pages to the right along the left-most record of the interval, calculate the average number of records per page, and then multiply that average by the number of pages between the left-most record and the right-most record of the interval.
  • Select * from table where key = primary key; select * from table where key = primary key; select * from table where key = primary;
  • Return the complete user record obtained after the table operation, and then check whether the other search criteria are valid

At the end of the article, please include the following text and link: This article is participating in the “Gold Nugget Booklet free learning!” Event, click to view the event details