Moment For Technology

Type column and extra column in MySQL execution plan

Posted on Aug. 9, 2023, 5:24 a.m. by 李冠宇
Category: The back-end Tag: The back-end mysql

A, read the type

The type of execution plan represents the access data type, and there are many access types.

1. System indicates that this step returns only one row of data. If the execution object of this step is a driver table or primary table, the driven table or subquery is accessed only once.

2, const

Indicates that this execution step returns at most one row of data. Const usually occurs in equivalent queries on primary keys or unique indexes, such as the primary key ID of table T:

3, eq_ref

The eQ_REF type generally means that the associated column on the associated table moves the primary key or unique index when the table is associated. For example, table Jiang associates lock_test with the primary key columns of the two tables:

When SQL is executed above, jiang is the driver table, lock_test is the driven table, and the associated column of the driven table is the primary key ID and type is eq_ref.

Therefore, an important feature of the EQ_REF type is that the tables involved in this step are driven tables; This step uses a unique index or primary key. After system and const, this is the most effective type of association.

4, ref

In contrast, if type is ref for a step in the execution plan, the associated column for that step is not a unique index. Alter table jiang alter table lock_test alter table JIANG alter table LOCK_test alter table JIANG alter table JIANG alter table LOCK_test

When the above SQL is executed, table Jiang is the driven table, lock_test is the driven table, and the driven table runs on a non-unique index. Type is ref.

So ref has the following characteristics: indicates that the index used in this step to access the data is non-unique index.

Ref_or_null For example:

Indicates that an index is crossed (the num column has an index), but a null value is also accessed.

6, index_merge

Represents an index merge, which occurs when an OR operation is performed on multiple secondary index columns. Mysql explain select * from lock_test where id=3 or num=4;

Id is the primary key, num has a normal index, and the statement is executed using two single-column indexes to process the OR operation.

7. Unique_subquery indicates a unique subquery. For example: value in(select primary_key from single_table where...) For the IN clause, when the subquery in the IN clause returns the primary key of a table, type is displayed as unique subQuery.

Index_subquery Value in(select key_column from single_table where...) Similar to the above, for the IN clause, type is index_subQuery when the subquery in the IN clause returns the secondary index column (not the primary key column) of a table.

Range: Fetch a portion of data from an indexed column. It is common to perform between and on index columns.

10. Index: Full scan of an index. This occurs when an index is overwritten, that is, a full scan of an indexed column is performed.

11. All: full table scan without index. A special case:

Explain select * from stu limit 1,1; Reading extra

1, using where: general meaning of two: through the index access, need to go back to the table to access the required data; Filtering conditions occur in the server layer rather than the storage engine layer; If the execution plan shows an index missing, but rows is high and extra shows using WHERE, then the execution will not work well. Because the cost of index access is mostly on the back table, overwrite indexes can be used for optimization. Overwriting indexes also allows filtering conditions to be pushed down to perform filtering operations at the storage engine layer, which works best. Therefore, overwriting an index is the most effective way to solve using WHERE.

2. Using index condition indicates that the filter is pressed down to the storage layer to prevent the server layer from filtering too much data.

3, using temporary indicates that a temporary table is used during statement execution. Temporary tables may be used: Data such as order by group by DISTINCT Union cannot be returned to the user directly, so the data needs to be cached in the user workspace as a temporary table. Note that temporary disk tables may appear, and you need to focus on rows for data that needs to be cached. You can use indexes to eliminate temporary tables corresponding to the above four operations.

Using sort_union(indexs)

Using sort_UNION (i_sname,i_spone) is displayed in the plan extra column, indicating index merge. Often accompanied by index_merge.

5, using MRR: generally through the secondary index access table data process is: first access secondary index column, find the corresponding secondary index data to get the corresponding primary key value, and then take the primary key value to access the table, take out the row data. The retrieved data is sorted by secondary index. MRR stands for: after obtaining the corresponding primary key through the secondary index, the table is not directly accessed, but stored first. After obtaining all the primary key values, the primary key values are sorted, and then the table is accessed. This can significantly reduce the number of table accesses, or at least sequential table accesses. One of the advantages of MRR is to improve the efficiency of index access to the table, that is, to reduce the cost of table back. But there is a big problem: the retrieved data is not sorted by secondary indexes.

Using join buffer(Block Nested Loop) BNL occurs when two tables are associated and there is no index on the associated table. BNL stands for: A is associated with B, and A's associated column has an index, but B's does not. In this case, 10 rows of data will be taken from table A and put into the user's Join buffer space, and then the data on B will be associated with the associated column of A in the Join buffer. In this case, only one visit is required for table B, that is, one full table scan is performed for table B. If 10 rows in the join buffer are associated, 10 more rows will be associated with table B until all rows in table A are associated. As you can see from the above, this approach improves efficiency by about 90%.

Using join buffer(Batched Key Access) When a table is associated, the driven table has an index, but the number of rows returned by the drive table is too large. When this happens, the return result set of the driven table is put into the join Buffer of the user workspace, and a record of the result set is taken to associate the index associated columns of the driven table. After obtaining the corresponding primary key column, data is not immediately fetched from the driven table through the primary key column, but is first stored in the workspace. After all the data in the result set is associated, sort all the primary key columns in the workspace by association, and then uniformly access the driven table to fetch data from it. This has the advantage of greatly reducing the number of visits. As can be seen from the above: BKA uses MRR technology; The BKA is suitable for the case where the driven table returns a large number of rows and the driven table accesses through indexes. Set optimizer_switch= 'MRR =on,batched_key_access=on';

8, using index for group by indicates that group by is completed by a composite index. For example, compound index (a,b), execute statement: select a from TB group by b; Using index for group by.

9, using the index

Indicates that an overwrite index scan is implemented. That is, the data that needs to be accessed is in the index and does not need to be returned to the table. In general, reducing unnecessary data access can improve efficiency.

Select * from lock_test where num = 1 and num = 1;

Use filesort to specify a sort behavior, but not necessarily a disk sort.

11. Materialize scan Is a full scan of a materialized table because the materialized table is a temporary table with no indexes.

Stack is a cloud native - site-based data central-platform PaaS, and we have an interesting open source project on Github and Gitee: FlinkX is a batch data synchronization tool based on Flink. It can collect static data and real-time data. It is a data synchronization engine with all-domain, heterogeneous and batch integration. If you like, please give us a star! Star! Star!

Github open Source project:

Gitee Open Source project:

About (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.