Discussion on the classification of MySQL index

What is an index?

An index is a data structure that assists storage engines to obtain data efficiently. Many people think of an index as a directory of data that can be quickly located by a storage engine.

Classification of indexes

We often categorize indexes in the following ways: we categorize indexes in “data structure perspective” and texts Index Classifies indexes in terms of physical storage Cluster index Secondary index (secondary index) in terms of index field characteristics Category Primary key index Unique index Common index Prefix index in terms of number of fields that make up the index Category Single column index Joint index (compound index)

Data structure view index

In practice, InnoDB is the default storage engine for MySQL when building tables. If you look at the table above, you can see that InnoDB is the default storage engine for MySQL when building tables. B+tree is the most popular index type used by storage engines in MySQL. Here’s a quick look at the differences between B+ trees and Hash and red-black trees.

B + tree and b-tree

In 1970, R.Bayer and E. McCreight put forward a balanced multi-fork Tree for external search — B-tree. B-tree is a data structure used in directory management in disk management system and index organization in database system. Data structure C language version 2 Yan Weimin B+tree is a variant of B-tree. (Oh, yes, b-tree is a b-tree. It’s not a b-minus tree.) B+tree only stores data on leaf nodes, while B-tree non-leaf nodes also store data. If you have any questions about this, you can insert data into the following connection to test it. B-tree :www.cs.usfca.edu/~galles/vis… : www.cs.usfca.edu/~galles/vis…If the number of nodes is smaller, more nodes can be queried using the same DISK I/O. In addition, the single-linked list of B+tree leaf nodes is suitable for the common range-based sequential retrieval scenario in MySQL, while B-tree cannot do this.

B+tree and red black tree

For B+tree with N leaf nodes, the search complexity is “O(logdN),d refers to degree refers to the degree of B+tree”, indicating that the maximum number of child nodes allowed by the node is D. In practical application,d value is greater than 100. Even when the data level reaches 10 million, the height of B+tree is 3-4, ensuring that the target data can be found after 3-4 disk I/ OS. As shown in the figure above, a red-black tree is a binary tree with a maximum of two child nodes, which means its search complexity is O(logN), much higher than that of a B+ tree. Therefore, a red-black tree requires more disk I/O times from the manager to retrieve target data.

B+tree Index and Hash table

Range query is a common scenario in MySQL databases, but Hash tables are not suitable for range query. Hash tables are more suitable for equivalent query. Besides, there are problems such as Hash function selection and Hash value conflict. For these reasons, B+ Tree indexes have a wider range of applications than Hash indexes.

Physical storage view index

The two common storage engines in MySQL treat indexes differently.

The index of the InnoDB

First, take a look at the indexes in the InnoDB storage engine. The indexes of InnoDB tables are divided into clustered indexes and secondary indexes according to whether the leaf nodes store complete table data. Full table data is stored in a clustered index. Indexes other than clustered indexes are called secondary indexes. Here are some examples of these two types of indexes. Create table workers (id int(11) not null auto_increment comment ’employee id ‘, name varchar(16) not null comment’ employee id ‘, Sales int(11) default null comment ‘Employee sales performance ‘, primary key (id) ) engine InnoDB AUTO_INCREMENT = 10 default charset = utf8; Insert into workers(id, name, sales) values (1, ‘gangnan ‘, 12744); Insert into workers(id, name, sales) values (3, ‘新有 ‘, 14082); Insert into workers(id, name, sales) values (7, ‘lumingfai ‘, 14738); Insert into workers(id, name, sales) values (8, ‘l ‘, 7087); Insert into workers(id, name, sales) values (11, ‘kil ‘, 8565); Insert into workers(id, name, sales) values (15, ‘cesar ‘, 8501); Insert into workers(id, name, sales) values (20, ‘e ‘, 7890); Now we create a test table (Workers) containing salesman information in our own test database, including ID (primary key),name,sales three fields, and specify the storage engine of the table as InnoDB. In this example, the cluster index of workers table is based on the field ID. To simulate accurately, we first insert the primary key ID into b+tree to get the following figure. Based on this figure, I draw the hd version. As can be seen from the figure, each leaf node of the cluster index stores a complete row of table data. One-way linked list is used to connect leaf nodes incrementally according to the ID column to facilitate sequential retrieval. InnoDB tables are required to have a clustered index. By default, a clustered index is created on a primary key field. If there is no primary key field, the first unique index of a table that is NOT NULL is created as a clustered index. InnoDB automatically generates an implicit increment ID column and creates a clustered index on this column. Next, look at secondary indexes. Index_namealter table workers add index index_name(name); Similarly, we draw the B+tree schematic diagram of secondary index Index_name. It can be seen from the diagram that the leaf node of secondary index does not store a complete row of table data, but stores the value of the column where the cluster index resides, that is, the value of the ID column in workers table. The degree of B+tree in these two diagrams is set to 3, which is also mainly for the convenience of demonstration. In a real B+tree index, the tree degree is usually greater than 100. When we talk about clustered indexes and secondary indexes, we must mention “back table query”. Since the leaf node of the secondary index does not store complete table data, when the column value of the clustered index is queried through the secondary index, it is necessary to go back to the narrow index, that is, the table data itself to obtain further data. Select * from workers where name=’ workers ‘; select * from workers where name=’ workers ‘; Select * from table where id=8; select * from table where id=8; select * from table where id=8; Inevitably, the query time increases. If all the Query fields are found in the secondary index, there is no need to return to the table. MySQL calls the secondary index overwrite or trigger “index overwrite”. Select id,name from workers where id = 1; Select * from table_name where id and name are present; select * from table_name where id and name are present; Explain select id,name from workers where name=’ l ‘; SQL > execute plan Extra Using WHERE; Using index indicates that the query triggers index overwriting of index index_name, and does a WHERE filter on the index. There is no need to return the table. Explain select id,name,sales from workers where name=’ luguicen ‘; Extra is Using Index Condition, which means that the Index is filtered conditionally. After the Index is filtered, all rows that meet the Index Condition are found, and then the rows are filtered Using other conditions in the WHERE clause. Index Condition Pushdown (ICP) is a new feature in MySQL 5.6 and above. It is an optimized way to filter data using indexes at the storage engine layer. The execution plan when ICP is enabled contains a Using Index condition flag indicating that the optimizer uses ICP to optimize data access. Check out the official documentation and technical blogs if you’re interested. When ICP is turned off, Index is only a method of data access. The storage engine obtains data from Index back tables and passes it to MySQL Server layer for WHERE filtering. Extra for Using WHERE just reminds us that MySQL will use the WHERE clause to filter the result set. This typically happens at the MySQL server, not the storage engine layer. Generally, it occurs when the index scan cannot be performed or when the index scan is performed but some query conditions are not included in the index. Index overwrite is not triggered.

The index of the MyISAM

InnoDB index, MyISAM index tables stored in MyISAM storage engine do not have clustered indexes. The structure of primary key index in MyISAM table is the same as that of non-primary key index. From the figure above, we can see that the leaf node of MyISAM table does not store table data. The node stores table data address, so MyISAM table can not have primary key. MyISAM table data and index are separate, is stored separately. The difference between a primary key index on a MyISAM table and a non-primary key index on a B+tree table is that the key on a B+tree table must conform to the primary key constraint, and the key on a non-primary key index on a B+tree table must conform to the characteristics of the corresponding field.

Index field features view index

A table can have at most one primary key index. Index column values are not allowed to be NULL. When creating a table, create a UNIQUE index together. Create table persons (id int(11) not null auto_increment comment ‘主库 iD ‘, eno int(11) comment ‘主库 iD ‘, eno int(11) comment ‘主库 iD ‘, Eid int(11) comment ‘iD ‘, veid int(11) comment’ id ‘, name varchar(16) comment ‘iD ‘, Primary key (id) comment ‘eno ‘, key (eno) comment ‘eno ‘, InnoDB auto_increment = 1000 default charset = utf8; Alter table persons add unique index index_veid (veid) comment ‘veid ‘; Run the show index from persons command; Command we see that three unique indexes have been successfully created.

Normal index

A primary key index or unique index requires a field to be a primary key or unique. An index built on a normal field is called a normal index and requires neither a primary key nor a unique field.

The prefix index

A prefix index is an index that indexes the first few characters of a character type field or the first bytes of a binary type field, rather than the entire field. For example, you can index the first 5 characters of name in persons (varchar(16)). Create index index_name on persons (name(5)) comment ‘index ‘; show index from persons; Prefix index can be established on the type of charvarcharbinaryvarbinary columns, can greatly reduce index occupy storage space, also can improve the index of the query efficiency.

Number of columns indexed by index

Create index index_id_name on workers(ID,name) comment ‘workers(id,name) comment ‘; Workers (id, name); workers (workers); Run the show index command to view detailed information about the index. The result is as follows: Although the details list two entries for a combined index, it does not mean that a combined index has multiple indexes. A combined index is an index structure, and these two entries represent the specific information of the fields in the combined index, sorted in the order in which the index was created. The non-leaf node of the combined index saves the values of two fields as the key value of B+tree. When data is inserted into B+tree, it is compared by field ID first. In case of the same ID, it is compared by name field.

Source: www.xysycx.cn/articles/20…

Some interview questions for 2020 are summarized. The interview questions are divided into 19 modules, which are: Java Basics, Containers, multithreading, reflection, object copy, Java Web, exceptions, Networking, Design patterns, Spring/Spring MVC, Spring Boot/Spring Cloud, Hibernate, MyBatis, RabbitMQ, Kafka, Zookeeper, MySQL, Redis, JVM.

Access to information above information: concern public number: programmers with stories, access to learning materials. Remember to click follow + comment oh ~