Indexes in Mysql

This is the first day of my August challenge.

Index is a kind of data structure, which can help us find the corresponding records quickly. In mysql, a good index can improve the performance of the database, especially when the amount of data is large.

What is the index?

One of the most common metaphors about indexes is to compare them to the contents of a dictionary. Indexes are a storage structure that helps mysql get data efficiently.

Index in mysql

  • According to the storage structure, it is divided into B tree index, full text index and hash index.
  • According to the physical order of data and the logical order of key values: clustered index, non-clustered index.

B-tree indexes

The B tree index includes the B+ tree index. Mysql officially defines B+ tree as an improvement over B tree. In addition, b-tree index can avoid full table scan and speed up data access. All nodes of a B tree are used to store data, and a node can hold multiple data. But when it comes to scoping, sometimes it’s not very efficient. The specific structure is as follows:

B+ tree index is an improvement of B tree index. The STRUCTURE of B+ tree in mysql is similar to skip table. The leaf node is used to save data, while the non-leaf node is used to save a pointer to the corresponding node. The specific structure is as follows:

The hash index

A hash index is an index based on a hash table that requires an exact match to be valid. A hash index can only be used for ‘=’,’IN’,'<=>’ queries and cannot be used for range queries. In addition, for the combination index, the hash index is calculated by combining the combination keys, rather than a single calculation, so only the first few digits of the combination index can be satisfied, and the hash index cannot go through. In addition, because hash indexes may have the same hash value, full table scans are unavoidable. The hash index structure is as follows:

Clustered indexes and non-clustered indexes

The data structure used by clustering index and non-clustering index is B + tree, but the clustering index is the leaf node to save the data, while the non-clustering index is the leaf node to save the address, and the data needs to be read in another file.

Innodb aggregates data by primary key. If no primary key is defined, Innodb will select a non-empty unique index instead. If there is no such index, InnoDB implicitly defines a primary key as the cluster index.

Clustering index data access to faster, because the cluster index will index and data stored in the same B + tree, so the data was obtained from the clustering index than the clustering index quickly, but the insertion speed is heavily dependent on the insertion order, in accordance with the order of the primary key insert is the fastest way, otherwise the page will be divided, seriously affect the performance. In addition, the cost of updating the primary key is relatively high, and the secondary index needs two index lookups, first to find the primary key and then to find the corresponding value.

An index created on top of a clustered index is called a non-clustered index. A non-clustered index always needs a secondary search to access data. Instead of storing the physical location of the row, the leaf node stores the primary key value. It first finds the primary key value, and then finds the data Page of the data row through the primary key value, and then finds the data row through the Page Directory in the data Page.

Conclusion: This is the end of the introduction of indexes in mysql, see you tomorrow!