MySQL knowledge comb graph, a graph to see the complete article:

The last article briefly introduced the MySQL implementation process, this article would like to introduce the index in detail, always want to fully understand the index, try to write a blog about it.

1. What is the index?

What’s the index? I checked the official documents. The official document writes the function of the index and no index will bring full table scan, very time consuming. Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows. Simply put, indexing improves query speed. This is very easy to understand, just like the English dictionary before, if there is no front table of words, the efficiency is very low, you have to look for the whole text.

2. Index implementation principle

To understand the implementation principle of index, first look at the underlying implementation of index, MySQL index mostly uses b-tree implementation, b-tree and B+ Tree. Others use Hash indexes. This document describes the B-tree (Balance Tree).

2.1 Binary search tree

Before we talk about b-trees, let’s take a look at Binary Search Trees.

2.2 B – tree

A balanced binary Tree is an empty Tree or the absolute value of the height difference between the left and right subtrees is not more than 1, and the left and right subtrees are both a balanced binary Tree. Multipath means that there are at least 2 non-leaf nodes in the child. B-tree features are also very brain-burning, check the introduction to the algorithm book, is also pondering for a long time. The figure below is an image from the introduction to the algorithm book. The light-shaded part is the node checked when searching for the letter R.

  1. Each node X has the following attributes:
    1. X.n. It represents the number of keywords stored in x;
    2. x.key1,x.key2,… , x.k eyn. They represent n keywords of x, stored in non-descending order, i.e. X.key1 ≤x.key2≤… Or less x.k eyn;
    3. X.l eaf. It’s a Boolean, TRUE if x is a leaf; Otherwise FALSE;
  2. x.c1,x.c2,… , x.c n + 1. They are Pointers to their children. If the node is a leaf node, it does not have these attributes.
  3. The keyword x. keyyi divides the keyword range stored in each subtree, that is, k1≤ X. key1≤ K2 ≤x.key2≤… Or less x.k eyn kn + 1 or less. Among them, ki (I = 1, 2,… ,n+1) represents any keyword stored in the subtree root x.ci.
  4. Each leaf node has the same depth, that is, the height of the leaf h.
  5. There is an upper and lower bound on the number of keys each node contains. These bounds are represented by a fixed integer t(t≥2) called the minimum degree:
    1. Lower bound: Every node except the root node must have at least T −1 keywords. Therefore, every internal node except the root node has at least T children.
    2. Upper bound: Each node contains at most 2T −1 keywords. Therefore, an internal node may have at most 2T children. A node is said to be full if it has exactly 2T −1 keywords.

If you still don’t know what to do, let’s explain each of these features according to the picture above.

The first point refers to the information contained in each node: N represents the number of keywords stored in the node. For example, in the figure above, the left child of M stores two keywords, D and H. X. Casey refers to specific keyword information, such as D. D is actually composed of two parts, which can be understood as a map, {key: data}. X. Casey refers to the map in a broad sense, including the specific key and the stored data, which is usually a record. X.leaf means whether the entire node is a leaf.

The second point indicates that if it is not a leaf node, each node has one property, which is a pointer to its n children. For example, the DH node in the figure above, which has three children, has three Pointers to its own children.

The third point indicates that the keywords of each node are arranged in order from small to large. At the same time, each node also meets the characteristics of the binary search tree mentioned above: the value of the left child < the value of the parent node < the value of the right child.

The fourth point is that each leaf node has the same height, which is also the origin of the word balance.

5. Limit the number of keywords per node. It is impossible for each node to store unlimited keywords. T is the minimum degree. To understand this, you can Google the definition of degree and order. The figure above is a B-tree of order 4. In the figure above, t=2, then each internal node can have 2, 3, 4 children. Range of children [t, 2t], keyword range of each node [T-1, 2T-1]. This is a distinction.

Let’s visualize the 4-order B-tree.

Due to the characteristics of B-tree, the algorithm for retrieving data by key in B-tree is very efficient. First, binary search is performed from the root node, and data of the corresponding node is returned if it is found. Otherwise, recursive search is performed on the node pointed to by the pointer of the corresponding interval until the node is found or the null pointer is found.

2.3 B + tree

Finally, we’re done with the B-tree, and the B-plus tree is actually a b-tree variant. The biggest difference from b-tree is that internal nodes do not store data, only keys. The diagram below:

The B+ tree used in the general database system is optimized on the basis of the classical above, and becomes a B+ tree with sequential access Pointers, as shown in the following figure. In this way, the performance of interval access is improved. For example, if you want to query all data records with keys from 18 to 49, when 18 is found, you can access all data nodes at one time by traversing the sequence of nodes and Pointers, which greatly improves the interval query efficiency.

3. Why is b-tree (B+) used to implement database index

3.1 Disk Access Principle

A B-tree is a balanced search tree designed for disks or other direct-access secondary storage devices. Secondary storage devices mentioned above, let’s take a look at the principle, what is the origin? Computer systems have main memory and disk-based secondary memory, and main memory is usually what we call RAM, or memory, which we won’t expand here. Index files themselves are large and usually do not exist in memory, so indexes are often stored on disk as files, so index retrieval requires disk I/O operations. The following figure shows a typical disk drive.

In order to shorten the disk read time, the computer has made some optimization: disk preread. Disk prefetch is based on locality: when a piece of data is used, nearby data is usually used immediately. Therefore, disk I/O operation not only the current disk address of the data, but also adjacent data are read into the memory buffer, because the principle of locality tells us that when the computer access to the data of an address, its adjacent data will also be accessed quickly.

The prefetch length is an integer multiple of a page. A page is a logical block of computer-managed memory. Hardware and operating systems often divide the main memory and disk storage areas into contiguous, equal-sized blocks. Each block is called a page (in many operating systems, the page size is usually 4k).

Having said that, to sum up:

  • The files are too large to be stored in memory, so they are stored on disk.
  • Indexes should be structured to minimize the number of disk I/O accesses during lookups, since each disk I/O takes a lot of time.
  • Locality principle and disk prefetch. The length of prefetch is an integer multiple of a page.

3.2 Search performance of B-/B+

The database system takes advantage of disk prefetch by setting the size of a node to be equal to one page, so that each node can be fully loaded with only one I/O. B-trees also take advantage of this. Each time a new node is created, a page of space is requested directly. This ensures that a node is physically stored on a page, and computer storage allocation is page-aligned, enabling a single disk I/O to read a page of data. Here is an example of a B-tree:

A direct view of the B+ tree is attached below:

dmax=floor(pagesize/(keysize+datasize+pointsize))

In general, 3 layer B+ trees can store millions of data, that is, read millions of data, only need 3 disk I/ OS, so this efficiency is greatly improved. Without indexes, each query for a data item would require one I/O, millions of times.

4. Index implementation principle of different engines

4.1 MyISAM index implementation

The index of MyISAM is realized by B+ tree. The index and data of MyISAM are separated, and the leaf node data accesses the address of data. An example diagram of a primary key index is as follows:

As can be seen from the figure, to find data according to the index, first find the leaf node according to the index, then find the data address according to the leaf node, and then take out the data according to the data address.

MyIASM’s secondary index is implemented in the same way as the primary key index, as shown below:

I’m going to separate it from InnoDB later.

4.2 InnoDB index implementation

InnoDB, in the actual project contact is very much, index implementation is also using B+ tree, but the implementation principle is different from MyISAM. The first difference is that InnoDB’s data files are themselves index files. The MyISAM index file and the data file are separate, the index file only holds the address of the data record. In InnoDB, the table data file itself is an index structure organized by B+Tree, and the data field of the leaf node of this Tree holds complete data records. The key of this index is the primary key of the table, so the InnoDB table data file itself is the primary index. The diagram below:

This kind of index is called a clustered index. InnoDB requires tables to have primary keys (MyISAM does not have primary keys) because InnoDB data files themselves are aggregated by primary keys. If this is not explicitly specified, the MySQL system automatically selects a column that uniquely identifies the data record as the primary key. MySQL automatically generates an implied field for the InnoDB table as the primary key. This field is 6 bytes long and of type long integer.

The second difference is that InnoDB’s secondary index data field stores the value of the corresponding primary key rather than the address. In other words, all InnoDB secondary indexes refer to the primary key as the data field. As shown below:

4.3 significance

Understanding the realization principle of different engines is very helpful for our daily database design.

  1. InnoDB auxiliary index search need to retrieve two times index: first retrieve the auxiliary index for the primary key, and then use the primary key to the main index to retrieve the records, so that they can understand why not suggest using long field as the primary key, because all the auxiliary index reference primary index, long primary index will make auxiliary index becomes too large.
  2. It is not recommended to use non-monotonic fields as the primary key of InnoDB, because InnoDB data file itself is a B+Tree. The non-monotonic primary key will cause the data file to be frequently split and adjusted to maintain the B+Tree feature when inserting new records, which is very inefficient. Therefore, auto-increment fields are generally used as the primary key.

I will stop here for this article and summarize optimization strategies for indexes in the next one.

Reference:

1. http://blog.codinglabs.org/articles/theory-of-mysql-index.html 2. Introduction to Algorithms (3rd edition)Copy the code

For more exciting articles, please pay attention to the public account: “Discussion on Tiancheng Technology”