preface

Speaking of indexes, the first reaction must be to improve query efficiency. For example, in the table of contents of a book, if you want to find a certain chapter, you will first locate it in the table of contents. If you don’t have a table of contents, you need to look through everything to find it.

Index design is very important to program performance, if too few indexes, the query performance will be affected; If there are too many indexes, add/change/delete performance will be affected.

knowledge

MySQL generally supports the following common indexes:

  • B + tree index
  • The full text indexing
  • The hash index

Today we are going to focus on B+ tree indexes and why B+ trees are used as data structures for indexes.

The B+ tree index does not find the specific row directly, but only the page where the row is searched, and DB then looks it up in memory by reading the whole page into memory.

Revisit data structures

1.1 Hash structure

If there are eight data types (3, 1, 2, 10, 9, 0, 4, and 6), create hash indexes as shown in Figure 1-1.

  • Direct query: Now to find the record 6 from 8 numbers, only need to calculate the hash value of 6, can quickly locate the record, time complexity is O(1).

  • Range queries: If you want to do range queries (data greater than 4), this index is completely useless.

1.2 Binary Tree Search tree

Binary tree is a classical data structure requiring that the left subtree be smaller than the root node and the right subtree be larger than the root node.

If there are eight data types (3, 1, 2, 10, 9, 0, 4, and 6), create a binary search tree as shown in Figure 1-2.

  • Direct query: if the key value is 6, first find root 4, 4<6, so find the right subtree of 4, find 9; 9 is greater than 6, so find the left subtree of 9; Three times in total. But if you look up sequentially, you need to look up eight times (last).

  • Range query: If you need to find data greater than 4, you simply traverse the right subtree of 4.

1.3 Balanced Binary Tree (AVL Tree)

According to the definition of binary search tree, it can be arbitrarily constructed, the same numbers can be built as shown in Figure 1-3-1. If you look for data 6, you have to look for it 5 times.

Therefore, in order to construct a binary search tree with maximum performance, it needs to be balanced, i.e., balanced binary tree.

Balanced binary tree definition: first meets the definition of binary search tree, and the maximum difference in height between the two subtrees of any node is 1.

Balanced binary tree queries are fast, but have disadvantages:

  1. The cost of maintaining the tree is very high, and it often requires multiple left or right turns to maintain balance during inserts or updates. Such asFigure 1-3-2As shown in
  2. With a large amount of data, the tree is tall and requires multiple I/O operations.
  3. When doing a range lookup, assume that the lookup >=3, find 3 first, then need to find the parent of 3, and then traverse the right subtree of the parent.

1.4 B + tree

In a B+ tree, all record nodes are stored on leaf nodes in sequence, connected by Pointers to leaf nodes. If you walk sequentially from the leftmost leaf node, you get the order of all the keys.

If there are 8 data (3, 1, 2, 10, 9, 0, 4, and 6), you can create a B+ tree with height 2 as shown in Figure 1-4-1.

B+ trees also require binary tree rotation when updated. For example, if a new 7 is added, it can be filled directly after the 4 and 6. If you add 8, then you need to rotate, and feel the B+ tree rotation below.

Advantages of using B+ tree index structure:

  1. The height of B+ trees is usually 2-4 levels, so it only takes 2-4 IO to find records at most, which is much lower than that of binary balanced trees.
  2. Range lookups allow data to be retrieved through Pointers to leaf nodes. For example, when searching for data greater than or equal to 3, when 3 is found in the leaf node, all data can be obtained through the tail pointer of 3, without obtaining the parent node of 3 as in the binary tree.

To be continued…

If you want to follow the updated articles and shared dry goods in real time, you can follow my official account