Thresh

Indexes are physical data page storage, stored in data files (InnoDB, IBD files) using data pages (pages). Indexes can speed up the search speed, but at the same time reduce the speed of add, delete and modify operations, index maintenance costs.Copy the code

Theoretical knowledge of indexing: binary lookup, Hash, and B+Tree.

Binary search method

Binary search is also called split – half search, it is in the ordered array to find the specified data search algorithm.

The advantages of equivalent query and range query are excellent performance. The disadvantages of updating data, adding data and deleting data are high maintenance costs.Copy the code
  • First locate the left and right Pointers
  • To calculate(left+right)/2
  • Determine the size ratio of the index position value divided by 2 to the target value
  • Index position value greater than the target value -1, right move; If it is less than the target value +1, the left moves
 public static int binarySearch(int key, int... array) {
        int low = 0;
        int high = array.length - 1;
        while (low <= high) {
            int mid = low + (high - low) / 2;
            if (key < array[mid]) {
                high = mid - 1;
            } else if (key > array[mid]) {
                low = mid + 1;
            } else {
                return mid;
            }
        }
        return -1;
    }
Copy the code

Hash structure

The underlying implementation of Hash is implemented by Hash tables, which store data based on key values <key,value>. This is great for finding values by key, which is a single key query, or equivalent query.

Hash indexes can easily provide equivalent queries, but full table scans are required for range queries.

Hash index in MySQL Hash structure is mainly used in Memory(Memory based storage engine rarely used) native Hash index, InnoDB adaptive Hash index.

InnoDB’s adaptive hash index is designed to improve query efficiency. InnoDB’s storage engine monitors queries on each index page of the table. When InnoDB notices that some index values are accessed frequently, it creates a hash index in memory based on the B+Tree index. Enable the B+Tree index in memory to have the function of hash index, that is, to quickly and fixed-value access to frequently accessed index pages.

InnoDB adaptive Hash index: When using Hash index access, data can be located in a single search, equivalent query efficiency is better than B+Tree.

The creation of adaptive hash indexes enables InnoDB storage engine to automatically create hash indexes for some hot pages based on the frequency and pattern of page access to speed up access. In addition, users can only choose to turn on or off the function of InnoDB’s adaptive hash index, without manual intervention.

show engine innodb status \G; 
show variables like '%innodb_adaptive%';
Copy the code

B + Tree structure

MySQL database index uses B+Tree structure, optimized in the B-Tree structure

B – Tree structure

  • Index values and data are distributed throughout the tree structure
  • Each node can store multiple index values and corresponding data data
  • Multiple index values in a tree node are arranged in ascending order from left to right

B tree search: starting from the root node, the index value sequence in the node using dichotomy search, if hit the end of the search. No hit will enter the child node to repeat the search process, until the corresponding node pointer is empty, or is already a leaf node.

B + Tree structure

  • Non-leaf nodes do not store data, but only index values, which is convenient for storing more index values
  • The leaf node contains all the indexes and data
  • Leaf nodes are connected with Pointers to improve the access performance of the interval

Compared with B tree, when B+ tree conducts range search, it only needs to search the index values of two nodes and then traverse by using the Pointers of leaf nodes. However, B Tree requires traversal of all nodes and data within the range. Obviously, B+Tree has high efficiency.

Cluster index and secondary index

Clustered index and non-clustered index: leaf nodes of B+Tree that store primary key index values and row records belong to clustered index. If the index value is stored separately from the row record, it is a non-clustered index.

Primary key index and secondary key index: the leaf node of B+Tree stores the primary key index. If the value is not a primary key, it is a secondary index.

In The InnoDB engine, primary key indexes are stored in a clustered index structure.

A clustered index is a data storage method. InnoDB’s clustered index builds a B+Tree structure by primary key order. The leaf nodes of a B+Tree are row records, which are tightly stored with primary key values. This also means that the primary key index of InnoDB is the table itself. It stores the data of the entire table in primary key order. The space occupied by InnoDB is the size of the data volume of the entire table. A primary key index is a clustered index.

Secondary Index InnoDB secondary index, also known as secondary index, builds a B+Tree structure based on index columns. However, only the index column and primary key information are stored in the leaf node of B+Tree. Secondary indexes take up much less space than clustered indexes, and secondary indexes are usually created to improve query efficiency. InnoDB can only create one clustered index per table, but can create multiple secondary indexes.

Unlike InnoDB table storage, MyISAM data tables have separate index files and data files, known as non-clustered index structures.