The article directories

  • One, foreword
  • Two, from B tree to B+ tree
    • 2.1 B tree
      • 2.1.1 B-tree properties
      • 2.1.2 B tree Search
      • 2.1.3 B tree Insertion
      • 2.1.4 Deleting a B tree
    • 2.2 B+ tree (nature, insert, delete, find, range find)
    • 2.3 summary
  • B+ tree index
    • 3.1 What are the data structures of the index
    • 3.2 Advantages of B+ tree data structures (hash table, binary tree, balanced binary tree, B tree, B+ tree)?
      • 3.2.1 Hash index
      • 3.2.2 Ordered arrays
      • 3.2.3 Binary Tree Indexing
      • 3.2.4 Balancing binary tree indexes
      • 3.2.5 Using B-tree as Index
      • 3.2.6 Using B+ tree as index
    • 3.3 How large is a node in a B+ tree?
    • 3.4 B+ tree as an index, features determine advantages
  • Hash index, full – text index
    • 4.1 Hash Index
    • 4.2 Full-text Index
  • Five, the end

One, foreword

When people talk about indexes, if they don’t specify the type, they usually talk about B+Tree indexes (InnoDB storage engine is the default MySQL storage engine, InnoDB storage engine uses B+Tree indexes). It uses Tree data structure to store data.

Although most MySQL engines support B+ tree indexes, different storage engines implement B+ tree indexes differently: (1) MyISAM uses prefix compression to make indexes smaller, while InnoDB stores them in raw data format; (2) MyISAM index refers to indexed rows by physical location of data, while InnoDB refers to indexed rows by primary key.

Note: Only B trees and B+ trees, no B- trees.

Two, from B tree to B+ tree

2.1 B tree

A B tree is a balanced multi-path search tree. It is mainly for dynamic search and is usually used in file systems.

2.1.1 B-tree properties

A B-tree of order M can be either an empty tree or an M-fork tree that meets the following characteristics:

(1) Leaf node: All leaf nodes appear in the same layer without information. The parents of leaf nodes are called terminal nodes;

(2) Each node: each node has at most M subtrees; (Goldfinger: M-fork tree)

(3) Root node: if the root node is not a terminal node, there are at least two subtrees;

(4) Non-root node: all non-terminal nodes except the root node have at least ⌈m/2⌉ subtrees;

(5) Non-terminal node (non-leaf node) : All non-terminal nodes contain the following data: (N,A0,K1,A1,K2… ,Kn,An)

Where n(⌈m/2⌉-1≤n≤m-1) is the number of key codes,Ki(1≤ I ≤ N) is the key code,Ki <K(I +1)(1≤ I ≤n-1),Ai(0≤ I ≤ N) is the pointer to the sub-root node, and the key codes of all nodes in the sub-tree indicated by pointer Ai are less than K(I +1) and greater than Ki.

In general, the leaf node of a B-tree can be regarded as the node of the external node (that is, the search failure), which is usually called the external node. In fact, these nodes don’t exist. Pointers to these nodes are null. So, the leaves of B don’t have to be drawn. Because the leaves all appear on the same layer, the B tree is also highly balanced. In addition, the number of key codes in each node is the number of subtrees minus 1.

A B-tree is a generalization of a 2-3 tree, which is a third-order B-tree. Typically, a node in a B-tree is large enough to fill a disk page. Pointers stored in a B-tree are actually block numbers containing its children, and each node generally allows 100 or more children.

2.1.2 B tree Search

The search of B tree is similar to the search of 2-3 trees, but the difference is that each node of the B tree is an ordered table of multi-key codes. When reaching a node, the search is first conducted in the ordered table. If it is found, the search succeeds. Otherwise, follow the pointer information to the appropriate subtree search. When the leaf node is reached, it indicates that there is no corresponding key code in the tree, and the search fails.

The process of searching in B tree is a cross process of searching nodes with pointer and searching key codes in nodes. For example, find the record with key code 53 in the figure above. First, we start at the root point, a There is only one key code in, and 53 is larger than it. Therefore, the search should be conducted from the pointer domain A1 of root node A to node C, and node C has two keys (43 and 78), while 53 is larger than 43 and less than 78. The search should be conducted from pointer domain A1 of node C to node G, and the key codes are compared in sequence in node G to find the key code 53.

So, searching a B tree involves two basic operations:

(1) Locating nodes: find nodes in B tree;

(2) Locate the key code in the node: find the key code in the node.

Because B tree is usually stored on disk, the former a lookup operation (point to look for in the B tree node) was conducted on the disk, and then a lookup operation (refers to find the key in the node) is in memory, namely on the disk to find after a node, node information read into memory first, and then find out the key code is equal to k. Obviously, a search on disk takes much more time than a search in memory. Therefore, the number of searches on disk, that is, the number of b-tree layers where the search key is located, is the primary factor determining the efficiency of b-tree search.

2.1.3 B tree Insertion

B tree insertion is a generalization of 2-3 tree insertion

It is assumed that key codes are to be inserted into m-order B-tree, and n=m-1, that is, n is the maximum number of key codes in nodes. The insertion process of B-tree is as follows:

(1) Positioning: find the insertion position. Since you’re inserting into an endpoint, you need to determine which endpoint it belongs to. The result of the location is to return the pointer P to the terminal node where the key belongs. If the number of key codes in P is less than N, the key is inserted directly. Otherwise, the number of key codes of node P overflows, and the process of “splitting and promotion” is performed.

(2) Splitting and promotion: node P is “split” into two nodes, namely P1 and P2, and the middle key code K is “promoted” to the parent node, and the left pointer of K points to P1 and the right pointer points to P2. If the number of key codes of the parent node also overflows, the process of “split-one promotion” is continued. Obviously, this split can be uploaded all the time, and if the root also splits, then the height of the tree increases by one level.

The insertion process ensures that all nodes are at least half full. For example, a fourth-order B-tree with full internal nodes will have five children. This node is split into two nodes, each containing two keys, thus preserving the b-tree properties. The following figure shows an example of an insert in a third-order B-tree:

2.1.4 Deleting a B tree

B tree deletion is a generalization of 2-3 tree deletion.

Delete the key in the M-order B tree. The first step is to find the location of the key. The result of positioning is that the pointer Q of the node where key is located is returned. Assume that key is the ith key code K in node Q. If node Q is not the endpoint, Ki is “replaced” by the minimum key value X in the subtree indicated by Ai. Since the node where X resides must be the terminal node, the deletion problem boils down to deleting the key in the terminal node. If the number of keys in terminal nodes is greater than ⌈m/2⌉-1, you can delete the key as shown in the following figure:

If the number of key codes is less than ⌈m/2⌉-2 after a key is deleted from a terminal node, it does not meet the requirements of m-order B-tree. You need to borrow key codes from sibling nodes or merge nodes to ensure the characteristics of b-tree. There are two specific cases:

(1) Check the neighboring sibling nodes. If the sibling node has enough records (more than ⌈m/2⌉), it borrows a record from the sibling node, moves the borrowed key codes up to the parent node of the deleted node, and moves the corresponding key codes in the parent node down to the deleted node. The purpose of this is to delay as much as possible the overflow of key codes in nodes caused by deletion.

(2) There are not enough brothers to borrow. If no sibling can lend records to the deleted node with too few records, the deleted node must give its keys to a sibling, perform the merge operation, and remove the null node from the tree. Of course, there is space for sibling nodes, because sibling nodes are mostly full, and the parent node of the deleted node is missing one node after merging, so a key code of the parent node should be “moved down” to the merged node. If the number of key codes in the parent node of the deleted node does not overflow, the merging process ends. Otherwise, the parent node also needs to borrow key codes or merge nodes. Obviously, the merging process can be propagated to the root, and if the two children of the root are merged, the B-tree will have one less layer.

The following figure shows the situation that the number of key codes in the deleted nodes overflows when the key codes are deleted in the third-order B-tree and the processing schematic diagram.

2.2 B+ tree (nature, insert, delete, find, range find)

B+ tree is a variant of B tree and evolved from B tree and Indexed Sequential Access (ISAM). B+ tree is a balanced lookup tree designed for disks or other direct Access assistive devices. In B+ tree, all records are stored in the order of key values on leaf nodes of the same layer, and are spliced by Pointers of each leaf node.

An m-order B+ tree is structurally identical to an M-order B tree, but differs in the internal arrangement of key codes. Details are as follows:

(1) The node with M subtrees contains M key codes, that is, each key code corresponds to a subtree;

(2) the key code K is the maximum (or minimum) key code in the root node of its corresponding subtree;

(3) All terminal nodes contain all key code information and Pointers to key code records;

(4) Each terminal node is linked together according to the size order of the key codes to form a single linked list, and the head pointer is similar to the B_ tree. In the B tree, the key codes in the node are still orderly arranged, and for any two key codes K and K in the same node, if K,<K, then K, less than K, all the keys in the corresponding sub-tree.

The most significant difference from binary sorting and 2-3 trees is that B+ trees only store records in terminal nodes, while internal nodes store keys, but these keys are only used to guide lookups. This means that internal nodes are structurally significantly different from terminal nodes. Internal nodes store key codes to guide the search, and each key code is associated with a pointer to the child node. The end node stores the actual record, and in the case of the B+ tree purely as an index stores keys and Pointers to the actual record. The end points of a B+ tree are generally linked together to form a linked list, so that by accessing all of the end points in the linked list, all records can be traversed in sorted order.

For example, as shown in the figure below, a third-order B+ tree usually has two head Pointers on the B+ tree, one pointing to the root node and the other to the terminal node with the smallest key code. Therefore, two kinds of lookups can be performed on B+ trees: one is a sequential lookups starting at the minimum key and the other is a random lookups starting at the root.

The process of random lookups, inserts, and deletions in a B+ tree is basically similar to that in a B tree. Searching in a B-tree is almost exactly the same as searching in a B-tree, except that the search must go all the way to the end node. Even if a key value is found in an internal node, it is only used to guide the index and does not provide access to the actual record, so

B+ tree search: When searching in B+ tree, the terminal node containing the key value must be reached.

B+ tree insertion: The insertion of B+ tree is only carried out on the terminal node. When the number of key codes in the node is greater than M, the node should be split into two nodes, and their parent nodes should contain the maximum key codes in both nodes.

B+ tree deletion: The deletion of B+ tree is also carried out only on the terminal node. When the maximum key in the terminal node is deleted, its value in the non-terminal node can exist as a “boundary key”. If the number of key codes in the node is less due to deletion, the merging process of the node is similar to that of the b-tree.

B+ tree range lookup: B+ trees are particularly good for range lookup. Once the first record in the range is found, you can find all the records in the range by sequentially processing the rest of the records in the nodes, and then continuing as far into the end node as possible.

B + tree search

Let’s start with a B+ tree with a height of 2, 4 records per page, and 5 fan out, as shown below. All records are stored in order on the leaf node. If the user traverses sequentially from the leftmost leaf node, he can get the order of all key values: 5, 10, 15, 20, 25, 30, 50, 55, 60, 65, 70, 75, 80, 85, 90.


2.3 summary

B-trees and B+ trees, collectively referred to as B-trees, are standard organization methods for applications that require insertion, deletion, and keycode range retrieval, addressing all of the following problems encountered in implementing disk-based retrieval:

(1)B tree is always highly balanced, and all leaf nodes are in the same layer;

(2) Search, insert, and delete operations only affect the minimum number of nodes (i.e. disk pages), so the performance is very good;

(3) B-tree makes use of the principle of access locality by placing related records on the same disk page;

(4)B tree Ensure that at least a certain proportion of nodes in the tree are full, so as to improve the utilization of space and reduce the number of disk reads during search and update operations.

Advantages of B+ tree over B tree: 1, 2, 3,

B+ tree index

3.1 What are the data structures of the index

When designing an index, you will find that the index type can be selected, including Hash, B+, as follows:

3.2 Advantages of B+ tree data structures (hash table, binary tree, balanced binary tree, B tree, B+ tree)?

Q: What is the basic content of data structure from mysql index? Answer: It was a continuous process from binary tree, balanced binary tree, B tree, B+ tree, and finally B+ tree was a reasonable choice, but B+ tree: search for non-leaf nodes + dichotomy of ordered arrays in leaf nodes plus hash table is: search for non-leaf nodes + hash array in leaf nodes

3.2.1 Hash index

MySQL default index B+ tree index, not Hash index Two disadvantages of Hash index 1: The database uses Hash index, and the array subscripts corresponding to all field values are randomly calculated by Hash algorithm, so Hash conflicts may occur. Disadvantage 2: The database uses hash indexes, which can only be used to query = accurately quickly, but does not support range queries.

So for such an index structure, now execute the following SQL statement:

select * from table1 where name='csdn'
Copy the code

SQL > select ‘CSDN’, ‘CSDN’, ‘CSDN’, ‘CSDN’, ‘CSDN’, ‘CSDN’;

select * from table1 where name < 'csdn'
Copy the code

Can’t help, because the characteristics of hash table can be fast and accurate query, but does not support range query.

If made index, that speed is also very slow, to scan all.

Which scenarios are Hash tables suitable for? Equivalent query scenarios only include KV (Key, Value), such as Redis, Memcached and other NoSQL middleware.

3.2.2 Ordered arrays

You’re talking about unordered Hash tables, but is there an ordered data structure? Ordered arrays, it’s Nice, it’s Nice for equivalence and range queries. Can be used to do static storage engine ah, used to save static data, such as your 2019 Alipay bill, 2019 Taobao shopping records and so on are very appropriate, are not changing historical data. Disadvantages of ordered arrays: Ordered arrays are good for static data because if we add, delete, or modify data, it changes its structure. An ordered array is an array structure at the bottom (there are only two types of underlying data structures: arrays and lists, and trees and graphs are linked list variants). If you add a new one, all the nodes behind your new location will move back, which is expensive.

3.2.3 Binary Tree Indexing

First, binary trees are ordered, so range queries are supported.

3.2.4 Balancing binary tree indexes

First, binary trees are ordered, so range queries are supported. Second, the time complexity is O(log(N)), and in order to maintain this time complexity, the update time complexity should also be O(log(N)).

Because balance binary tree index, because the index is not only stored in memory, or to drop disk persistence, if the data is too much, the tree height will be very high, the cost of query will increase with the increase of tree height (Goldfinger: the tree has to read the disk once).

In order to save the cost of a lot of companies or the use of mechanical hard disk, such a ten million level of inquiry about 10 seconds, this who can stand ah?

3.2.5 Using B-tree as Index

The representation of the same element in a B tree is “shorter” than a fully balanced binary tree because a node in a B tree can store multiple elements.

B tree is already a good data structure, used for indexing is good.

3.2.6 Using B+ tree as index

For storing the same elements, a B+ tree and a B tree have the same height. However, the representation of a B+ tree is fatter than that of a B+ tree, because the non-leaf nodes in a B+ tree have a redundant copy in the leaf node, and the leaf nodes are connected with each other by Pointers.

Question: Why are B+ trees (binary tree, balanced binary tree, B tree, B+ tree) chosen? Include the pros and cons of various data structures?

Answer: Hash does not support range queries; Binary trees solve the problem of hash range query, but the tree height is very high. Balanced binary trees solve the problem of binary tree height, but each node can only put one element. B-tree solves the problem that each node of a balanced binary tree can only store one element, and a node can store multiple elements that meet the definition of b-tree, further solving the problem of tree height, improving disk I/O efficiency and reducing the average number of disk accesses. The B+ tree has no improvement over the B tree, but there are two structural changes:

(1) Non-leaf nodes in B+ tree will have a redundant copy in leaf nodes, which improves the efficiency of range search. The reason for the improvement is that there will be Pointers to the leaf node of the next node.

(2) Leaf nodes are connected with Pointers.

Therefore, Mysql selects the data structure B+ tree as the index, which can improve the disk I/O efficiency of index query, and can improve the efficiency of range query, and the elements in B+ tree are also ordered.

3.3 How large is a node in a B+ tree?

Question: How big is a node in a B+ tree? Answer: A B+ tree in which a node is a page or multiple of a page is most appropriate. If the size of a node is less than 1 page, such as 0.8 pages, then reading this node will actually read 1 page, resulting in a waste of resources. If the size of a node is greater than 1 page, such as 1.2 pages, two pages will be read when the node is read, resulting in a waste of resources. Therefore, in order to avoid waste, it is most appropriate to control the size of a node in multiples of 1, 2, 3, 4 pages.

Question: Operating system page concept?

On an operating system, a page size is 4KB. In InnoDB, a page size is 16KB, which is the size of four operating system pages. The basic storage structure of Mysql is a page (where records are stored) :



1. Inter-page + in-page: each data page can form a two-way linked list, and the records in each data page can form a one-way linked list;

2. Each data page will generate a page directory for the records stored in it. When searching for a record through the primary key, dichotomy can be used in the page directory to quickly locate the corresponding slot, and then traverse the records in the corresponding group of the slot to quickly find the specified record;

3. Use other columns (non-primary keys) as search criteria: each record in a single linked list can only be traversed from the smallest record.

If we write the SQL statement select * from table1 where website=’ CSDN ‘without any optimization, the default is:

First, locate the page where the record is located (note: it is necessary to traverse the bidirectional linked list to find the page, the time complexity is O(n), because each data page forms a bidirectional linked list, after using B+ tree index, the time complexity is O(lgN) to locate the specified page)

Second, find the corresponding record from the page (note: since the query is not based on the primary key, only the single linked list of the page can be traversed, because the records in the data page form a single linked list).

Obviously, in the case of a large amount of data this lookup is very slow! It looks a little bit like a return watch.

3.4 B+ tree as an index, features determine advantages

Feature 1: B+Tree usually means that all values are stored sequentially and that each leaf page is the same distance from the root.

Feature 2: THE B+Tree index can speed up data access.

The B+Tree index speeds up data access because the storage engine does not need to perform a full table scan to obtain the desired data. Instead, the storage engine searches from the root node of the index, which holds Pointers to child nodes. The appropriate Pointers to the lower-level child nodes can be found by comparing the value of the node page with the value to be searched. These Pointers actually define the upper and lower bounds. Eventually the storage engine either finds the corresponding value or the record does not exist.

Feature 3: Leaf nodes are special in that their Pointers point to indexed data, not node pages (ps: different engines have different types of “Pointers”), and the depth of the tree is directly related to the size of the table. B+ trees are stored sequentially on index columns, so they are good for looking up range data. For example, in a textfield-based index tree, passing consecutive values in alphabetical order for lookups is quite appropriate, so lookups like “find all names starting with I through K” can be very efficient.

Note that the index sorts multiple values based on the order of the columns when the index is defined in the CREATE TABLE statement. For example, in a user table, two people with the same first and last name are sorted according to their birth date.

The query type that can be used for the B-tree index. The B-tree index is applicable to the full key value search, key value range search, and key prefix search. The key prefix lookup applies only to searches based on the leftmost prefix. The indexes described above are valid for the following types of queries.

Full key-value lookup Full value matching refers to matching all columns in an index, for example, the aforementioned index can be dry lookup for Cuba Allen, born in 1960-01-01.

Key value range lookups such as the aforementioned index can be used to find people whose last names are between Aen and Barrymore, and here only the first column of the index is used.

Matching the index mentioned before the left-most prefix can be used to find all people with the last name Alen, that is, only the first column of the index is used.

Matching column prefixes can also match only the beginning of a column’s value. For example, the index mentioned earlier can be used to find all people with the last name starting with J, and only the first column of the index is used here.

The previously mentioned index can also be used to find all people whose last name is Allen and whose name starts with a letter K (e.g. Kim, Karl, etc.), i.e. first column last_name all matches and second column first_nane range matches.

Index-only queries B-trees can usually support index-only queries, which require access only to the index, not to the rows. We will discuss this “overwrite index” optimization separately later.

Because the nodes in the index tree are ordered. So in addition to searching BY value, indexes can also be used for ORDER BY operations in queries. In general, if a B-tree can find a value in a certain way, it can also be used for sorting in that way. So, if the ORDER BY clause satisfies any of the query types listed above, then the index also satisfies the corresponding sorting requirements.

(1) You cannot use a b-tree index unless you start the search in the left-most column of the index. For example, the index in the above example cannot be used to find a person named Bill, or a person of a particular birthday, because neither column is the left-most data column. Similarly, you can’t find people whose surnames end with certain letters; MySQL can only use the first column of the index if the first name is not specified (first_name). (3) If there is a range query for a column in the query, all columns to the right of it cannot be found using index optimization. WHERE last name= ‘Smith’ AND first_name = ‘J%’ AND dob=1976-12-23 This query can only use the first two columns of the index, where LIKE is a scope condition (but the server can use the remaining columns for other purposes). If you have a limited number of range query column values, you can replace a range condition by using multiple equal conditions.

The reader should understand how important the order of the index columns mentioned above is: these restrictions are all related to the order of the index columns.

Four features of B+ Tree Index (Emphasis 003)



First, lower the tree height:

(1) N-fork tree: B+ tree exists in the form of n-fork tree. The tree becomes fatter and the height of the tree decreases.

(2) Balanced tree: B+ tree uses balanced binary tree principle to reduce tree height to the longest degree;

(3) B tree: A node of B+ tree stores multiple data, reducing the tree height.

(4) B+ tree: From B tree to B+ tree, the tree height does not decrease, but the tree becomes fatter. The reason is that non-leaf nodes in THE B+ tree will have a redundant copy in the leaf nodes, and the leaf nodes are connected by Pointers.

Second, order: search data does not need to scan the full table, along the root node layer down to find our target data quickly;

Third, a node is a disk block size, program locality principle of disk prefetch: The size of each node is the size of a disk block. One IO will read the data of a page (each page contains multiple disk blocks). (Disk prefetch.

Fourth, the leaf node pointer linked list joins, will be random IO into sequential IO, will not form a temporary table in memory: Leaf nodes are connected by Pointers to each other, which can effectively reduce the random IO of sequential traversal. Moreover, we can also see that leaf nodes are sorted by index, which means that search or sorting by index is sorted, and no temporary table will be formed in memory.

Hash index, full – text index

4.1 Hash Index

Hash index basic hash table implementation, only valid queries that exactly match all columns of the index. Hash table (also known as hash table) is a data structure accessed directly according to the Key value. It allows the code value to be mapped to the corresponding position of the hash table through the transformation of the hash function, and the search efficiency is very high.

In MySQL, only the Memory engine explicitly supports hash indexes, which is the default index type of the Memory engine tables. The Memory engine also supports B+Tree indexes. It is worth remembering that the Memory engine supports non-unique hash indexes, which is quite unusual in the database world. If multiple columns have the same hash value, the index stores multiple record Pointers in a linked list into the same hash entry.

Assuming that we have created hash indexes for names, the search process is as follows:

Hash index working process: As you can see from the left side of the image above, for each row of Mysql data, the storage engine calculates a hash code (the position of the hash table in the image above) for all index columns (the name column on the left side of the image above) through a hash algorithm (the rectangle in the middle of the image above). Each element in the hash table points to a pointer to a row of data (again reflecting logical adjacencies rather than physical storage adjacencies, as is the case with B+ tree indexes and hash indexes).

In terms of the underlying data structure, B+ tree index: the data structure used is B+ tree. Hash index: Data structures used are hash tables. In theory, the hash index is faster than the B+ tree index. The time complexity is O(1) and the index itself only stores the corresponding hash value. The index structure is very compact, so why InnoDB uses B+ tree as the default index? Because hash indexes have two fatal weaknesses:

<= <= between… <= <= between… <= <= between… The and… This range lookup, because the hash evaluates to an out-of-order hash, results in an out-of-order ROWID.

(2) It is easy to cause hash conflict. If the hash index of name is used, the calculated hash values of two records with the same name will be the same, and hash conflict will occur. When a hash conflict occurs, the storage engine must traverse all the row Pointers in the linked list, comparing them row by row until it finds all the rows that match the criteria.

(3) Hash indexes also do not support partial index column matching queries, because hash indexes always use the entire contents of the index column to calculate the hash value. For example, if A hash index is created on data column (A,B), the index cannot be used if the query has only data column A.

More often, the hash table is used with B+ Tree, etc., B+ Tree range lookup + hash table accurate lookup. There is a special index called “adaptive hash index” in the InnoDB engine. This is an adaptive hash index. When innoDB notices that some index values are used very frequently, it creates a hash index in memory based on the B-tree index. In this way, the B+ Tree index also has the advantages of a quick hash index lookup. However, you can turn this feature off if necessary.

The adaptive hash index changes the original “B+ tree range search + binary search within leaf node” into “B+ tree range search + hash search within leaf node”.

4.2 Full-text Index

A full-text index is a special type of index that looks for keywords in text rather than directly comparing values in the index. When using large Text types like Blob or Text, if you use like to search for some characters in the field without indexing, the use of the full-text index will be applied to the field.

Full-text search is not matched in the same way as other types of indexes. It has a lot of details to pay attention to, such as stop words, stem and plural, Boolean search, etc. Full-text indexes are more similar to what search engines do than simple WHERE condition matching. There is no conflict between creating a full-text index and a value-based B-tree index on the same column, and full-text indexes are suitable for MATCH AGAINST operations rather than normal WHERE conditions.

It’s worth noting that MySQL’s default full-text index is only good for natural languages like English, where words are separated by Spaces and no word segmentation is required. However, for such unnatural languages as Chinese, it is not easy to use and often cannot be found. For local operations, it is better to use like. In project development, ES or Solr, such an in-site index engine, is generally used. As a result, MySQL full-text indexing is as useless as its cache layer. In use, the cache is implemented using Redis or Mongodb, and the query of large text characters is implemented using ES or Solr.

Five, the end

B+ tree index, hash index, full text index, complete.

Play code every day, progress every day!