Welcome to pay attention to github.com/hsfxuebao, I hope to help you, if you think it can trouble to click on the Star ha

1. Why index

Index is a data structure used by storage engine to quickly find data records, just like the table of contents of a teaching book. By finding the page number of the corresponding chapter in the table of contents, you can quickly locate the desired article. If the query condition meets the specified index, the related data is searched through the index. If the query condition does not meet the specified index, the whole table is scanned, that is, the records need to be searched until the records that match the condition are found.

As shown in the figure above, when there is no index in the database, the data is distributed in different locations of the disk. When reading data, the swing arm needs to swing back and forth to find data, which consumes a lot of time. If the data is placed sequentially, rows 1 through 6 need to be read sequentially, which is equivalent to six IO operations, which is still very time-consuming. If we didn’t use any index structure to help us locate the data quickly, if we looked for Col 2=89, we would have to look and compare line by line. Start with Col 2 = 34, compare and find it is not, move on to the next line. Our current table has less than 10 rows of data, but if the table is large and has tens of millions of rows of data, that means many, many times of disk work IO to find it. Now look for Col 2=89. The CPU must find the record in disk, load it into memory, and then process the data. The most time consuming part of this process is disk I/O (which involves disk rotation time (fast) and head seek time (slow and time-consuming)). If the data is stored using such a data structure as binary tree, as shown in the figure below

Adding an index to field Col 2 is equivalent to maintaining an indexed data structure on hard disk for Col 2, the binary search tree. Each node of the binary search tree stores (K, V) structure, key is Col 2, value is the file pointer (address) of the key line. For example, the root node of the binary search tree is: (34,0×07) ₑ now Col2 is indexed, and then Col2 = 89 will be searched for the binary search tree. Read 34 to memory, 89>34; Continue the data on the right side, read 89 to memory, 89 = 89; Find data and return. After it is found, it can quickly locate the address corresponding to the record to be searched according to the yan UE of the current node. We can find that it only takes two lookups to locate the address of the record, which increases the query speed.

This is why we build indexes, to reduce disk 1/0 times and speed up queries.

2. Indexes and their advantages and disadvantages

2.1 Index Overview

An Index is a data structure that helps MySQL obtain data efficiently.

The essence of an index: An index is a data structure. You can think of it simply as a “sorted fast lookup data structure” that satisfies a particular lookup algorithm. These data structures point to the data in a way that allows you to implement advanced lookup algorithms on top of these data structures.

Indexes are implemented in storage engines, so each storage engine may not have exactly the same index, and each storage engine may not support all index types. At the same time, the storage engine can define the maximum number of indexes and the maximum index length for each table. All storage engines support at least 16 indexes per table, and the total index length is at least 256 bytes. Some storage engines support a larger number of indexes and a larger index length.

2.2 the advantages

  1. Similar to the university library to build bibliographic index, improve the efficiency of data retrieval, reduceDatabase IO costThis is the main reason for creating indexes.
  2. Each row in a database table is guaranteed by creating a unique indexUniqueness of data
  3. In terms of implementing referential integrity of data, yesAccelerometers and connections between tables. In other words, you can speed up queries when children and parents of dependent tables are jointly queried.
  4. When using grouping and sorting clauses for data queries, can be significantReduces grouping and sorting time in queriesReduced CPU consumption.

2.3 disadvantages

  1. To create and maintain indexesTime consumingAnd as the amount of data increases, so does the time consumed.
  2. Index requirementDisk spaceIn addition to the data table occupies data space, each index also occupies a certain amount of physical space.Store it on diskIndex files may reach the maximum file size faster than data files if there are a large number of indexes.
  3. While indexing greatly improves query speed, it also improves query speedSlow down the speed of updating tables. Indexes are maintained dynamically as data in a table is added, deleted, or modified, which slows down data maintenance.

Therefore, the advantages and disadvantages of an index should be considered when selecting an index.

Tip: Indexes can improve the speed of query, but will affect the speed of insert records, in this case, the best method is to drop the index in the table, then insert data, after the insert index.

3. Deduction of indexes in InnoDB

3.1 Lookup before index

Let’s start with an example of an exact match:

SELECT * FROM table_name WHERE table_name = XXX;Copy the code
  1. Search within a page

Assuming that there are few records in the table at present, all records can be stored in one page, and records can be searched in two cases according to different search conditions:

  • Search criteria by primary key can be used in page directoriesdichotomyQuickly locate the corresponding slot, and then traverse the records in the corresponding group of the slot to quickly find the specified record.
  • Other columns as search criteria Because there is no so-called page directory for non-primary key columns in the data page, we cannot quickly locate the corresponding slot by dichotomy. In this case, you have to takeMinimum recordStart, in turn,Iterate over a single linked listAnd then compare each record to see if it meets the search criteria. Evidently, the efficiency of this search is very low.
  1. Look through many pages

Most of the time the records in our tables are very large and require many data pages to store them. Finding records across many pages can be divided into two steps: I

  • Navigate to the page where the record resides.
  • Find the corresponding record from the page you are on.

In the case of no index, no matter according to the value of the primary key column or other column, because we can not quickly locate the record on the page, so we can only from the first page down the bidirectional linked list, in each page according to the way we find the specified record. This is obviously super time consuming because you have to traverse all the data pages. What if a table had 100 million records? This is where indexes come in.

3.2 Design Index

Create a table:

mysql> CREATE TABLE index_demo(
 ->   c1 INT,
 ->   c2 INT,
 ->   c3 CHAR(1),
 ->   PRIMARY KEY(c1)
 -> ) ROW_FORMAT = Compact;
Copy the code

The newly created Index_demo table has 2 INT columns, 1 CHAR(1) column, and c1 primary key. The table uses Compact row format to actually store records. Here we simplify the row format schematic of the Index_demo table:

We only show these parts of the record in the diagram:

  • record_type: An attribute of the record header that indicates the type of record, 0 for normal record, 2 for minimum record, 3 for maximum record, 1 not yet used, below.
  • next_record: A property of the record header that represents the offset of the next address relative to this record. We use arrows to indicate who the next record is.
  • The values of the individual columns: Only three columns are recorded in the Index_demo table, c1, C2, and C3.
  • Other information: All information except the above three, including the values of other hidden columns and additional information recorded.

This is what happens when you temporarily remove other information items from the record format diagram and make it stand up:

Here’s a diagram of putting some records on a page:

3.2.1 A simple index design scheme

Why should we traverse all the data pages when we are looking up some records based on some search criteria? Because the records in the individual pages are not disciplined, we do not know which records in the pages our search criteria match, so we have to traverse all the data pages in turn. So what if we want to quickly locate the records we need to find in which data pages? We can create a directory to quickly locate the data page where the record resides. This directory must do the following:

  • The primary key value recorded by the user on the next data page must be greater than the primary key value recorded by the user on the previous page.

Assumption: Each data page can hold up to three records (in fact, a data page is large enough to hold many records). With this assumption in mind, we insert three records into the index_demo table:

Mysql > INSERT INTO index.demo VALUES , 4, 'u'), (3, 9, 'd'), (5), 3, 'y'); Query OK, 3 rows Affected (0.01 SEC) Records: 3 Duplicates: 0 Warnings: 0Copy the code

The records are then cascaded into a one-way linked list of primary key values, as shown

As you can see from the figure, all three records in the Index_Demo table have been inserted into the data page number 10. Now let’s insert another record:

 mysql> INSERT INTO index_demo VALUES(4, 4, 'a');
Copy the code

Since page 10 can hold a maximum of three records, we have to assign a new page:

Note that the newly assigned data page numbers may not be consecutive. They simply establish linked lists by maintaining the numbers of the previous and the next pages. In addition, the user record in page 10 largest primary key value is 5, and there are a record in pp. 28 of the primary key value is 4, 5 > 4 because it is over, for this is not conform to the user records in a data page of a page in the history of the primary key value must be greater than the user record the requirements of the primary key, so the primary key is inserted in 4 records need to be accompanied by a record of the mobile, That is, the record with primary key 5 is moved to page 28, and the record with primary key 4 is inserted into page 10, as shown in the following diagram:

This process indicates that while adding, deleting, and modifying records on a page, we must do something like record movement to ensure that the state is always true: the primary key value of the user record on the next data page must be greater than the primary key value of the user record on the previous page. This process we call page splitting.

  • Create a directory entry for all pages

So our table of contents for the previous pages looked something like this:

For example, page 28 corresponds to directory entry 2, which contains the page number 28 and the minimum primary key value 5 for user records on the page. We only need to store several items in physical storage (such as: array), can realize the function of finding a certain record according to the primary key value. For example, to search for records whose primary key value is 20, the specific search process is divided into two steps:

    1. First from the directory entrydichotomyThe primary key value is quickly determined as20Records of theDirectory entry 3(because of12 < 20 < 209 ), its corresponding page isPage 9
    1. Look for records in the page as described abovePage 9To locate a specific record.

At this point, a simple table of contents for data pages is done. This directory has an alias called index.

3.2.2 Index scheme in InnoDB

Iteration 1: the page of the catalog entry record

The above is called a simple indexing scheme because we assume that all entries can be stored contiguously on physical storage in order to use dichotomy to quickly locate specific entries in primary key lookups, but there are several problems with this:

  • InnoDBIs to use pages as the basic unit of managed storage space, that is, at most guaranteed16KBAs the number of records in a table increases, it would take a very large amount of contiguous storage to drop all the entries, which is not practical for a table with a very large number of records.
  • We add and delete records from time to timePage 28Records were deleted from thePage 28There’s no need to exist, which meansDirectory entry 2There would be no need for it to existDirectory entry 2After the items are moved forward, this is not a good idea ~

So InnoDB’s designers needed a way to manage all the entries flexibly. Their mental clarity, suddenly discovered that these entries are to grow our user record, just two of the directory entry column is the primary key and page number, so they reuse before storage to store user record data page directory entry, to make a distinction between user record, we call these used to represent the directory entry record entry record. How does InnoDB distinguish between a user record and a directory entry record? Don’t forget the record_type attribute in the record header, whose values mean the following:

  • 0: Common user records
  • 1: Records of directory entries
  • 2: Minimum record
  • 3: Maximum record

Record_type = 1 record_type = 1 record_type = 1

As you can see from the figure, a new page numbered 30 has been allocated to store catalog entry records. Again, there are differences between directory entry records and normal user records:

  • Directory entry recordrecord_typeThe value is 1, andCommon User Recordrecord_typeThe value is 0.
  • Directory entry records onlyPrimary key and page numberTwo columns, whereas ordinary user record columns are user-defined and may containA lot of columnsThere are also hidden columns added by InnoDB itself.
  • Understanding: the record header information is also calledmin_rec_maskProperties only in storageDirectory entry recordThe primary key value in the page of

The min_rec_mask value of the smallest directory entry is 1, and the min_rec_mask value of all other records is 0.

Similarities: Both use the same data Page and both generate a Page Directory for primary key values, thus using dichotomy to speed up queries when searching by key value.

Now, taking the search for records with primary key 20 as an example, the search for records based on a primary key value can be roughly divided into the following two steps:

  1. First go to the page where the directory record is stored, that is, through the dichotomy method on page 30, quickly locate the corresponding directory entry. Because 12 < 20 < 209, the page where the corresponding record is located is page 9.

  2. Then go to page 9 where user records are stored and quickly locate the user record whose primary key value is 20 according to dichotomy.

Iteration 2: pages of multiple catalog entry records

Though the directory entry record stores only the primary key and the corresponding page number, than the user record need much smaller storage space, but no matter how to say a only 16 KB page size, can store the directory entry record is limited, so if the data in the table is too much, so much so that a data page is not enough to store all the directory entry record, should do how?

For a better understanding of the process of assigning a new entry page, let’s assume that a page can hold at most 4 entries. So if we insert a user record with a primary key of 320, we need to allocate a new page to store the directory entry record:

As you can see from the figure, we need two new data pages after we insert a user record with a primary key of 320:

  • Created to store the user recordPage 31
  • Because the directory entry records were originally storedThe capacity of page 30 is full(We assumed that you could store only four entries), so you had to have a new oneOn page 32To hold thePage 31Corresponding directory entry.

Now, since there is more than one page for storing catalog entry records, it takes roughly 3 steps if we want to find a user record based on the primary key value. Take finding a record with a primary key value of 20 as an example:

  • Identify the catalog entry record page
    • At present, there are two pages storing directory entry records, namely page 30 and page 32. Because the primary key value range of the directory entry represented by page 30 is [1, 320), and the primary key value of the directory entry represented by page 32 is not less than 320, the directory entry corresponding to the record with the primary key value of 20 is recorded in page 30.
  • Use the catalog entry record page to determine which page the user record actually resides on.
    • Locating a directory entry record by primary key value in a page where it is stored.
  • Locate the specific record in the page where the actual user record is stored.

Iterate 3 times: The catalog page of the catalog entry record page

So the question comes, in the query process step 1 we need to locate the page directory entry’s records, but these pages might not have to in the storage space, if we have the data in the table is very much will produce many records storage directory entry pages, so what do we according to the primary key quickly locate a storage directory entry record page? In fact, it is quite simple to regenerate the pages that store the entries into a more advanced directory, like a multi-level directory, with smaller directories nested within the larger directory where the actual data is stored, so the schematic diagram of each page now looks like this:

As shown in figure, we generated a higher entry page 33 of storage, the two records of representing pages 30 and 32 pages, if the user record’s primary key value in [1, 320), then to page 30 to find more detailed catalogue records, if the primary key value should not less than 320 words, just look in to page 32 for more detailed catalogue records.

We can describe it as follows:

This data structure, its name isB + tree.

B+Tree

Both the data pages that hold user records and the data pages that hold directory entry records are stored in the B+ tree data structure, so we also call these data pages nodes. As can be seen from the figure, our actual user records are stored in the lowest nodes of the B+ tree, which are also called leaf nodes or leaf nodes. The other nodes used to store directory items are called non-leaf nodes or internal nodes, among which the node at the top of the B+ tree is also called the root node.

The nodes of a B+ tree can actually be divided into several layers, and the lowest layer, where our user records are stored, is designated as layer 0, and then added upwards. Earlier, we made a very extreme assumption that a page for user records could hold up to three records and a page for directory entry records could hold up to four records. In fact, the number of records stored in a page in the real environment is very large. Suppose that the data page represented by all leaf nodes storing user records can store 100 user records, and the data page represented by all internal nodes storing directory records can store 1000 directory records, then:

  • If the B+ tree has only one level, that is, only one node for storing user records, it can hold up to 100 records.
  • If the B+ tree has two layers, it can store at most1000 x 100 = 10000Records.
  • If the B+ tree has 3 layers, it can store at most1000 x 1000 x 100 = 1000, 0000Records.
  • If the B+ tree has four layers, it can store at most1000 x 1000 x 1000 x 100 = 1000000, 0000Records. Quite a lot of records!!

Can you store 100000000000 records in your table? Therefore, we usually use no more than four levels of B+ tree, so we only need to do a maximum of four Page look-up through the primary key value to find a particular record (find three item pages and one user record Page), and because there is a so-called Page Directory in each Page, So in the page can also be achieved through dichotomy fast location record.

3.3 Common Index Concepts

Index According to the physical implementation, indexes can be divided into two types: clustered (clustered) and non-clustered (non-clustered) indexes. We also refer to non-clustered indexes as secondary or secondary indexes.

3.3.1 Cluster Index

Features:

  1. Sorting records and pages using the size of the record primary key has three implications:
    • In the pageThe records are arranged in order of the size of the primary keySingly linked list.
    • Each storePage of user recordsAlso according to the user records in the page of the primary key size orderTwo-way linked list
    • storeThe page of the catalog entry recordIt is divided into different levels, and the pages in the same level are arranged according to the size of the primary key recorded by the directory entry in the pageTwo-way linked list.
  2. The leaves of the B+ tree store the complete user record.
    • A complete user record is one in which the values of all columns (including hidden columns) are stored.

Advantages:

  • Faster data accessBecause a clustered index stores the index and data in the same B+ tree, it is faster to get data from a clustered index than a non-clustered index
  • Cluster index for primary keyOrder to findFind the rangeVery fast
  • According to the clustering index order, when the query shows a certain range of data, because the data are closely linked, the database does not need to extract data from multiple data blocks, soSaves a lot of I/O operations

Disadvantages:

  • Insertion speed depends heavily on insertion order, the fastest way is to insert according to the order of primary keys. Otherwise, pages will split and performance will be seriously affected. So, for InnoDB tables, we usually define oneThe self-increasing ID column is the primary key
  • Updating primary keys is expensiveCause the updated row to move. Therefore, for InnoDB tables, we generally define primary keys as not updatable
  • Secondary index access requires two index lookups, first finds the primary key value, second finds the row data based on the primary key value

3.3.2 Secondary Index (secondary index, non-clustered index)

Did you notice that the clustering index described above only works if the search criteria are primary keys, because the data in a B+ tree is sorted by primary keys. So what if we want to search for a different column? Do you have to traverse the list from beginning to end?

No, we can build more B+ trees, and we can sort the data in different B+ trees differently. For example, we use the size of c2 column as the sorting rule of data page and records in the page, and then build a B+ tree, as shown in the figure below:

This B+ tree differs from the cluster index introduced above in several ways:

  • Ordering records and pages using the size of the record C2 column has three implications:

    • Page records are in accordance withc2The column sizes are arranged in a one-way linked list.
    • The pages that store user records are also based on the records in the pagesc2The column size order is arranged in a two-way linked list.
    • The pages that hold directory entry records are divided into different levels, and the pages in the same level are also recorded according to the directory entry in the pagec2The column size order is arranged in a two-way linked list.
  • The leaf node of the B+ tree does not store the complete user record, but only the values of the C2 + primary key columns.

  • Instead of a primary key + page number, there is a C2 column + page number in the directory entry record.

So now if we want to look up some records by the values of c2 column we can use this B+ tree that we just built. For example, to find the record whose value is 4 in c2 column, the search process is as follows:

  1. Identify the catalog entry record page

    Based on the root page, which is page 44, you can quickly locate the directory entry record on page 42 (because 2 < 4 < 9).

  2. Use the catalog entry record page to determine which page the user record actually resides on.

    In page 42, the page where user records are actually stored can be quickly located. However, since there is no uniqueness constraint in c2 column, the records of C2 column value 4 May be distributed in multiple data pages. Because 2 < 4 ≤ 4, the pages where user records are actually stored are determined to be in page 34 and 35.

  3. Locate the specific record in the page where the actual user record is stored.

    Go to pages 34 and 35 to locate the specific record.

  4. However, the records in the leaf node of the B+ tree only store c2 and C1 (primary key) columns, so we have to look up the complete user records in the cluster index again based on the primary key value.

Everybody, do you see what step 4 does? The B+ tree sorted by the c2 column size can only determine the primary key value of the record we are looking for, so if we want to find the full user record based on the C2 column value, we still need to look again in the clustered index, a process also known as table back. It takes 2 B+ trees to query a complete user record based on the value of c2 column.

Concept: back table We can only determine the primary key value of the record we are looking for based on the C2 column size of the B+ tree, so if we want to find the full user record based on the C2 column value, we still need to look again in the cluster index, this process is called back table. It takes 2 B+ trees to query a complete user record based on the values of c2 column!

Why do we need a back table operation? Isn’t it OK to put the full user record directly into the leaf node?

If each one is completely stored user records, then if 100W user data, each to be completely recorded, that is not several secondary indexes, it needs to be multiplied several times to store, increasing the storage space overhead

3.3.3 Federated Indexes

If we want to sort the B+ tree by the size of the C2 and C3 columns, this has two implications:

  • Start by sorting individual records and pages into c2 columns.
  • In the case that the C2 columns of the records are the same, the C3 column is used for sorting

As shown in the figure, a few points need to be noted:

  • Each of theDirectory entry recordAre made byc2,c3,Page numberThese three parts are composed of each record according to the firstc2Column values are sorted if recordedc2If the columns are the same, follow thec3To sort the values of the columns.
  • B+The user record at the leaf node of the tree is determined byc2,c3And the primary keyc1Of the column.

It is important to note that a B+ tree sorted by the size of the C2 and C3 columns is called a joint index and is essentially a secondary index. This means something different from the statement that indexes columns C2 and C3 separately, with the following differences:

  • To establishJoint indexOnly one tree like the one above will be createdB+The tree.
  • Indexing the C2 and C3 columns separately will result in thec2andc3Column size creates 2 columns for collationB+The tree.

3.4 InnoDB B+ tree index considerations

3.4.1 The root page position does not change for ten thousand years

When we introduced the index of B+ tree, for the convenience of everyone’s understanding, we drew the leaf nodes that store user records first, and then drew the inner nodes that store directory entry records. In fact, the formation process of B+ tree is like this:

  • Each time one is created for a tableB+When a tree index (a clustered index is not created artificially, it is created by default), an index is created for this indexThe root nodePage. At the beginning, when there’s no data in the table, eachB+Corresponding to the tree indexThe root nodeThere are neither user records nor directory entry records in.
  • When you subsequently insert user records into the table, you store the user records in this firstThe root nodeIn the.
  • whenThe root nodeContinues to insert records when the available space inThe root nodeCopy all records in to a newly assigned page, for exampleA page, and proceed to the new pagePage dividedTo get another new page, for examplePage b. At this point, the newly inserted record is assigned to the value of the key (that is, the primary key value in the cluster index, the corresponding index column value in the secondary index)A pageorPage b, and theThe root nodeUpgrade to a page that stores directory entry records.

This process requires special attention: the root node of a B+ tree index does not move from the date of birth. So whenever we create an index for a table, the page number of the root node will be recorded somewhere, and then whenever InnoDB storage engine needs to use the index, the page number of the root node will be retrieved from that fixed place to access the index.

3.4.2 Uniqueness of directory entry records in inner nodes

We know that the contents of the table entry in the inner node of the B+ tree index are the column + page number collocation, but this collocation is a little loose for the secondary index. Also taking the index_demo table as an example, suppose that the data in the table looks like this:

c1 c2 c3
1 1 ‘u’
3 1 ‘d’
5 1 ‘y’
7 1 ‘a’

If the contents of the table entry in the secondary index are only the index column + page number, then the B+ tree indexed for the C2 column should look like this:

If we want to insert a new row where c1, c2, and c3 are: 9, 1, ‘c’, then we have a big problem modifying the B+ tree corresponding to the c2 secondary index: Since the table of contents stored in page 3 is composed of c2 column + page number, the value of c2 column corresponding to the two table of contents records in page 3 is 1, and the value of C2 column of the newly inserted record is also 1, so should we put the newly inserted record on page 4 or on page 5? The answer is: Sorry, muddled.

In order for the newly inserted record to find itself on that page, we need to ensure that the entry record of the node in the same level of the B+ tree is unique except for the page number field. Therefore, the contents of the entry record for the inner node of the secondary index are actually composed of three parts:

  • The value of the index column
  • The primary key
  • Page number

In other words, we added the primary key value to the directory entry record of the node in the secondary index. This ensures that each directory entry record of the node in each layer of the B+ tree is unique except for the page number field. Therefore, the diagram after we build the secondary index for c2 column should actually look like this:

When inserting records (9, 1, ‘c’), we can compare the values of column C2 of the new record with the values of column C2 of each entry in page 3. If the values of column C2 are the same, we can compare the values of column C2 of each entry in page 3. Because the c2 column + primary key values for different entries in the same layer of the B+ tree are bound to be different, the only entry record must eventually be located and, in this case, the new record should be inserted into page 5.

3.4.3 Store at least two Records on a page

As we said earlier, a B+ tree can easily store hundreds of millions of records with only a few levels, and queries are extremely fast! This is because a B+ tree is essentially a large multilevel directory, filtering out many invalid subdirectories as it passes through each directory until it finally gets to the directory where the real data is stored. What about a large directory with only one subdirectory? The directory hierarchy is very, very, very large, and the last directory where the real data is stored can only hold one record. You can only store one real user record? Tease me? So InnoDB can store at least two records on a data page, which is a conclusion we used when we introduced the row format (we used this conclusion to deduce the maximum number of bytes that can be stored in a table with only one column without row overflow, go back and see if you forget).

4. Index scheme in MyISAM

The b-tree index applicable to storage engines is shown in the following table:

Even if multiple storage engines support the same type of index, their implementation principles are different. Innodb and MyISAM default index is Btree index; Memory’s default index is a Hash index.

MyISAM engine uses B+Tree as the index structure, and the data field of the leaf node stores the address of the data record.

4.2 Principle of MyISAM index

Here’s how MyISAM works:

We know that InnoDB index (data) already contains all the complete user records in the leaves of the B+ tree of clustered indexes, while MyISAM index scheme also uses tree structure, but stores indexes and data separately:

  • Records in a tableIn order of insertion of recordsStored in a separate file calledThe data file. The file is not divided into several data pages, just as many records are crammed into the file. Because when you insert data, it doesn’tThere is no deliberate sorting by primary key sizeSo we can’t use dichotomous search on these data.
  • Tables using the MyISAM storage engine store index information in a separate file calledIndex fileIn another file. MyISAM creates a separate index for the primary key of the table, but instead of storing the complete user record in the leaf node of the indexA combination of primary key values and data record addresses.

Table f MyISAM (Coll, Coll, Coll, Coll); You can see that MyISAM’s index file only holds the address of the data record. In MyISAM, there is no structural difference between a primary key index and a Secondary key, except that the primary key index requires a unique key, while the Secondary key can be repeated. If we create a secondary cable bow I on Col2, the structure of this index is as follows:

4.3 Comparison between MyISAM and InnoDB

MyISAM indexes are “non-clustered”, unlike InnoDB which has one clustered index. Summary of index differences between the two engines:

  • In the InnoDB storage engine, we only need primary key value pairsClustering indexA single lookup finds the corresponding record, which is required in MyISAMBack to the tableOperation, meaning that the index created in MyISAM is equivalent to allSecondary indexes.
  • InnoDB’s data files themselves are index files together, while MyISAM index files and data files areThe separation ofThe index file only holds the address of the data record.
  • InnoDB’s non-clustered index data field stores corresponding recordsThe value of the primary key, while MyISAM index recordsaddress. In other words, all InnoDB’s non-clustered indexes refer to the primary key as the data field.
  • MyISAM’s back table operation is tenfastInnoDB retrieves data directly from the file with the address offset. InnoDB retrieves data directly from the file with the primary key and then retrieves records in the cluster index. Although it is not slow, it is still not as fast as using the address to access data directly.
  • InnoDB requirements tableMust have a primary key (MyISAM can't). If not explicitly specified, the MySQL system automatically selects a non-empty column that uniquely identifies the data record as the primary key. If no such column exists, MySQL automatically generates an implied field for the InnoDB table as the primary key. This field is 6 bytes long and of type long integer.

Summary:

Understanding how indexes are implemented by different storage engines is very helpful for proper use and optimization of indexes. Such as:

  • Example 1: InnoDB’s index implementation is easy to understandWhy not recommend using too long fields as primary keys, because all secondary indexes reference primary key indexes, and an excessively long primary key index makes the secondary index too large.
  • Example 2: It is not a good idea to use three or four key fields as primary key in InnoDB, because InnoDB data file itself is a B+Tree, sayo primary key will cause the data file to maintain the B+Tree feature frequently when inserting new records, so it is very inefficient to useA self-increment field is a good choice as a primary key.

5. The cost of indexing

An index is a good thing to keep in mind, because it is expensive in space and time:

  • The cost in space
    • A B+ tree is created for each index. Each node of the B+ tree is a data page. By default, a page occupies 16KB of storage space.
  • The cost in time
    • Each B+ tree index needs to be modified every time data in the table is added, deleted, or modified. And as we have said, each level of the B+ tree nodes are sorted in ascending order according to the value of the index column to form a bidirectional linked list. Both the records in the leaf node and the records in the inner node (that is, the user record and the directory entry record) form a one-way linked list in ascending order of the index column values. Adding, deleting, and changing operations may damage the ordering of nodes and records, so the storage engine needs extra time to perform operations such as record shifting, page splitting, and page reclamation to maintain the ordering of nodes and records. If we build many indexes, each index’s B+ tree will have to perform maintenance operations, which can drag down performance.

The more indexes a table has, the more storage space it occupies, and the worse performance it performs when adding, deleting, or modifying records. In order to build good and few indexes, we need to learn the conditions under which these indexes work.

6. Rationality of MySQL data structure selection

6.1 Table Traversal

6.2 the Hash structure

Hash itself is a function, also known as a Hash function, that can make retrieving data much more efficient.

Hash algorithms convert input to output using deterministic algorithms such as MD5, SHA1, SHA2, and SHA3. The same input can always produce the same output, and given a slight deviation in the input content, there will often be different results in the output.

For example: If you want to verify that two files are the same, you don’t need to compare them directly, you just need to tell you the result of the Hash function, then perform the same local Hash function on the file, and finally compare the two Hash functions to see if the result is the same. I can tell if these two files are the same.

There are two common types of data structures that speed up lookups

  • Trees, such as balanced binary search trees, have average time complexity for query/insert/modify/deleteO(log2N)
  • For hashes, such as HashMap, the average time complexity of query/insert/modify/delete isO(1); (key,value)

Hash is usually more efficient than B+ tree. Basically, data can be found in a single search, but B+ tree needs to search from top to bottom and visit nodes for multiple times to find data. In the process, multiple I/O operations are required.

In the way of hashing, an element K is in h(k), that is, using the hash function h, the slot position is calculated according to the key word k. Function h maps the keyword field to the hash tableT[0...m-1]Is in the slot of.

The hash function in the figure above has the potential to map two different keywords to the same location. This is called a collision and is usually solved by chaining in databases. In chaining, the elements hashed to the same slot are placed in a linked list, as shown below:

See the efficiency difference between an array and a hash table lookup

Public void test1(){int[] arr = new int[100000]; for(int i = 0; i < arr.length; i++){ arr[i] = i + 1; } long start = System.currentTimeMillis(); for(int j = 1; j<=100000; j++){ int temp = j; for(int i = 0; i < arr.length; i++){ if(temp == arr[i]){ break; } } } long end = System.currentTimeMillis(); System.out.println("time: "+ (end-start)); / / time: 823}Copy the code
O(1) @test public void test2(){HashSet<Integer> set = new HashSet<>(100000); for(int i = 0; i < 100000; i++){ set.add(i + 1); } long start = System.currentTimeMillis(); for(int j = 1; j<=100000; j++) { int temp = j; boolean contains = set.contains(temp); } long end = System.currentTimeMillis(); System.out.println("time: "+ (end-start)); / / time: 5}Copy the code

Hash structures are efficient, so why should index structures be trees?

  • Possible cause 1: Hash indexes can only be used for (=)(<>) and IN queries. ifRange queries, hash type index, time complexity will be reduced to 0(n); The “ordered” nature of trees can still maintain O(log2N) efficiency.
  • Reason 2: Hash line | bow has a defect, the data is storedunorderedIn the case of ORDER BY, using Hash indexes also requires reordering the data.
  • Cause 3: In the case of a joint index, the Hash value is calculated by combining the joint index keys together, and a single key or several index keys cannot be computed.
  • Reason 4: Hash indexes are usually more efficient for equivalent queries, but there is a case whereIf the index column has many duplicate values, the efficiency will be reduced. This is because when Hash conflicts occur, it is time-consuming to traverse the row pointer in the bucket for comparison and find the query key. Therefore, Hash indexes are usually not used on columns with many duplicate values, such as gender, age, etc.

Hash indexes for storage engines are shown in the following table:

Applicability of Hash indexes:

Hash indexes have many limitations. B+ tree indexes are more widely used in databases, but there are some scenarios where Hash indexes are more efficient. For example, in key-value databases, the core of Redis storage is Hash tables.

MySQL’s Memory storage engine supports Hash storage. If you need to query a temporary table, you can use the Memory storage engine to set a field as a Hash index, such as a string field, which can be shortened to a few bytes after the Hash calculation. Hash indexes are a good choice when word segments are low in repeatability and equivalent queries are often required.

In addition, InnoDB itself does not support Hash indexes, but does provide Adaptive Hash indexes ₒ When can we use Adaptive Hash indexes? If certain data is frequently accessed, the address of the data page is stored in the Hash table when certain conditions are met. So that the next time you query, you can directly find the location of the page. This gives B+ trees the benefit of a Hash index.

The adaptive Hash index is used to speed up the locating of leaf nodes according to SQL query conditions, especially when the B+ tree is deep. The adaptive Hash index can significantly improve the data retrieval efficiency.

We can check whether adaptive Hash is enabled by using the innodb_adaptive_hash_index variable, for example:

show variables like '%adaptive_hash_index';
Copy the code

6.3 Binary search tree

If we use a binary tree as an index structure, the number of DISK I/OS is related to the height of the index tree.

  1. Binary search tree features
  • A node can only have two children, that is, a node degree cannot exceed 2
  • Left child node < local node; Right child >= this node, older than me to the right, younger than me to the left)
  1. To find the rule

The Binary Search Tree is a Binary Search Tree in which a node is searched in the same way as a node is inserted, assuming that the value of the Search insert is key:

  • If the key is larger than the root node, the search is performed in the right subtree.
  • If the key is smaller than the root node, the search is performed in the left subtree.
  • If the key is equal to the root node, which means the node is found, return the root node.

For example, the binary search tree we created for the sequence (34, 22, 89, 5, 23, 77, 91) is shown below:

But there are special cases where the depth of the binary tree is very large. For example, if the data is (5,22,23,34,77,89,91), the binary search tree is created as shown below:

The second tree above is also a binary search tree, but has been reduced to a linked list in terms of performance, with a time complexity of O(N) to find data.

To improve query efficiency, you need to reduce disk I/OS. To reduce the number of DISK I/OS, you need to reduce the tree height as much as possible. You need to change the original “tall and thin” tree structure into “short and fat”. The more branches in each layer of the tree, the better.

6.4 the AVL tree

In order to solve the problem of Binary search Tree degenerated into a linked list, people put forward Balanced Binary search Tree, also called AVL Tree (different from AVL algorithm), which adds constraints on the basis of Binary search Tree and has the following properties:

  • It is an empty tree or the absolute value of the difference in height between the left and right subtrees is no more than 1, and both subtrees are balanced binomial trees.

Here, there are many kinds of common balanced binary trees, including balanced binary search tree, red black tree, number heap, stretch tree. Balanced binary search tree is the first self-used broken binary search tree. When we refer to balanced binary search tree, we generally refer to balanced binary search tree. In fact, the first tree is a balanced binary search tree with a search time of 0(log2n).

The data query time depends on disk I/. If we take the form of a binary tree, even if we modify it by balanced binary search tree, the depth of the tree is alsoO(log2n), when n is relatively large, the depth is also relatively high, such as the situation below:Each node access requires one disk I/O operation, and for the tree above, we need five I/ OS. Although balanced binary trees are efficient, the tree depth is also high, which means that the number of disk I/O operations affects the overall data query efficiency.

For the same data, what if we change the binary tree to an m-tree (M>2)? When M=3, the same 31 nodes can be stored by the following trigeminal tree:

You can see that the height of the tree is much smaller than the height of the binary tree (M>2) when N is large and the number of branches of the tree is large. So, we need to change the tree from tall and thin to short and fat.

6.5 B – Tree

B treeBalance Tree, which is the multi-way balanced search tree. Short for B-tree (note that the bar means the two words are connected, not the minus sign). Its height is much smaller than that of a balanced binary tree. The structure of b-tree is shown in the figure below:As a multi-path balanced search tree, each node of the B-tree can contain at most M child nodes. M is the order of the B-tree. Each disk block contains keywords and Pointers to child nodes. If a disk block contains X keywords, the number of Pointers is x+1. For a 100-rank B-tree, if there are three levels, it can store up to about 1 million index data. For a large number of index data, the b-tree structure is very suitable, because the height of the tree is much smaller than the height of the binary tree.

An m-order B-tree (M>2) has the following properties:

  • The range of sons of the root node is [2,M].
  • Each intermediate node contains K-1 keywords and K children. The number of children is equal to the number of keywords +1. The value of K is [ceil(M/2), M].
  • The leaf node contains k-1 keywords (the leaf node has no children), and the value range of k is [ceil(M/2), M].
  • Assume that the keywords of the intermediate nodes are Key[1], Key[2]… , Key[k-1], and the keywords are sorted in ascending order, that is, Key[I]

    … ,P[k], where P[1] points to the subtree whose Key is less than Key[1],P[I] points to the subtree whose Key belongs to Key[i-1], Key[I], and P[k]> points to the subtree whose Key is greater than Key[k-1].
    [i>
  • All leaf nodes are in the same layer.

The b-tree represented in the diagram above is a third-order B-tree. If we look at disk block 2, the keyword is (8, 12), it has 3 children (3, 5), (9,10) and (13,15), you can see that (3, 5) is less than 8, (9,10) is between 8 and 12, and (13,15) is greater than 12. It fits exactly what we just gave you.

And then let’s see how we can use B trees to find things. Assuming that the keyword we want to find is 9, the steps can be divided into the following steps:

  • We compare it with the keyword (17, 35) of the root node. If 9 is less than 17, we get pointer P1.
  • Find disk block 2 according to pointer P1, keyword is (8, 12), because 9 is between 8 and 12, so we get pointer P2;
  • Follow pointer P2 to find disk block 6 with keyword (9, 10), and then we find keyword 9.

As you can see, we’re not comparing a lot of times in the search of the B tree, but if we’re reading the data out and comparing it in memory, that’s negligible. Reading the disk block itself requires I/O operations, which consume more time than the time required for comparison in memory, and is an important factor in data lookup time. B trees require less disk I/O operations than balanced binary trees, and are more efficient in data queries than balanced binary trees. So as long as the tree height is low enough, I/O times are low enough, you can improve query performance.

Summary:

  • If the tree is unbalanced when nodes are inserted or deleted, the self-balance of the tree can be maintained by automatically adjusting the position of nodes.
  • Keyword sets are distributed in the whole tree, that is, both leaf nodes and non-leaf nodes store data. The search may end at a non-leaf node
  • Its search performance is equivalent to a binary search in the whole set of keywords.

Example 1:

6.6 B + Tree

B+Tree is also a multi-way search Tree, which has been improved based on B Tree. Mainstream DBMS all support the index method of B+Tree, such as MySQL. Compared with B-tree, B+Tree is suitable for file index system. MySQL > MySQL

Differences between B+ trees and B trees:

  • A node with k children has k keywords. So the number of children is equal to the number of key words, and in the CASE of B tree, the number of children is equal to the number of key words plus 1.
  • Keywords of non-leaf nodes also exist in child nodes and are the maximum (or minimum) of all keywords in child nodes.
  • Non-leaf nodes are only used for indexing and do not hold data records. Records-related information is stored in leaf nodes. And in the B tree,Non-leaf nodes hold both indexes and data records
  • All keywords appear in the leaf node, which forms an ordered linked list, and the leaf node itself is linked in order of keyword size from smallest to largest.

Both B trees and B+ trees can be used as data structures for indexes. In MySQL, B+ trees are used. However, both B trees and B+ trees have their own application scenarios. It is not true that B+ trees are better than B trees and vice versa.

Question 1: To reduce IO, will the index tree be loaded at once?

  1. The index of a database is stored on disk. If the amount of data is large, the index size will be large, exceeding several GIGABytes
  2. When we use the index query, it is not possible to load all gigabytes into memory, we can do: load each disk page one by one, because the disk page corresponds to the node of the index tree

Question 2: What is the storage capacity of B+ trees? Why does it take only 1 to 3 disk I/OS to find row records

The storage capacity is very strong. If the root page can store 100 data items at the beginning, then if the page directory can store 1000 data items, then the storage capacity of the second level is 1001000, the third level is 100100010001000, and the fourth level is 100100010001000, then why only need to load the maximum 3 times? Because the data for the root page is already loaded without any need to be loaded in the first place, a maximum of 4 levels of loading will require a maximum of 3 loads

Question 3: Why are B+ trees better than B- trees for file and database indexes in practical operating systems?

1. The disk read and write cost of B+ tree is lower

  • The internal nodes of the B+ tree do not have Pointers to specific information about the keyword. So the internal nodes are smaller than the b-tree. If all the keywords of the same internal node are stored in the same disk block, then the disk block can contain more keywords. Read into memory at a time to find more key words. 10 reads and writes is a relatively small mark.

2. The query efficiency of B+ tree is more stable

  • Because non-endpoints are not the nodes that ultimately point to the contents of the file, they are only the indexes of the keywords in the leaf nodes. So any keyword lookup must take a path from root to leaf. The length of all keyword query paths is the same, resulting in the same query efficiency of each data.

Because B+ tree queries are more stable and suitable for fast lookups of ranges

Question 4: Difference between Hash index and B+ tree index

We talked about the structure of the B+ tree index, and the Hash index structure is different from the B+ tree index structure, so there are differences in index usage. 1. Hash indexes cannot be used for range query, whereas B+ trees can. This is because the data to which the Hash index points is unordered, whereas the leaf nodes of a B+ tree are an ordered linked list. 2. Hash indexes do not support the leftmost principle of a joint index (that is, part of a joint index cannot be used), whereas B+ trees do. In the case of a Hash index, the Hash key is combined to compute the Hash value, so the Hash value is not computed separately for each index. Therefore, if one or more indexes of the union index are used, the union index cannot be utilized. 3. Hash indexes do not support ORDER BY sorting, because the Hash index points to unordered data and therefore cannot be used for sorting optimization, whereas B+ tree indexes are ordered and can be used for sorting this field. RDER BY sort optimization. In the same way, we can’t use Hash indexes for fuzzy queries, whereas B+ trees that use LIKE for fuzzy queries can be optimized by using LIKE followed by a fuzzy query (e.g. % ending). 4. InnoDB does not support hash indexes

Question 5: Are Hash indexes and B+ tree indexes manually specified when building an index?

When we create a table, every time we insert data, it will maintain the corresponding index behind the back. If a new secondary index is added, the index will be created again

6.7 R tree

R-tree is rarely used in MySQL. It supports only the geometry data type. Only myISAM, BDB, InnoDB, NDB, and Archive support this type of storage engine. An example of what R trees can do in the real world: find all restaurants within 20 miles. What would you do if there were no R trees? In general, we will divide the coordinates of the restaurant (x and Y) into two fields and store them in the database. One field records longitude and the other one latitude. So we need to go through all the restaurants to get their location information, and then calculate if they meet the requirements. If there are 100 restaurants in an area, we need to perform 100 location calculations, which is certainly not feasible when applied to large databases such as Google maps and Baidu Maps. R tree is a good solution to the high dimensional space search problem. It extends the idea of B-tree to multi-dimensional space, adopts the idea of b-tree partition space, and combines and decomposes nodes in addition and deletion operations to ensure the balance of the tree. Therefore, R tree is a balanced tree for storing high dimensional data. Compared with B-tree, R-tree has the advantage of range search.

6.8 summary

Using indexes can help us to quickly determine the right data sources from the massive data. However, indexes also have some disadvantages, such as occupying storage space, the performance of the write operation of the I di database, etc. If there are multiple indexes, the index selection time will be increased. When we use indexes, we need to balance the benefits of indexes (improved query efficiency) with the costs of maintaining indexes.

In practical work, we also need to determine whether to use index based on demand and the distribution of data itself. Although index is not a panacea, it is unimaginable not to use index when there is a large amount of data. After all, the essence of index is to help us improve the efficiency of data retrieval.

7. Time complexity of the algorithm

The same problem can be solved by different algorithms, and the quality of one algorithm will affect the efficiency of the algorithm and even the program. The aim of algorithm analysis is to select suitable algorithm and improve algorithm.

Refer to the article

Chapter 6 “MySQL Technology Insider: InnoDB Storage Engine (version 2)” “Database index Design and Optimization”