(reproduced) links: the original blog.csdn.net/fujiandiyi0…

The introduction

Hey, everybody. I’m a smokey. I once wrote an article called interviewer: What do you know about watch design? So, we decided to do another mysql database topic. In this article, we will talk about mysql interview questions about indexes. Innodb storage engine. I haven’t used Mysiam or anything like that. Ps: Actually very early wanted to write, has been lazy!

Actually this below each question, I can speak an article to come out! And I’m not making these questions up. As shown below (note question 3)

So I recalled, the index of common test points, summed up this article! The main topics are as follows

(1) How do you build indexes?

(2) Talk about the classification of indexes? What do you know?

(3) How to avoid back table query? What is index coverage?

(4) Now I have a column with unique data, I need to create an index, select unique index or ordinary index.

(5) What is the structure of the mysql index? Can I use a red-black tree?

Mysql > select * from table where id = 1;

The body of the

1. How do you build indexes? Note: I remember when a fan came to me, the following funny scene appeared

Tony: smoke"Your resume says you have SQL optimization experience, how do you build the index?"See the fans hey hey smile.. Said:"Build it that way..."Tony: smoke"Oh (second voice), build that..."Fan:"... With all the indexing rules on the Internet."Tony: smoke"How do you know that SQL is broken and needs to be indexed?"Fan:"I... .."
Copy the code

Well, this is actually pretty basic. But you can tell if you’ve ever done it.

Open slow query logs slow_query_log=1 Slow_query_log_file =/var/log/mysql/logLog If the SQL execution time is greater than 3 seconds, log long_query_time=3Copy the code

Start building indexes as soon as slow SQL is detected? For example, limit 1 is used when there is only one row of data

In most cases, however, business SQL is too complex to optimize. So you have to build an index. At this point, follow these rules to create an index

(1) More indexes are not always better. A large number of indexes not only occupy disk space, but also affect the performance of INSERT, DELETE,update statements

(2) Avoid more indexes for frequently updated tables, and keep as few columns in the index as possible; Create indexes for fields that are often used for queries to avoid adding unnecessary indexes

(3) Try not to use indexes for tables with less data. Because of less data, the query time may be shorter than the time of traversing indexes, and indexes may not produce optimization effects

(4) Create indexes on columns with many different values that are often used in conditional expressions, and do not create indexes on columns with few different values. For example, if the gender field has only “male” and “female” values, there is no need to create an index. If the index is established, it will not improve efficiency, but seriously reduce the speed of data update

(5) Create indexes on the columns that are frequently sorted or grouped. If there are multiple sorted columns, create joint indexes on these columns.

2. What is the classification of indexes? What do you know?

From the point of view of data structure: B+ Tree index, hash index, FULLTEXT index, r-tree index

  • Primary key index: A primary key index is a special unique index that does not allow empty values

  • Plain index or single column index

  • Multi-column index (compound index) : A compound index is an index created on multiple fields. The index is used only when the first field of the index is used in the query condition. Follow the leftmost prefix set when using composite indexes

  • Unique index or non-unique index

  • Spatial index: A spatial index is an index of spatial data types, which are GEOMETRY, POINT, LINESTRING, and POLYGON in MYSQL.

3, how to avoid back table query? What is index coverage?

This problem, if you want to see the detailed version, please refer to the article “Innodb index principle”.

Here’s a quick explanation.

Let’s say I have a table table1 with a federated index (a,b).

Execute the following SQL

select a,b from table1
Copy the code

You can find the result on the index, so you don’t have to go back to the table to query!

select a,b,c from table2
Copy the code

If column C does not exist on the index, it needs to be queried back into the table.

Mysql can only use a B+ tree index to create an overwrite index, but a hash index, a spatial index, and a full-text index do not store column values.

SQL > create index (unique index); SQL > create index (common index);

Note: Do not think that the unique index affects the insert speed, this speed loss can be ignored, but improve the search speed is obvious; In addition, even with perfect validation control at the application layer, as long as there is no unique index, according to Murphy's law, there will be dirty data.Copy the code

Well, here comes the next one!

Why are unique indexes not as fast as regular indexes? Why does a unique index find faster than a normal index?

Advantages: The ability to combine multiple inserts into a single operation greatly improves the insert performance of non-clustered indexes.

InnoDB has introduced Change Buffer from version 1.0.x, which is an update to Insert Buffer. From this release, the InnoDB storage engine can cache inserts, deletes, and updates.

The only reason why inserts are slower than normal indexes is because:

  • Unique indexes cannot take advantage of Change Buffer

  • Normal indexes can use Change Buffer

Then the next question came! Why don’t unique index updates use Change Buffer?

For unique indexes to ensure uniqueness, data pages need to be loaded into memory to determine whether the uniqueness constraint is violated. However, since the data pages are loaded into memory, it is better to update the data pages in memory directly, without the need to use Change Buffer.

Finally, a unique index is searched faster than a normal index because:

  • After finding the first record that meets the condition, a normal index determines the next record until the first record that does not meet the condition appears.

  • A unique index returns the first record that meets the criteria without determining the next record.

What is the structure of the mysql index? Can I use a red-black tree?

Why not choose B+ Tree instead of B Tree?

Notice two distinct features of B Tree

  • Trees store data

  • There are no linked lists on leaf nodes

And B plus tree looks like this

Notice two distinct features of B+ Tree

  • Data only appears in leaf nodes

  • A chain pointer has been added to all leaf nodes

For example, the main reason why database indexes use B+ tree is that B tree improves disk I/O performance but does not solve the problem of low efficiency of element traversal. It is to solve this problem that B+ Tree was born. B+ tree traverses the entire tree by traversing the leaf nodes. Moreover, range-based query is very frequent in the database. If B Tree is used, local middle-order traversal is required, which may require cross-layer access, which is too slow.

Hint, my next question is:

Mysql > select * from table where id = 1;

I want you to take a look at another article of mine

Mysql has an optimizer in the optimizer called the Range optimizer, which is responsible for optimizing Range queries!

They are the cost estimation methods of MySQL optimizer, the former is slow but can get accurate value, the latter is fast but not accurate data.

To be honest, I am crying in my heart as I write this. It is not easy to write index Dive and Index statistics.

For the index dive:

COST = CPU COST + IO COST
Copy the code

Where CPU COST refers to the COST of processing returned records. IO COST is the COST of reading the page.

Mysql will perform the above cost calculation for each index execution, and finally execute it in a cost-effective manner.

However, in some cases the cost of mysql performing Index Dive is too high. So the optimizer chooses to estimate the cost in index statistics.

SHOW INDEX FROM tbl_name [FROM db_name] 
Copy the code

One of the results is Cardinality, which represents the number of non-duplicate values in the index column. The larger the Cardinality value, the more data that can be excluded using an index and the more efficient the execution.

The last

If the article is helpful to you, pay attention to ⬇️ and then go!

This document is mandatory for Linux servers and network protocols

Essential knowledge of ten million database

Essential documentation for back-end development

Front end essential dry goods