Mysql > index

  1. Index definition

An index is a structure that sorts the values of one or more columns in a database table. An index is used to provide quick access to specific information in a database table. If you want to find a particular employee by his or her last name, an index helps you get the information faster than searching all the rows in a table.

One of the main purposes of an index is to speed up retrieval of table data, that is, auxiliary data structures that help information searchers quickly find record IDs that meet the constraints.

  1. Type of index

1) Primary key index

It is a special unique index that does not allow null values. A table can have only one primary key

2) unique index

The values of a unique indexed column must be unique, but null values are allowed. If it is a composite index, the combination of column values must be unique.

3) Index

This is the most basic index, and it has no restrictions.

C/C++, Linux, Golang, Nginx, Zzeromq, MySQL, Redis, FastDFS, MongoDB, ZK, Streaming Media, CDN, Linux, Linux, Golang, Nginx, Zzeromq, MySQL, Redis, FastDFS, MongoDB, ZK, Streaming Media, CDN, etc. P2P, K8S, Docker, TCP/IP, Coroutine, DPDK, FFMPEG, etc.)

4) FullText FullText

Full-text index (also called full-text retrieval) is a key technology used by search engines. It can use a variety of algorithms such as word segmentation technology to intelligently analyze the frequency and importance of key words in the text, and then intelligently screen out the search results we want according to certain algorithm rules.

5) Composite indexes

Composite index, where an index contains more than one column.

  1. Index structure

MySQL generally uses B+Tree as the index, but the implementation is different according to the cluster index and non-cluster index, which will not be introduced in detail here, and will be explained in detail in the following chapters.

  1. The test data

We now have an order table with the following structure:

  1. The Left-Most Prefix Matching Rule is a very important rule. MySQL will match from left to right until it encounters a range query (>, <, between, like) and stops matching. Create an index
CREATE INDEX IDX_T_1 USING BTREE ON xa87_v2.t_xa87_order_info (store,delivery_msg,food_fee,delivery_fee);

We execute an SQL

Explain select * from t_xa87_order_info where delivery_msg=' Food_fee >1000 'and food_fee>1000; +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- ------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- -- -- -- -- -- -- -- -- -- -- -- -- + | | 1 SIMPLE | t_xa87_order_info | NULL | | NULL ALL | NULL | NULL | NULL | 62017 | | 3.33 Using the where | +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- ------------+

There is no store field in WHERE, so the IDX_T_1 index is not used.

Let’s add “store” to the query and see the effect:

Explain the select * from t_xa87_order_info where store = 'f9fd2705ad1d740a4bef42833b487cea' and delivery_msg = 'Gao Yue static, 17836031' and food_fee=1000; +----+-------------+-------------------+------------+------+---------------+---------+---------+-------------------+---- --+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------------------+------------+------+---------------+---------+---------+-------------------+---- --+----------+-------+ | 1 | SIMPLE | t_xa87_order_info | NULL | ref | IDX_T_1 | IDX_T_1 | 267 | const,const,const | 1 | 100.00 | NULL | +----+-------------+-------------------+------------+------+---------------+---------+---------+-------------------+---- --+----------+-------+
explain select * from t_xa87_order_info where store='f9fd2705ad1d740a4bef42833b487cea' and delivery_msg like '%17836031'  and food_fee=1000; +----+-------------+-------------------+------------+------+---------------+---------+---------+-------+------+--------- -+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows |  filtered | Extra | +----+-------------+-------------------+------------+------+---------------+---------+---------+-------+------+--------- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | | 1 SIMPLE | t_xa87_order_info | NULL | ref | IDX_T_1 | IDX_T_1 | 131 | const | | 1.11 633 | Using index condition | +----+-------------+-------------------+------------+------+---------------+---------+---------+-------+------+--------- -+-----------------------+

The first query is more efficient because the second query uses’ delivery_msg ‘on the condition and does not continue matching food_fee

  1. Select the highly differentiated column as the index

High differentiation means less of the same value. For example, the gender of the field, only male, female two values, the distinction is very low, it is not suitable for the index.

  1. Expand indexes and try not to create new indexes

If you want to add index (a,b) to a table that already has index (a,b), you simply need to change the original index.

Second, query optimization

  1. Do not invalidate the index

1) SELECT * FROM ‘LIKE’ WHERE ‘LIKE’ DOES NOT START WITH A WILDCARD

explain select * from t_xa87_order_info where order_no like '2022%'; +----+-------------+-------------------+------------+-------+---------------+---------+---------+------+------+--------- -+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------------------+------------+-------+---------------+---------+---------+------+------+--------- - + -- -- -- -- -- -- -- -- -- -- -- -- -- + | | 1 SIMPLE | t_xa87_order_info | NULL | range | PRIMARY | PRIMARY | 130 | NULL | 1 | | 100.00 Using where | +----+-------------+-------------------+------------+-------+---------------+---------+---------+------+------+--------- -+-------------+
explain select * from t_xa87_order_info where order_no like '%2022'; +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- ------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- -- -- -- -- -- -- -- -- -- -- -- -- + | | 1 SIMPLE | t_xa87_order_info | NULL | | NULL ALL | NULL | NULL | NULL | 62017 | | 11.11 Using the where | +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- ------------+

2) Where the condition must conform to the leftmost prefix matching principle, which has already been described above

3) Don’t use it! = and < >

explain select * from t_xa87_order_info where created_date<>'2020-08-24'; +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- ------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- -- -- -- -- -- -- -- -- -- -- -- -- + | | 1 SIMPLE | t_xa87_order_info | NULL | | ALL IDX_T_2 | NULL | NULL | NULL | 62017 | | 50.00 Using the where  | +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- ------------+

4) Don’t make null judgments

explain select * from t_xa87_order_info where created_date is not null; +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- ------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- -- -- -- -- -- -- -- -- -- -- -- -- + | | 1 SIMPLE | t_xa87_order_info | NULL | | ALL IDX_T_2 | NULL | NULL | NULL | 62017 | | 90.00 Using the where  | +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- ------------+

5) Don’t use OR

6) The columns in = and IN can be out of order

Explain select * from t_xa87_order_info where delivery_msg=' dd_fee,17836031' and food_fee=1000 and store='f9fd2705ad1d740a4bef42833b487cea'; +----+-------------+-------------------+------------+------+---------------+---------+---------+-------------------+---- --+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------------------+------------+------+---------------+---------+---------+-------------------+---- --+----------+-------+ | 1 | SIMPLE | t_xa87_order_info | NULL | ref | IDX_T_1 | IDX_T_1 | 267 | const,const,const | 1 | 100.00 | NULL | +----+-------------+-------------------+------------+------+---------------+---------+---------+-------------------+---- --+----------+-------+

We put the store at the end of the query condition, and found that we could still match IDX_T_1 index, and the efficiency was the same as the store at the beginning.

Create an index on created_date

CREATE INDEX IDX_T_2 USING BTREE ON xa87_v2.t_xa87_order_info (created_date);
explain select * from t_xa87_order_info where created_date='2020-08-24'; +----+-------------+-------------------+------------+------+---------------+---------+---------+-------+------+--------- -+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------------------+------------+------+---------------+---------+---------+-------+------+--------- + -- -- -- -- -- -- -- + | | 1 SIMPLE | t_xa87_order_info | NULL | ref | IDX_T_2 | IDX_T_2 | 3 | const | 123 | | NULL | 100.00 +----+-------------+-------------------+------------+------+---------------+---------+---------+-------+------+--------- -+-------+ explain select * from t_xa87_order_info where DATE_FORMAT(created_date,'%Y-%m-%d')='2020-08-24'; +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- ------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+- -- -- -- -- -- -- -- -- -- -- -- -- + | | 1 SIMPLE | t_xa87_order_info | NULL | | NULL ALL | NULL | NULL | NULL | 62017 | | 100.00 Using the where |  +----+-------------+-------------------+------------+------+---------------+------+---------+------+-------+----------+ -------------+

You can see that the first query uses an index and the second does not.

  1. Query SQL optimization

1) avoid select ‘*’

During SQL parsing, the ‘*’ is converted into all column names in turn, which is done by querying the data dictionary, which means that it takes more time. So, you should get into the good habit of taking what you need.

2) Order by optimization

Rewrite the ORDER BY statement to use the index

Create another index for the column used.

Definitely avoid expressions in the ORDER BY clause;

3) Group by optimization

You can improve the efficiency of the GROUP BY statement BY filtering out unwanted records before the GROUP BY statement

4) exists in

5) Use VARCHAR/NVARCHAR instead of CHAR/NCHAR

6) Use DISTINCT instead of GROUP BY

7) You can use a UNION ALL

UNION ALL does not execute the SELECT DISTINCT function, which eliminates many unnecessary resources.

8) Join a table with an example of the same type and index it

If there are many JOIN queries, you should make sure that the Join fields in both tables are indexed. In this way, MySQL internally initiates the mechanism to optimize the SQL statement of the JOIN for you.

The fields to be joined should be of the same type. For example, if you were to Join a DECIMAL field with an INT field, MySQL would not be able to use their indexes. For those strings, the same character set is required. (It is possible that the character sets of the two tables are different)

3. Transaction

  1. Transaction characteristics

Atomicity (A) : All operations in A transaction are either completed or not completed, and do not end somewhere in between. An error occurring during the execution of a transaction can be rolled back to the state before the transaction started, as if the transaction had never been executed.

Consistency (C) : The integrity of the database is not compromised before and after the transaction begins. This means that the written data must fully conform to all preset rules, including the accuracy of the data, the concatenation of the data, and the ability of the subsequent database to do the intended work spontaneously.

Isolation (I) : the ability of the database to allow multiple concurrent transactions to read, write and modify its data at the same time. Isolation can prevent data inconsistency due to cross-execution during concurrent execution of multiple transactions;

Persistence (D) : After the transaction ends, changes to the data are permanent;

  1. Isolation level

Read uncommitted read repeatable read serializable Read uncommitted read repeatable read serializable

  1. Read Uncommitted

Thing A and Thing B, the uncommitted data of Thing A, the uncommitted data of Thing B can be read and the data that is read here is called “dirty data” and this is the lowest level of isolation, and this is usually A theoretical level, and the database isolation level is usually higher than that

  1. Read Committed

A and B things, things that A submitted data, things to read into this isolation level is higher than B read uncommitted In other words, the other things after submit the data, I can read the current things to this level can avoid the “dirty data” this isolation level will lead to “unrepeatable reads” Oracle default isolation level

  1. Repeatable Read

Transaction A and transaction B, after transaction A has committed, transaction B can’t read transaction B is repeatable read and that isolation level is higher than read committed in other words, I can’t read the data after the other party has committed and that isolation level prevents me from reading the non-repeatable read, Readable reads. For example, if you read from a point 1 and a point 2, it’s at the same MySQL default level. So you can get a repeatable read, but it will cause a phantom read.

  1. Serializable

Transaction A and transaction B. When transaction A operates the database, transaction B can only queue and wait. This isolation level is rarely used, the throughput is low, and the user experience is poor