Before writing the article, first of all thanks to fei you technology teacher Lu provided the document.

For the record, the following optimizations are based on mysql-indexBTree type”

A, the EXPLAIN

To do MySQL optimization, we should use EXPLAIN to view the SQL execution plan.

Here’s a simple example, annotating (1,2,3,4,5) the data we want to focus on

  1. Type column, connection type. A good SQL statement must be at least range level. Prevent the occurrence of all levels
  2. Key column, the name of the index used. If no index is selected, the value is NULL. Mandatory indexes can be used
  3. Key_len column, index length
  4. Rows, the number of rows scanned. This value is an estimate
  5. Extra column, details. Note that common unfriendly values are: Using filesort, Using temporary

2. SQL statements should not contain too many values IN

MySQL optimizes IN to store all constants IN an array that is sorted. However, if the number is large, the consumption is also relatively large. Select id from t where num in(1,2,3); Or use connections instead.

The SELECT statement must specify the name of the field

SELECT * adds a lot of unnecessary consumption (CPU, IO, memory, network bandwidth); Increased the possibility of using overridden indexes; When the table structure changes, the front break also needs to be updated. Therefore, the field name is required to be followed directly by select.

4. Use limit 1 when only one data item is required

This is to make the type column in EXPLAIN const

5, if the sort field does not use index, as little sort as possible

6. If the other fields in the constraint have no index, use or as little as possible

If one of the fields on both sides of or is not an index field and the other conditions are not index fields, the query does not remove the index. Many times it is better to use union all or union(if necessary) instead of “or”

B. Union all. C. union all

The main difference between union and Union all is that the former requires the combination of result sets and then carries out the unique filtering operation, which involves sorting, increases a large number of CPU operations, and increases resource consumption and delay. Of course, the premise of union all is that the two result sets have no duplicate data.

ORDER BY RAND()

select id from `dynamic` order by rand() limit 1000;Copy the code

The above SQL statement can be optimized to

select id from `dynamic` t1 join (select  rand() * (select max(id) from `dynamic`) as nid) t2 on t1.id > t2.nid limit 1000;Copy the code

Distinguish between in and exists, not in and not exists

Select * from table A where id in (select id from table B)Copy the code

The SQL statement above is equivalent to

Select * from A where exists(select * from B where b.id = a.id)Copy the code

The distinction between in and EXISTS mainly results in a change in driver order (which is key to performance changes). In the case of exists, the outer layer table is the driver table and is accessed first. In, the subquery is executed first. So IN is suitable for the situation where the outside is large and the inside is small; EXISTS applies to a situation where the outer surface is small and the inner surface is large. For not in and NOT EXISTS, it is recommended to use not EXISTS, not in May have logic problems, not in is not just a matter of efficiency. How to efficiently write a SQL statement that replaces not EXISTS?

The original SQL statement

The select colname... From A where a.id not in (select B. id from B)Copy the code

Efficient SQL statements

The select colname... From A Left join B on a.id = b.id where B.id is nullCopy the code

The extracted result set is shown in the figure below, with data in table A not in Table B

Ten, use reasonable paging method to improve the efficiency of paging

select id,name from product limit 866613, 20Copy the code

When you do paging with the SQL statement above, you may find that using limit paging queries directly gets slower and slower as the amount of table data increases.

The optimization method is as follows: You can take the ID of the maximum number of rows on the previous page, and then limit the start of the next page based on that maximum ID. For example, in this column, the maximum ID of the previous page is 866612. SQL can be written as follows:

select id,name from product where id> 866612 limit 20Copy the code

Eleven, section query

In some user selection pages, the time range selected by some users may be too large, resulting in slow query. The main reason is that too many lines are scanned. This time can be through the program, segmented query, circular traversal, the results of the combined processing for display.

The following SQL statement can be used when the number of rows scanned exceeds millions

Avoid null values for fields in the WHERE clause

The determination of NULL causes the engine to abandon the index and perform a full table scan instead.

Fuzzy query with % prefix is not recommended

For example, LIKE “%name” or “%name%”, this query will cause index invalidation and perform full table scan. But you can use LIKE “name%”.

How to query %name%

As shown in the figure below, although the secret field was indexed, it was not used in explain results

So how to solve this problem, answer: using full-text indexes

Select id,fnum, FDST from dynamic_201606 where user_name like ‘%zhangsan%’; . Such statements, ordinary index is not sufficient to meet the query requirements. Fortunately, in MySQL, there are full-text indexes to help us.

SQL syntax for creating a full-text index is:

ALTER TABLE `dynamic_201606` ADD  FULLTEXT INDEX  `idx_user_name` (`user_name`);Copy the code

SQL statements that use full-text indexes are:

select id,fnum,fdst from dynamic_201606 where match(user_name) against('zhangsan' in boolean mode);Copy the code

Note: Before you need to create a full-text index, contact your DBA to confirm that it can be created. It is also important to note that the query statement is written differently from ordinary indexes

Avoid expression manipulation of fields in where clauses

Such as

select user_id,user_project from user_base where age*2=36;Copy the code

SQL > alter index (); this will cause the engine to abandon the index

select user_id,user_project from user_base where age=36/2;Copy the code

Avoid implicit type conversions

The type conversion occurs when the type of the column column in the WHERE clause is inconsistent with the type of the parameter passed in. It is recommended to determine the type of the parameter in the WHERE clause first

For union indexes, follow the leftmost prefix rule

For example, the index contains the fields ID,name,school, you can use the id field directly, or you can use the order ID,name, but name; No school can use this index. Therefore, when creating a federated index, you must pay attention to the order of the index fields, and put the commonly used query fields first

If necessary, you can use force index to force a query on an index

Sometimes the MySQL optimizer takes the index it sees fit to retrieve SQL statements, but it may not use the index we want. Force Index is used to force the optimizer to use our index.Copy the code

Note the range query statement

For a federated index, if there is a range query, such as between,>,<, etc., the following index field will be invalidated.

19. About JOIN optimization

  • The LEFT JOIN A table is the driver table
  • INNER JOIN MySQL automatically finds out which table has less data and uses it to drive the table
  • The RIGHT JOIN B table is the driver table

Note: MySQL does not have full join

select * from A left join B on B.name = A.name 
where B.name is null
 union all
select * from B;
Copy the code

Use inner join instead of left Join

There are at least two tables participating in the joint query, which are generally of different sizes. If the join mode is inner JOIN, MySQL will automatically select the small table as the driver table without other filtering conditions. However, left JOIN follows the principle of left drive right in the selection of the driver table, that is, the left join table is called the driver table.

Rational use of indexes

The index field of the driven table serves as the limit field of ON.

Use small tables to drive large tables

From the schematic diagram can intuitively see that if you can reduce the drive table, reduce the number of cycles in the nested loop, to reduce the total IO and CPU operation times.

Use opportunely STRAIGHT_JOIN

Inner join is selected by mysql as the driver table, but some special cases need to select another table as the driver table, such as group by, order by, etc. “Using filesort”, “Using temporary”. By STRAIGHT_JOIN to enforce the join order, the table name on the left of STRAIGHT_JOIN would be the driven table and the table name on the right would be the driven table. One prerequisite for using STRAIGHT_JOIN was that the query was an inner join, that is, an inner join. STRAIGHT_JOIN is not recommended for other links because it could result in inaccurate query results.

This method can sometimes reduce the time by three times.

Here is only a list of the above optimization scheme, of course, there are other optimization methods, we can try to explore, thank you for your attention.