From: juejin. Cn/post / 684490…

When we write SQL statements, we tend to focus on the results of SQL execution, but whether we really pay attention to the efficiency of SQL execution, whether we pay attention to the specification of SQL writing?

The following dry goods sharing is summarized in the actual development process, I hope to help you!

1. Limit paging optimization

Limit efficiency is very low when the offset is very large.

SELECT id FROM A LIMIT 1000,10

SELECT ID FROM A LIMIT 9000010 slow

Solution a:

select id from A order by id limit90000, 10; Copy the codeCopy the code

If we use order by together. Very quickly, 0.04 seconds is OK. Because the primary key id is used as the index!

Of course, whether or not you can use an index depends on your business logic, but be careful when using it for paging purposes!

Scheme 2

select id from A order by id between 90000 and 90010; Copy the codeCopy the code

2. Use limit 1 and top 1 to obtain a row

Some business logic makes a query operation (especially when taking the maximum amount based on a certain field DESC). You can terminate [database index] by using limit 1 or top 1 to continue scanning the entire table or index.

counter-examples

SELECT id FROM A LIKE 'abc%'Copy the codeCopy the code

Is case

SELECT id FROM A LIKE 'abc%' limit1 Copy codeCopy the code

3. Do not use select * from table under any circumstances, replace “*” with a list of specific fields, and do not return fields that are not needed to avoid a full scan!

counter-examples

SELECT * FROM A to copy codeCopy the code

Is case

SELECT id FROM A copy codeCopy the code

4. Batch insert optimization

counter-examples

INSERT into person(name,age) values('A',24)
INSERT into person(name,age) values('B',24)
INSERT into person(name,age) values('C'24) Copy codeCopy the code

Is case

INSERT into person(name,age) values('A', 24), ('B', 24), ('C',24), copy codeCopy the code

SQL statement optimization mainly lies in the correct use of the index, and we often make a mistake in the development of the table is to complete the scan, one affect performance, and to consume time!

5. Optimization of like statement

counter-examples

SELECT id FROM A WHERE name like '%abc%'Copy the codeCopy the code

Because ABC is preceded by “%”, this query must follow the full table query. Do not prefix the keyword with % unless necessary (fuzzy queries need to include ABC)

Is case

SELECT id FROM A WHERE name like 'abc%'Copy the codeCopy the code

The instance

Mysql version: 5.7.26

select nick_name from member where nick_name like Xiao Ming '% %'Copy the codeCopy the code

Like ‘% xiaoming %’ does not use index!

select nick_name from member where nick_name like 'xiaoming %'Copy the codeCopy the code

Select * from index like’ xiaoming %’

6. Where clause uses an optimization of OR

It is often better to replace “or” with union all or union. If the OR keyword is used in the WHERE clause, the index will be discarded.

counter-examples

SELECT id FROM A WHERE num = 10 or num = 20Copy the code

Is case

SELECT ID FROM A WHERE num= 10 union all SELECT ID FROM A WHERE num=20 Copy codeCopy the code

7. The where clause uses IS NULL or IS NOT NULL optimizations

counter-examples

SELECT ID FROM A WHERE num IS NULLCopy the code

If you use IS NULL or IS NOT NULL in the WHERE clause, the index will be discarded

A full table query

.

Is case

Select * from table_name where num = 0;

The USE of IS NULL IS highly used in SQL in actual business scenarios

We should pay attention to
Avoid full table scan

SELECT id FROM A WHERE num=0Copy the code

8. Optimize expression operations on fields in the WHERE clause

Do not perform functions, arithmetic operations, or other expression operations to the left of the “=” in the WHERE clause, or the system may not use the index properly.

  • 1
SELECT id FROM A WHERE datediff(day,createdate,'2019-11-30')=0 copy codeCopy the code

Optimization for

SELECT id FROM A WHERE createdate>='2019-11-30' and createdate<'2019-12-1'Copy the codeCopy the code
  • 2
SELECT id FROM A WHERE year(addate) <2020Copy the code

Optimization for

SELECT id FROM A where addate<'2020-01-01'Copy the codeCopy the code

9. Sort index problems

Mysql queries only use an index, so if the WHERE clause already uses an index, the order BY column does not use an index. So the database default sort can meet the requirements of the situation do not use sort operation;

Try not to include more than one column sort, and create composite indexes for those columns if necessary.

10. Try to replace union all with union

The main difference between union and union all is that the former requires the combination of two (or more) result sets and then the uniqueness filtering operation, which will

It involves sorting, adding a lot of CPU processing, increasing resource consumption and latency

. So when we can

Verify that duplicate result sets are not possible or do not matter

“, use union all instead of union

Inner join and left join, right join, subquery

  • Inner join left/ rightJoin left/ rightJoin left/rightjoin
SELECT A.id,A.name,B.id,B.name FROM A LEFT JOIN B ON A.id =B.id; SELECT A.id,A.name,B.id,B.name FROM A RIGHT JOIN ON B A.id= B.id; SELECT A.id,A.name,B.id,B.name FROM A INNER JOIN ON A.id =B.id; Copy the codeCopy the code

After many aspects of the confirmation

inner join

Performance is faster because an inner join is an equivalent join and may return fewer rows. But remember that some statements implicitly use equivalent concatenations, such as:

SELECT A.id,A.name,B.id,B.name FROM A,B WHERE A.id = B.id;

Use inner Join as much as possible

  • Second: sub-query performance is slower than external join performance, try to use external join to replace the sub-query.

counter-examples

If the outer table is A large table, we can expect query performance to be worse than this.

Select* from A where exists (select * from B whereid>=3000 and A.uuid=B.uuid); Copy the codeCopy the code

Execution time: about 2s

Is case

Select* from A inner join B ON A.uuid=B.uuid whereb.uuid>=3000; This statement executes the test in less than a second; Copy the codeCopy the code

Execution time: less than 1s

  • Third: When using JOIN, you should use small results to drive large results

Left JOIN The left table results should be as small as possible, if there are conditions should be placed on the left side of the first processing, right join the same reverse. Such as:

counter-examples

Select * from A left join B A.id=B.ref_id whereA.id>10 Copy the codeCopy the code

Is case

select * from (select * from A wehre id >10) T1 left join B on T1.id=B.ref_id; Copy the codeCopy the code

12. Exist & in optimization

SELECT * from A WHERE id in(SELECT id from BCopy the code
SELECT * from A WHERE id EXISTS (SELECT 1 from a.id = B.idCopy the code

Analysis:

In is traversing the comparison in memory

Exist needs to query the database, so when the data volume of B is large, the efficiency of EXISTS is better than that of in**

In () is executed only once, caching all ids in table B, and then checking whether the IDS in table A are equal to those in table B. If the ids are equal, the records in table A are added to the result set until all records in table A are traversed.

The flow principle of an In operation is similar to the following code

    List resultSet={};

    Array A=(select * from A);
    Array B=(select id from B);

    for(int i=0; i<A.length; i++) {for(int j=0; j<B.length; j++) {if(A[i].id==B[j].id) {
             resultSet.add(A[i]);
             break; }}}returnresultSet; Copy the codeCopy the code

As you can see, it is not appropriate to use in() when table B is large, because table B will be traversed all at once

For example, if table A has 10,000 records and table B has 1,000,000 records, it is possible to traverse 10,000 x 1,000,000 records at most, which is inefficient.

Another example: Table A has 10000 records and Table B has 100 records, so it is possible to traverse 10000*100 times at most, which greatly reduces the number of traverse times and greatly improves the efficiency.

  

Conclusion: In () is suitable for the case where the data of table B is smaller than that of table A

Exist () will execute A.length() times, and the execution process code is as follows

List resultSet={};
Array A=(select * from A);
for(int i=0; i<A.length; i++) {if(exists(A[I].id) {// Execute select 1 from BwhereResultset.add (A[I]); }}returnresultSet; Copy the codeCopy the code

Use exists() when table B is larger than table A because it does not have as many traversals and only needs to perform one more query.

For example, if there are 10,000 records in table A and 1,000,000 records in table B, exists() performs 10,000 executions to determine whether the ids in table A are equal to those in table B.

For example, if there are 10000 entries in table A and 100000000 entries in table B, then exists() should be executed 10000 times because it only executes A.length. The more data in table B, the better exists() is.

If exists(A) has 10000 entries and exists(B) has 100 entries, it is better to use in() to perform 10000*100 times, because in() does the comparison in memory, and exists() needs to query the database.

We all know that querying a database consumes more performance and memory is faster.

Conclusion: Exists () applies to the case where table B has larger data than table A