Interviewer: “Chen, say you commonly used SQL optimization way.” Chen xiaoha: “that many ah, such as do not use SELECT *, query efficiency is low. Blah blah blah…”

Interviewer: “Why not use SELECT *? Where is it inefficient?” SELECT * from table_name (SELECT * from table_name); SELECT * from table_name (SELECT * from table_name);

Interviewer: “HMM…” Chen Xiaoha: “EMMM ~ is gone”

Chen xiaoha: “…. ?? (Several meanings)”

Interviewer: “HMM… Well, is there anything else you want to ask me?” Chen xiaoha: “I ask you a hammer, return my resume to me!”

Whether at work or in an interview, do not use “SELECT *” in SQL, are we overhear the problem, although overhear, but the general understanding is still in a very shallow level, and there are not many people to get to the bottom, explore its principle.

Without further ado, this article gives you an in-depth look at why and how “SELECT * “is inefficient.

This article is very dry! Bring your own tea, save it before you have time to read it — advice from a programmer who was beaten up by his tech manager for years

First, the cause of low efficiency

MySQL > MySQL > MySQL > MySQL > MySQL > MySQL > MySQL > MySQL > MySQL

4-1. [Mandatory] In table query, * should not be used as the list of query fields, and the required fields must be clearly stated.

Description:

  • Increased query parser parsing costs.

  • Adding or subtracting fields may be inconsistent with the resultMap configuration.

  • Useless fields add network overhead, especially fields of the text type.

The development manual Outlines several reasons, so let’s take a closer look:

1. Unneeded columns increase data transfer time and network overhead

  • Use “SELECT *” database need to resolve more objects, fields, permissions, attributes and other related content, in the CASE of SQL statement complex, hard parsing more, will cause a heavy burden on the database.

  • Increase network overhead; * Sometimes useless and large text fields such as log and IconMD5 will be mistakenly added, and the data transfer size will increase geometrically. This overhead is obvious if the DB and the application are not on the same machine

  • Even if the mysql server and client are on the same machine and still use TCP, communication takes extra time.

2. Add IO operations for large fields that are useless, such as vARCHar, BLOb, and text

To be precise, exceeding 728 bytes serializes the excess data to another location, so reading this record adds an IO operation. (MySQL InnoDB)

3, lost the MySQL optimizer “overwrite index” strategy optimization possibility

SELECT * eliminates the possibility of overwriting indexes, and the “overwriting index” strategy based on MySQL optimizer is extremely fast, efficient, and highly recommended query optimization method in the industry.

For example, let’s have a table called T (a,b, C, D,e,f) where A has a primary key and B has an index.

(a, B, C, D, E, F) and (a, B), respectively. If the where condition can filter out some records through the index of column B, the secondary index will go first. If the user only needs the data in columns A and B, the secondary index can be used directly to know the data queried by the user.

If the user uses SELECT * to retrieve unwanted data, the data is filtered through the secondary index first, and then all columns are retrieved through the clustered index, which is an additional B + tree query and necessarily much slower.

Due to the secondary index data, there were fewer than the clustered index in many cases, cover index using auxiliary index (by index can get all the columns) to meet the needs of users, all don’t need to read disk, direct access from within, and clustered index may data in the disk (CRT) (depending on the buffer pool size and shooting), in this case, One is memory read, one is disk read, the speed difference is significant, almost an order of magnitude difference.

=

Second, index knowledge extension

Secondary index (s); secondary index (s); secondary index (s); secondary index (s)

● Joint index (A, B, C)

The joint index (A, B, C) actually established three indexes (a), (a,b), (a, B, C);

We can think of a composite index as a level 1 directory, level 2 directory, and level 3 directory of a book. For example, index(a, B, C) is a level 1 directory,b is the level 2 directory under the level 1 directory, and C is the level 3 directory under the level 2 directory. To use a directory, you must first use its parent directory, except for the first level directory.

As follows:

Where conditions

The effect

where a=1 and c=1

C is in the level 3 directory. If c is not in the level 2 directory, the level 3 directory cannot be used

where a=1 and b=1

Only level 1 directory and level 2 directory are used.

● The advantages of federated indexes

1) Reduce expenses

To create a joint index (a,b,c) is equivalent to creating three indexes (a), (a,b), (a,b, C). Each additional index increases the overhead of write operations and disk space. For tables with a lot of data, using a federated index can greatly reduce overhead!

2) Overwrite the index

SQL > select * from (a,b,c);

SELECT a,b,c from table where a='xx' and b = 'xx';
Copy the code

MySQL can then retrieve data directly by traversing the index without returning to the table, which reduces a lot of random I/O operations. Reducing IO operations, especially random IO, is actually a major optimization strategy for DBAs. Therefore, in real application, overwriting index is one of the main optimization methods to improve performance.

3) High efficiency

The more index columns, the less data is filtered through the joint index. SQL > create table with 1000W entries

select col1,col2,col3 from table 
 where col1=1 and col2=2 and col3=3;
Copy the code

Assumption: Assume that each condition can filter out 10% of the data.

  • A. If there is only A single column index, then the index can filter 1000W10%= 100W data, and then go back to the table from 100W data to find col2=2 and COL3 = 3 data, and then sort, and then paging, and so on (recursion);

  • B. If it is a (COL1, COL2, COL3) joint index, filter 1000W10% 10% *10%= 1W through the three column index, the efficiency improvement can be imagined!

● Is it better to build more indexes

The answer, of course, is no

  • Tables with small data volumes do not need to be indexed, which incurs extra index overhead

  • Do not index infrequently referenced columns, because infrequently referenced columns do not make much sense if they are indexed

  • Do not index columns that are frequently updated, as this will definitely affect the efficiency of inserts or updates

  • Data is repeated and evenly distributed fields, so it is not very effective to create indexes (for example, gender fields, only male and female, not suitable for indexing)

  • Data changes require indexes to be maintained, meaning that more indexes mean more maintenance costs.

  • More indexes also require more storage space

Three, experience

I believe I can see here this old iron either has a cavity of passion for MySQL, or like to roll the mouse. Come is the fate, if from this literature to things, please do not mean the hands of praise oh, refuse white piao ~

A friend asked me, you are so concerned about SQL specification, usually you do not use SELECT * code?

How is that possible? Use it every day. It is also used in the code (a face of shame), in fact, our project is generally small, the amount of data is not up, performance has not met the bottleneck, so it is more indulgent.

Writing this article is mainly this knowledge point online summary is rarely very scattered, also not standard, is to give yourself is to give you a summary of a more detailed, worth remembering. Tell the interviewer what to say so he can’t pick on you.

Phase to recommend

Handwriting a “dozen airplane” small game attached [source code]

A SpringBoot2 + MybatisPlus mall management system, really sweet ~~

Why does The Ali Developer manual prohibit SELECT * altogether?

Springboot+MybatisPlus efficient implementation of add, delete, change, check, write is too good!!

Seven open source Spring Boot back-end separation projects, must be favorites!

More than ten recommended IDEA plug-ins, greatly improving the development efficiency

Java ID card number identification system

10 Useful Linux Commands you may not have used

Interviewer: Can you explain the principle of method overloading and method rewriting?

Worried about a project? Java + Vue test system

Stuck for a project? Use Java to develop an interesting expression generator

If you think the article is good, you can give it a thumbs up at the end of the article