This article mainly summarizes the work of some common operations and unreasonable operations, in the slow query optimization to collect some useful information and information, this article is suitable for MySQL based developers.

 

First, index correlation

 

1. Index base

 

Cardinality is the number of different values that a data column contains. For example, if a data column contains values 1, 3, 7, 4, 7, 3, its cardinality is 4.

 

It works best when the index has a high cardinality relative to the number of rows in the data table (that is, the columns contain many different values with few duplicate values).

 

If a column contains many different ages, the index will quickly identify the rows. If a data column is used to record gender (only “M” and “F” values), then the index is of little use; If values occur almost equally often, then any search for a value is likely to yield half of the rows.

 

In these cases, it is best not to use indexes at all, because when the query optimizer finds that a value is present in a high percentage of the table’s rows, it will generally ignore the indexes and perform a full table scan. The usual percentage line is “30%”.

 

2. Cause of index failure

 

  • Operations on index columns include (+, -, *, /,! <>, %, like’%_’ (% in front);

  • Type error: vARCHAR, where condition: number;

  • Apply an internal function to an index, in which case you should build a function-based index. Select ROUND (t.logicdb_id) from template t where ROUND (t.logicdb_id) = 1 5.7 Virtual columns can be supported. Previously, only a ROUND (t.logicdb_id) column can be created and then maintained.

  • If the condition has AN OR, it will not be used even if the condition has an index (this is why it is recommended to use LESS OR). If you want to use OR and want the index to be valid, you can only add an index to each column in the OR condition.

  • If the column type is a string, make sure that the data in the condition is quoted, otherwise no index is used;

  • The b-tree index is null, but is not NULL, and the bitmap index is null, but is not NULL.

  • Composite indexes follow the leftmost principle.

 

3. Index establishment

 

  • The most important is certainly the statement that is frequently queried based on the business;

  • Try to select columns with high distinction as indexes. The formula of distinction is COUNT(DISTINCT Col)/COUNT(*), which indicates the ratio of DISTINCT col to COUNT(*). The higher the ratio, the fewer records we scan.

  • It is better to set up unique keys for unique features in the service. On the one hand, the correctness of data can be guaranteed, and on the other hand, the efficiency of indexing can be greatly improved.

 

Useful information in EXPLIAN

 

1. Basic Usage

 

  • Desc or EXPLAIN add your SQL;

  • It is also useful to add extended Explain to your SQL, and then show Warnings to see what statements are actually executed. Many times different scripts will execute the same code after SQL analysis.

 

2. Performance enhancing features

 

  • Index coverage (covering index) : need to query data in the index can be checked do not need to return to the table EXTRA column show using index;

  • Index Condition Pushdown (ICP) : Originally, Index is only an access mode of data access. The data obtained by the storage engine through the Index back table is transmitted to the MySQL Server layer for WHERE Condition filtering. As of version 5.6, MySQL Server will push down some where criteria to the storage engine layer if the ICP is enabled to use the index field. EXTRA displays the using index condition. The architecture diagram of MySQL is divided into Server and storage engine layers.

  • Index Merge: Perform a conditional scan of multiple indexes separately and then combine their respective results (INTERSECT/Union). If the OR condition is AND, consider creating a composite index. The index type shown in EXPLAIN shows index_Merge, and EXTRA shows the specific merge algorithm and indexes used.

 

3. Extra field

 

  • Using filesort: indicates that MySQL uses an external index sort for data, instead of reading data in the order of the indexes in the table. MySQL does not use an index to sort files, but does not use an index to sort files.

  • Using temporary: A temporary table is used to hold intermediate results. MySQL uses temporary tables when sorting query results. Sort order by and group by;

  • Using index: the corresponding SELECT operation using index (Covering index), avoid access to the data row table, efficiency is good;

  • Impossible WHERE: The value of the where clause is always false and cannot be used to retrieve any tuple;

  • Select Tables Optimized Away: Optimize MIN/MAX operation based on index or COUNT(*) operation for MyISAM storage engine without GROUP BY clause.

  • Distinct: Optimizes the DISTINCT operation to stop searching for the same value once the first matching tuple is found.

 

Using filesort, using temporary, group by, group by, group by, group by, group by, group by, group by, group by, group by, group by, group by Insert a temporary order by NULL to avoid sorting, and use filesort to remove the order.

 

4, Type field

 

  • System: The table has only one row (equal to the system table). This is a special case of const type.

  • Const: Const is used to compare primary key indexes or unique indexes, if found in sequence by index. Because you can only match one row of data, it’s fast. If you place the primary key in the WHERE list, MySQL can convert the query to a constant;

  • Eq_ref: Unique index scan, for each index key, only one record in the table matches it. Common with primary key or unique index scans;

  • Ref: a non-unique index scan that returns all rows matching a single value. It is essentially an index access that returns all rows that match a single value, however it may find multiple rows that match the criteria, so it should be a mixture of lookup and scan;

  • Range: Retrieves only rows in a given range, using an index to select rows. The key column displays which index is used. This is usually a query where between, <, >, in, etc. This is better than a full index scan because you only need to start at one point and end at another point.

  • Index: Full Index Scan: The difference between an Index and ALL is that the Index type only traverses the Index tree. This is usually faster than ALL because Index files are usually smaller than data files. In other words, although ALL and Index are used to read the entire table, Index is read from the Index, while ALL is read from the hard disk.

  • All: Full Table Scan: traverses all tables to obtain matching rows.

 

Reference Address:

https://blog.csdn.net/DrDanger/article/details/79092808

 

Third, field type and encoding

 

1) MySQL returns a string length

 

CHARACTER_LENGTH(With CHAR_LENGTH)The LENGTH function returns the number of bytes, three for each character.

 

2) Create index length calculation statements for varvhar and other fields

 

Select count(distinct left(test,5))/count(*) from table; The closer you get to one, the better

 

MySQL utf8

 

MySQL utF8 is a maximum of 3 bytes and does not support emoji. It must be utF8MB4 only. You need to set the client character set to UTF8MB4 in the MySQL configuration file.

 

JDBC connection strings do not support setting characterEncoding= UTF8MB4. The best way is to specify the initial SQL in the connection pool, for example: Hikari connection pool. Other connection pools are similar to spring.datasourc.hikari. connection-init-sql =set names UTF8MB4. Otherwise, set Names UTF8MB4 must be executed before each SQL execution.

 

MySQL collation (_bin and _genera_ci)

 

  • Utf8_genera_ci is case insensitive. Ci is short for Case Insensitive insensitive insensitive insensitive insensitive insensitive utF8_GENERA_CI.

  • Utf8_general_cs is case sensitive. However, MySQL does not support *** _generA_cs sorting. Utf8_bin is used instead.

  • Utf8_bin stores each character in a string as binary data, case sensitive.

 

So, what is the difference between UTf8_general_cs and UTf8_bin, also case sensitive?

 

  • Cs stands for case sensitive. Bin means binary, binary code comparison;

  • Under utf8_general_cs, some Western European and Latin characters are not case-sensitive, such as a =a, but sometimes a =a is not required, hence utf8_bin.

  • Utf8_bin is characterized by using the binary encoding of characters to perform operations. Any different binary encoding is different, so in utF8_bin sorting: A <>a.

 

5) The initial connection in SQL YOG specifies the encoding type using the initialization command of the connection configuration

 

 

4. Summary of SQL statements

 

1. Frequently used but easily forgotten

 

  • Do not insert if there is a primary key or unique key conflict: insert ignore into

  • Update if there is a primary key or unique key conflict, note that this affects the increment increment: INSERT INTO room_remarks(room_id,room_remarks)VALUE(1,”sdf”) ON DUPLICATE KEY UPDATE room_remarks = “234”

  • REPLACE INTO room_remarks(room_id,room_remarks) VALUE(1,” SDF “)

  • Backup TABLE: CREATE TABLE user_info SELECT * FROM user_info

  • Replicate the TABLE structure: CREATE TABLE user_v2 LIKE user

  • Import FROM the query statement: INSERT INTO user_v2 SELECT * FROM user or INSERT INTO user_v2(ID,num) SELECT ID,num FROM user

  • UPDATE user a, room b SET a.num= A.num +1 WHERE a.room_id= B.id

  • DELETE user FROM user,black WHERE user.id=black.id

 

2. Lock correlation (rarely used as an understanding)

 

  • Select id from tb_test where id = 1 lock in share mode;

  • Select id from tb_test where id = 1 for update

 

3, used in optimization

 

  • Select * from table force index(idx_user) limit 2;

  • Select * from table ignore index(idx_user) limit 2;

  • Select SQL_NO_CACHE from table limit 2;

 

4. Check the status

 

  • Check character set: SHOW VARIABLES LIKE ‘character_set%’;

  • To check collation rules: SHOW VARIABLES LIKE ‘collation%’;

 

5, SQL preparation attention

 

  • Where statements are parsed from right to left.

  • Deferred Join technology is used to optimize super multi-page scenarios, such as Limit 1000010. Deferred join can avoid back to the table.

  • Distinct statements are very performanceless and can be optimized with group by;

  • Try not to connect more than three tables.

 

Five, the pit

 

  • If there are autoincrement columns, the TRUNCate statement resets the cardinality of the autoincrement column to 0. In some scenarios, the autoincrement column is used as the ID in the service.

  • SUM(a) returns NULL instead of 0; SUM(a) returns NULL; SUM(a) returns NULL.

  • SQL > select * from ‘where’ where ‘having’ where ‘where’ having ‘where’ having ‘UnKnown’ is always’ false ‘; Therefore, “A is null” is used.

 

Six, ten million large table online modification

 

If the MySQL table structure is modified, the table will be locked and service requests will be blocked. MySQL introduced online updates after 5.6, but still locks tables in some cases, so pt tools (Percona Toolkit) are generally used.

 

Add index to table;

 

pt-online-schema-change –user=’root’ –host=’localhost’ –ask-pass –alter “add index idx_user_id(room_id,create_time)” 
D=fission_show_room_v2,t=room_favorite_info –execute

 

7. Slowly query logs

 

Sometimes if an online request times out, you should pay attention to slow query logs. Slow query analysis is easy by finding the location of the slow query log file and using mysqlDumpSlow. Query slow To query log information, you can run SQL commands to view related variables. The common SQL commands are as follows:

 

 

Mysqldumpslow’s tool is quite simple, and the main arguments I use are as follows:

 

  • -t: limit the number of output lines, I usually take the first ten is enough;

  • -s: sort by what the default is average query time at, I also often use c query times, because the query times are very frequent but not high time is also necessary to optimize, and t query time, check which statement special card;

  • -v: displays detailed information.

 

Example: mysqlDumpslow -v -s t -t 10 mysql_slow.log.2018-11-20-0500

 

View SQL processes and kill processes

 

If you execute an SQL operation, but the operation does not return, you can query the process list to check its actual execution status. If the SQL operation is time-consuming, you can use the kill command to kill the process. You can view the current SQL execution status directly by viewing the process list. If the current database load is high, the process list may show up with a large number of processes jammed and taking a long time to execute.

 

The command is as follows:

 

— View the process list

SHOW PROCESSLIST;

— Kill a process

kill 183665

 

If you use SQLyog, then there is also a graphical page in the menu bar – Tools – Display – process list. On the process list page, you can right-click to kill a process. As follows:

 

 

 

 

Nine, some database performance thinking

 

 

When optimizing a company’s slow query logs, it is often possible to forget to build an index. Problems like this can be easily solved by adding an index. But there are a few cases that can’t be solved simply by adding an index:

 

1. The business code cycles through the database

 

Consider a scenario where the user’s fan list information is obtained by adding pages of ten. In fact, SQL like this is very simple, and the performance of the query through the table is also very high. However, in some cases, many developments take a series of ids and then loop through the information of each ID, so that a large number of ids can put a lot of strain on the database, and the performance is low.

 

2, statistics SQL

 

In many cases, there will be leaderboards in the business, and it is found that the company directly uses the database for calculation in many places. When it does aggregation calculation for some large tables, it often takes more than five seconds. These SQL are generally very long and difficult to optimize. For scenarios like this, if the business allows (for example, the consistency requirement is not high or the statistics are collected only after a period of time), you can do statistics exclusively from the database. I also recommend Redis caching for this type of business.

 

3. Large paging

 

Slow query log (limit 400001000, limit 400001000, limit 400001000, limit 400001000, limit 400001000) But look at the relevant business code normal business logic is not such a request, so it is likely to be malicious users in the brush interface, it is best in the development of the interface with verification interception of these malicious requests.

 

This article summed up here, I hope to be able to help you!