directory

  • directory
  • preface
  • The body of the
    • 1. Table structure optimization ●
    • 1.1 Split Fields
    • 1.2 Selection of field types
    • 1.3 Limits on the size of field types
    • 1.4 Add redundant fields reasonably
    • 1.5 New Fields must have default values
    • 2. Index ●
    • 2.1 Selection of index fields
    • 2.2 Make good use of index push-down supported by mysql to cover indexes and other functions
    • 2.3 Choice between unique index and normal index
    • 3. Query statement ●
    • 3.1 Avoiding Index Failure
    • 3.2 Write where condition fields in proper order
    • 3.3 Small tables drive large tables
    • 3.4 Force index() can be used to prevent the optimizer from selecting the wrong index
    • 4. Separate database and table ●
  • conclusion

preface

Mysql optimization is a topic, we often mentioned is also a top priority, in many companies will have special dbas to do this, is even more egregious even seniors are written on the job requirements of SQL optimization, need to know a little moon recently has been writing articles about mysql, including index relevant written before, In fact, are also for this article to pave the way, so you understand, today I will from the table structure, index, query statement, sub-database sub-table these four dimensions to talk with you, in the work, how to carry out SQL optimization?

The body of the

1. Table structure optimization ●

The most basic condition of SQL optimization to have a table, so how do we achieve the purpose of SQL statement optimization through a table?

1.1 Split Fields

Imagine that you are the owner of a dumpling shop, and you have to check out every day. There must be a balance sheet to record the total assets of your dumpling shop

CREATE TABLE `accout_balance` (
  `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'primary key'.  `account` varchar(64) NOT NULL DEFAULT ' ' COMMENT 'account'.  `balance` decimal(16.2) DEFAULT NULL COMMENT 'the balance'.  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4; Copy the code

But then the business of the bun shop became better and better, and the boss opened hundreds of stores, and then it became a national chain store.

The boss was very happy, but he found a problem, because the business was so hot, so people would pay the bill all the time, and their system became more and more jammed, why?

Let’s analyze:

The balance field in the Accout_balance table is changed after the checkout, so the table is a hot table, and each change opens a transaction (update statement is equivalent to a transaction), so in the case of high concurrency, the problem is obvious. For the same row, a transaction must wait for another transaction to complete before it can execute its own update statement, so it gets slower and slower. Line (lock)

So how do we optimize for this situation? Moon’s idea is to control concurrency.

We now have a situation where hundreds of stores operate on the same row, so we can divide the row into multiple rows, that is, divide the balance of the account into N parts, so that each time we increase, one of the records is selected, and the probability of conflict is reduced to 1/n.

1.2 Selection of field types

This kind of optimization should be fairly common, so we’ll make a long story short.

For example, we can use char(36) as the type of the UUID, or we can use tinyint(2) as the type of the Boolean. When we can predict the size of the field, it is better to restrict the type of the field directly. Avoid wasting storage space.

1.3 Limits on the size of field types

This is explicitly stated in our company’s SQL table building specification.

To use varchar(255), there are several considerations:

1.255 will consume just one byte of storage, but 256 will consume two bytes of storage. (This is for UTF-8 encoding)

If you want to create an index on vARCHar, 255 is the full index, and 266 is the left-most prefix. Myisam table, a single-column index, cannot exceed 1000 bytes in length, otherwise an alarm will be raised. Innodb table, single-column index, index exceeding 767 bytes, warning, index created successfully, index prefixed (255 characters), left-most prefix problem is not available in mysql overwrite index acceleration.

3. In addition, in onlinedDL, inplace can be used below 255, 256 needs rebuild. Inplace: this is the way to create indexes in native MySQL 5.5 and innodb_plugin. Inplace means that the index is created directly on the original table without copying the temporary table. This is an improvement over the Copy Table approach. Create index in Inplace mode, the original table is also readable, but not writable.

1.4 Add redundant fields reasonably

When we first learn mysql, we will know the three paradigms of the database, and in the actual use process, for the sake of performance, we can also abandon the three paradigms of the database.

Moon had such a problem in her previous company. One SQL statement required five tables, and it would normally take more than one minute for a query, so this SQL was too heavy. However, after careful observation, Moon found that two of the tables only used one field. Then I discussed with the DBA to redundancy these two fields into two other tables (business related), resulting in the execution time of this SQL statement becoming more than ten seconds.

1.5 New Fields must have default values

The benefits are as follows:

1. Save space.

In general, it looks like the setting is more space-efficient when NULL is used, but in fact, it takes up one more bit of space than NOT NULL to determine whether the field is NULL.

2. The index fails. The index is split

Reference to NULL, the index is invalidated. If you run out of space, it may cause index splitting.

3. Reduce calculation errors due to null values

Count () does not count when a null value is encountered.

2. Index ●

2.1 Selection of index fields

In general, you can use slow query logs to select hot SQL statements and index fields after select and WHERE conditions.

2.2 Make good use of index push-down supported by mysql to cover indexes and other functions

select a from user where  b = 5;Copy the code

Add an index to the a and B columns to take advantage of mysql’s overwrite index acceleration feature.

select a from user where c = 5 and  d > 5;Copy the code

If c=5 and D =5, mysql will find d and determine that d>5. If d is not added to d, it will need to return to the table.

Secondly for composite index: this index (a, b) once set up, there is no need to give a index, the most left prefix of mysql support principle combination index or string type of index’s left N unit index is established in this paper. On the other hand, if you are creating (a, B) indexes, but your business needs a separate index for B, then you can consider creating a separate index for B. If you only have index A in your table, but your business needs indexes (A, B), remember to add the index first, then create the index, otherwise you may cause the service to hang. Moon had a friend’s colleague who chose to delete and then add indexes when adding indexes. As a result, many slow query SQL appeared in the blank period from deleting to adding indexes. At the same time, there was a large number of requests, and the business could not be processed in a short time.

2.3 Choice between unique index and normal index

If we can guarantee that a field is unique in a business sense, and the table is a frequently written table, moon recommends that you use a normal index instead of a unique index for the following reasons: 1. When reading data, a normal index finds the record that meets the first condition and continues to look for the next record until the first record that does not meet the condition. Unique indexes, on the other hand, stop when they find the first record that meets the criteria. In this case, unique indexes are better, but in practice, the difference in performance is minimal, and we can use limit 1 on the query statement. The point is the second. 2. During the update process, the update of the common index will be directly written into the change buffer without considering uniqueness, and will be persisted to the disk when the data page is accessed periodically or again. Change bufer cannot be used to update a unique index, because it is necessary to determine whether the table already has the record, so there will be an IO operation to read the data page into memory, and the IO operation is very resource-consuming.

3. Query statement ●

3.1 Avoiding Index Failure

A. The best left prefix rule (the leading index can not die, the middle index can not break

Do not perform any operations (calculations, functions, automatic/manual type conversions) on the index, which will cause the index to fail and lead to a full table scan

3. Do not continue to use the column to the right of the bettween, <, >, in, etc.

select a from user where c > 5 and b = 4;Copy the code

4. Use (! If = or < >), the index is invalid and full table scan is performed

5. If is NULL/IS not NULL is used to determine the index field, the index will be invalid and full table scan will be performed.

6. If an index field starts with a wildcard character (‘ % string ‘), the index will be invalied and will be scanned in the full table.

7. The index field is a string, but the query does not use single quotation marks, which will result in index invalidation and full table scan

8. If the index field or is used, the index will be invalidated and the table will be scanned

3.2 Write where condition fields in proper order

This is actually the leftmost prefix rule. In some post-maintenance development work, you can look at the joint index in the table, when you write SQL where conditions, try to write where conditions in the same order as the joint index.

3.3 Small tables drive large tables

Join query If there is an index, the drive table does not use the index, but the drive table uses the index when creating an index.

The algorithm for MySQL table association is Nest Loop Join, which drives the result set of the table as the basic data of the Loop, and then queries the data in the next table one by one through the data in the result set as the filtering conditions, and then merges the results. If there is a third Join, the Join result set of the first two tables will be used as the basic data of the loop, and the data will be queried in the third table again through the loop query condition, and so on. Therefore, the number of connections established by a small table driving a large table is much smaller than the number of connections established by a large table driving a small table.

You can use EXPLAIN analysis to determine who is the driver table in SQL. The first row of the table analyzed by EXPLAIN statement is the driver table.

3.4 Force index() can be used to prevent the optimizer from selecting the wrong index

We can use force index() to force only one index when we are sure we want to use one, so that the optimizer does not make the query inefficient in some cases.

The optimizer chooses indexes to find an optimal execution plan and to execute statements with minimal cost.

The optimizer makes a comprehensive judgment based on whether to use temporary tables, whether to sort, the number of rows scanned, and so on.

Of course, there are drawbacks to using force index. If your index changes and your SQL statement does not change, then this error will be reported.

4. Separate database and table ●

In the above extreme conditions you can do optimization, due to the large amount of data, may still face the situation of slow query, then at this time we have to consider the sub-database sub-table.

Here’s a simple example from Moon:

A customer service system company, a large volume of business, a lot of customers, there may be tens of millions of data every day, if you put these data in a table, there is no doubt, will die very miserable. At this time, we can consider a business-related way to divide tables. For example, if you have 10,000 customers, you can put every 100 customers in a table. In this way, the table may have only hundreds of thousands of data in an average day, which is acceptable. However, after a long time, you will find that the previous data are useless. What customers care about is the latest data, so we can separate the data and put the data that customers do not care about in this cold storage, so as to improve the online thermal data query efficiency.

conclusion

There is still a long way to go in mysql optimization. Of course, this is not a complete optimization solution, but it will cover all the basic optimization tips you can use in daily development. For the average interviewer, it is enough to beat him, but I want to remind you that you can still try to be nice when facing DBA

See you next time, guys