Did you know MySQL has performance issues with Limit

MySQL paging queries are usually implemented with limits.

The basic usage of limit in MySQL is very simple. Limit accepts one or two integer arguments, the first of which specifies the offset of the first row returned and the second the maximum number of rows returned. The offset of the initial record row is 0.

For PostgreSQL compatibility, limit also supports limit # offset #.

Question:

For small offsets, there is no problem with using limit directly, but as the amount of data increases and the pages are paged later, the offset of the limit statement becomes larger and the speed becomes significantly slower.

Optimization idea:

Avoid scanning too many records when there is a large amount of data

Solution:

Subquery paging or JOIN paging.

JOIN paging and subquery paging are about the same in terms of efficiency and time consumption.

Here’s an example. Generally, MySQL primary keys are numeric types that increment, in which case you can optimize them in the following way.

The following takes a table with 800,000 data in the real production environment as an example to compare the query time before and after optimization:


-- Traditional limit, file scanning
[SQL]SELECT * FROM tableName ORDER BY id LIMIT 500000.2; Affected rows: 0 Time: 5.371s-- Sub-query mode, index scan
[SQL]
SELECT * FROM tableName
WHERE id> = (SELECT id FROM tableName ORDER BY id LIMIT 500000 , 1)
LIMIT 2; Affected row: 0 Time: 0.274s-- JOIN paging
[SQL]
SELECT *
FROM tableName AS t1
JOIN (SELECT id FROM tableName ORDER BY id desc LIMIT 500000.1) AS t2
WHERE t1.id <= t2.id ORDER BY t1.id desc LIMIT 2; Affected row: 0 Time: 0.278sCopy the code

You can see a nearly 20-fold improvement in performance after optimization.

Optimization principle:

Subqueries are done on indexes, whereas regular queries are done on data files, which are generally much smaller and more efficient. In order to extract all the field contents, the first method needs to be taken out across a large number of data blocks, while the second method basically takes out the corresponding contents after locating according to the index field directly, which naturally improves the efficiency greatly.

Therefore, instead of using limit directly, we can first get the ID of offset and then directly use limit size to get the data.

In the actual project, we can use a similar policy mode to deal with paging. For example, if there are 100 pages of data per page, the most basic paging method will be used if the number is less than 100 pages, and the subquery paging method will be used if the number is greater than 100 pages.

Related articles

MySQL index and query optimization
Install the MySQL decompressed version in Windows
MySQL primary key Auto Increment usage
Introduction to MySQL Database storage engine

Did you know MySQL has performance issues with Limit

Related Posts

MySQL > select * from syslock; MySQL > select * from syslock; MySQL > select * from syslock

Mysql in the lock

AQS shared lock