This is the 9th day of my participation in the August More Text Challenge. For details, see:August is more challenging

📖 the

It's important to be able to make complex things simple

In the process of learning, we have seen a lot of materials, videos, documents, etc. Because there are so many materials and videos now, a knowledge point often has a variety of videos to explain it. In addition to the promotion and marketing, there are indeed many excellent video explanations. For example, teacher Li Yongle’s short video lessons can explain such complex knowledge so easily and thoroughly on a blackboard. And those of us who are learning to code, we need to learn not only how to speak clearly, but how to write clearly.


🌂 Problem Description

Limit (0,10) lists 10 items on the first page, and limit(10,10) lists the second page. However, when limit encounters order BY, it is possible to turn to the second page and see the first page again.

The details are as follows:

SELECT
  `post_title`,
  `post_date`
FROM
  post
WHERE
  `post_status` = 'publish'
ORDER BY
  view_count desc
LIMIT
  5.5
Copy the code

When using the above SQL query, it is very likely that and will occurLIMIT 0, 5The same record. This will not happen if you use the following:

SELECT
  *
FROM
  post
WHERE
  post_status = 'publish'
ORDER BY
  view_count desc
LIMIT
  5.5
Copy the code

But becausepostTable has many fields, just want to use these two fields, do not want to putpost_contentAlso found out. To solve this situation, in theORDER BYTwo sorting conditions are used to solve this problem, as follows:

SELECT
  `post_title`,
  `post_date`
FROM
  post
WHERE
  `post_status` = 'publish'
ORDER BY
  view_count desc,
  ID asc
LIMIT
  5.5
Copy the code

In principle,MySQLBy default, sort by primary key ID, that is, if theview_countIn the case of equality, the primary key ID is the default sorting condition, so we don’t need to add anything elseID asc. But the truth is,MySQL 在 order by å’Œ limitWhen mixed, there is a sort of confusion.


🤳 Analyze the problem

inMySQL 5.6The optimizer is running on theorder by limitStatement, do an optimization, that is usedpriority queue.

usepriority queueThe purpose of the index order is to use the index order, if you want to sort, and uselimit nIn this case, n records need to be retained in the sorting process. Although this can not solve the cost of sorting all records, it only needssort bufferA small amount of memory will do the sorting.

The reasonMySQL 5.6The problem of duplicate data on the second page arises becausepriority queueHeapsort is a kind of sorting method, and heapsort is an unstable sort method, that is, the same value can be sorted out of the same order as the data read out.

MySQL 5.5 does not have this optimization, so this problem does not occur.

In other words, MySQL 5.5 does not have any of the problems mentioned in this article, and it only appeared after the 5.6 release.

MySQL interprets the SQL language in the following order:

(1)     SELECT 
(2)     DISTINCT <select_list>
(3)     FROM <left_table>
(4)     <join_type> JOIN <right_table>
(5)     ON <join_condition>
(6)     WHERE <where_condition>
(7)     GROUP BY <group_by_list>
(8)     HAVING <having_condition>
(9)     ORDER BY <order_by_condition>
(10)    LIMIT <limit_number>
Copy the code

The execution sequence isThe form... The where... The select... The order by... Limit...Due to the above mentionedpriority queueThe reason in the completionselectAfter that, all the records are arranged in a heap sort method, in progressorder byWhen the onlyview_countThe larger values move forward.

But as a result oflimit, only 5 records need to be retained in the sorting process.view_countThere is no order in the index, so when the second page of data is presented,mysqlTake whichever one you see, so if the first sort is arbitrary and the second time the SQL is executed, the result should be the same as the first.


🤳 solve the problem

(1) Index sort field

If you add an index to the field, you can avoid this problem by reading and paging directly according to the order of the index.

(2) Understand pagination correctly

Paging is built on the basis of sorting, the number of range segmentation. Sorting is a database capability, pagination is a derivative application requirement.

The MySQL and Oracle documentation provide the limit n and rownum < n methods, but the concept of paging is not clearly defined.

It is also important to note that while the above solution may ease the user’s problem, there is still a problem from the user’s perspective: for example, if the table is inserted frequently, the first and second pages will still overlap under the read-committed isolation level when the user queries it.

So, paging has always had this problem, and different scenarios don’t have very high accuracy requirements for paging data.

(3) Some common database sorting problems

Do not addorder byWhen the sorting problem

When using Oracle or MySQL, users find that MySQL is always in order and Oracle is chaotic, mainly because Oracle is a heap table and MySQL is an index cluster table. So without an Order by, the database does not guarantee that the records will be returned sequentially, and that they will be returned consistently every time.

Paging problem Paging problem

As described earlier, paging is an application requirement derived from the sorting capabilities provided by the database, and the database does not guarantee duplication of paging.

NULL values and empty strings

Different databases interpret and process NULL values and empty strings differently. For example, Oracle NULL and NULL values are not comparable, neither equal nor unequal, and are unknown. When inserting an empty string, MySQL uses a NULL string of length 0, whereas Oracle uses a NULL value.

🎉 finally

  • For more references, see here:Chen Yongjia’s Blog
  • Like the little friend of the blogger can add a concern, point a like oh, continue to update hey hey!