How to gracefully implement paging queries

Paging is a common feature, especially when the volume of data is increasing, and paging queries are essential. There are many ways to implement paging. If the ORM framework used is Mybatis, there are open source paging plug-ins available, such as Mybatis-PageHelper. If you do not use the paging plug-in, manual paging is required. Because different databases implement different SQL statements for paging, such as Mysql uses the limit keyword and Oracle uses rownum, the paging scheme described in this article only applies to Mysql databases.

Limit based paging scheme

I’ll start with a few requirements that paging must meet: one is orderliness and one is non-repetition. Orderliness can be considered a prerequisite for non-repetition, because if the data is unordered, there is no guarantee that multiple pages will not repeat. So paging often requires sorting the data and then adding paging conditions. The first scheme we looked at is the limit-based paging scheme used by many paging plug-ins. Let’s start with our test data.

Let’s have a look at the table structure:

mysql> desc user; +-------+------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+------------+------+-----+---------+-------+ | id | bigint(20) | NO | PRI | NULL | | | name | char(50) | NO | |  NULL | | +-------+------------+------+-----+---------+-------+ 2 rowsin set
Copy the code

As you can see, our user table has only two columns, id of type bigint and name of type char.

Let’s look at the following table data:

mysql> select count(*) from user;
+----------+
| count(*) |
+----------+
|    45116 |
+----------+
1 row in set

mysql> select * from user order by id asc limit10; +----+--------+ | id | name | +----+--------+ | 0 | user_0 | | 1 | user_1 | | 2 | user_2 | | 3 | user_3 | | 4 | user_4 |  | 5 | user_5 | | 6 | user_6 | | 7 | user_7 | | 8 | user_8 | | 9 | user_9 | +----+--------+ 10 rowsin set
Copy the code

You can see the total number of rows is about 45,000.

Implementing paging based on limit is relatively simple:

mysql> select * from user order by id asc limit10000, 10; +-------+------------+ | id | name | +-------+------------+ | 10000 | user_10000 | | 10001 | user_10001 | | 10002 | user_10002 | | 10003 | user_10003 | | 10004 | user_10004 | | 10005 | user_10005 | | 10006 | user_10006 | | 10007 | user_10007 | | 10008 | user_10008 | | 10009 | user_10009 | +-------+------------+ 10 rowsin set
Copy the code

The first argument after limit indicates the subscript, starting with the 10,000th row, and the second argument indicates the total of 10 rows.

Paging with limit is pretty simple to use, but are there any problems?

Let’s start by reviewing the two elements that paging needs to meet: orderliness and non-repetition. We have already used order by to sort the above statement, so it can satisfy the order, but it can not repeat? What if a record is inserted between the current page of the query and the next page, and the data ID is smaller than the largest ID in the current page record? Let’s test it out:

mysql> select * from user order by id asc limit10000, 10; +-------+------------+ | id | name | +-------+------------+ | 10000 | user_10000 | | 10001 | user_10001 | | 10002 | user_10002 | | 10003 | user_10003 | | 10004 | user_10004 | | 10005 | user_10005 | | 10006 | user_10006 | | 10007 | user_10007 | | 10008 | user_10008 | | 10009 | user_10009 | +-------+------------+ 10 rowsin set

mysql> insert into user(id,name) values(-1,'user_-1');
Query OK, 1 row affected
mysql> select * from user order by id asc limit10010, 10; +-------+------------+ | id | name | +-------+------------+ | 10009 | user_10009 | | 10010 | user_10010 | | 10011 | user_10011 | | 10012 | user_10012 | | 10013 | user_10013 | | 10014 | user_10014 | | 10015 | user_10015 | | 10016 | user_10016 | | 10017 | user_10017 | | 10018 | user_10018 | +-------+------------+ 10 rowsin set
Copy the code

As you can see, when we insert a record between two adjacent pages of queries, the next page duplicates the previous one (the record with ID 10009 appears on both adjacent pages). The reason is that after inserting a record, the paging structure has changed, so duplicate data occurs.

As a result, using limit for pagination may not seem very elegant, but I’ll show you another way to write pagination.

Paging scheme based on limit and comparison

Another way to think about pagination is that if we want the 10 rows starting at line 10,000, we can find and sort the first 10 rows, and then fetch the first 10 rows. Let’s look at the specific SQL statement:

mysql> select * from user where id >=10000 order by id asc limit 10;
+-------+------------+
| id    | name       |
+-------+------------+
| 10000 | user_10000 |
| 10001 | user_10001 |
| 10002 | user_10002 |
| 10003 | user_10003 |
| 10004 | user_10004 |
| 10005 | user_10005 |
| 10006 | user_10006 |
| 10007 | user_10007 |
| 10008 | user_10008 |
| 10009 | user_10009 |
+-------+------------+
10 rows in set
Copy the code

Does this prevent the problems above? We’ll find out with an experiment.

mysql> select * from user where id >=10000 order by id asc limit 10;
+-------+------------+
| id    | name       |
+-------+------------+
| 10000 | user_10000 |
| 10001 | user_10001 |
| 10002 | user_10002 |
| 10003 | user_10003 |
| 10004 | user_10004 |
| 10005 | user_10005 |
| 10006 | user_10006 |
| 10007 | user_10007 |
| 10008 | user_10008 |
| 10009 | user_10009 |
+-------+------------+
10 rows in set

mysql> insert into user(id,name) values(-4,'user_-4');
Query OK, 1 row affected
mysql> select * from user where id >=10010
 order by id asc limit 10;
+-------+------------+
| id    | name       |
+-------+------------+
| 10010 | user_10010 |
| 10011 | user_10011 |
| 10012 | user_10012 |
| 10013 | user_10013 |
| 10014 | user_10014 |
| 10015 | user_10015 |
| 10016 | user_10016 |
| 10017 | user_10017 |
| 10018 | user_10018 |
| 10019 | user_10019 |
+-------+------------+
10 rows in set
Copy the code

As you can see, when data is inserted between adjacent page queries, the paging query results do not duplicate. It makes sense, too, because even though the structure of the paging changes after the record is inserted, since our current paging query starts with a fixed ID, inserting new data has no effect on subsequent paging results.

Of course, there are limits to this paging query. This only works if the columns used for sorting are unique, and in the example above, the ID column is the primary key and therefore unique, so you can use this method for paging. If the columns used for sorting are not unique, such as timestamps, then this paging method can also duplicate, and you can think about why.

If you find this article helpful, you can scan the qr code below and follow my public account “Thoughts on Java Architecture”.

How to gracefully implement paging queries

Related Posts

Record a production MQ accident

God, how complicated is it to update a little registry? 【 Rip eureka source code NO.2】

– Approach the container Spring IOC ii again