A simple thing is not simple when you magnify it

preface

Some people say that the single table of tens of millions of data should be divided into the database table, so play unreasonable ah. However, for innovative business, the design of the business system can not be expected at the beginning of such a large capacity, cost and time limit are insufficient to complete the development of the system. I think the design of innovative business systems should first meet the requirements, and secondly allow for temporary solutions in case of a business blowout and leave time for system upgrades.

Everyone wants a business blowout, so here it comes!

01

I did not mention the specific time point. I started a new business. I saw a table with no more than 1 million data entries and provided query function. Later, the business volume continued to rise, mysql disk began to alarm, query timeout alarm. Furthermore, the customer needs to query and download the data of the business table in real time. It is too late to change the storage scheme temporarily, so KPI can not be delayed.

02

First solve the current problem, first expand the disk. Stop double room synchronization to reduce unnecessary alarms.

However, it is estimated that 1000G will not last long. After discussion with the business students, the business accepts real-time query and download of data within the range of T-7. At this rate, seven days is also over 100 million records. But a seven-day data disk is certainly enough, so you need to store the historical data offline first.

This one is easy, just a few lines of code. This, of course, depends on sound infrastructure.

03

The capacity problem is solved, so the data paging query should be optimized. To illustrate, sensitive business data is removed and the data table structure is as follows:

CREATE TABLE 't' (' id 'bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',' A 'char(32) DEFAULT' COMMENT ', `b` varchar(64) DEFAULT NULL COMMENT '', `c` bigint(20) unsigned NOT NULL COMMENT '', `d` varchar(64) NOT NULL COMMENT '', `e` tinyint(4) DEFAULT NULL COMMENT '', `f` int(11) NOT NULL DEFAULT '0' COMMENT '', `g` varchar(32) NOT NULL COMMENT '', `h` char(32) DEFAULT NULL COMMENT '', `i` varchar(64) DEFAULT NULL COMMENT '', `j` varchar(64) DEFAULT NULL COMMENT '', `k` datetime DEFAULT NULL COMMENT '', `l` int(11) DEFAULT NULL COMMENT '', `m` timestamp NULL DEFAULT NULL COMMENT '', `n` timestamp NULL DEFAULT NULL COMMENT '' PRIMARY KEY (`id`), UNIQUE KEY `UK_b` (`b`), KEY `IDX_c` (`c`,) USING BTREE )Copy the code

When the data volume is small, we have no problem using the following pages:

SELECT id, a, b... FROM t LIMIT n,mCopy the code

Such as:

Pagesize: Displays the number of entries per page.

Pageno: the page number

So m = pagesize; n=(pageno-1)*pagesize.

MySQL limit works by first reading the first N records, then discarding the first n records and reading the next M records as desired, so the larger the n, the larger the offset, the worse the performance.

Modify SQL to reduce I/O consumption

SELECT id, a, b... FROM t where id in(SELECT id FROM t LIMIT n,m)Copy the code

In fact, this will not avoid scanning the first n, but it has saved a lot of time.

The RT of requests per page is shown above, and you can see that the RT goes up gradually as the number of pages increases.

Qps gradually decreased.

So if there’s too much data, the probability of the last page timeout is very high.

The optimized

Let’s keep it in the dark and see how it performs after optimization. The performance of this interface is significantly improved. As shown in the figure:

The average RT is around 10ms, and the final RT is around 15ms due to the data processing returned

QPS is also smooth and should be higher depending on customer calls.

To be continued………..

Focus on the public number [wang Ge wrote code] reply [optimization] to get the latest updates of this article