Design practice of trillions enterprise MySQL massive storage sub-database and sub-table

preface

Relational database take MySQL as an example, the single storage capacity, connection number is limited, it is easy to become the bottleneck of the system. We can also improve performance by adding slave libraries and optimizing indexes when the single table data volume is in the millions. Once the amount of data towards more than ten million trend growth, how to optimize the database, many operational performance is still declining seriously.

In the era of mobile Internet, massive users produce massive amounts every day, such as:

The users table
The order sheet
Transaction statement

Take Alipay users, 800 million; Wechat has 1 billion users. The order list is even more exaggerated, such as Meituan takeout, which receives tens of millions of orders every day. Taobao’s total historical orders should be tens of billions, even hundreds of billions of levels, these massive data is far from a table can Hold. In fact, MySQL can store 1 billion levels of data in a single table, but the performance is poor at this time. It is generally accepted that the MySQL single table capacity is at the level of 1KW, because its BTREE index tree height is between 3 and 5 at this time.

Since one table is not enough, you should try to put the data in multiple places. There are three common solutions at present:

Partition;
Sub-database sub-table;
No (/ NewSQL;

Note: only divide into database, or only divide into table, or divide into database and table fusion scheme are uniformly considered as divide into database and table scheme, because divide into database, or divide into table is only a special kind of divide into database and table. NoSQL is more representative of MongoDB, ES. A good example of NewSQL is TiDB.

Why separate database and table

Finally, the general method to deal with massive data in the Internet industry is introduced: separate database and separate table.

The workload of developing sub-database sub-table tool is huge. Fortunately, there are many mature sub-database sub-table middleware in the industry, so we can

To focus more time on business implementation

Sharding – JDBC (dangdang)
TSharding (Mushroom Street)
Atlas (Qihoo 360)
Cobar by Alibaba
MyCAT (based on Cobar)
Oceanus (58.com)
Vitess (Google)

However, so many sub-database and sub-table middleware can all be divided into two types:

CLIENT mode;
The PROXY pattern;

CLIENT mode represents TDDL of Ali and Sharding-JDBC of open source community (The 3.x version of Sharding-JDBC, namely sharding-Sphere, has supported proxy mode). The structure is as follows:

PROXY mode represents Cobar of Ali and MyCAT of non-governmental organizations. The structure is as follows:

However, both CLIENT mode and PROXY mode. The core steps are the same:SQL parsing, rewriting, routing, execution, result merging.

The author prefers CLIENT mode, which has simple architecture, small performance loss and low operation and maintenance cost.

Next, with a few common large table as a case, explains how to land sub-database sub-table!

Table design Principles

Primary key selection: The advantages and disadvantages of the service primary key and the auto-added primary key have been compared and analyzed. The conclusion is that the service primary key is more suitable for service query requirements, while most Internet services meet the characteristics of more reads and less writes. Therefore, all online services use the service primary key.

Number of indexes: The number of indexes should not be more than five because excessive indexes may cause large index files.

Column type selection: Generally, the smaller and simpler the better. For example, TINYINT is used for BOOL fields, TINYINT is used for enumeration fields, and LONG is used for transaction amount. Because BOOL and enumeration types can be easily extended using TINYINT, InnoDB provides DECIMAL types that support exact calculations, but DECIMAL is a storage type, not a data type, and does not support CPU native calculations, which is less efficient. So let’s just do a simple thing of converting decimals to integers and storing them in LONG.

Sub-table strategy: first of all to make clear the database performance problems in general after the amount of data to a certain extent! So we are required to make a good estimate in advance, do not wait for the need to split and then split, generally the table data volume control in ten million level; There are two types of table division strategies: select modules by key to ensure uniform read and write; According to time, cold and hot data is clear;

Practical cases

The first and most important step is the selection of Sharding column. The selection of Sharding column will directly determine the final success of the whole database and table scheme. The selection of Sharding column is strongly related to business. The author thinks that the method of selecting Sharding column is mainly to analyze your API traffic, give priority to the API with large traffic, and extract the SQL corresponding to the API with large traffic. These SQL common conditions are referred to as Sharding columns. For example, general OLTP systems provide services to users, and the SQL corresponding to these apis has conditional user IDS. Therefore, the user ID is a very good Sharding column.

Take several actual tables as examples to illustrate how to divide the database into tables.

The users table

User table several core fields are generally as follows:

Uid is the primary key. Services have two types of query requirements: UID and mobile. Therefore, indexes must be created on moblie. The switch column is of type BIGINT and is used to store user attributes of type BOOL. Each column can store one attribute of the user, for example, we use the first column to store whether to receive push, the second column to store whether to store offline messages, etc.

This design is very scalable (because BIGINT has 64 bits and can hold 64 states, which is difficult to use up in general), but it also presents some problems. Switch has a high query frequency. Since InnoDB is row storage, to find a switch query you need to fetch the positive row data.

What can we do to optimize the table design for the above scenario? A common scheme, and we’re not going to talk too much about this, is to sort the table vertically.

Another option is to take advantage of InnoDB’s ability to override indexes by creating a joint index on the uid and switch columns, so that the secondary index contains the values of the UID and switch columns. In this way, when querying a switch with a UID, only the secondary index can find the switch without accessing the record. You don’t even need to go to the leaf node of the secondary index to find the switch value to be queried, which is very efficient.

Another thing to consider is that switch changes are quite frequent, as you can imagine. Would a switch worth changing cause a change in the federated index (in this case, the index node is split or reordered)?

The answer is no! Since the uid of the first column of a federated index is unique and unchangeable, it already determines the order of the index. Changing the switch column only changes the value of the second key on the index node, not the index structure.

In common scenarios, users can log in using mobile_no, email, or username. However, some user-related apis all contain user_id, so it may be necessary to divide the database and table according to the four columns, that is, all the four columns are sharing-columns.

Account table

The core fields of the account table are generally as follows:

The API related to the account table generally has acc_NO, so use ACC_NO as sharding-column.

Complex queries

All of the above are SQL executions with Sharding columns in the condition. However, there are always some query conditions that do not contain Sharding columns. Meanwhile, it is impossible for us to have unlimited redundant sub-database and sub-table for these queries with low request volume. So what about SQL that doesn’t have a Sharding column in those conditions? Take Sharding-JDBC as an example, the number of sub-database sub-tables will be concurrently routed to the number of sub-database sub-tables for execution, and then the results will be merged. Specific how to merge, you can see the author sharding- JDBC series of articles, analysis of the source code to explain the merger principle.

Compared with the conditional query with Sharding column, the performance of this kind of conditional query is obviously much lower. If there are dozens or even hundreds of sub-tables, as long as the execution of one table is slow due to some factor, the entire SQL execution will be slow, which is very consistent with the barrel theory.

Even more, those operating systems in the fuzzy query, or ten criteria filter. In this case, it is difficult to create an index even for a single table, let alone for separate tables. So what to do? This is where the famous ElasticSearch, es, comes in handy. All data of sub-database and sub-table are fully redundant into ES, and those complicated queries are handed over to ES for processing.

All my order pages on Taobao are as follows, with multiple screening conditions and fuzzy matching of product titles, which cannot be solved even by a single table (the index cannot meet such a scenario), not to mention by database and table:

Specific analysis: It is best not to use multi-Sharding column unless it is absolutely necessary, because the cost is high. Because the user table has a great feature that its upper limit is certain, even if all the 7 billion people in the world are your users, this data volume is not large, so the author suggests to adopt the mode of single Sharding Column + ES to simplify the architecture.

conclusion

Finally, several schemes are summarized as follows (Sharding Column is referred to as SC) :

–	A single sc	Multiple sc	sc+es
Applicable scenario	A single	general	More extensive
Query timeliness	In a timely manner	In a timely manner	More timely
Storage capacity	general	general	larger
Code cost	A small	larger	general
Architectural complexity	simple	general	More difficult to

In a word, for massive data, and a certain amount of concurrent sub-database sub-table, by no means to introduce a sub-database sub-table middleware can solve the problem, but a systematic project. You need to analyze the entire table-related business and let the right middleware do what it does best. For example, if a sharding column query goes to the database table, some fuzzy queries go to ES, or multiple unfixed criteria are filtered, and massive storage goes to HBase.

After doing so many things, there will be a lot of work to do, such as the consistency of data synchronization, and after running for a period of time, the amount of data in some tables slowly reached the single-table bottleneck, this time also need to do cold data migration. In a word, database and table is a very complex system engineering. Any massive data processing, is not simple things, ready to fight!

Design practice of trillions enterprise MySQL massive storage sub-database and sub-table