Remember, if someone asks you what is the most effective way to do database tuning?

SQL optimization, distributed cluster, sub-database sub-table! Dry is done ~

But is it really appropriate to start with a separate table, and what do you understand about a separate table? When will it be divided? How many ways are there?

First of all, we need to know what the sub-database, sub-table are doing, the protagonist of this article or our MySQL as the first perspective. First of all, the literal meaning:

Depots:

A single database instance is split into multiple database instances and the data is distributed among multiple database instances.

Table:

From a single table is divided into multiple tables, the data is divided into multiple tables.

To know, for large-scale Internet projects, the data magnitude may not be what we can think of. It is common to add tens of millions of new data every day, and it is unrealistic to rely on a single MySQL server. Your neck feather in cow B, also can’t stop four teammates hang up!! Xiang Yu:??With the increasing amount of business data and website QPS, the pressure on the database is also increasing. The stand-alone database will soon reach the bottleneck of storage and concurrency, so it is necessary to optimize the database performance. The strategy of divide database and table is to reduce the storage pressure and scalability of single MySQL instance. The split table is to solve the single table after the large query bottleneck problem, frankly speaking, these problems are all relational database “hard wound”.

Today we are going to find out exactly when we are using this strategy, based on the common strategies and scenarios for library and table partitioning. Common strategies include: vertical sub-table, horizontal sub-table, vertical sub-library, horizontal sub-library.

One, unpretentious – sub – table

1, vertical table

Vertical table, or vertical cut table, does not feel that the strategy is based on fields! Split the fields in a table into different tables (main table and extended table) according to the activity and length of the fields.

Features:

Each table has a different structure;

The data of each table is also different. There is an associated field, usually a primary key or a foreign key, which is used to associate the data of the sibling table.

The union of all sibling tables is the full data of the table.

Scene:

There are a few fields are hot fields, update frequency is very high, to cut these fields into a table, otherwise InnoDB row lock is very disgusting, lock you, such as the balance field in the user table? No, my balance is stable. It’s always 0.

Large fields, such as text, are stressful to store because innoDB data and indexes are in the same file; At the same time, I like to use SELECT *, you know, this disk IO consumption, like fun, no one can bear.

There is obvious business differentiation, or the table structure design of the field redundancy; When some friends see the first point, they find that Chen ha ha is a vegetable chicken, how can there be a balance field in the user table? Something’s obviously wrong! Quickly to the comments area first spray Chen ha ha wave, and then smile happily found that it was a small tail, really shameless is it. Yes, so different business we have to separate the specific field, so that it is conducive to the subsequent expansion of business oh.

2. Level table

Horizontal score, also known as “horizontal cut”. Sharding is based on row data, generally according to the self-capacity of a column.

For example, we can divide the phone number table by the first two or three digits, for example, 131, 132, 133 → phone_131, phone_132, phone_133, there are 11 digits (10 billion) of phone number, it is normal to have a large number of people, who have every mobile phone these days, right? In this way, a large table is cut into dozens of small tables, and the amount of data is not down.

Have classmate asked that how do I know I this mobile phone number to check which table? You didn’t look at the red dots in the first two lines, why red? For example, I check 13100001111, then I cut the first three, dynamic Mosaic to the query table name, on the line.

Features:

Each table has the same structure;

The data in each table is different and there is no intersection;

The union of all tables is the full data of the table.

Scene:

The amount of data in a single table is too large or the growth rate is too fast, which has affected or will affect the EFFICIENCY of SQL query, increasing the CPU burden, and reaching the bottleneck in advance. Remember t the grade chart as early as possible, and don’t ask me why.

Two, fancy – branch library

Note that traditional repositories are not the same thing as the familiar clustering and master-slave replication. Multi-node cluster is to copy a library into N libraries, so as to achieve load balancing of multiple MySQL services through read and write separation. It is actually around a library, called Master library.

However, branch library is different. Branch library is to divide the master library into N, such as two, and then configure 2N slave library nodes for these two master libraries.

1. Vertical branch libraryVertical database cutting, too classic sharding method, based on the table for sharding, usually is to split out new business modules or integrated public modules, such as the most familiar single sign-on, authentication module. Familiar with the taste, remember once I put some useless table cut to a very good performance of the server, this server I specifically used to learn, later also do not know by which henchman informed against ~ Features:

Each library has a different table;

Table is different, data is more different ~ there is no intersection;

Each library is relatively independent and modular;

Scene:

When you can abstract out individual business modules, when you can abstract out common areas (such as dictionaries, common times, common configurations, etc.), or when you want to have a server of your own?

2. Horizontal branch library

Split data from one library into multiple libraries based on row data. Large sub-table experience? Frankly speaking, this strategy is not practical, because it will be very unfriendly to the background development, there are many pits, do not recommend the use of, understand.

Features:

Each library has the same structure;

Each library has different data and no intersection;

The union of all libraries is full data;

Scene:

System absolute concurrency up, CPU memory pressure. The partition table is difficult to fundamentally solve the problem of quantity, and there is no obvious business attribution to vertical partition, the main library disk is close to saturation.

conclusion

This article is here, I hope you learn waste! In fact, in the actual work, we should think about caching, read and write separation, SQL optimization and other aspects before choosing the strategy of database and table, because these can be more direct and less costly to solve the problem.

Remember that moving table is moving fundamental, you never know how much history behind this table will be associated with legacy problems, if it is a very large project, you meet some problems with the manager proposed to separate the database and table, be careful to be called to death ~

Original link: blog.csdn.net/qq_39390545…