How to store data of tens of millions with MySQL?

How many tables can a MySQL table hold?

According to the official documentation of MySQL, the theoretical limit of MySQL is (232)2 data, however, in practice, it is often limited by the following two factors:

  1. Myisam_data_pointer_size, MySQL myISam_datA_POinter_size, MySQL myisam_datA_POinter_size, MySQL myisam_datA_POinter_size, MySQL myisam_data_POinter_size, MySQL myisam_data_POinter_size, MySQL myisam_data_POinter_size, MySQL myisam_data_POinter_size
  2. The table storage size is 256TB

So someone might say, well, as long as I don’t exceed the maximum size of my data and I don’t exceed the maximum number of rows, isn’t that a problem? Not really.

In practice, no project actually triggers the MySQL data limit, because when the amount of data becomes large, the query speed will be extremely slow, and usually at this time, your data amount is far from the theoretical limit of MySQL!

Traditional enterprise applications generally have a small amount of data, and the data is relatively easy to deal with, but in Internet projects, tens of millions of data, hundreds of millions of data is not uncommon. At this time, but also to ensure the operation efficiency of the database, we have to consider the database of the table.

Then the next simple talk with you about the database sub – database sub – table.

Database sharding

Divide a database into N database instances and store them on different database instances. This has two advantages:

  1. Reduce the load on a single database instance
  2. It can facilitate the expansion of the database

In general, there are two different sharding rules for databases:

  1. The level of segmentation
  2. Vertical segmentation

Let’s take a look at the two different sharding rules.

The level of segmentation

Let’s start with a simple diagram to give you a sense of what horizontal segmentation is:

If I have table-1, table-2, and table-3 tables in my DB, horizontal shard is to take my excellent sword, aim it at the black line, slash a sword or slash N sword!

Once the hack is complete, place the cut portion in another database instance as follows:

Mysql > insert table into DB; mysql > insert table into DB;

  1. The number of tables in both DB is complete, that is, how many tables there were in DB before, and now there are still several tables.
  2. The data in each table is incomplete, the data is split into different DB.

This is the horizontal sharding of a database, also known as sharding by row, that is, dividing the table data into multiple libraries according to certain rules of a certain field in a table, with each table containing a portion of the data.

What are some of the rules here? This involves the sharding rules of the database, which Songo will expand in detail with you in later articles. Here are a few common sharding rules:

  1. Data by date: Data that cannot be dated is stored in different databases.
  2. Model ID: Model the ID field in the table, and save the data to different instances according to the model result.
  3. The consistent hashing algorithm is used for segmentation.

Detailed usage, will be in the following article and we carefully said.

Vertical segmentation

Let’s start with a simple diagram to get a feel for vertical segmentation:

The so-called vertical segmentation is taking my dragon slayer knife, aimed at the black line. After the hack is complete, put the different tables into different database instances and look like this:

At this time, we found the following characteristics:

  1. The number of tables in each database instance is incomplete.
  2. The data for the tables in each database instance is complete.

That’s vertical segmentation. In general, vertical sharding can be divided by business, with tables for different business placed in different database instances.

To be honest, in a real project, database vertical partitioning is not easy, because there are often complex cross-library JOIN problems between tables, so it is a test of the architect’s level how to choose or not!

Advantages and disadvantages analysis

Through the above introduction, I believe you have some understanding of horizontal segmentation and vertical segmentation, the advantages and disadvantages are also very obvious, Songko and you sum up again.

The level of segmentation

  • advantages
  1. The biggest advantage of horizontal sharding lies in the good scalability of the database. After selecting the sharding rules in advance, the database can be easily expanded in the later period.
  2. Effectively improve the database stability and system load capacity. Split rule abstraction is good, join operation can be done by the database.
  • disadvantages
  1. After horizontal shard, shard transaction consistency is not easy to solve.
  2. Split rules are not easy to abstract and require a high level of architect.
  3. Cross-library join performance is poor.

Vertical segmentation

  • advantages
  1. Generally, services are split based on services. After splitting, services are clear and can be used together with microservices.
  2. It is much easier to integrate or extend systems.
  3. Data maintenance is relatively simple.
  • disadvantages
  1. The biggest problem is that there is a single library performance bottleneck, data table expansion is not easy.
  2. Cross-library joins are not easy.
  3. Transaction processing is complex.

conclusion

Although the theoretical upper limit of data storage in MySQL is relatively high, but in the actual development we will not wait until the data can not be stored before considering the problem of database and table, because before that, you will obviously feel that the performance of the database is declining, will start to consider the database and table.

Well, today I’m going to introduce you a little bit of conceptual stuff, a little bit of preparation for the formal introduction of our distributed database middleware.

References:

  1. MySQL official documentation

Pay attention to the public account [Jiangnan little Rain], focus on Spring Boot+ micro service and front and back end separation and other full stack technology, regular video tutorial sharing, after attention to reply to Java, get Songko for you carefully prepared Java dry goods!