preface

We all know that startups start out as monolithic applications, usually in the form of monolithic libraries. But versioning and iteration of versioning, the need for databases to tolerate more high concurrency has become a point of architectural consideration.

So to solve the problem, we have to talk about the solution. But there are so many options. How do we choose?

Optimization and scheme

Basically, we should optimize from several key words: short distance, few data, scattered pressure.

Short distance

By short distance, I mean a short path from the front end to the database.

  1. The page is static. If the data on a page is constant at certain times, the page can be static, which can improve the speed of access.
  2. Use caching. Caching, as you know, is fast because it’s based on memory. So using memory-based caching can reduce database access and speed up access.
  3. Batch read. In the case of high concurrency, multiple queries can be combined at once to reduce the speed of accessing the database.
  4. Delay modification. Delayed modification means high concurrency may be to put multiple changes in the cache, and then periodically update the cached data to the database. It can also be asynchronously synchronized to the database through a cache synchronization policy and through resolution.
  5. Use indexes. There are many types of indexes, such as normal index/primary key index/composite index/full text index, etc.

According to the minority

The so-called minority data, in fact, is the query data to be less.

  1. Table. The so-called table, in fact, there are horizontal segmentation and vertical split. If you’ve ever played a single player, you know that there are millions of levels of data in historical forms. This way for MySQL, even if the index is added, SQL continues to optimize, it is difficult to achieve faster query speed. So we can achieve through the operation of the table. For example, most commonly we can split the table horizontally according to the time dimension, so that this year’s data is kept, and last year’s data can be stored in another table.
  2. Separate the active data. This is actually a bit like caching, but the difference is that the data is still in MySQL. For example, in a business that looks up items, some of the most popular/frequently searched items can have an active list. Query the active table first, if not, then query the total goods table.
  3. Block. This chunking is similar to the “index order lookup” in the algorithm. Through optimization at the data level, the data is placed in different blocks, and we only need to calculate and find the corresponding blocks.

Distribute pressure

The so-called dispersion pressure is actually the pressure of dispersing different database servers

  1. The cluster. The concept of cluster is very clear to everyone. For business servers, it means deploying multiple servers with the same business process and distributing requests to different servers through load balancing or other means. Databases, too, are directed to specific database servers through specific rules and policies.
  2. Distributed. The so-called distribution is actually to allocate the business logic in the same process to different servers for execution, which achieves the effect of “concurrent” execution and speeds up execution.
  3. Separate database and separate table. Subdivision table is mainly horizontal split and vertical split. For a single table with high access frequency and a large amount of data, the data in a single table can be reduced, and the database throughput can be increased by horizontal splitting according to specific dimensions. For multiple tables with low service coupling, different tables can be stored in different databases to split the database vertically to improve database write capability.
  4. Establish primary and secondary. The purpose of the master-slave system is to separate reading and writing. As we all know, as long as the transaction level of the database is high enough, concurrent reads do not affect data clutter, while concurrent writes do. So setting up a master slave Generally, writes are left on the master and read on the slave. So basically let the primary server do transactional operations and the secondary server do select queries. In this case, changes resulting from transactional operations (add/delete/modify) are updated to the secondary database in the cluster.

conclusion

Finished!