How to choose a distributed database? What does qunar's database director suggest

As the Open Source Database Conference (ODF) draws to a successful conclusion in Beijing, many people may still be indulging in technology, because these technologies are the focus of every practitioner and even their own livelihood. Only such technical meetings can cause technical collision and resonance. As one of the big highlights of the conference, “MariaDB/MySQL vs PostgreSQL Armageddon,” the room was full of fireworks. I have the honor to participate, and become a member of the MySQL clan, personally, I think “Armageddon” may not be accurate, more should be collision, as ever, in the database world, two different database is on the stage of the public, they should be the first time walk so close, I fear is that this phenomenon will appear in the future.

In fact, the technology itself is good, I personally think, we should be in line with the “let a hundred flowers bloom, a hundred schools of thought contend” attitude to learn, to use. If it weren’t for PostgreSQL, MySQL wouldn’t be as popular as it is today, and if it weren’t for MySQL, PostgreSQL would feel lonely. The world’s Most Advanced Open Source Database and The World’s most popular Open Source Database are supposed to learn from each other, Mutual progress, so such “collision”, should be repeated in the future, looking forward to the next “open source database conference” arrival.

Is MySQL the only choice?

The fact is that MySQL is the “most popular” open source database, and PostgreSQL is the “most advanced” one, as was evident in Armageddon. PostgreSQL is the “most advanced” database, but PostgreSQL has fewer users than MySQL, the “most popular” database. MySQL has done a really good job with “popular.” It’s been very successful. As you can see, as long as the database is used, most people will consider MySQL, because this problem is quite consistent with my concept, so I think, any result has its deep reasons, MySQL popular, the reasons may be as follows:

Open source, which probably goes without saying, has no advantage over PostgreSQL, which is also open source.
Simple, MySQL entry can be said to be very simple, this we should have a feeling, as long as you want to use a database, in addition to using access, MySQL may be the only choice.
Plug-in type, plug-in type is also two-sided, on the one hand, limited his development; On the other hand, it is flexible and powerful, because there are so many plug-ins that you can choose from and use freely, and users value the latter more.
Preconceived, MySQL was popular when PostgreSQL wanted to be.
Promoted by large Internet companies, in the tide of going O, for the above reasons, the promotion of large Internet companies, the first choice is MySQL, resulting in the rapid development of MySQL.

There are many reasons, but as it turns out, MySQL is really popular, and with the advent of MGR, “done with MGR”, maybe it is.

However, MySQL also has its disadvantages, that is, its storage is a single point. Of course, this is also a common problem of large universal databases. Generally, multi-point redundancy is required to achieve high availability and high performance of data. However, if the amount of data exceeds the capacity of a single disk (currently, the maximum capacity of PCIe SSD cards is 12.8 TB), MySQL may encounter a bottleneck.

Where our solution, usually split on business, such as the total amount is 20 t, then 10 clusters, each cluster is 2 t the amount of data, it can solve the problem of storage, of course, this is from the business logic to solve above, combined with routing tables are needed to control data storage node position. Such a solution would solve the problem, but more people are probably looking for a more advanced solution, the popular distributed database.

The ideal solution is that we don’t need to care about the data store, we just need to write to or read from a node. Not only can the amount of data be any size, but when this node fails, we can also start another node “on top” at any time to achieve failover, thus realizing the true “cloud storage”. In such “cloud storage”, we do not need to worry about its high availability, multiple copies, capacity, performance and other issues, nor do we need to care about whether there is multi-point write, read and write nodes can be expanded at any time, maybe this is the distributed in our mind.

Therefore, from this point of view, MGR still has the problem of single disk, which cannot solve the distributed problem in the case of huge data volume.

Distributed database

Is there a better, similar to the distributed database in our mind, I think there is, at least to this direction. Qunar has been exploring, so my requirements are basically as follows:

1. To be compatible with MySQL, because I am a heavy researcher and user of MySQL, and highly approve of the architecture and usage of MySQL database. Compatibility with MySQL this requirement, in fact, is very high, we all know. But MySQL syntax is messy (speaking of code implementation, probably more is called), very loose, if achieve 90% compatibility, that is not enough, the best to achieve 100%, this can be done? I think so.

2. High storage rate, the use of distributed database business, most should be storage analysis type, if the use of distributed database, but also need to occupy too much hardware resources, and can not store too much data, then this is very high in the cost, outweighs the loss.

3. Have a perfect circle, will inevitably encounter problems in use, a problem, now in the early stages of development of distributed, so less people in the community, but only to turn to the official, if officials can’t help (and perhaps didn’t give money), that such a database, may not be attractive, too risky.

4. Sufficient performance, in the use of distributed database, in fact, has been accepted by default to reduce the performance requirements of the condition, so our requirements just say, sufficient performance, not to go and single point MySQL to compare, because it is meaningless. Enough is good, of course, in this respect, if enough is good, that is the best.

5. Less technology stack, such demand is very high, because the technology stack is too long, will increase the cost of operation and maintenance personnel, and under the current situation of difficult to find and recruit talents, such desire is more urgent.

Is there a distributed database that meets this requirement?

The SequoiaDB giant sequoia database, which was recently shared with the open Source community at the Open Source Database Forum (ODF) conference, came right into my sights. They have been doing distributed databases for many years and have only recently appeared in the MySQL community. One of the most important reasons is that they finally figured it out, or realized the importance of MySQL, so they also maintained a close relationship with MySQL, or more accurately, giant Sequoia database, and became a member of the MySQL community, a true MySQL system.

SequoiaDB Giant sequoia database

SequoiaDB is a Chinese database product. Technically, version 3.0 of SequoiaDB uses a computation-storage separation architecture, which loosely coupled SQL and storage engines for greater optimization in resource allocation and versatility, according to the official website. SequoiaDB’s data storage engine is a distributed JSON data storage engine developed by Giant Sequoia completely from scratch. All data management, distributed control, transactions, ACID support, and so on are done in SequoiaDB’s distributed storage engine. SQL layer, SequoiaDB directly uses the native parser of mysql through the connector (SequoiasQL-mysql) to achieve full compatibility with mysql, and currently supports PGSQL and SparkSQL.

Can giant sequoia database meet my needs?

1. Why do giant sequoia database belong to MySQL system? Because one thing it does is it’s 100% compatible with the syntax of MySQL, and more specifically, it becomes a plugin for MySQL, and when it comes to plugins, I think everyone is familiar with it, because you don’t get the sense that MySQL plugins are not part of the MySQL system. So this completely meets my first requirement. As a MySQL worker, I love to see this scenario;

2. In terms of storage rate, giant sequoia database only needs three nodes;

3. In terms of sound circle, I think, as a MySQL plug-in, the circle is large enough, because we can solve the problems of MySQL Server layer by ourselves, and the remaining Giant cedar database itself may need to continue to learn and share, but at least there are many fewer problems;

4. In terms of performance, we have tested that in the case of only reading and writing to one IP port (data without partition, SDB has only one node), the performance is basically two thirds of that of MySQL single point, which is acceptable, because as a distributed database, such a way of use is necessarily inferior to that of MySQL single point. Here the focus is on testing how much performance loss, if you want to improve performance, you can increase partitions, or increase coordination nodes and other ways to achieve, so as to maximize its distributed advantages;

5. In terms of technology stack, this is closely related to MySQL. For the Server layer, it is easy to do, and the giant Sequoia storage engine is only a few independent processes, with a clear and simple architecture and no great difficulty in maintenance.

Giant fir database architecture design details

Above is the architecture diagram of the Giant Sequoia database. There are multiple modules involved here. Here is an explanation:

1. Coordination node: it is used to do data routing. Its function is more like a middleware, and it will determine the storage location of data according to the data access KEY and catalog node. Multiple coordination nodes can be used to provide higher performance;

2. Catalog node: used to store routing information and cooperate with data nodes to finally locate data;

3. Data node: used to store data;

SDB Plugin: This is the MySQL plugin, the Giant cedar database itself has nothing to do with MySQL, but MySQL through this plugin, all the interfaces to access data, the relationship between the two has been established, so SDB plugin is more of an adapter. MySQL Server and giant Cedar database protocol converter.

Architecturally, this is a true implementation of MySQL’s cloud storage solution. At this point, MySQL Server itself does not store anything, and its role is more transformed into a middleware.

As a storage engine, when creating a table, it also needs to follow the rules of MySQL itself, such as creating an FRM file. In fact, IN my opinion, this FRM file is not strongly associated with the corresponding table in the Giant Sequoia database. It is just to “fool” MySQL Server to let it know that this table exists and can access the database normally. Then, after fooling MySQL Server, it will go to the storage engine layer for access.

After successfully passing various tests of MySQL Server, access to the storage engine, because Giant sequoia realized all the storage engine and Server layer interface, so the access to the storage engine, will be successful access to giant sequoia SDB plugin, For example, fetching a piece of data, writing a piece of data, fetching data with conditions (condition Push Down feature added in MySQL5.6), etc. As long as the correct data can be successfully returned to the interface requested by the Server, the Server layer will normally process these data and finally return them to the client.

Architecture of MySQL+SequoiaDB

Before sending data or requests to the Giant Sequoia storage engine, or sending data back from the storage engine to the Server layer, all of these operations have nothing to do with the Giant sequoia, it’s all the work of the MySQL Server layer. This includes parsing, semantic analysis, query optimization, MDL locks, database permissions, master/slave replication if replication is enabled, and of course some of the commands we run regularly, such as show ProcessList; information schema; Query MySQL database information. Given these familiar characteristics, this approach is very tempting.

From the above implementation principle, the architecture from MySQL Server to Giant Sequoia database should be as shown in the figure above. SDB Plugin itself does not store data, so its role is transformed into a lightweight middleware. Sdbplugin forwards application requests to the coordination node. Is the world of giant fir database, I will not repeat, I here focus on the problem of its own architecture.

The best way to frame MySQL Server as an intermediate layer is to have multiple sets of MySQL Servers running, and applications can access any one of them. This has the following benefits:

1. The read/write performance is enhanced.

2. Read/write separation is implemented.

3. Failover can be implemented

4. Supports the multi-point write function, which has no impact on failover.

In this architecture, configuration refers more to its metadata, such as the table structure, which is used to “fool” MySQL Server, because if this information is not synchronized across multiple nodes, First of all, the MySQL Server layer will not be able to pass smoothly, so it will not be able to access the giant sequoia.

The current solution to this problem is that Giant Sequoia provides a script to subscribe to DDL operations of MySQL Server layer through the audit function of MySQL Server, so that when there is metadata change, it will automatically synchronize to other MySQL Servers, so as to realize the metadata synchronization function. This implementation, both familiar, but helpless, because this is MySQL, plug-in has no way to provide such an interface to synchronize metadata, had to implement this. However, if you are familiar enough with MySQL, this implementation is a good idea.

Let’s further refine this architecture, as shown below:

A Keepalived layer is added to the front of multiple MySQL servers (middle layer) to bind the VIP to the Server. If one of them goes down, Keepalived will automatically switch to the live Server to implement automatic failover. Of course, you can also use Keepalived to access MySQL Server without adding this layer, because there is a problem with Keepalived is the VIP switch, during normal maintenance, it will also affect the business, so it may produce a little unfriendly.

Of course, if you want to achieve no impact on services during normal maintenance, you still need to evolve again. The solution is as follows:

In this case, if MySQL Server is properly maintained, you only need to configure in the common middle layer to take the maintenance node offline, so that traffic will not be routed to the maintenance node. After all operations are completed, the maintenance can be normal. Or if it fails, the sentry will notice its state change, the sentry will connect to the common middle tier, make configuration changes, and then failover can occur.

But this architecture in fact, nothing too big necessary, because this time, the MySQL Server is very lightweight, and its role has become an intermediary role, only when the machine needs to turn it off, or some MySQL parameter is read-only, but when I have had to modify to maintenance, impact will not be too big.

As for architecture, the giant sequoias itself many nodes, how to achieve high availability failover I think it can be more reference principle of their implementation, as I know, its content is to use a raft of similar algorithms, state elections do judgment, etc., these are the current mainstream distributed implementation method, should be reliable enough, here no longer do too much here.

Test case

summary

1. Self-increment column is not supported. Self-increment column is widely used in MySQL, but when actually used, it is a meaningless column and is not used in business logic. Therefore, at present, giant sequoia does not have columns supporting this attribute, which is not a big problem for new business. For the migration of old business, it may not be good enough for compatibility. But after the exchange, officials said it would be available within a few months.

2. Metadata synchronization, which has been mentioned before, is not perfect, and the functionality is there.

After testing the functionality and performance of SequoiaDB, it meets our requirements and will be available in suitable scenarios in the near future.

The future of MySQL is definitely bright, and I hope that more open source software will join the MySQL community to collide, share and grow together. We also hope that Giant Sequoia will soon build a community of technology sharing that will benefit more people in the open source community.

About the author:

Wang Zhufeng: Database director of Qunar, executive Director of Open Source Database Committee of China Computer Industry Association. Good at database development, database management and maintenance, has been committed to the research and exploration of MySQL database source code, has a deep understanding of database principle and implementation. Worked at dream database, engaged in years of database kernel development work, after to renren, served as a senior database engineer, now in where network is responsible for the MySQL source research and operations, database management and automatic operations platform design and development and practical work, is Inception open source projects and the MySQL operations inside of the author, It is also one of the few MySQL oriented Oracle ACE in China.

ACMUG’s motto is: Open source, Open, Happy

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

How to choose a distributed database? What does qunar’s database director suggest

How to choose a distributed database? What does qunar’s database director suggest

Related Posts

TableStore combat: GEO index to create a hundred million weight store search system

Eclipse, android Studio beginner tutorial (2) project import

FalconStor, the storage company, is delisted