DTCC 2021 Dongxu Huang: From DB to DBaaS, the present and future of database technology

From October 18 to 20, the 12th China Database Technology Conference (DTCC2021) was held in Beijing International Convention Center. PingCAP co-founder and CTO Huang Dongxu was invited to the main venue for a “TiDB Cloud: “From Product to Platform” as the theme of the speech, shared the importance of the Platform of database products in the cloud native era, and TiDB from DB to DBaaS experience and experience. The following is a transcript shared.

In recent developments in the database industry, the science issue has become more prominent than the engineering issue of “how well you write code” : one thing that has changed the database industry very profoundly is what has happened at the bottom. In the past, when we think about database software and system software, we will first make an assumption: software is running on specific hardware such as computers, even if it is distributed database, each node is still an ordinary computer. Now that assumption has changed: our next generation, when they’re old enough to program or write code, won’t be able to see cpus, hard drives, and networks the way we do now. Instead, they’ll be able to see an S3 API provided by AWS. In fact, this change is not only the change of software carrier, but more importantly, the underlying logic of architecture and programming has changed. The impact and change of the cloud on infrastructure and software is profound. When it comes to PingCAP, the biggest feeling is that there is probably a lot more investment in TiDB Cloud services on the Cloud than there is in database kernels. This is the topic I want to share today, From Product to Platform — From DB to DBaaS, the current and future of database technology.

PingCAP’s original idea

The figure above is my understanding of the evolution of the database. Back to more than ten years ago, we started to use standalone MySQL. During this period, our demand for database was only simple increase, deletion, change and check. Until today, the outbreak of data around 2010 makes standalone database difficult to sustain, so we can only realize distributed deployment through sub-database sub-table or middleware.

However, the intrusion of sub-database and sub-table on business is too big, so can we have such a database, which is as simple as standalone MySQL, but does not need to consider sharding when expanding, but through the mechanism of the system itself to achieve flexibility, comfort, business non-intrusion expansion? That’s how PingCAP started.

In order to achieve this small goal, PingCAP has summed up the following points since its establishment more than six years ago:

Ease of use first: protocol over implementation

The MySQL protocol is more important than the MySQL software. If a database is compatible with MySQL protocol, users can have the largest user base without considering the impact on application and business in the selection process of database. We don’t need to invent a new way to use it, just like electric cars will still be operated by steering wheel and accelerator, although the world under the engine is completely different from petrol cars.

User experience first

Database performance indicators such as TPS, QPS and so on are important, but the user experience is the key to the success of a database. Therefore, TiDB makes all technical decisions through User Experience Matters. In my past experience, many Internet companies have so many different databases to maintain that each new database opens up another data island. Thus, a simplified technology stack may be a real pain point for users while meeting their data processing needs. Whether it’s OLTP, OLAP or HTAP, what TiDB wants to do is make everyone’s life a little bit better.

Open source is preferred

PingCAP has benefited from its open source strategy. From an ecological point of view, the open source development model can quickly accumulate users. TiDB version 1.0 was released in November 2017. Since its birth, we have known more than 2000 users and more than 1500 contributors. PingCAP ranks the sixth in the Contribution Rank of CNCF open source organization.

From a technical point of view, open source speeds up product iteration. The vertical axis is the amount of code, the horizontal axis is time, and the different color blocks are the amount of code written in a given year. As you can see from the figure, the TiDB code is being rewritten almost every year, almost every year is the same as last year’s code. The speed of iteration, achieved through the open source community, is a speed of evolution that no one team, no one company, no one enterprise can achieve by building a database from scratch.

Why DBaaS

Instead of focusing on TiDB’s product capabilities, I’d like to talk about how important it is to turn a product into a cloud service. First of all, throw out a final conclusion, now this era for CIO especially overseas customers, database products to cloud adaptation has become a must.

We are now standing at the junction of time. Technically, the evolution of databases is the process from Standalone to cloud-native. Now we’re at the second red line, the boundary from shared-nothing to cloud-native. From a business perspective, the business model of the entire database and basic software industry is also undergoing a particularly big change: in the past, we wanted to sell licenses for private deployment, but now we want to achieve scale expansion, which is the on-PREM to DBaaS transformation. As a successful database commercialization of the company, MongoDB out of a very representative path. MongoDB’s market value has been doubling every year, to more than $30 billion. As can be seen from MongoDB’s financial statements, MongoDB Atlas, a DBaaS product, basically maintains a compound annual growth rate of more than 100% every year, which is where the value of cloud services lies.

TiDB platform on the cloud

Over the past two years, I have redefined our vision and mission: Anywhere with Any Scale, Anywhere for developers around the world. To achieve this goal, DB to DBaaS is a must. Only services in the cloud can break through geographical limitations and provide unlimited computing power. From DB to DBaaS, there is much more to consider than simply switching the underlying resources to the cloud. Technically, to achieve cost reduction and efficiency improvement, operation and maintenance automation, multi-tenant management, data security should be considered in compliance, and pricing model and commercialization strategy should be taken into consideration in business. Next, I will talk about TiDB’s efforts in the DBaaS process from a technical perspective.

Cost savings: Separate architectural designs

That’s what cloud native technology is ultimately aboutThe cost ofThe problem.In the past, TiDB had a TiDB + TiKV co-processing engine, and the boundary between computation and storage was very fuzzy, making it difficult to handle scenarios with different load rates. In the case of local deployment, if storage capacity needs to be increased, storage nodes need to be added. Due to hardware limitations, CPU and network bandwidth will increase synchronously in addition to disks, resulting in a waste of resources, which is a problem faced by all shared-nothing databases.

Up in the clouds, things are different. For example, AWS block storage service EBS, especially GP3 series, can run on different machines and achieve the same IOPS and Cost. The performance and integration of cloud native are very good. In order to make use of the characteristics of GP3, can we move the boundary of computation and storage down from the original TiKV to storage, and now most of TiDB and TiKV can be computing units, which is more flexible?

The cloud’s cost savings don’t stop there. The real money in the cloud is CPU, and the bottleneck will be computing, not capacity. Clusters and instances can be optimized based on Spot Instances & Clusters based on shared resource pools, select the type of storage service on demand, and perform optimization for different types of EC2 Instance combination delivery in specific scenarios, serverless computing, and flexible computing resources will all be possible. Furthermore, in addition to computing storage separation, the network, memory, and even CPU cache will be separated, as far as I can judge. Because for an application, especially a distributed one, the requirements for hardware resources are different. No matter what business, just like cooking, there is only one dish in hand can not do what flowers, but a lot of raw materials, you can do the combination according to the taste, cloud brings is such an opportunity.

security

In addition to cost, cloud security is also an important issue. The public clouds officially supported by TiDB are AWS and GCP. 太阳太阳 network users are using their own VPC on the cloud, and there is also a link of opening up at 太阳. We can’t see the user’s data, but the user can access his/her business with high performance. How can we ensure the security?

This is TiDB’s security architecture, and the security architecture on the cloud is completely different from our thinking on the cloud. A particularly simple example: in the cloud, you only need to consider permissions within the RBAC database, but in the cloud, it is very complicated, requiring a robust and secure system for users, from network to storage. The key to good security on the cloud is never to reinvent yourself, because there are almost always security holes. So we are now taking full advantage of the full set of security mechanisms that the cloud provides, such as key management and rules. Of course, the best part is that these services can be clearly priced, just make a billing model.

Operation and maintenance automation

Another important point about DBaaS construction, and actually related to cost, is the automation of operations and maintenance. The cloud is a scale business, and one of the most difficult parts of the domestic database business right now is delivery. A big client wants 20 people on the ground, but is it sustainable? . What we want to achieve is to be able to support a 1000 customer system with a 10-person delivery team, which is a prerequisite for scale.

These are TiDB’s own cloud service technology selection, which uses Kubernetes for cloud deployment, Gardener for federal management, and the management of multiple Kubernetes clusters. Pulumi is an infrastructure-code automation tool.

Kubernetes

What are the steps to turning TiDB into a cloud service? The first step is to code all human operations. TiDB to expand, not human capacity expansion, the system itself can expand? TiDB failure recovery, people can not participate, can the machine participate? We have changed the operation and maintenance of all TiDB into Kubernetes Operator, which means we have realized automatic operation and maintenance of TiDB. Kubernetes can mask the interface complexity of all cloud vendors, and every cloud vendor will provide Kubernetes services.

Pulumi

I just said that if the logic of deployment, operation and maintenance and scheduling of these things is written by people, it will be unstable and unmaintainable. Our philosophy is to solidify anything that can be turned into code, never rely on people, including opening a server or buying a virtual machine, we will turn it into a script in Pulumi programming language.

Gardener

TiDB uses Gardener’s API to manage and control Kubernetes clusters in different regions, and each Kubernetes cluster can be divided into TiDB clusters of different tenants to form a large system with multi-cloud, multi-region and multi-AZ. One benefit of this architecture is that users can enable TiDB on demand in the cloud service providers and geographic regions where their applications reside, keeping the technology stack unified.

Commercial SLA

There are also many things to consider in the SLA, which is what TiDB will do and is doing. TiDB has a large number of overseas customers, whose demand for database is very different from that of domestic users, and cross-data center is a rigid requirement. Due to current data security requirements in various countries, there are many restrictions on the transmission of data, and compliant, cross-data center capabilities are important for databases. For example, in the face of Europe’s GDPR control, if some data can be kept in Europe, only those things that are not under the control can be released, it will save a lot of trouble. We believe that this ability will become a critical requirement for Chinese manufacturers and customers, including manufacturing and domestic compliance. This function can be easily implemented in the cloud. For example, AWS itself is a multi-AZ and multi-region architecture, without considering the underlying layer. Users only need to click the mouse on the interface to open several machines in another data center, and the data will be gone. There is much more to consider if you are dealing with global data distribution or global and Local transactions. Now TiDB is ready for a rainy day, and it’s coming soon.

To provide services on the cloud, technology is important and compliance is a prerequisite. Ecological integration on the cloud has a main line, which is to follow the data. The upstream, downstream and control of data are the three most important points. Upstream of TiDB is the data files in MySQL and S3, and downstream only needs to support synchronization with Kafka or other message queue services. In terms of data management and control, especially for overseas users, it is more hoped to get through with platforms like DataDog and Confluent than to do overall management and control through database manufacturers. As a final note, TiDB will come with Q4 with a free 12-month trial for developers, quick deployment, HTAP support by default, computing isolation via containers, and dedicated block storage for everyone to use on the cloud. Our website is tidbCloud.com and we will support domestic cloud in the future. We look forward to your experience and feedback. Let’s hope PingCAP can do exactly that: provide developers around the world with Anywhere with Any Scale.