Breaking capacity limits: TiDB's secret of "insensitive expansion" of massive data

For any rapidly growing business, dealing with peak traffic impact is always a big challenge for the technical team. In the face of massive data, database and business teams all hope to achieve “insensitive capacity expansion”, but the popular scheme of separate databases and tables often fails to meet the requirements in terms of capacity expansion speed and consistency. The industry is looking forward to a new database solution with powerful performance and easy to use, which fundamentally solves the bottleneck of database performance in the face of peak traffic.

Industry demand is the biggest driving force of technological innovation. In recent years, the TiDB distributed database developed by PingCAP has a great advantage in the field of mass data processing. In this context, jingdong Zhilian Cloud and PingCAP jointly built a Cloud distributed database — Cloud-TiDB based on TiDB at the beginning of 2020.

On November 26, JD Zhilian cloud and Intel jointly held an online live event with the theme of “Breaking the limit, TECHNICAL Architecture and practice of TiDB in JD Zhilian Cloud”. The live broadcast invited Ge Jibin, architect of product R&D department of JD Zhilian Yunyun, and Qi Zheng, ecological technology evangelist of PingCAP TiDB, to share their ideas respectively. We hope to take this opportunity to help more enterprises and developers expand their thinking and provide a new option other than the way of dividing database and table. And understand how to bring the value of TiDB into play in production practice.

This article is from this live share, the content is adjusted.

I. Technical architecture and practice of TiDB in JD Zhilian Cloud

In the first part of the live broadcast, Teacher Ge made an in-depth analysis of why JINGdong Zhilian cloud chose TiDB database, and introduced the technical architecture and technical ecological details of TiDB on Jingdong Zhilian Cloud.

1. What problems the TiDB database wants to solve

Traditional stand-alone database has exposed more and more limitations in the current era of big data. For a rapidly growing enterprise, due to the orderly expansion of data volume with the enterprise scale, stand-alone database will soon encounter multiple bottlenecks:

Large amount of data in single table and single library;

The single-node storage capacity reaches the upper limit.

Single machine processing capacity reached the bottleneck.

Read latency and storage requirements increase, and write performance cannot be extended.

In order to continue to improve performance, the traditional method is to separate databases into tables, but there are many natural constraints.

In terms of high availability, MySQL needs to integrate external programs to handle it.

Customized policies are required for fault detection, master/slave judgment, and transfer of MySQL.

Asynchronous and semi-synchronous replication are inconsistent, increasing the risk of data loss.

In addition, MySQL’s OLAP data processing capability is weak, and data analysis needs ETL to external analysis system. All above are the bottleneck of traditional database scheme. The birth of TiDB, is to solve these traditional database common problems, hope to break through the ability limit of stand-alone database fundamentally.

On a technical level, what medicine does TiDB have to deal with these problems? The first thing to be clear about is that TiDB is not a traditional stand-alone architecture, but a truly distributed database. The architecture design of computing and storage separation provides horizontal linear scalability. It also has strong consistency, high availability, automatic failover, and real-time analysis of data. It is also highly compatible with the MySQL protocol. In terms of the overall architecture, TiDB is divided into TiDB Server, distributed storage layer and PD:

TiDB Server is compatible with the MySQL protocol and can be horizontally extended, so users can use TiDB as MySQL.

TiDB storage tier TiKV is distributed KV storage that scales linearly and ensures strong consistency through multiple copies and Raft protocol. TiKV is divided into multiple regions and managed by region. The data is distributed on each TiKV node, which can be extended horizontally.

PD is responsible for cluster management, including scheduling and load balancing, and for generating global TSO timestamps. PD itself is also a single point of failure free cluster.

TiDB uses the TiFlash column storage engine to support real-time data analysis. It replicates asynchronously through Raft Learner, provides strong consistent reading with MVCC, and computationally pushes down to make AP/TP functions non-interfering with each other. With TiFlash, the TiDB optimizer calculates the cost of the query and automatically selects TiKV row storage or TiFlash column storage based on the results.

Based on this architecture design, TiDB cluster achieves overall high availability and strong data consistency. Even if a few copies are lost, data recovery and failover can be completed automatically without interfering with the business layer. TiDB enables cross-center remote multi-active deployment.

2. Implementation and functions of TiDB on cloud

In recent years, customers of JD Zhilian cloud have increased their demand for data processing capability. In view of such demand, JD Zhilian Cloud and PingCAP jointly create a Cloud distributed database — Cloud-TiDB based on TiDB, which is mainly oriented to high performance, high reliability and high availability scenarios.

The figure above shows the overall architecture of JD Zhilian Cloud-TiDB. Based on this architecture, Cloud-TiDB provides some functions with high business value, including horizontal elastic capacity expansion, backup and recovery, real-time data analysis, data migration and synchronization, Cloud monitoring and alarm, etc.

** Elastic horizontal expansion. **TiDB can dynamically add and subtract storage and compute nodes online, with nearly unlimited horizontal scaling capability. After scaling, the database can automatically rebalance data.

** Backup and restore. **TiDB supports automatic/manual full backup and backs up data in OSS. Data will not overwrite the original cluster during recovery, preventing misoperations.

** Real-time data analysis. ** Provides real-time analysis and processing of business data while supporting OLTP.

** Data migration and synchronization. ** Supports full/incremental migration and can synchronize data to MySQL and Kafka downstream storage.

** Monitoring and alarm. **TiDB provides rich monitoring metrics and supports browser direct access. Cloud monitoring provides monitoring alarms at the resource and service levels. Cloud logs can be configured with error log monitoring alarms. In addition to the above capabilities, another advantage of Cloud-TIDB in practical applications is lower operation and maintenance costs. Cloud-tidb can well meet the demand of on-demand Cloud service scaling, so that users can accurately control resource usage and avoid resource waste.

Choosing a new technology is also choosing an ecology. The more complete the ecology is, the higher the efficiency of development and operation and maintenance will be. One feature of the TiDB ecosystem is that it is compatible with the MySQL protocol, thus benefiting from the mature MySQL ecosystem. All MySQL database drivers, third party development/management tools, data exchange/migration tools, etc., can be used for TiDB database.

TiDB can also be easily interlinked with other mainstream data processing technologies. For example, TiDB data can be imported into Kafka, connected to Flink, and even Hive, HDFS, Amazon S3, and Spark. Users do not have to worry about the risk of technology lock-in, which lays the foundation for TiDB’s ecological prosperity.

At the end of the sharing, Ge made a prospect of the development trend of cloud database:

Distribution is one of the important trends in the future development of technology, including operating systems, applications and databases are all turning to distributed. As a distributed database, TiDB is in line with the development trend of this technology. At the same time, the cloud on the database can bring many benefits, such as flexible scheduling, integration with AI, better understanding of the user’s business perspective, and intelligent optimization of data processing.

In the long run, the cloud on the database can gain a lot in terms of development, operation and stability. Because of this, JD Zhilian cloud choose TiDB cloud, is hoping to give users a better use experience.

Ii. Application of TiDB in large data volume and high concurrency scenarios

After Teacher Ge’s speech, Qi Zheng, TiDB ecological technology evangelist from PingCAP, shared TiDB’s application practice in large data volume and high concurrency scenarios.

1. Comparison of TiDB and SHARDING’s solutions in OLTP scenarios

When enterprises meet the demand for massive data, they are often accompanied by the pressure of rapid increase of data volume in a short time. Such a service requires the database to be capable of rapid expansion, high concurrency, and high enough in response latency and throughput to cope with sudden traffic. The OLTP scenario mainly involves online 2C transactions and has high requirements on database stability. Fluctuations in database performance directly affect user experience.

In view of such demand characteristics, common solutions in the industry are divided into Sharding, New SQL and the middle DB-based several major types. ShardingSphere and TiDB are two examples of the first and third types. Both are active open source projects and represent two approaches to the need for massive amounts of data. The so-called Sharding is the division of database and table. In practice, it is mainly divided into horizontal splitting and vertical splitting. Vertical latitudes can be divided by service modules or data series, while horizontal latitudes can be divided by modules, time, and hot and cold storage. Sharding Sphere is the representative of the idea of database and table division, and its structure is roughly as follows:

Compared with TiDB, Sharding requires third-party tools to configure data backup, high availability, and alarm monitoring. And TiDB itself is a complete solution, can one-stop meet users of high-performance database requirements. Now ShardingSphere also begins to transform to the overall distributed database scheme in the new version, which proves that distributed database is the inevitable trend in the future.

2. Massive data application cases of TiDB

TiDB was originally intended to solve many of the problems of database and table partitioning, but some scenarios are not suitable for migrating to TiDB. Specifically, such a scenario does not have rapid business growth, simple business requests, and no distributed transaction requirements. With the exception of such scenarios, most of the massive data requirements can be better addressed by migrating to TiDB. Qi teacher here enumerated a few practical cases.

A community personalized home page and push services. Due to the personalized push service feature of massive users, the database needs to generate 3 billion pieces of data every day, and the historical data is up to trillions of magnitude. The service is also highly sensitive to throughput and delay. The user’s original MySQL scheme was based on sub-database and sub-table, but the total number of MySQL instances reached hundreds, and the risks and delays were difficult to meet the requirements. After research, users decided that TiDB was the only solution that could meet their needs for high scale, strong consistency, and high availability, and decided to make a full migration. During the migration process, PingCAP developed a quick import tool Lightning combined with DM tool to smooth the data transfer. After the migration, PingCAP made a series of optimizations and finally met the requirements well. In particular, users are satisfied that the new architecture has a strong expansion capacity. After migration, the data volume gradually increases from 1.3 trillion to 1.8 trillion, and the performance and availability remain at a very high level, and the cost does not significantly increase compared with the past.

A telecom personal billing system. With 8 billion data in total, the performance requirements are high, and the original MyCAT scheme is nearing its scaling limits, storing less than a year’s worth of historical data. Due to the processing bottleneck of MySQL database and table, there will be many problems in sharding. Therefore, users choose TiDB for upgrading. After the migration to TiDB, the amount of data in a single table can reach 10 billion, the data storage period is extended from half a year to 3-5 years, and QPS and latency are significantly improved.

Teacher Qi also introduced the cases of the order flow business of an O2O platform PMC, a financial core accounting system and a mutual finance marketing platform migrating to TiDB. The common point of these cases is that the user’s original database sub-database sub-table has encountered a growth bottleneck, which has caused more and more negative impact on the business. However, the original bottleneck has been completely solved after the migration to TiDB, no serious failure has been encountered in the migration process, and the cost investment is also under control.

3. TiDB 5.0 highlights analysis

At the end of the session, Qi introduced the performance optimization highlights and details of TiDB 5.0, including the following features.

* * Async Commit. ** The old version of TiDB adopts the two-phase commit mode, which is relatively expensive. Async Commit mainly implements asynchronous Commit in the second phase, which is similar to 1PC for small transactions, resulting in certain performance improvements.

* * Clustered Index. ** This feature is suitable for queries with primary key columns in the condition range, similar to Innodb clustering indexes, which can save the cost of table back for such queries. According to TPCC testing, the new version provides some performance gains in this area.

* * a Compaction Filter. ** This feature is mainly optimized for performance jitter caused by background automatic collation of compressed data. After enabled, the standard deviation of QPS fluctuation is reduced to less than 5%.

**SATA SSD optimization. When combining new features such as compaction Filter, fsync Control, and compaction Guard, version 5.0 delivers improved performance and latency on SATA SSDS over version 4.0. Due to SATA disk jitter itself caused by THE QPS jitter also decreased significantly.

Third, breaking capacity limit, TiDB breaks the performance bottleneck of enterprise database

Compared with the traditional sub-database and sub-table, TiDB is the real one-stop distributed database overall solution, which can fully meet the demanding requirements of enterprises in the rapid business growth, massive data with high concurrency, real-time data analysis and financial data with high availability and other scenarios. Through the wonderful sharing of the two teachers in this live broadcast, the audience had a deeper understanding of the capabilities, implementation details and business practice of TiDB database, and also understood the outstanding advantages of TiDB database service.

As the two teachers said, distributed database is an inevitable development trend in the industry, and TiDB complies with this trend, and will become a good solution for more and more enterprises to solve the bottleneck of database performance. At the same time, the application of TiDB in JD Zhilian cloud opens up a convenient channel for enterprises to quickly adopt TiDB and enjoy the benefits and values of TiDB as soon as possible.

If you want to know more about JD Zhilian cloud and TiDB, you can get the speech PPT. You can comment PPT in the comments section, and we will reply to you in time.

Click [Read article] to see the video playback link.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Breaking capacity limits: TiDB’s secret of “insensitive expansion” of massive data

I. Technical architecture and practice of TiDB in JD Zhilian Cloud

1. What problems the TiDB database wants to solve

2. Implementation and functions of TiDB on cloud

Ii. Application of TiDB in large data volume and high concurrency scenarios

1. Comparison of TiDB and SHARDING’s solutions in OLTP scenarios

2. Massive data application cases of TiDB

3. TiDB 5.0 highlights analysis

Third, breaking capacity limit, TiDB breaks the performance bottleneck of enterprise database

Recommended reading:

Breaking capacity limits: TiDB’s secret of “insensitive expansion” of massive data

I. Technical architecture and practice of TiDB in JD Zhilian Cloud

1. What problems the TiDB database wants to solve

2. Implementation and functions of TiDB on cloud

Ii. Application of TiDB in large data volume and high concurrency scenarios

1. Comparison of TiDB and SHARDING’s solutions in OLTP scenarios

2. Massive data application cases of TiDB

3. TiDB 5.0 highlights analysis

Third, breaking capacity limit, TiDB breaks the performance bottleneck of enterprise database

Recommended reading:

Related Posts

A diary

UC Flutter Technology Salon share: Build UC mobile technology system based on Flutter technology

Maven+JSP+SSM+Mysql implementation of music website