TiDB 5.1 release, to create a smoother enterprise database experience

Since the release of TiDB 5.0, it has been applied in the production environment of users in finance, Internet & new economy, logistics and other industries, and received many positive comments from users:

TiDB service 58 financial, housing and other data warehouse statements of complex reading and associated query, in the multi-table associated query, compared with the 4.0 version of the highest performance up to 90%;

Compared with 4.0, the overall performance of TiDB 5.0 is more stable without obvious jitter.

TiDB 5.0 has obvious advantages over MPP in the application of Autohome big data Join and aggregation scenarios. Compared with MySQL, the overall efficiency of TiDB 5.0 is 20-50 times higher.

“Customer feedback drives us forward, and our mission is to continue to improve the experience for developers and DBAs and make it easy for users to use.” “Every release of TiDB is designed to address DBA pain points,” says PingCAP co-founder and CTO Dongxu Huang. “Real world is the best architect. Starting with version 5.0, TiDB has shortened the release cycle and adopted a more flexible and agile train release model. Every input that a user needs for real world can become a feature delivered in the next release within a two-month cycle.”

Thanks to the rapid feedback from a large number of users in real application scenarios, TiDB 5.1 has been accelerated to further create a smoother enterprise database experience. TiDB 5.1 has more stable response latency performance, better MPP performance and stability, and easier operation and maintenance. Developers and DBAs can easily build business-critical applications of any size based on TiDB 5.1.

TiDB 5.1 features and user value

Common Table Expression supports THE ANSI SQL 99 standard, enabling users to write concise and maintainable SQL codes to cope with complex service logic and improve development efficiency.
Further improve MPP performance and stability to help users make real-time decisions faster. 5.1 By supporting partition table in MPP mode and adding multiple function expressions and operator optimization, real-time analysis performance can be improved by more than one order of magnitude; Analysis queries are faster and more stable with enhanced memory management and load balancing mechanisms.
Version 5.1 optimizes the stability of the database’s long-tail query latency in the case of sudden heavy write traffic, cluster scaling, and online data import and backup. The latency can be reduced by 20-70% depending on the workload. Especially for delay-sensitive types of critical business applications in the financial industry, the query stability is greatly improved under high pressure loads.
Support column type change, higher compatibility with MySQL. 5.1 Added Stale Read mode to dramatically improve Read throughput by breaking Read hotspots in Read/write separation scenarios. A new system table is introduced to quickly locate lock conflicts in high concurrent transaction scenarios. The statistical analysis engine is improved to improve the optimizer’s index selection accuracy and ensure the efficiency and stability of business queries.
Provide a more friendly operation and maintenance experience for large clusters, further reducing DBA workload. 5.1 Cluster capacity expansion and data migration speed increased by 40%, improving o&M reliability of large-scale clusters, reducing the overall backup and recovery time of large-scale clusters, and optimizing the automatic recovery mechanism after temporary CDC data link interruption to further improve data synchronization link reliability.

Common Table Expression simplifies SQL

In financial transaction scenarios, it can be a developer’s nightmare to maintain a single 2,000-line SQL statement with a lot of aggregation and multi-layer subquery nesting due to the objective complexity of the business. Version 5.1 supports the ANSI SQL 99 standard Common Table Expression (CTE) and the recursive writing method, greatly improving the efficiency of developers and DBAs in writing complex business logic SQL and enhancing the maintainability of code.

HTAP’s real-time analytics capabilities have been upgraded

Further improve MPP performance and stability

Version 5.1 further enhances the integrated capabilities of TiFlash MPP computing engine to help users make business decisions faster:

MPP supports partitioned tables to optimize resources consumed by massive data analysis and query based on service logic, improving query speed.
New support for multiple common SQL functions, and optimization operators to make better use of MPP to speed up queries;
Provide convenient forced MPP mode switch, users can decide whether to enable MPP mode;
By optimizing the decentralization and balance mechanism of cluster load, the hot spots are eliminated and the “comprehensive” carrying capacity of the system is improved.
Fixed engine memory usage issues to provide a smoother and smoother experience.

Improve query analysis stability under high pressure loads

In the financial business scenario, technicians perform high-stress daily batch calculations on data to generate the latest marketing and marketing analysis reports to aid business decisions. Batch running process requires high continuity and cannot tolerate errors in the middle process. For this scenario, version 5.1 optimized the request retry mechanism of TiDB and request processing mechanism of TiKV, which significantly reduced the probability of Region Unavailable errors caused by delayed data synchronization in TiFlash under heavy load.

Seamless integration with TiSpark

TiSpark 5.1 provides read and write support for clustered index tables without any additional performance overhead and is completely transparent to users, who can immediately migrate to the new TiSpark to experience seamless integration with TiDB 5.1.

Reduce read/write delay jitter

In dely-sensitive application scenarios, when online write traffic burst, TiDB capacity expansion, background statistics, and online data import and backup, delay jitter of the P99 and P999 percentile of the database may occur, which affects long-tail query. TiDB 5.1 Enhanced management of disk read and write links. Limit the use of disk resources by background tasks to greatly reduce the interference on online services and improve the efficiency and stability of read and write links. In the environment of AWS EC2 R5B.4xlarge instance mounting EBS GP3 disk, the measured results of TPC-C benchmark test (10KWH) are as follows:

The operation cluster was reduced from 6 TiKV to 3, and P99 response time was reduced by 20% and P999 response time was reduced by 15%.
When 200GB data is imported online, the P99 response time reduces by 71% and P999 response time reduces by 70%.

Enhance business development flexibility

Column type changes are supported

In typical TiDB application scenarios, binlog is often used to aggregate multiple MySQL upstream data into a TiDB cluster. TiDB does not support column type change operations. If upstream MySQL changes the column type of the table, data synchronization with TiDB will be interrupted. Support for DDL statements that change column types was added in version 5.1 to completely resolve these issues and further improve MySQL compatibility.

Stale Read

Stale Read applies to scenarios that Read a lot and write a little and tolerate old data. For example, after a Twitter user sends a message, the system will generate tens of thousands or even hundreds of millions of reads, and it is tolerable for the new message to be read after a certain time. This scenario brings a great deal of concurrent read pressure to the database, and may generate read hotspots. As a result, the load on nodes is unevenly distributed and the overall throughput becomes a bottleneck. Stale Read allows users to specify a past point in time to Read data from any copy of data (rather than from the leader), significantly spreading the load on nodes and nearly doubling the overall Read capacity.

Stale Read */ > SET TRANSACTION Read ONLY AS OF TIMESTAMP NOW() -interval 5 SECOND; > SELECT * FROM T;Copy the code

Quickly locate lock conflicts (experimental features)

Business development needs to deal with concurrent database transactions very carefully, once the occurrence of locking table will bring great impact on online services, and DBAs need to quickly locate the cause of locking table to ensure that services can be restored to normal. The Lock View system table View is added in TiDB 5.1, which can quickly locate the transactions and related SQL statements that cause the Lock table, thus improving the processing efficiency of Lock conflict problems. The following small example shows how to use Lock View to quickly locate transactions and SQL statements that have locked tables.

Faster and more accurate statistical information analysis

As the business data continues to change, the statistics of the table become stale, causing the optimizer to execute the plan less accurately, making the query slower. The DBA performs the ANALYZE operation to reconstruct the statistics of the table. TiDB 5.1 optimizes the performance of the ANALYZE sampling algorithm, reducing the average time to generate statistics by one-third, and adds a new statistical data type to make index selection more accurate for the optimizer.

Improve the reliability of large cluster operation and data transmission

Backup optimization of most scales

Optimize the backup of most tables, and reduce the full backup time of TiDB cluster to 30~40% of the previous level under the scale of 50K tables. In addition, version 5.1 optimizes the organization of the meta information file of the backup module (v2 for short). When starting BR, you can enable V2 by setting the parameter “–backupmeta-version=2” to reduce the single write volume and reduce the memory consumption. This prevents abnormal exit in an environment with low memory specifications (≤8GB).

Improve the o&M reliability of large-scale clusters

The larger the TiDB cluster is, the longer it takes to perform routine o&M operations such as capacity expansion, hardware upgrade, and node relocation. TiDB 5.1 significantly improves the performance of data migration when scaling. Here are two sets of test results:

When 100 nodes are deployed, the time required to migrate all data in a cluster across data centers is reduced by 20%.
Adding a node or migrating data on a node reduces the time by about 40%.

Optimize memory usage

Out Of Memory has always been a typical problem in the database industry. In version 5.1, there are a number Of optimizations for TiDB Memory usage to reduce OOM risk:

The window function row_number takes up a fixed amount of memory regardless of the size of the data;
Optimize reading of partitioned tables to consume less memory;
Add a configurable memory limit to the storage tier. When the limit is triggered, the system releases part of the cache to reduce memory usage.
TiFlash writes 80% less memory than the previous version.

Improve the reliability of CDC synchronization links

TiCDC 5.1 Provides synchronous link reliability without manual intervention: TiCDC ensures continuous synchronization in the event of environmental disturbance or hardware failure; Even if synchronization is interrupted, the TiCDC automatically retries based on the actual situation.

Finally, special thanks to xiaomi, Qihoo 360, Zhihu, IQiyi, Ideal Automobile, Sina, Huya, Xiaodian, Cross Express, Eema technology and other companies and community developers for their contributions in the design, development and testing of TiDB version 5.1. It is your continued support. Help TiDB continue to improve the use experience of developers and DBAs in real world scenarios, making TiDB easier to use.

Click on TiDB 5.1 Release Notes to start the TiDB 5.1 tour.