Introduction: This article will introduce how to upgrade the real-time data warehouse HA architecture based on last year and successfully launch double 11.

The author | | plum sauce source ali technology to the public

A summary of the 2021 Singles day

During Alibaba Double 11 in 2021, the high-availability real-time data warehouse constructed by CCO+Hologres supported 10 core application scenarios in Alibaba Group, from intelligence to labor, from application to data products, from B/C to internal operation, after two years of iteration. In addition, the new generation of high availability and disaster recovery schemes are fully launched in multiple application scenarios such as double 11 real-time large screen and real-time monitoring of large market. When the peak write value of Hologres main cluster reaches 4.5 million + per second, the data delay and service delay can be truly achieved.

Compared with 2020, by optimizing the real-time write link this year, when Binlog consumption and dimension table Join flow are doubled, the Hologres Binlog reading peak value reaches 17 million + per second under the same resources, and the overall water level is stable and maintains normal high throughput. At the same time, for the first time this year, a new generation of high availability and disaster recovery scheme was launched in the promotion of core scenarios. The high-cost mode of dual-instance + dual-link used last year was cancelled, which greatly reduced the cost of human development, pressure testing and operation and maintenance, reduced hundreds of invalid dual-link tasks, reduced human input by 50%, and saved thousands of CU computing resources.

The following will introduce how to upgrade the real-time data warehouse HIGH availability architecture based on last year and successfully launch double 11.

Last year’s highlights: How Hologres perfectly supported the double 11 intelligent customer service real-time data warehouse?

2. Customer Profile

CCO stands for Chief Customer Officer and Customer Experience Business Division of Alibaba Group. In Alibaba’s economy, CCO is the organizational guarantee for the implementation of the value of “customer first”, the neural network of customer experience in the whole economy, and the forefront of reaching consumers and merchants. Our vision is to “become the cradle of service ecology for new business” and “make experience the core competitiveness of business”. With provide professional service for consumers, businesses and economies of small 2, for the platform to develop customer value stock of experience operating expert, provide support at the bottom of the data for business development, product and technical personnel, we become the digital service of the Internet industry unique experience team – a love bear, creative “ali”.

3 Business Challenges

By highly co-building with Hologres, CCO has built a real-time, self-service and systematic user experience real-time data warehouse, which perfectly contributes to the double 11 scene in 2020. It supports thousands of + service screens, reduces peak by 30%, and saves cost by nearly 30%.

But in 2021, the task size also increased 1.5 times in 2020, compared to the real-time data to expand more than 2 times, how to effectively manage data and resource has become the key problem more and more, at the same time will face during the 2021 large to promote more high concurrent high throughput traffic, how to guarantee the real-time several positions of high availability, and a balance of stability and cost, It is the core challenge of building real-time data positions this year.

On November 11, 2020, in order to cope with the surge of traffic, we spent one month and invested huge manpower cost to build a dual-link + dual-instance high-availability scheme. The following is the real-time data warehouse architecture of Last year’s Double 11. Although this architecture has supported a variety of large volume surge peaks such as last year’s Double 11, it still has some pain points in this year’s more complex environment and more external challenges, mainly including the following:

  • Waste of resources: Data is written to multiple instances at the same time to meet the requirements of active/standby mode, wasting computing and storage resources, and increasing service development and o&M costs.
  • Data consistency between the active and standby links cannot be effectively guaranteed: In the case of dual-write, data consistency between an instance and another instance cannot be completely maintained when the delay occurs for various reasons.
  • Complex operation and maintenance: Dual link means that two sets of architectures are required. Although the construction logic and development logic are consistent, when the operation and maintenance of the main link is issued (such as lifting and lifting configuration, bug fixed, etc.) or the logic is modified, the whole body is affected at the same time, and the standby link needs to be maintained at the same time, so the operation is complicated and the operation and maintenance link is long.

In order to solve the problem of legacy in 2020, 2021 pairs of 11 warehouse upgrade the amount of real time, using a new generation of high availability and disaster preparedness plan, in assessing sufficient pressure measurement of single link and contingency plans, long instance using multiple copies within + Shared storage way, in addition to the unknown problems can quickly switch the deputy was supposed to improve the usability of the business, It also greatly reduces the cost of constructing dual links. At the same time, in the face of heavy traffic and daily flow, it can quickly reduce capacity, improve the overall flexibility of the architecture, achieve a more balanced cost and efficiency compared with last year, achieve the generation of high availability, and successfully hit double 11 on a large scale.

The following describes the high availability and DISASTER recovery plans of this year.

Iv Business Plan

The overall data link remains the same as in 2020: Data source data is processed by Flink ETL and written into Hologres in real time. Row storage table is used in online service scenario, column storage table is used in complex multidimensional analysis scenario, and Hologres directly connects to upper-layer applications through different scenarios.

Based on last year’s solution, the architecture is upgraded to implement cluster isolation and high availability deployment for Hologres service and analysis scenarios, forming the current real-time warehouse 3.0 architecture.

Note: Some non-core services cannot be offline temporarily due to historical reasons, so Hologres synchronizes them to a DB engine to provide services.

Upgrade 1: Multiple isolation methods to meet production high availability

In the high availability part, this year’s solution upgrade is mainly as follows: 1) Service and analysis cluster isolation adopts two instance clusters of row storage and column storage to effectively isolate the capability of different QPS/RT requirements for row storage service and column storage analysis scenarios.

  • The row storage cluster carries core real-time data and mainly undertakes the high-frequency interaction with Flink (data writing, Binlog consumption, dimension table association). Meanwhile, it provides online lookup services such as data features and user portraits to realize online recommendation.
  • The column storage cluster is mainly used for complex multidimensional analysis, which is written by Flink in real time and applied to a series of core scenarios such as real-time data large screen and real-time monitoring

2) Analysis scenario Read/Write separation High availability and DISASTER Recovery For the core scenario of column storage analysis with high security during a big boost, enable capacity building for multi-instance shared storage read/write separation high availability and non-shared storage DISASTER recovery.

  • Read/write separation High availability: Multiple instances share the storage. The primary instance has full capabilities, and data can be read and written with configurable permissions and system parameters. The sub-instance is read-only. This scheme realizes complete read/write separation function and guarantees SLA of different business scenarios. In scenarios of high-throughput data writing and complex architecture jobs, OLAP, AdHoc query and online service, the loads are physically isolated from each other without query jitter caused by writing. At the same time, when a sub-instance is faulty, an emergency switch can be performed between other sub-instances without data consistency problems.
  • Disaster recovery: Multiple instances are deployed in multiple equipment rooms. Data is synchronized between instances in real time, and data is redundant on multiple file systems. In case of cluster faults, you can perform emergency switchover.

When the daily incremental data is tens of TB, the real-time data synchronization delay of the same room or across the room is less than 10ms regardless of the high availability of read/write separation on shared storage or the disaster recovery mode of non-shared storage, which fully meets the requirements for data timeliness in high security promotion scenarios.

Upgrade 2: Real-time link optimization improves throughput

For the core scenario, high performance is required for query and high throughput for write under high traffic rush peak, so as not to affect the next decision of the service. For example, the annual double 11 transaction peak puts great pressure on write and Binlog consumption. Therefore, real-time link optimization and guarantee need special processing. In this year’s task tuning for large traffic scenarios, we have made corresponding optimizations for concurrent, batch and connector on real-time links, ensuring high throughput write and reducing write delay to meet the needs of different business systems.

  • Optimize the write bandwidth policy of connector to avoid the restriction of VIP bandwidth from 5GB/s to 20GB/s.
  • Increase the number of shards in high-traffic row memory tables, such as transaction tables, and improve write concurrency.
  • For tasks with large flow, appropriate concurrency is selected. For example, the parameters used in transaction table are Binglog concurrency: 512, sink concurrency: 512, Batch size: 512, buffer size: 20000, ingore delete: true.
  • Appropriate saving capacity: selecting more appropriate connector saving and server saving, the gap between the peak value of the input flow and output flow of the transaction scene can reach 30%, which has the effect of peak cutting and valley filling to a certain extent.

There are strong scene reasons as well as objective reasons for the scheme compromise:

  • Due to historical reasons, different application systems may rely on different data service engines. For example, some KV engines have not been modified and offline for the time being. In order to ensure data consistency, we consume Hologres Binlog to synchronize some real-time data to other KV engines in real time. It can also meet the needs of different application systems.
  • In order to ensure the peak throughput performance, all engines expand according to the peak capacity, which will be a great waste of resources. The traffic transferred through the warehouse needs to have a certain capacity of “peak elimination and valley filling” to ensure that the target engine does not have to expand excessively.

Five summarizes

After two years of iteration, the high-availability real-time data warehouse constructed by CCO+Hologres is gradually upgraded from the original traditional data warehouse to the high-availability real-time data warehouse in 2021: The 2020 Double 11 Promotion adopts Hologres as the core real-time data warehouse scheme for the first time, which unified the storage of real-time core data and part of offline data. In 2021, the real-time data warehouse will be successfully upgraded from link dual-write to read/write separation high-availability and disaster recovery architecture by upgrading its high-availability architecture, which will be applied in core scenarios such as Double 11 and double 12. Real-time task scale by last year, hundreds of thousands of + + increased to write pressure increases to 17 million + per second, several hundred terabytes data size, direct service dozens of online service scenarios, and hundreds of business core analysis, effectively reduce the number of building real-time warehouse main link of human and machine costs, reduce the pressure of the stability of the core business for reading, Perfect withstand the test of each major promotion core scene, to achieve high availability of production.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.