Abstract: with the development of the technology from offline to real-time data lake, lake of data in the business has been gradually from auxiliary decision to real-time decision, real-time intervention in the direction of advance prevention, even at the same time, along with the national data as the fifth kind of factors of production, the data according to the value in gradually improve, so to the reliability of the mass data of lake puts forward new requirements.

This article is shared from Huawei Cloud FusionInsightMRS Dr: Big Data Geo-Redundant Dr Can Be So Easy, by Sailing27.

background

Along with the development of the technology from offline to real-time data lake, lake of data in the business has been gradually from auxiliary decision to real-time decision, real-time intervention in the direction of advance prevention, even at the same time, along with the national data as the fifth kind of factors of production, the data according to the value in gradually improve, so to the reliability of the mass data of lake puts forward new requirements.

First, data lake as a mass data storage, the enterprise have a vital role on the safety of data, how to ensure the data in the lake, in any case, don’t appear lost situation, is more and more companies are thinking problem, at the same time, with data lake gradually swallow the interactive query, real-time analysis ability, A large number of analysts are gradually moving their daily data analysis work to the data lake. In this context, the failure of the data lake to provide services or data loss will have a significant impact on the enterprise. On the other hand, data center-level failures continue to occur around the world, and news such as fires and floods continue to emerge. At the same time, the daily operation and maintenance process of misoperations, is always hanging over the head of a sword, this situation may be more common than the engine room fire, but the impact on the data is also fatal.

Every time there is a news report about this kind of accident, I believe that many big data platform operation and maintenance personnel are worried, how to ensure the absolute reliability of the data lake system has become an issue of concern to more and more enterprises.

For a data lake platform, common failures include:

Generally, data lake is a large-scale distributed system, and the small range of hardware fault classes have been considered in the system architecture. This paper does not describe in detail. The following describes the MRS ha solution in the case of major disasters and misoperations.

This section describes the MRS Dr Scheme

As the world’s leading data lake platform, Huawei Cloud MRS has a relatively complete reliability plan and provides the following three solutions for different fault scenarios:

  • Data backup: The OBS or standby MRS cluster is used as backup storage to back up key data to the OBS/HDFS.

  • Single cluster across AZs: Multiple AZs are used to construct a single cluster. The copy placement policy (BPP) and YARN scheduling mechanism are optimized to ensure that data is not lost and critical services are not interrupted when a single AZ fails.

  • Remote ACTIVE-Passive Dr: Construct an active MRS cluster and a standby MRS cluster, configure the Dr Relationship between the active and standby MRS clusters, and synchronize data in the active cluster to the standby cluster periodically or in real time. If the primary cluster is faulty, switch services to the secondary cluster to ensure rapid service recovery.

The above three schemes will be described in detail as follows:

This section describes the MRS backup scheme

As a basic data protection solution, the backup solution is low cost and simple. Compared with other solutions, it has unique advantages in data deletion protection. However, the backup of big data business is also a challenging task, which is mainly reflected in the following aspects:

  • Due to the large number of components in the big data platform, the backup scheme of various components is not uniform, and the implementation is complicated, especially in some scenarios, data consistency among components should be considered.

  • The data volume is huge and the full backup cost is high. For this reason, many big data projects do not adopt the backup solution.

Huawei Cloud Native Data Lake (MRS), as an enterprise-level data lake platform, provides easy-to-use backup management functions and supports multiple backup storage solutions. The overall backup capability is as follows:

Components support all data related components, such as Manger, DBService, HDFS, YARN, HBase, Hive, ES, Redist, ClickHouse, and IotDB. In addition, the MRS supports a graphical backup configuration interface. Users only need to select the data to be backed up and set the backup period. The system automatically backs up data periodically and ensures the consistency of data association between components.

The MRS supports data backup to the standby MRS cluster or OBS. If OBS backup is available, the OBS backup is preferred. If no OBS backup is available, the standby MRS cluster is preferred. The standby cluster can use two copies of HDFS.

The following is the main interface for managing backup jobs:

When creating a backup job, select the following policies:

Backup Task management:

Summary: Due to its advantages in coping with data loss, the backup scheme provides the basic data protection capability, especially the full data preservation capability, to cope with data deletion by mistake.

This section describes the MRS single-cluster and Cross-AZ solution

Backup solution while it is possible to solve the problem of the reliability of the data, but can’t solve the problem of the reliability of the business, if there is room for machine fault, although the data can be recovered from the backup system, but the restoration period will be very long, for the room level fault, how to solve business and data reliability at the same time, MRS provides ChanJiQun across AZ’s plan, The core of this solution is to use the distributed capability of big data to deploy an MRS cluster in multiple AZs. For the HDFS of the storage layer, it automatically identifies multiple AZs and distributes multiple copies in multiple AZs to ensure that data loss is not caused when any AZ fails. For the computing layer, the same queue can be deployed across AZs. When an AZ fails, the task is automatically retried in another AZ. In this way, the application layer is unaware of AZ migration.

The DEPLOYMENT architecture of the MRS in a single cluster and across AZs is as follows:

As shown in the figure above, the same MRS cluster is deployed in 3AZs at the same time. For the storage tier, three copies of the same data are placed in three AZs based on the block placement policy (BPP) to ensure that the failure of any AZ will not cause data loss. For computing, By configuring tenant queues in multiple AZs, tenant applications can be executed in another AZ even if one AZ fails.

In the cross-AZ scenario, another big challenge is the bandwidth between azs. The bandwidth between azs is generally limited, so how to reduce the requirement of inter-AZ bandwidth in cross-AZ deployment is a problem that cannot be ignored. In the cross-AZ scheme of MRS, in order to reduce the demand for cross-AZ bandwidth, The MRS optimizes cross-AZ traffic in the following aspects:

  • Shuffle traffic in an application: Based on the self-developed Superior scheduler, computing tasks do not span azs in normal scenarios. In this way, Shuffle traffic during operation is controlled in one AZ, reducing cross-AZ traffic consumption.

  • Read traffic of service data: For read requests of service data, data is read from the local AZ based on data localization scheduling, reducing cross-AZ read traffic to nearly zero.

  • Service data write traffic: The HDFS writes a large amount of temporary files and logs. The MRS enables you to determine whether to cross AZs by directory and place copies only for real service data according to the cross-AZ policy. Reduce unnecessary cross-AZ traffic for temporary files and log classes.

As shown in the figure below, App1 runs on Queue1, a cross-AZ queue. Although this queue is associated with computing resources of AZ1 and AZ2, the Superior scheduler developed by MRS detects scheduling through AZs and will not distribute computing tasks of App1 to two AZs for simultaneous execution. Instead, only one of the AZs will be executed

When AZ1 fails, the Superior automatically reruns the application on AZ2. In this case, the application does not interrupt the task status, and the application does not need to perform failure retries. Therefore, the application is completely transparent.

At the same time, it can be seen that no matter which AZ the App is running in, the access to the storage tier can realize the closed loop within the AZ without cross-AZ access, which ensures the performance and reduces the requirement for inter-AZ bandwidth.

In some scenarios where insufficient 3AZ resources are available, the MRS also supports the 2+1 deployment mode, that is: Two active AZs and one quorum AZ. The quorum AZ is not used for data storage and service computing. Only a few components, including Zookeeper and HDFS JournalNode, that require 3AZ quorum are deployed.

In a scenario where resources between azs are uneven, the MRS also provides flexible configuration capabilities. You can configure services (tenants) and data (tables/directories) to be protected as required. Only the resources in the smallest AZ need to meet the requirements of service and data consumption that need to be protected across AZs. You do not need to force all azs to have the same resources.

The MRS provides an easy-to-use cross-AZ deployment and configuration interface:

Enabling the cross-AZ capability for a cluster:

Select nodes in each AZ:

Conclusion:

Through the design of computing, storage, and cluster management, services can conveniently and flexibly deploy cross-AZ MRS clusters. From the perspective of services, the cross-AZ cluster is still a single cluster, and application retries are implemented in the platform when an AZ fails, and the application layer does not need to perform failure retries. Truly achieve full transparency to the application of high availability capabilities.

This section describes the MRS Active-Passive Dr Scheme

Although the cross-AZ solution can solve the equipment room level fault, the network delay required by the cross-AZ solution requires that the distance between AZs is within one city. In response to the city level fault scenarios, you need to use MRS main disaster preparation plan, realize the true high availability, to be sure, here is the main disaster preparation plan, is an end-to-end solution, is not a big data platform layer can unilaterally, so a lot of times requires a combination of data sources, the application layer to complete the design of architecture, This document describes the disaster recovery data center (ACTIVE-passive mode) solution at the big data platform layer. The following figure shows the active/standby replication scheme at the big data platform layer:

In the active-passive DISASTER recovery (ACTIVE-passive) scenario, the MRS provides a unified Dr Management capability to solve the problems that involve multiple components and complicated synchronization management. You only need to configure the Dr Relationship between the active and standby clusters to complete THE Dr Protection for all components.

In most cases, the Dr Service protects only core data and services. MRS provides the concept of protected group. Multiple protected groups can be configured for a pair of active and standby clusters to protect different services and data.

In a protected group, you can configure various protection contents, such as HDFS directories and Hive tables.

After protected groups are configured, the system provides the synchronization status and historical record management functions of protected groups.

Summary: With the ACTIVE-passive Dr Capability of THE MRS, services can easily implement the remote active-passive Dr Capability of the big data platform to meet the capability of coping with city-level disasters. Configure the active-passive Dr Scheme on the service side to achieve absolute high availability of services.

conclusion

Based on the three schemes described above, MRS can achieve complete coverage from simple data backup to cross-AZ high availability (HA) and remote DISASTER recovery, supporting services to cope with various abnormal scenarios. The comparison of the three schemes is as follows:

Services can flexibly select their own solutions based on their own service features and troubleshooting scenarios. For example, the primary cluster is built across AZs in a city in north China and a secondary cluster is built in a city in south China. In this way, it can not only protect against AZ-level fires and power failures, but also prevent major disaster scenarios such as city flood.

Click follow to learn about the fresh technologies of Huawei Cloud