The authors introduce

Tao Wang, one of the co-founders of Giant Sequoia Database, currently serves as the CTO and chief architect of SequoiaDB, responsible for the architectural design and development of SequoiaDB products. He used to be a core r&d member of IBM DB2 Lab in North America, and has more than ten years of experience in database core architecture design, database engine development and enterprise database application.

Since this year, the frequent occurrence of public cloud accidents, has a “black swan” constantly erupting momentum. A Beijing-based startup recently said that after using a cloud platform for eight months, all the data stored on its cloud server, including backup, was lost, resulting in the loss of its platform data for several years and “nearly 10 million yuan in losses.” Apologize to this, the cloud platform, the company is willing to compensation on the platform of real consumption amounted to 3569 yuan, in line with the help the user to quickly restore business purpose, commitment to the company to provide additional compensation of 132900 yuan in cash or cloud resources, a total of 136400 yuan, plus compensation amount reach its on the platform in the cloud amount of 37 times.

In addition, in the past six months, a number of platforms have also appeared data security problems, I believe you have heard, I will not list them here. Obviously, the frequent occurrence of data security “black swan”, so that the public attention to data security again rose to a new height.

With the increasing value of data in enterprises, data has become the lifeline and core asset of enterprises. Data security in the financial sector has become a top regulatory priority, both in China and overseas.

In China, bank core system security has always been the focus of attention of China Banking Regulatory Commission. For most bank data centers, regulators currently put forward “two-center, three-center” and “active-active” capabilities for data security and high availability.

Geo-redundant is the construction plan for a production data center, same-city Dr Center, and remote Dr Center. In this mode, the three data centers in two cities are interconnected. If one data center fails or suffers a disaster, the other data centers can run properly and take over key services or all services. Today, some big banks have even achieved “three places and five centers”.

“Double live” data center is a city with the deployment of two data centers, live on the other hand is equal status between the multicenter, normal work mode, parallel provide services for business visit, achieved to make full use of resources, avoid one or two idle backup center, cause waste of resources and investment; Second, in the case of a data center failure or disaster, other data centers can run normally and take over key services or all services, so that users are not aware of faults.

At present, enterprises have been constantly trying new architectures and platforms in data management. Under the strong supervision of regulatory authorities, the data security requirements of the financial industry are the most stringent in all industries, and there are only a few products and solutions that can truly meet these requirements. As a database carrying the key data of enterprises, its security, reliability and stability have always been its important core values. For database platform, how to achieve better data security through technology? How to implement data recovery, DISASTER recovery, and multi-live in special cases?

Comparison of database active schemes

With the increase of many enterprise customer demand for business continuity, the traditional business of downtime window is more and more small, even in many types of Internet applications require 7 x 24 uninterrupted service, cause the system to the database operations, continuous service ability, high availability and disaster recovery ability has a new demand.

Among them, the multi-live architecture of the database is critical to the task distribution of business middleware, from RTO/RPO. In a traditional multi-site master-slave architecture, business middleware often needs to understand the role of the different data centers to ensure that read and write operations are always directed to the primary site, while some read-only services that allow asynchronous access can be done by the secondary center. However, in a complex business environment, this construction method often makes the business model extremely complex and involves many configuration adjustments during the switchover between the active and standby centers.

From the perspective of architecture, the current traditional database industry is mainly represented by Oracle RAC and IBM GDPS active architecture.

An RAC is usually located in a single data center and is based on shared disks. The upper layer builds multiple Oracle running instances and connects them to a SAN. All transaction control and lock waiting mechanisms are shared among multiple Oracle instances in the data center over high-speed networks.

Figure 1: Oracle RAC abstract architecture

GDPS is a more standard live architecture. By setting up IBM DB2 for Z /OS database in multiple data centers, and copying data between databases through QRep, supplemented by upper Workload controller for task distribution, the application can live multiple times in the database between multiple sites.

Some of the application-level live architectures of traditional databases can be achieved by business shards. For example, the databases in Beijing and Shanghai can maintain the accounts of users in the north and south respectively, and the data centers in Shanghai and Beijing can act as disaster recovery nodes for each other’s data centers. This architecture is also a typical multi-live application logic segmentation model.

Whether it is RAC, GDPS, or application business sharding, the core design concept is basically based on the traditional centralized database architecture. However, in the field of distributed database, its multi-activity design idea is better than the traditional architecture design.

Since most distributed databases are designed with the separation of computing and storage, their SQL parsing and executor often run in different processes from data storage and transaction control. In this case, the use of the characteristics of distributed database itself and three copy of placing data scattered in multiple data centers, each data center configuration local SQL service node, from the application point of view does not need to pay attention to the underlying database of master-slave architecture, need only through a JDBC connection to the local SQL server nodes can be read and write operations.

In this architecture, each SQL node is completely peer and can handle read and write operations. All transaction control, consistency control, lock wait and so on are provided directly by the underlying distributed database.

Figure 2: Schematics for storage and computing (SQL) separation of distributed databases

Through this mechanism, the whole cluster can also provide second-level RTO and RPO capabilities. Meanwhile, applications do not need to care about the primary and secondary configuration of background databases, and simply connect to the local MySQL service node to use the multi-live distributed database as traditional database.

Financial distributed database disaster Recovery multi – active practice

SequoiaDB internally implements DISASTER recovery, backup, and hypermetro. The following figure shows the remote multi-live architecture.

Figure 3: SequoiaDB remote live architecture diagram

SequoiaDB implements a remote Dr Architecture for the database of a large bank enterprise. The following is a brief share of the solution based on actual services.

Disaster preparedness architecture

The architecture is based on the SequoiaDB three-copy solution for same-city Dr. Two copies are deployed in the local production environment, and one copy is deployed in the Dr Environment. The cluster spans both the production environment and the Dr Environment. To ensure data consistency between the Dr Environment and production environment in real time, enable the SequoiaDB data synchronization function.

After strong data synchronization consistency is enabled, the application receives a data update success message only after the synchronization of all surviving nodes is complete. This ensures data loss to the maximum extent.

Figure 4: Physical deployment architecture of same-city Dr

On the basis of same-city Dr, a SequoiaDB cluster is deployed in a remote equipment room as a remote Dr Cluster. The remote cluster keeps only one copy. The synchronization of structured data between the two clusters is realized by transferring the logs of the same-city Dr Cluster to the remote Dr Cluster and then replaying the logs.

Figure 5: Remote Dr Deployment architecture diagram

Disaster response

1. Single node failure

Because of the three-copy high availability architecture, the data group can still work normally in the case of individual node failure. For individual node faults, you do not need to take special measures. You only need to recover the faulty node in time and restore the data of the faulty node through automatic or manual data synchronization.

Figure 6: Single node failure

2. Overall fault response in local production environment

If the equipment room in the production environment is faulty, two thirds of the nodes in the cluster environment are lost. For each data group, two data nodes in each data group are faulty and only one node remains. If no measures are taken, the surviving nodes in the Dr Environment can only provide the query function for services.

In this scenario, to enable a copy in the Dr Environment to provide read and write services, you need to use the cluster splitting function of SequoiaDB to split the cluster in the Dr Environment into a single-node cluster, and all nodes in the Dr Environment can provide read and write services. Splitting a cluster takes a relatively short time, usually within 10 minutes.

Figure 7: Local environment fault response diagram

After the cluster is split, the two copies of the production cluster cannot be started again. You need to use the SequoiaDB merge cluster function to start the production cluster. Otherwise, the production cluster will be split, that is, the production cluster will also start the data update service. The production cluster and the Dr Cluster have two data versions, and it is difficult to merge them.

3. Overall disaster recovery environment fault response

When the disaster recovery environment is faulty, two replicas of each data group are deployed in the production environment, and the number of surviving nodes in each data group is greater than the total number of nodes in each data group. Therefore, each data group can still provide read and write services for the application layer. If the disaster recovery environment is faulty, you do not need to take special measures. You only need to recover the faulty node in time and use automatic or manual data synchronization to recover the data on the faulty node.

Figure 8: Fault recovery in a Dr Environment

4. Network fault response

When the local environment cannot communicate with the Dr Environment due to a network fault, applications can access the local two-copy cluster because of the three-copy architecture. If the same-city network is faulty, you do not need to take special measures. You only need to rectify the network fault in a timely manner, and then use automatic or manual data synchronization to recover data on the Dr Node.

Figure 9: Schematic diagram of network fault recovery

The user’s image platform uses the same-city active-active architecture. Each data group has two nodes in the main machine room and one node in the DISASTER recovery machine room. In terms of data synchronization, SequoiaDB’s node consistency function is used. When data is written to the master node, the database will ensure that data is returned after the synchronization is complete, ensuring data integrity and security even in the case of an overall disaster in the master room.

When the whole machine room fails, the SequoiaDB Takeover function can be used to quickly split a single copy in the DISASTER recovery machine room into an independent cluster to provide services for services in a few minutes. Therefore, the RTO is close to zero. Since SequoiaDB enables strong consistency of node data, RPO is also close to zero.

summary

Under the requirements of strong financial supervision, whether it is “two-site three-center” or data disaster backup and other requirements, financial distributed databases continue to innovate in data security.

In the future, as the core of data management, distributed database will continue to improve data security and data availability. Database security will be promoted to a new level through innovative mechanisms such as hypermetro, multi-activity, and high availability DISASTER recovery.