Article | Yan Gaofei

Industrial and Commercial Bank of China

Cloud Computing Laboratory of Fintech Research Institute

This article can be read at **** 5 **** minutes

 

preface

In 2014, ICBC took the lead in exploring the transformation of the distributed architecture of the bank’s core business system and independently developed the industry-leading distributed service platform.

The robustness of the registry, which provides the core service discovery function for the service platform, is critical to the success of service invocations. In recent years, with the rapid expansion of service architecture implementation, the performance capacity of the registry is facing higher challenges.

Based on the actual operation scenarios of the financial industry, ICBC has carried out in-depth customization and optimization of the registry architecture, and continuously explored the limits of performance capacity and high availability.

PART. 1

Problems and Challenges

The DISTRIBUTED service platform of ICBC selects Zookeeper **** as the distributed service registry.

Zookeeper is widely used in distributed systems in the industry to provide users with efficient and stable distributed coordination services. Zookeeper supports clustered deployment, scalable performance, and high availability. Zookeeper has the following features:

1, half of the survival can be used, so that the disaster recovery performance is strong

As long as more than half of the election nodes work properly, the cluster can provide services normally. If a server in the cluster breaks down, the client automatically switches to another available server.

2, the session mechanism, so that the client registration information remains stable

The client maintains a heartbeat session with the ZooKeeper server. By default, the session times out 40 seconds. If the cluster provides external services normally, the ZooKeeper server removes the client registration information after the session times out. If the entire cluster is unavailable, sessions do not expire. After the registry cluster is recovered, sessions and registration information before the fault are automatically recovered from the snapshot without external intervention.

The cluster provides an Observer mechanism

Read/write separation and traffic isolation are implemented

Observer nodes do not participate in cluster elections, but only undertake client connections and requests to improve overall throughput. Share pressure on election nodes to improve cluster stability. With observer grouping, client traffic isolation can also be achieved.

In the actual operation scenario of financial services, ICBC has built a ZooKeeper registry cluster with election nodes as the core and expanded deployment of Observer nodes, which has been running stably in ICBC for many years.

Theoretically, when the ZooKeeper registry cluster fails, the service invocation is not affected.

However, in the practice of chaos engineering fault drill, we found that in occasional scenarios, the distributed service platform may be affected by zooKeeper system resource failure or other problems.

Specific problems are listed as follows:

– Question 1:

After the single-service expansion, the number of provider nodes is too large

The upper limit of 4M pushed ZooKeeper packets is reached. As a result, the ZooKeeper server is under heavy performance pressure and fails to process other service registration requests in time, affecting transactions.

– Question 2:

The memory of the ZooKeeper cluster leader server is faulty

After the process dies, the cluster elects a new leader. However, some servers in the cluster do not sense the new leader in time. As a result, some service providers go offline, affecting transactions.

– Question 3:

The ZooKeeper registration data increased significantly. Procedure

Cluster failure After the reelection, the amount of data that the leader synchronizes to other nodes is too large. As a result, the server network reaches a bottleneck, and some servers fail to join new clusters in time, affecting transactions.

Based on the above problems, ON the one hand, ICBC deeply customized zooKeeper source code, and on the other hand, began to study the mechanism to improve service registration and service discovery high availability.

PART. 2

Build a multi-registration and multi-subscription mechanism

Improve service invocation stability

In order to avoid the problem of service invocation affected by the failure of single registry cluster, ICBC constructed a high availability registration model of multi-point access (as shown in Figure 1).

 

Figure 1. High availability registration model for distributed services

Note: In the preceding figure, DC1 and DC2 are two equipment rooms with two independent registry clusters. Both providers and consumers deployed in DC1 and DC2 are registered and subscribed to both registry clusters. Consumers in DC1 have priority to use the provider list pushed by THE DC1 registry, and consumers in DC2 have priority to use the provider list pushed by the DC2 registry.

Deploy two ZooKeeper clusters in the same city. The provider registers services with the two registries, and the consumer subscribes services from the two registries. The consumer has priority to use the data pushed by the registry in the same machine room. When the registry cluster in the same machine room fails, the registry in other machines can take over. Services are invoked normally during registry cluster failover.

After the establishment of the dual-registration and dual-subscription mechanism, the distributed service platform has perfectly dealt with the underlying system resource failures and network isolation for many times, and the stability of distributed service registration and service discovery has been significantly improved.

Based on the dual-ZooKeeper cluster deployment of the hypermetro architecture, due to the same technology stack of dual-cluster registry components, there are still systematic transaction risks with minimal probability. If the same type of failure occurs in the two ZooKeeper registries at the same time (for example, the performance bottleneck is reached at the same time), Business transactions could still be affected.

Therefore, in order to avoid the minimal probability of systemic risk, ICBC further carries out the construction of heterogeneous registry to pursue the ultimate service registration and discover high availability capability.

PART. 3

Building heterogeneous registries to avoid the risk of a single technology stack

Heterogeneous deployment of registries further improves high availability

The INDUSTRIAL and Commercial Bank of China (ICBC) has carried out the construction of heterogeneous registries and deployed a set of independent heterogeneous registries simultaneously with the ZooKeeper cluster. The business system registers with both the ZooKeeper and heterogeneous registry clusters and subscribes to services (as shown in Figure 2).

Figure 2 heterogeneous registration model

Heterogeneous registries work with ZooKeeper to improve the overall external service capability of the registry. In the event of a systemic failure between the ZooKeeper cluster and one of the heterogeneous registry clusters, another registry cluster can take over (figure 3 shows the data table structure).

Figure 3 Heterogeneous registry model next registry cluster failure

Note: In the preceding figure, the ZooKeeper cluster is faulty, the heterogeneous registry is working properly, and services can be registered, subscribed, and invoked normally.

Proposed a consolidated subscription mechanism,

Smoother migration in the event of registry failures

In the original multi-registry subscription mechanism of ICBC, consumers give priority to the list of providers pushed by one registry. Only when there is no provider available in this registry, consumers switch to the data pushed by the next registry. In this case, if the preferred registry pushes an incomplete list of providers due to a fault, consumers call these providers in a centralized manner, which may cause problems such as “uneven load on providers” and “concurrent traffic limiting”.

In order to ensure a smoother migration in case of registry failure after the construction of heterogeneous registries, ICBC studied and proposed a consumer merge subscription mechanism: merge subscription views of ZooKeeper and heterogeneous registries on the consumer side, and refresh invoker cache after merging subscription data of all registries.

Even if a registry fails to push an empty or incomplete list of providers, the combined data is still valid, which improves the efficiency of consumer screening providers and improves the success rate of transactions.

Figure 4. Multi-registry merge subscription mechanism

After the heterogeneous deployment of the registry and the combined subscription mechanism of the consumer are established, chaos simulation of the registry CPU full load, high memory load, disk failure, network jitter, network isolation and other failure scenarios, the provider load balance, no transaction failure, as expected. At the same time, the merge subscription mechanism can also reduce the memory footprint of the consumer’s cache provider list in the multi-registry mode.

Based on the above technical reserve and infrastructure construction, ICBC will carry out the construction of heterogeneous registry in 2020 to avoid the risk of a single technology stack within the registry and further improve the high availability of the core components of the registry in the whole distributed system.

PART. 4

future

In order to meet the demand of finance-level high availability of distributed service platform, ICBC explored and proposed a heterogeneous registration high availability scheme, which eliminated the systemic risk of a single registry in the scenario of large-scale service registration and ensured the stable operation of the financial system.

Industrial and commercial bank of the future, will be in a heterogeneous registration on the basis of further promoting unitized deployment operations transformation, breaking the registry for globally unique traffic hub of the status quo, the deal flow inside the unit complete closed loop to the greatest extent, supplemented by unit automatically switch to realize fault isolation, further control regional fault scenarios of explosion radius, Comprehensively improve the flow control and high availability takeover capabilities of distributed service platforms, and provide best practices and examples for large-scale applications of micro-service architectures in the same industry.