Distributed interview question: How does a ZK cluster achieve high availability deployment?

Zookeeper I think everyone is familiar with, in many occasions have heard its name. It is a top-level project of Apache that provides consistent high performance coordination services for distributed applications. It can be used for configuration and maintenance, domain name service, distributed lock, etc. Many open source components, especially in the middleware space, use Zookeeper as a configuration center or registry. It is an important component of Hadoop and HBase, a management and coordination service for Kafka, a registry for service frameworks such as Dubbo, and more.

The principle of

Before introducing high availability deployment, it is important to understand the basics of Zookeeper to fully understand its high availability deployment.

architecture

The following figure shows the architecture of Zookeeper. The Zookeeper cluster has three roles: Leader, Follower, and Observer.

Leader: responsible for initiating and deciding votes and updating system status. The Leader is elected.
Followers: receive client requests and return results to the client. They participate in voting during the master selection process.
Observer: The Observer can accept client connections, receive read/write requests, and forward write requests to the Leader. However, the Observer does not participate in the voting process and only synchronizes the status of the Leader. The Observer aims to expand the system and improve the read speed.

Client is the Zookeeper Client and request initiator.

High availability

In the Zookeeper system, as long as more than half of the nodes (voting nodes and non-observer nodes) in the cluster work properly, the whole cluster can provide external services normally

Based on this, if you want to build a cluster that can allow N machines to go down, you need to deploy a ZooKeeper cluster consisting of 2 x N+1 servers.

Therefore, if three Zookeeper nodes (non-Observers) are deployed, if at least two Zookeeper nodes are available, the whole cluster is available. This means that one node failure does not affect external services provided by the Zookeeper cluster. If five nodes are deployed, two of them fail at the same time, and the Zookeeper cluster can still provide services.

The number of non-Observer nodes deployed in a Zookeeper cluster is an odd number

The number of nodes deployed is usually an odd number. This does not mean that it cannot be an even number. For example, if four nodes are deployed, it means that 4/2+1 = 3 nodes are required to be normal before the cluster can be serviced properly, that is, one node can be tolerated. However, if three nodes are deployed, the effect is the same. In other words, deploying an even number of nodes is a waste of one machine in terms of high availability.

The deployment of

As long as more than half of the Zookeeper cluster nodes work properly, the cluster can provide normal services. Therefore, it is very easy to deploy more nodes to ensure high availability of Zookeeper.

Multiple deployed nodes are highly available?

Multiple nodes do seem to increase availability, but there are two issues to consider:

Adding more nodes has an impact on performance and write availability Adding more nodes means being able to tolerate more abnormal nodes, which sounds like high availability is higher. However, a larger number of nodes means that more nodes (more than half) are required to accept the proposal sent by the Leader, which inevitably increases the time of proposal commit, which means a greater impact on the performance and availability of write requests. Therefore, for a normal business system, perfect trade-offs are needed to adjust the number of nodes.
Ensure that the Zookeeper cluster can provide services properly when an equipment room or area fault occurs.

Disaster room

To ensure high availability of the Zookeeper cluster when the entire equipment room is faulty, deploy Zookeeper across equipment rooms.

Single room

Let’s first look at the deployment of a single machine room. The following figure shows the deployment of five nodes in a single machine room. Dr Cannot be implemented in a single room. If the equipment room is faulty, the Zookeeper cluster cannot work.

In single-node deployment, the selected nodes should be different from the same host or cabinet to avoid problems on multiple nodes at the same time.

Two rooms in the same city

Since a single room can not do the machine room disaster, that room?

As shown in the following figure, three nodes are deployed in Equipment room 1 and two nodes are deployed in Equipment room 2. The Zookeeper cluster has a total of five nodes. Can this implement DISASTER recovery in equipment room? Can the cluster provide services properly when any equipment room fails?

Actually, it doesn’t work. If “machine room 2” is faulty and “machine room 1” is normal, in this case, because “machine room 1” has three nodes, more than half, so it can work normally; However, if “machine room 1” fails, the number of surviving nodes is only 2, and the whole cluster cannot work properly.

Therefore, Zookeeper deployed in two equipment rooms cannot implement EQUIPMENT room Dr.

Three computer rooms in the same city

Let’s take a look at the three-room deployment. The three-room deployment can achieve the disaster recovery of the equipment room. Again, take a cluster of five nodes as an example:

As shown in the following figure, two nodes are deployed in Room 1 and Room 2, and one node is deployed in Room 3. When any equipment room is faulty, the number of normal nodes is greater than half, ensuring disaster recovery.

Different ground disaster

Only the machine-room level disaster recovery should be enough for general business, but at present, many companies adopt the mode of two places and three centers, and Ant Financial even achieves three places and five centers. In this case, how should our Zookeeper cluster be deployed?

Two and three centers

Geo-redundant is the construction plan for a production data center, same-city Dr Center, and remote Dr Center. In this mode, the three data centers in two cities are interconnected. If one data center fails or suffers a disaster, the other data centers can run properly and take over key services or all services.

What are the considerations for deploying a Zookeeper cluster in a two-site, three-center mode?

As shown in the figure below, the following deployment mode is generally adopted for the two centers and three centers. In “Region 1” there are two same-city data centers, “Center 1” and “Center 2”, and in “region 2” there is a remote center, “Center 1”. Here you may have two questions:

Why are the voting nodes (followers and leaders) deployed in area 1 center 1 instead of three centers?
The reason is that the long physical distance between the remote nodes and the long network transmission delay lead to a long decision time of the voting nodes in the cluster, which further affects the write performance. For example, if Beijing and Shanghai are selected, the network delay of the special line is about 30ms. When writing data, half of the nodes need to agree to the proposal, so that a write request can be successful. Therefore, a successful write will take a long time. In addition, the network between remote sites is complicated, so it is easy to re-elect the cluster, which makes the whole cluster unavailable and takes a long time to elect. Therefore, the Zookeeper cluster is generally deployed in one center with three equipment rooms, while the other centers use Observer nodes. Therefore, the Zookeeper cluster cannot implement remote Dr.
Why were Observer nodes introduced?
The Observer extends the Zookeeper cluster. The Observer provides read and write services to clients but does not participate in voting. Therefore, Observer nodes do not affect the voting time for the cluster, nor do they affect the cluster election. In addition, the addition of the Observer is a big boost to read performance.

Three-center optimization

To protect the cluster, Observer nodes are deployed in all three centers, while the Client only interacts with Observer nodes. In this way, the workload of voting nodes can be reduced, the instability of leaders and followers can be reduced, and the stability and availability of the whole cluster can be improved.

conclusion

High availability of Zookeeper requires many considerations in deployment. Zookeeper clusters can implement equipment room Dr, but cannot implement remote Dr. In addition, to improve the scalability and stability of the cluster, you can introduce Observer nodes to improve read performance and protect the Leader and Follower nodes.

Come from:
https://www.jianshu.com/p/9c9543dc21ea

The last

This is a 500-page PDF document of interview notes. Welcome to my official account: Win the world with Java Architecture, reply [2020] to receive these organized materials! If you like the article, remember to pay attention to me. Thank you for your support!