preface

ZooKeeper: ZooKeeper But do you really know what ZooKeeper is? If someone/an interviewer asked you to tell them what ZooKeeper is, how far would you go?

I used ZooKeeper as the registry of Dubbo, and I used ZooKeeper as the management tool of solr cluster when I set up the solr cluster. A few days ago, when I was summarizing my project experience, I suddenly asked myself what is ZooKeeper? The only simple words that come to mind are: “①Zookeeper can be used as a registry. Zookeeper can be used as a distributed lock. ③ When building a Zookeeper cluster, use an odd number of servers. It is clear that my understanding of Zookeeper is only scratching the surface.

So I started by looking at the website, the wikiHere’s another quote on the website:

ZooKeeper: A Distributed Coordination Service for Distributed Applications

I actually like the introduction to ZooKeeper on the Wiki more than on the website:(But my English is poor)

Let me summarize briefly:

ZooKeeper mainly serves distributed systems. ZooKeeper can be used to perform the following operations: unified configuration management, unified naming service, distributed locking, and cluster management.

Using a distributed system, it is impossible to avoid the problems of node management (real-time awareness of the status of nodes, unified management of nodes, etc.), and because these problems may be relatively troublesome to deal with and improve the complexity of the system, ZooKeeper as a general solution to these problems came into being.

So, in this article, I hope to give you a little more detail about ZooKeeper. If you haven’t studied ZooKeeper before, this article will be your first step in the ZooKeeper door. If you’ve already worked with ZooKeeper, this article will take you through some of its basic concepts.

ZooKeeper is an open source distributed coordination service. The ZooKeeper framework was originally developed in the “Yahoo! To access their applications in a simple and robust way. Later, Apache ZooKeeper became the standard for organized services used by Hadoop, HBase, and other distributed frameworks. For example, Apache HBase uses ZooKeeper to track the status of distributed data. ZooKeeper is designed to encapsulate complex and error-prone distributed consistency services into an efficient and reliable set of primibles and provide them to users with a series of easy-to-use interfaces.

Primitive: a category of operating system or computer network terms. A process consisting of several instructions used to perform certain functions. Indivisibility · The execution of the primitive must be continuous, with no interruption allowed during execution.

ZooKeeper is a typical distributed data consistency solution. Distributed applications can implement functions such as data publishing/subscribing, load balancing, naming services, distributed coordination/notification, cluster management, Master election, distributed locking, and distributed queuing based on ZooKeeper.

One of the most common use scenarios for Zookeeper is as a registry that acts as a service producer and a service consumer. Service producers register their services with The Zookeeper center. Service consumers first search for services in Zookeeper to obtain the detailed information of the service producer before invoking the content and data of the service producer. Zookeeper acts as the registry in the Dubbo architecture, as shown in the figure below.In my own projects, I mainly used ZooKeeper as the registry of Dubbo (Dubbo officially recommends using ZooKeeper registry). In addition, WHEN setting up solr cluster, I use ZooKeeper as the management tool of Solr cluster. ZooKeeper provides the following functions: 1. Cluster management: fault tolerance and load balancing. 2. Centralized management of configuration files. 3.

I personally feel that when using ZooKeeper, it is best to use the cluster version of ZooKeeper rather than the standalone version. The architecture diagram on the website describes a cluster version of ZooKeeper. Usually three servers make up a ZooKeeper cluster.

Why is it best to use an odd number of servers to form a ZooKeeper cluster?

We know that the Leader election algorithm in Zookeeper adopts Zab protocol. The core idea of Zab is that task data is written successfully when most servers are written successfully.

(1) If there are three servers, a maximum of one Server can fail.

② If there are four servers, a maximum of one Server can fail.

Since three or four servers are allowed to fail at most one Server, their reliability is the same, so select an odd number of ZooKeeper servers. Here select three servers.

First of all, ZooKeeper has these concepts:

Session, Znode, Version, Watcher, AND ACL

Then let’s talk about what it can do:

Unified configuration management, unified naming service, distributed locking, cluster management and more… A lot of

So why does ZooKeeper do so much? ZooKeeper: ZooKeeper: ZooKeeper: ZooKeeper: ZooKeeper

ZooKeeper nodes store their data in a hierarchical name space, much like a file system or a tree data structure

ZooKeeper’s data structure, much like a Unix file system, is a tree, with each node called a ZNode. Each node can be identified by path. The structure diagram is as follows:ZooKeeper: : ZooKeeper? ZooKeeper nodes are called ZNodes. Znodes fall into two types:

Ephemeral: When the client is disconnected from the server, the created Znode(node) is automatically deleted

Persistent: After the client is disconnected from the server, the created Znode(node) is not deleted

Let’s talk about listeners:

Watcher Watcher is an important feature in Zookeeper. Zookeeper allows users to register some Watchers on specified nodes. When certain events are triggered, the Zookeeper server notifies interested clients of the events. This mechanism is an important feature of Zookeeper to implement distributed coordination services.

Common listening scenarios are as follows:

Listen for data changes on the Znode

Listen for changes in child nodes

Let’s see how distributed locking is implemented:

The concept of lock in this I do not say, if the concept of lock is not too familiar with the students, can refer to the following article

Java lock? Distributed locks? Optimistic locking? Row locks?

We can use ZooKeeper to implement distributed locking, how does it work? Here’s a look:

Systems A, B, and C all access the /locks nodeWhen accessed, EPHEMERAL_SEQUENTIAL nodes are created. For example, system A creates the ID_000000 node, system B creates the ID_000002 node, and system C creates the ID_000001 node.Next, it takes all the child nodes (ID_000000, ID_000001, ID_000002) under the /locks node and determines whether it is the smallest node

If so, get the lock.

Release lock: Delete the created node after the operation is complete

If not, it listens for changes to nodes 1 smaller than itself

Here’s an example:

System A takes all the child nodes under the /locks node and compares them and finds that it, id_000000, is the smallest of all the children. So you get the lock

System B obtains all the child nodes under the /locks node and finds that it (ID_000002) is not the smallest child node. So listen for the status of node ID_000001 that is 1 smaller than oneself

System C obtains all the children of the /locks node and finds that it (ID_000001) is not the smallest of all the children. So listen for the status of node id_000000, which is 1 less than oneself

After system A completes the operation, delete the node (ID_000000) created by system A. Through listening, system C found that the ID_000000 node had been deleted, and found that it was the smallest node, so it got the lock successfully

… System B as above

Zookeeper-zab Protocol ZooKeeper Atomic Broadcast (ZAB) is a crash recovery Atomic Broadcast protocol designed for the ZooKeeper association. It ensures data consistency and global command ordering in the ZooKeeper cluster.

Before introducing the ZAB protocol, you need to know some concepts related to ZooKeeper to better understand the ZAB protocol.

Cluster role Leader: A cluster can have only one Leader at a time. The Leader provides read and write functions for clients and is responsible for synchronizing data to each node. Follower: provides the read function for the client and sends write requests to the Leader for processing. If the Leader crashes and is disconnected, the Follower participates in the Leader election. Observer: Unlike followers, observers do not participate in the Leader election. Service status LOOKING: When the node thinks that there is no Leader in the cluster, the server enters the LOOKING state to find or elect the Leader. FOLLOWING: follower role; LEADING: leader role; OBSERVING: Observer role; Zookeeper uses its own status to identify roles and perform tasks.

ZAB Status Zookeeper is also in the four states defined by ZAB, reflecting the four steps in the process of Zookeeper from election to external service provision. Public enum ZabState {ELECTION, DISCOVERY, SYNCHRONIZATION, BROADCAST} ELECTION: The cluster enters the election state, during which a node is selected as the leader. DISCOVERY: Connects to the leader, responds to the heartbeat of the leader, and detects whether the role of the leader has changed. After this step, the elected leader can perform the real job. SYNCHRONIZATION: After confirming the leader in a cluster, the cluster synchronizes data from the leader to nodes to ensure data consistency in the cluster. BROADCAST: The cluster starts to provide services. ZXID ZXID is an extremely important concept. It is a long (64-bit) integer, divided into two parts: epoch part and counter part. It is a globally ordered number.

The epoch represents the current cluster which belongs to the leader, the election of the leader is similar to the replacement of a dynasty, you can not cut the sword of the former dynasty officer, with the epoch represents the validity of the current command, counter is an increasing number.

The ZAB protocol consists of two basic modes, crash recovery and message broadcast. ZAB will enter recovery mode and elect a new Leader server when the whole service framework is started or when the Leader server is interrupted, crashes, exits or restarts. The ZAB protocol exits recovery mode when a new Leader server is elected and more than half of the machines in the cluster have been synchronized with the Leader server. The so-called state synchronization is data synchronization, which is used to ensure that half of the machines in the cluster can be consistent with the data state of the Leader server.

When more than half of the Follower servers in the cluster are in sync with the Leader server, the entire service framework can enter broadcast mode. When a zAB-compliant server is started and added to the cluster, the new server will automatically enter the data recovery mode if there is already a Leader server in the cluster to broadcast messages: Find the server where the Leader is, synchronize data with it, and participate in the message broadcast process. As mentioned above, ZooKeeper is designed to allow only one Leader server to handle transaction requests. After receiving the transaction request from the client, the Leader server generates the corresponding transaction proposal and initiates a round of broadcast protocol. If another machine in the cluster receives a transaction request from the client, the non-leader server will first forward the transaction request to the Leader server.

There’s so much to talk about and understand about the ZAB protocol and the Paxos algorithm that, to be honest, I still don’t know exactly how the brothers work or how they implement it. (Migrant workers still have to continue to work hard to move bricks ah)

Finally, this article mainly explains the introduction of ZooKeeper related knowledge and some concepts and usage, as well as the ZAB protocol, ZooKeeper through the Znode node type + listening mechanism to achieve so many useful functions!

Of course, the ZooKeeper thing is not that simple to consider, and I will continue to share it when I have a chance to dig deeper. I hope this article was helpful