preface

Zookeeper Atomic Broadcast (Zab) is a crash recovery Atomic Broadcast protocol designed for Zookeeper to ensure data consistency and global order of commands in the Zookeeper cluster.

The concept is introduced

Before introducing the ZAB protocol, you must familiarize yourself with zooKeeper concepts.

The cluster character

1.Leader: A cluster can have only one Leader at a time. The Leader reads and writes data to clients and synchronizes data to each node.

2.Follower: provides the read function for the client. The write request is forwarded to the Leader for processing.

3. Observers, unlike followers, do not participate in the Leader election.

Service status

1.LOOKING: When the node thinks that there is no Leader in the cluster, the server will enter the LOOKING state in order to find or elect the Leader.

2.FOLLOWING: follower role;

3.LEADING: leader role;

OBSERVING is one of the characters.

Zookeeper uses its status to identify roles to perform tasks.

ZAB state

Zookeeper also defines four states for ZAB, which reflect the four steps in the process from Zookeeper election to external service provision.

1.ELECTION: The cluster enters the ELECTION state. During this process, a node is elected as the leader.

2.DISCOVERY: Connect to the leader, respond to the heartbeat of the leader, and detect whether the role of the leader is changed. After this step, the elected leader can perform real duties;

3.SYNCHRONIZATION: After the leader is identified in the entire cluster, the leader data is synchronized to all nodes to ensure data consistency in the whole cluster.

4.BROADCAST: The cluster transitions to the BROADCAST state and provides external services.

Zxid is a very important concept. It is a long (64-bit) integer divided into two parts: epoch part and counter part. It is a globally ordered number.

The epoch represents the leader to which the current cluster belongs. The election of the leader is similar to the replacement of a dynasty. The sword of the previous dynasty cannot kill the officials of the current dynasty.

The election

Now that the basic concepts are introduced, how zAB protocol supports leader election is introduced.

There are three problems to conduct leader. When to conduct leader? Election rules? Selection process?

I will answer these three questions one by one:

There are two times when the Leader election occurs. One is when the service starts, when there is no Leader node in the whole cluster, the Leader node will enter the election state. If the Leader node already exists, it will inform the Leader node of the information and connect to the Leader, and the whole cluster does not enter the election state.

Another is that during service operation, various situations may occur. When the service is down, the power is off, and the network delay is very high, the leader can no longer provide services to the outside world. Therefore, when other points detect that the leader is disconnected through heartbeat, the cluster will also enter the election state.

2. Election rules Enter the voting process, how can a leader be elected? Or according to what rules other nodes can elect you as leader.

3. Zab protocol is to conduct voting screening according to several comparison rules. If your vote is better than mine, you will modify your voting information and vote for you as the leader.

When the era of other nodes is higher than its own, if the era is the same, compare the size of its own ZXID and elect the node with a large ZXID. Here, the ZXID represents the ID of the largest transaction submitted by the node. The larger the ZXID is, the more complete the data of the node is.

Finally, if the epoch and ZXID are the same, the serverId of the service is compared, which is configured for the ZooKeeper cluster. Therefore, when we configure the ZooKeeper cluster, we can configure the serverId of the cluster with higher service performance to be larger, and let the machine with better performance play the role of leader.

radio

After the leader election, a cluster will also have two steps: connecting to the leader and synchronization. This section does not analyze the flow of these two steps in detail, but mainly introduces how to ensure the data consistency of each node when the cluster provides services externally.

Zab guarantees the following characteristics in the broadcast state

Reliable delivery: If the message M is delivered by one server, it will eventually be delivered by all servers.

Globally ordered: If a message A is delivered by one server before message B, then both A and B are delivered by all servers, and A precedes B.

Causal ordering: If message A precedes message B causally and both are delivered, then A must precede B.

Zookeeper stores data in a data structure similar to a directory structure. Therefore, zooKeeper must be named in order.

For example, if you name a as /test and then b as /test/123, if the order of b is not guaranteed, the command b will fail to create because the parent node does not exist.