Bohemian, love of life. Java Cultivator (wechat official ID: Java Cultivator), welcome to follow.

Zookeeper is a distributed service framework used to solve data management problems in distributed applications, such as unified naming service, status synchronization service, cluster management, and configuration item management of distributed applications.

We can simply think of Zookeeper as the big steward of a distributed family. Then how does the steward team select the Leader? Curious? Let’s take a closer look.

The basic principles of human voting

Before explaining the Zookeeper election process, let’s take a look at human elections.

All of us have lived through elections at some point or another, and there are several things that can happen in the process of voting:

Case 1: You are familiar with several candidates and you will vote for whom you thinkSomeone with more ability;

Case 2If you’re a candidate and you don’t know any of the other candidates well, you might want to go out and get votes because you think you’re the strongest person and everyone should vote for me. But unfortunately in the process of canvassing, youFind out that others are better than youYou start to feel inferior and end up voting for the person you think is the strongest.

After everyone has voted, the ballot box is finally counted and the person with the most votes is elected.

There are four core concepts that we can extract from the voting process:

  • Candidate competence: The basic principle of voting is to vote for the strongest person.

  • Encounter strong change to vote: if the back found stronger people can change to vote.

  • Ballot box: Everyone’s vote will be placed in the ballot box.

  • Leader: The person who gets the most votes is the leader.

From the principle of human election, we will simply deduce the principle of Zookeeper election.

Basic principles of Zookeeper elections

Notice An election is not required if Zookeeper is deployed on a single machine. An election is required only in cluster mode.

The principle of Zookeeper election is similar to the logic of human election. The four basic concepts of human election are used to explain Zookeeper in detail.

  • Personal ability

How do I measure individual capabilities of a Zookeeper node? The answer depends on whether the data is new or not. If the data of a node is new, it means that the individual power of the node is stronger. Is it strange?

In Zookeeper, transaction ID (hereafter referred to as ZXID) is usually used to mark the degree of old and new data (version). The larger the latest ZXID of a node is, the newer the data of the node is, and the stronger the capability of the node is.

The full name of the ZXID is ZooKeeper Transaction Id.

  • Strong change

At the start of the cluster election, the node first thinks it is the strongest (that is, the data is up to date), and then writes its name (including zxID and SID) on the ballot. The ZXID is the transaction ID, and the SID uniquely identifies itself.

It then passes the vote to other nodes and receives the vote from other nodes. After each node receives the vote, it will compare whether the person is stronger than me (zxID is bigger than me). If so, I need to change the vote. Obviously, others are stronger than me, and I can’t be shameless, right?

  • Ballot boxes

Slightly different from the human election ballot box, the Zookeeper cluster maintains a ballot box in memory per node. The node will place its vote in the ballot box along with the votes of other nodes. Since the ballots are passed around, the ballots in the ballot boxes at each node will eventually be the same.

  • The leader

In the process of voting, we will count whether more than half of the votes and we choose the same node, that is, they all think a node is the strongest. Once more than half of the nodes in the cluster agree that a node is the strongest, that node is the leader, and voting ends.

In what scenarios does Zookeeper need to be elected?

If one of the following conditions occurs on a server in the Zookeeper cluster, you need to enter the Leader election.

(1) The server is initialized and started.

(2) The Leader fails during server running.

Leader election at startup time

Assume that there are five servers in a Zookeeper cluster with ids from 1 to 5. They are newly started and have no historical data.

Assuming the servers start in sequence, let’s analyze the election process:

(1) Server 1 starts

Initiate an election, server 1 cast his vote, at this time server 1 votes for one vote, less than half (3 votes), the election cannot be completed.

Voting result: 1 vote for server 1.

The state of server 1 remains LOOKING.

(2) Server 2 starts

An election is initiated, and server 1 and server 2 vote for each other. Server 1 finds that server 2’s ID is larger than server 2’s and votes for server 2.

Voting result: 0 votes for server 1, 2 votes for server 2.

Keep the server 1,2 state LOOKING

(3) Server 3 starts

Initiate an election, server 1, server 2, and server 3 vote for themselves first, and then change the vote to server 3 because server 3 has the largest ID.

Results: 0 votes for server 1, 0 votes for server 2, 3 votes for server 3. At this point, server 3 has more than half of the votes (3 votes), and server 3 is elected Leader.

Change the status of server 1,2 to FOLLOWING, and server 3 to LEADING.

(4) Server 4 starts

An election is initiated when servers 1, 2 and 3 are no longer LOOKING and will not change the ballot information. Result: 3 votes for server 3, 1 vote for server 4. At this point, server 4 follows the majority and changes the vote information to server 3.

Server 4 and change the state to FOLLOWING.

(5) Server 5 starts

As with server 4, vote 3, server 3 has 5 votes, and server 5 has 0 votes.

Server 5 and change the state to FOLLOWING.

The end result:

Server 3 is Leader and its state is LEADING. The other servers are followers and in the FOLLOWING state.

Run-time Leader election

During the running of Zookeeper, the Leader and non-leader perform their respective duties. If a non-leader server breaks down or joins the Zookeeper server, the Leader will not be affected. However, once the Leader server dies, the whole Zookeeper cluster will suspend external services. Could trigger a new election.

Server 3 is selected in the initial stateLeaderServer1 is 99, server2 is 102, server4 is 100, and server5 is 101. In this case, the zxID of each server may be different

The run-time election is basically similar to the initial voting process, which can be roughly divided into the following steps:

(1) State change. After the Leader fails, the remaining non-Observer servers change their server state to LOOKING and start the Leader election process.

(2) Each Server will issue a vote.

(3) Receive the vote from each server, if the data of other servers is newer than their own will change the vote.

(4) Processing and counting of votes, after each round of voting will be counted, more than half can be elected.

(5) Change the state of the server and declare the election.

Without further ado, here’s a picture:

(1) For the first time, each machine will vote for itself.

(2) Then each machine will send its vote to other machines. If it finds that the ZXID of other machines is larger than its own, it needs to change the vote and vote again. For example, server1 received three tickets and found that server2’s XZID was 102. After pk, it found that it lost, and decided to vote server2 as the leader.

The core concepts involved in the electoral machinery

Knock on the blackboard. These concepts are required for the interview.

(1) Server ID (OR SID) : indicates the Server ID

Let’s say there are three servers, numbered 1,2, and 3. The larger the number, the greater the weight in the selection algorithm, such as initialization startup is compared by server ID.

(2) Zxid: indicates the transaction ID

The transaction ID of the data stored in the server. The larger the value is, the newer the data is, and the greater the weight is in the election algorithm.

(3) Epoch: logical clock

Also called the number of votes, the logical clock value is the same throughout the voting cycle, and this number increases with each vote.

(4) Server status: election status

LOOKING, campaign status.

FOLLOWING, follower state, synchronous leader state, participate in voting.

OBSERVING: indicates that the system is in the leader state. The system does not participate in voting.

LEADING, leader status.

conclusion

(1) The Zookeeper election occurs in the initial state and running state of the server.

(2) In the initial state, according to the number of the server SID, the more the number, the greater the power, and the Leader can be elected by more than half of the votes.

(3) The Leader failure will trigger a new round of election. The newer the data, the greater the weight.

(4) In the running period of the election may also encounter the situation of brain split, you can learn by yourself.