Distributed administrator ZooKeeper, you need to know

Good evening, I am luca, recent half a month time more nervous, one is the Chinese New Year is coming soon, the company’s personnel to review and assess the document also is more, the second is also updated your resume previous technology, induction, since may also have a full seven months, found that a lot of technology is not very solid, such as the classic zookeeper, Apache ZooKeeper’s open source distributed coordination notification is highly applicable to e-commerce systems with high concurrency and microservice system services. I restudied and summarized my knowledge points, which was very helpful to the refining of knowledge points and my understanding. Technical life began from relearning technical points.

Zookeeper positioning

Zookeeper is an open source apache Zookeeper project.

Main functions of Zookeeper

For the distributed coordination and management of microservices, metadata management and Master election, there are mainly three types of projects in actual use scenarios

1. Direction of ZooKeeper

(1) An e-commerce management system based on Java at the back end

Zookeeper mainly works in distributed lock, registry, where Zookeeper mainly maintains CP(consistency and partition fault tolerance) in CAP theory, that is to say, Zookeeper guarantees strong consistency

② For big data storage, data management, primary/secondary data switchover and other systems

Here we are familiar with Kafla, Canal and others using Zookeeper metadata management, through the master to elect the specific master and slave,ZAB atomic consistency protocol,

③ Large Internet companies, self-developed distributed components

Generally based on ZooKeeper as the prototype, ZooKeeper has been applied to major distributed systems, metadata management, and registries on a large-scale industrial scale, and is generally used in Dubbo + ZooKeeper as a basic microservice architecture to achieve basic data calls

Summary:

Zookeeper distributed coordination system, encapsulating all the core and mainstream requirements and functions of distributed architecture;

Distributed locking, metadata management, master elections, distributed coordination and notification

Basic features of ZooKeeper:

Before understanding the basic structure of ZooKeeper, let’s understand why ZooKeeper can realize distributed system coordination notification, metadata management, etc.

Those who are familiar with ZooKeeper technology stack are familiar with it. It itself is in the notification coordination mechanism and has strong processing power in data synchronization. This is why many self-developed frameworks use ZooKeeper at the bottom.

Background:

In a cluster environment, three physical machines are installed in the ZooKeeper cluster.

1. Consistency in order

Multiple requests, whether to read or write data, are executed sequentially; Zookeeper internally processes requests. The leader role assigns a globally unique and self-increasing ZxID to each request to request data// This unique global ID ensures order consistencyHere the zxID, first simple understanding, as a request sequence number, will focus on the laterCopy the code

2. The atomicity

Either all machines will succeed or all machines will not succeed when synchronizing data operations// Ensure that the results of each synchronization are consistent
Copy the code

3. High availability

Cluster deployment prevents the system from breaking down. If the system breaks down, you can elect a leader to provide services again// Go from recovery mode to broadcast mode
Copy the code

4. Data consistency:

In the cluster case, the data on each machine is consistent. However, if there is a breakdown during data synchronization, other machines will automatically save the data to the disk log. After the leader role commits, the data will be synchronized to the ZNode node, so that the data on each machine is consistent// Data consistency is the core feature of ZooKeeper. According to the CAP theory maintained by distributed systems, only CP can be supported.
 // C-consistency for strong data consistency, p- for system partition fault tolerance A- high availability, eureKA is to support AP feature
Copy the code

5. High performance

High performance, here a ZooKeeper data model, zNode node// Znode is based on pure memory, fast, tree structure, similar to the Linux file system, each node has dataThe data stored, can already be processed the ability is relatively strong, and we will talk more about it later, and then look at itCopy the code

Summary:

Features of the ZooKeeper-based cluster architecture:

Zookeeper clustering:

Background:

Consistent with the above scenario, the ZooKeeper cluster is automatically deployed with three machines. Now let’s break down the main functions of each machine

Zookeeper’s data is synchronized to each follow role after the primary leader commits the transaction: 2PC commits the transaction — more on that later

1. Role division in Zookeeper cluster:

The Leader

After the cluster is started, half of the selected machines become the Leader, supporting read and write operations.

Follower (candidate)

The rest of the elections are not successful. They are in the follower role and support data reading and data synchronization (when the leader breaks down and cannot provide data write operations, other followers elect a new lerner to implement data write operations).

Observer

It only supports reading data and does not compete with the leader

The roles are self-understanding, or patterns, and so on;

Once each role has identified its capabilities, we can synchronize the data directly to provide services;

1.1 Matters needing Attention:

If a write request directly accesses the ZooKeeper process of the follower role, the follower role cannot handle the write request because the follower role does not have the permission to handle the write request. In this case, the follower role forwards the write request to the Leader role for processing. After processing, data synchronization is started, and then data synchronization is completed by returning to the other follower roles.

2. The process of accessing the Zookeeper cluster and client:

Background:

Zookeeper clusters and processes requests, either from message queues, or from individual requests that read data, and so on. How does each request connect to the Zookeeper cluster?

When the Zookeeper cluster environment is started successfully, then we start to assign roles, and then we cluster to establish TCP persistent connections on the client.Copy the code

When the connection is established, a session is generated on the server, forming a session mechanism, and the corresponding sessionTimeout exists

In the current environment, if the connection is interrupted suddenly, the connection succeeds within the sessionTimeout period without affecting the use of the long connection.

3. How to ensure data consistency in the Zookeeper cluster?

Background:

Zookeeper supports CP in CAP theory, mainly consistency. It has its own architectural characteristics to ensure strong data consistency. Let’s step by step unravel the veil of how Zookeeper ensures strong consistency.

Zookeeper Atomic Broadcast (ZAB) protocol: indicates the Atomic Broadcast protocol

This protocol is the core of Zookeeper’s data operation and the key to metadata management.

Protocol to synchronize data between Zookeeper clusters to ensure strong data consistency.

ZAB: ZooKeeper Atomic Broadcast protocol

The roles can be leader and follower

Send a Proposal. The leader synchronates a Proposal to all the followers. If more than half of the followers receive a Proposal for the transaction, each follower will return an ACK

Each proposal will not be written to the znode data on the follower, but will be written to a local disk log file and then returned to the leader as an ACK. The leader will then send a commit transaction

Half commit: 2PC is a two-phase commit; If the leader breaks down unexpectedly, a new election will be held among the followers.

4.Zookeeper cluster data synchronization process:

The division is more fine, adhere to read the harvest must be ⛽️

1 ️ When the cluster starts, the election algorithm starts to allocate the leader and follower roles in the machine, and the machine with the leader role is successfully selected through the half election mechanism

2 ️ The remainder of the machine, of course, is the follower role and can then provide external services

3 ️ After a (write operation) request passes through the Zookeeper cluster, the leader role machine will preferentially allocate a zxID to create a node or change the globally unique auto-increment ID of the node, and then initiate a Proposal (just a Proposal, equivalent to telling others to live and prepare). Put the proposals into the queue that was previously prepared for each follow role (here to ensure order consistency).

4 ️ Each follower machine gets the proposal, puts the data into its own disk log file (without znode node), and replies an ACK to the leader node (confirm the successful connection and return the positive character)

ACK (Acknowledge character) is the acknowledgement character:// In data communications, a transmission control character sent from the receiving station to the transmitting station. Indicates that the received data has been confirmed.In TCP/IP, if the receiver succeeds in receiving the data, it responds with an ACK. Usually ACK signal has its own fixed format, length size, by the receiver to the sender.Copy the code

5 ️ After the leader machine gets more than half of the ACK returned by the follower machine, the preparation is successful, and the leader role will push a commit message.

Other followers write data from the disk file to the Znode and store it on the znode. After the data synchronization is complete, the data can be read

Because of this process, a two-phase commit, also known as a 2PC transaction, is used to resolve distributed transactions

5. Data model of Znode:

1. The data model of ZooKeeper is

Pure memory-based tree structure: Znode

Znodes are classified as follows: Persistent node: the client is disconnected from the ZK, but the node still exists. Temporary node: The client is disconnected from the ZK, and the node disappears.Copy the code

Zk has a requirement environment based on distributed locking, the exhibit framework, which we implement by developing temporary nodes, creating a sequential node when locking

2. Functions of the ZK node

When ZookeePerk does metadata management: it is definitely a persistent node, but it usually does distributed locking, distributed coordination, and notification. It is a temporary node. If it disconnects, the temporary node disappears.

3. Components of a Znode:

6. Zookeeper starts the cluster data synchronization process

Leader election

The cluster is started and the leader election begins. Half of the followers agree that the current machine is the leader. Each follower starts data synchronization, exits the recovery mode, and provides external services

The system breaks down and elects the leader again

3 machines, allowing no more than half of the machines to fail

2 machines, both machines agree that one machine will be the leader, this machine can elect the leader;

A machine cannot elect itself; Because 1 machine, less than half, all can not start the cluster environment;

Data synchronization

After the leader is elected, all the other machines are followers — data synchronization begins

Force the remaining follower data to be consistent with the leader’s;

Once the data is synchronized, it’s going to go into message broadcast mode;

Message write: The leader writes messages in 2PC mode and synchronizes the followers

Update your data to the ZNode data node

Downtime repair

If the leader fails, or the followers fail, as long as more than half of the surviving machines can elect a new leader, the elected leader requires more than half of the support, and the other follower data synchronization, message broadcast mode

Zk starts to synchronize data from recovery mode to message broadcast mode

7. What is your understanding of data consistency in the ZooKeeper cluster?

During data synchronization, the leader puts proposals in the queue (first-in, first-out) and synchronates them to the followers. When more than half of the followers return an ACK(acknowledgement character), the leader pushes a commit for submission and the followers synchronize data. (Here we note that not all followers return results)

Zk data synchronization is not strongly consistent,

The data in the disk log file can be read only after the follower submits the data to the Znode. The final data will be consistent

The official reply from ZK is that the order consistency will be guaranteed according to zxID and the order of proposal

The leader will ensure that all proposals are synchronized to the followers in order to achieve sequential consistency

Zk can also support strong consistent lines, but you need to manually adjust zK’s sync() operation

8. What are the cases of data inconsistency under ZAB?

Case 1:

When the leader pushes the commit data, it has just committed the data, but it has not yet committed the data to the follower.

Brief introduction: When the client sends a write request, the leader has received ack from more than half of the followers. The leader has locally written data to the Znode. The leader commits successfully, but it has failed before sending a commit to the other followers. When the client receives the data that the leader has committed, the data has been updated by default. However, when we query the data follower machine for a new request, we find that the data has not been updated, which is different from the data returned by the leader. (Resulting in inconsistent data)

In this case, the follower data must be inconsistent with the data on the leader machine that just crashed. What will zK do next?

In a specific time, the followers in the ZK cluster find that the eldest leader cannot reply and is in a state of failure. They say we need to choose a new leader, and then choose a new leader (a new leader). The previous leader will become a follower.

Situation 2:

If the client is on the leader machine that requested the write operation, and the leader sends a proposal, but before it does, it’s dead;

A proposal exists in the log file on the local disk. But other followers do not have this data;

When the cluster collapses and the recovery mode is carried out, the key core of the ZAB protocol is revealed, according to

Solution to Scenario 1:

After selecting a new leader, a large number of followers find out whether a proposal has been submitted on the local disk, and then check other followers for the same situation. The new leader(who is the follower who did not receive the commit message), then we start sending the COMMIT to the other followers, writing data to the Znode to resolve the client read inconsistency;

Solution process of Case 2:

According to the above situation, the broken leader has a proposal, but the other follow has not been received, so after the recovery mode, the new leader is selected to carry out the write operation, but the first thing to do is to check whether there are any proposals left in the local disk. If data on the disk of one follower is inconsistent with that of the other, the new leader will synchronize data to the other followers, and any proposal from the previous failed leader will be discarded.

9. How does the new leader elected during crash recovery synchronize data with other followers

Let’s simulate a zooKeeper cluster deployment scenario with 5 machines

Four of them are used by the follower role to read and synchronize data, and the remaining one is used by the leader to write data

Background:

When a leader sends out a proposal, all the other three follower machines have received the proposal, except for the one that has not received the proposal. But now there are three J follower machines that have returned ack confirmations, so the leader decides that the proposal has been successfully notified. (half of the followers elect), and then start the commit itself. It fails before submitting to the rest of the followers.

Analyze:

The leader has already committed the proposal to his znode, but it has been suspended. The other four follow machines, three of which have already committed the proposal to their local disk log file, and the remaining one has no proposal at all. If the other three machines all elect the last machine as the new leader when they restore the mechanism, this data will be lost forever.

The zooKeeper cluster does not elect a leader who has no data

The main thing to understand is a zxID concept:

Zxid: Can be understood as the increment of something’s ID. If the proposal is changed, it increases.

If the follow does not receive the suggestion, then the zxID corresponding to the follower node will be larger than the machine that received the ACK;

The requirement for the leader election is that among followers, the one with the largest transaction ZXID, that is, the one with the earliest creation time, shall be the new leader

10.zxid

Znode status information contains czxID, then what is zxID?
Each change in the ZooKeeper state corresponds to an increasing Transaction ID called zxID. Because of the increasing nature of zxID, if zxid1 is less than zxid2, then zxid1 must have occurred before zxid2.
When a node is created, its data is updated, or its data is deleted, the Zookeeper status changes and the zxID value increases.

In ZooKeeper, ZXID is a 64 – bit number. The lower 32 bits are in ascending order. That is, each time the client initiates a proposal, the lower 32 bits are simply added by 1. The higher 32 bits are the epoch number of the leader period. This is a version number of the leader

For example, if the later version of the epoch is 8, the current leader version is 8, and the new election is held after the failure, the number of the epoch is 9

So if you think about the previous situation 2, you will know why the data will be thrown out, because the largest version of the leader’s EPOCN will be found as the leader that can be submitted

To issue a new order;

11. The role of the Observer node:

The observer does not participate in the semi-write mechanism and does not participate in elections

Just passively accepting the data,

If the demand for read requests is high, we can add several observers,

Advantages:

Can handle a large amount of concurrency,
Does not affect the half-write mechanism,
Performance is not particularly affected

The reason:

When the leader sends a proposal, he or she has to wait for more than half of the followers to return an ACK and then synchronize data.

During the election, more than half of the votes are taken, which will consume a lot of time and network resources. However, as long as the data is read, non-election observers can solve this pain point to a large extent

12. Zookeeper applies to the following deployment scenarios:

Suitable for small cluster deployment, read more and write less.

An odd number of machines, half the election,

Half write + disk log file ==== Commit + ZNode node

The main reason is that the writing pressure is too high on the leader

Distributed administrator ZooKeeper, you need to know

Zookeeper positioning

1. Direction of ZooKeeper

Basic features of ZooKeeper:

Zookeeper clustering:

1. Role division in Zookeeper cluster:

1.1 Matters needing Attention:

2. The process of accessing the Zookeeper cluster and client:

3. How to ensure data consistency in the Zookeeper cluster?

4.Zookeeper cluster data synchronization process:

5. Data model of Znode:

6. Zookeeper starts the cluster data synchronization process

7. What is your understanding of data consistency in the ZooKeeper cluster?

8. What are the cases of data inconsistency under ZAB?

9. How does the new leader elected during crash recovery synchronize data with other followers

10.zxid

11. The role of the Observer node:

12. Zookeeper applies to the following deployment scenarios:

Related Posts

Gnawing concurrency (eight) : In-depth analysis of wait&notify principle

Let’s talk about Zookeeper

Spring [AOP module] is that simple