28 ZooKeeper: How to nail an interviewer!

1.ZooKeeper?

ZooKeeper is an open source distributed coordination service. It is the manager of the cluster. It monitors the status of each node in the cluster and performs appropriate operations according to the feedback submitted by the node. Finally, it provides users with an easy-to-use interface and a system with high performance and stable function.

Distributed applications can implement functions such as data publish/subscribe, load balancing, naming services, distributed coordination/notification, cluster management, Master elections, distributed locks, and distributed queues based on Zookeeper.

Zookeeper ensures the following distributed consistency features:

Sequential consistency
atomic
A single view
reliability
Real-time (final consistency)

Read requests from clients can be processed by any machine in the cluster, and if a listener is registered on the node, the listener is also handled by the connected ZooKeeper machine. For write requests, these requests are sent to other ZooKeeper machines at the same time and are agreed upon before the request returns success. Therefore, as the number of Cluster machines in ZooKeeper increases, read request throughput increases but write request throughput decreases.

Order is an important feature of ZooKeeper. All updates are globally ordered. Each update has a unique timestamp, called zooKeeper Transaction Id (ZXID). However, the read request will only be ordered relative to the update, that is, the return result of the read request will contain the latest ZXID of the ZooKeeper.

2. What does ZooKeeper offer?

The file system
A notification mechanism

3. Zookeeper file system

Zookeeper provides a multi-tier node namespace (called zNode). Unlike file systems, where only file nodes can hold data and not directory nodes, these nodes can set associated data.

To ensure high throughput and low latency, Zookeeper maintains the tree directory structure in memory. This feature prevents Zookeeper from storing large amounts of data. Each node can store up to 1 MB of data.

4. ZAB agreement?

ZAB protocol is a crash recovery atomic broadcast protocol specially designed for distributed coordination service Zookeeper.

The ZAB protocol includes two basic modes: crash recovery and message broadcast.

When the whole ZooKeeper cluster is just started, or the Leader server breaks down, restarts, or the network fault leads to the normal communication between half of the servers and the Leader server, all processes (servers) enter the crash recovery mode, and a new Leader server is elected first. Then the Follower server in the cluster starts to synchronize data with the new Leader server. When more than half of the hosts in the cluster complete data synchronization with the Leader server, the Follower server exits the recovery mode and enters the message broadcast mode. The Leader server starts receiving transaction requests from clients to generate transaction proposals for transaction request processing.

5. Four types of data nodes zNodes

1. Persistent-persistent nodes

The node exists on Zookeeper unless manually deleted

2. EPHEMERAL- EPHEMERAL node

The life cycle of the temporary node is bound to the client session. Once the client session fails (the client disconnects from ZooKeeper and the session does not necessarily fail), all the temporary nodes created by the client are removed.

PERSISTENT_SEQUENTIAL- persistent sequential node

The basic features are the same as the persistent node, but with the addition of sequential attributes. The node name is followed by an increment integer maintained by the parent node.

4. EPHEMERAL_SEQUENTIAL- Temporary sequential node

Basic features are the same as temporary nodes, with sequential attributes added and a self-incrementing integer number maintained by the parent node appended to the node name.

6. Zookeeper Watcher mechanism – Data change notification

Zookeeper allows the client to register a Watcher listener with a Znode on the server. When the Watcher is triggered by some specified event on the server, the server sends an event notification to the specified client to implement the distributed notification function. The client then makes business changes based on Watcher notification status and event type.

Working mechanism:

The client registers watcher
The server handles watcher
The client calls back watcher

Watcher features:

1. Disposable

Whenever a Watcher is triggered, Zookeeper removes it from the appropriate storage, either on the server or client side. Such a design effectively relieves the pressure on the server. Otherwise, the server will constantly send event notifications to the client for frequently updated nodes. Both the network and the server are under great pressure.

2. Serial execution by the client

The process of the client Watcher callback is a serial synchronization process.

3, light weight

3.1. Watcher notifications are very simple. They tell the client that an event has occurred, but they do not explain what the event is.

3.2 When a client registers a Watcher with a server, it does not pass the client’s actual Watcher object entity to the server. It simply marks the request with a Boolean attribute.

Watcher events are sent asynchronously from the server to the client. This creates a problem. Different clients communicate with each other over sockets. Zookeeper provides the ordering Guarantee, that is, the client can sense the changes of the ZNode monitored only after the monitoring event is monitored. So with Zookeeper we can’t expect to monitor every node change. Zookeeper only ensures final consistency, not strong consistency.

5. Register watcher getData, exists, getChildren

6. Trigger Watcher create, delete, setData

7. When a client connects to a new server, watch will be triggered with any session event. When a server is disconnected, the watch cannot be received. When the client reconnects, if necessary, all the previously registered watches will be re-registered. Usually this is completely transparent. Only in one special case can a watch be lost: for an uncreated ZNode exist Watch, the watch event can be lost if it is created during the client disconnection and then deleted before the client connects.

7. The client registers the Watcher implementation

Call getData()/getChildren()/exist() apis and pass in the Watcher object
Tag request request, encapsulate Watcher to WatchRegistration
Encapsulate the Packet into a Packet object, and send the server to send request
After receiving the response from the server, register Watcher with ZKWatcherManager for management
Request returned, registration completed.

8. The server handles the Watcher implementation

1. The server receives the Watcher and stores it

After receiving the client request, process the request to determine whether it is necessary to register Watcher. If necessary, connect the node path of the data node with ServerCnxn (ServerCnxn represents the connection between a client and a server, realizing the Process interface of Watcher. This can be viewed as a Watcher object) stored in WatchTable and watch2Paths of the WatcherManager.

2. Watcher trigger

Take the NodeDataChanged event triggered by the setData() transaction request received by the server as an example:

2.1 packaging WatchedEvent

Encapsulate the notification status (SyncConnected), event type (NodeDataChanged), and node path into a WatchedEvent object

2.2 query Watcher

Look up Watcher based on node path from WatchTable

2.3 didn’t find; Note No client has registered Watcher on the data node

2.4 find; Extract and remove the Watcher from WatchTable and Watch2Paths.

3. Call the process method to trigger Watcher

In this case, process mainly sends Watcher event notifications through the TCP connection corresponding to ServerCnxn.

9. The client calls back Watcher

The SendThread thread receives the event notification, and the EventThread calls back Watcher.

The client-side Watcher mechanism is also one-time, and once triggered, the Watcher is invalidated.

10. ACL permission control mechanism

UGO (User/Group/Others)

Currently used on Linux/Unix file systems, it is also the most widely used permission control method. Is a coarse-grained file system permission control mode.

Access Control List (ACL) Indicates an Access Control List

It includes three aspects:

Permission Mode (Scheme)

1. IP: Implements permission control by IP address granularity

2. Digest: the most commonly used permission identifier is username:password

Set, easy to distinguish between different applications for permission control

3. World: The most open permission control mode is a special digest mode with only one permission bar

“Knowledge world: something”

4. Super: indicates the Super user

Authorization object

An authorization object is a user or a specified entity to which the permission is granted, such as an IP address or a machine light.

Access Permission

CREATE: Data node creation permission, allowing authorized objects to CREATE child nodes under this Znode
DELETE: indicates the permission to DELETE child nodes of a data node, allowing authorized objects to DELETE child nodes of the data node
READ: READ permission of a data node, allowing authorized objects to access the data node and READ its data content or child node list
WRITE: data node update permission, allowing authorized objects to update the data node
ADMIN: indicates the permission to manage data nodes, allowing authorized objects to set ACLs on the data nodes

11. Chroot feature

Since version 3.2.0, the Chroot feature has been added, which allows each client to set a namespace for itself. If a client has Chroot set, any operations that the client can do to the server will be restricted to its own namespace.

By setting Chroot, a client can be applied to a subtree of the Zookeeper server. In scenarios where multiple applications share the same Zookeeper, different applications are isolated from each other.

12. Session management

Points barrels strategy: Manages similar sessions in the same block so that Zookeeper can access sessions

Separate processing of different blocks and unified processing of the same block.

Allocation principles: ExpirationTime for each session

Calculation formula:

ExpirationTime_ = currentTime + sessionTimeout ExpirationTime = (ExpirationTime_ / ExpirationInrerval + 1) * Context Context ExpirationInterval Indicates the ExpirationInterval for a Zookeeper session. The default ExpirationInterval is tickTimeCopy the code

13. Server roles

Leader

1. The only scheduler and handler of transaction requests to ensure the sequential processing of cluster transactions

2. Dispatcher of all services in the cluster

Follower

1. Process the non-transaction request of the client and forward the transaction request to the Leader server

2. Participate in voting of the transaction request Proposal

3. Participate in Leader election voting

Observer

A server role introduced after version 3.0 that improves the non-transactional capabilities of a cluster without affecting its transactional capabilities
Handles non-transaction requests from clients and forwards transaction requests to the Leader server
Do not participate in any form of voting

14. Running status of the Zookeeper Server

The server has four states, which are LOOKING, FOLLOWING, LEADING, and OBSERVING.

LOOKING: Searches for the Leader status. When the server is in this state, it considers that there is no Leader in the cluster and therefore needs to enter the Leader election state.
FOLLOWING: status of the follower. The current server role is Follower.
LEADING: Indicates the leader status. Indicates that the current server role is Leader.
OBSERVING: Indicates the status of the observer. Indicates that the current server role is Observer.

15. Data synchronization

After the Leader election is complete, the Learner (collectively named Follower and Observer) returns to the Leader server to register. When the Learner server wants the Leader server to complete the registration, the data synchronization process starts.

Data synchronization process :(all in the form of message passing)

Learner registers with Learder

Data synchronization

Synchronous confirm

Zookeeper data synchronization is classified into four types:

Direct differential synchronization (DIFF synchronization)
TRUNC+DIFF Synchronization
Rollback synchronization only (TRUNC synchronization)
Full synchronization (SNAP synchronization)

Before data synchronization, the Leader server completes data synchronization initialization:

PeerLastZxid:

Extract lastZxid (the ZXID last processed by the Learner server) from ACKEPOCH message sent when the Learner server registers.

MinCommittedLog:

The Leader server proposes the minimum ZXID in the cache queue committedLog

MaxCommittedLog:

The Leader server proposes the maximum ZXID in the cache queue committedLog

Direct differential synchronization (DIFF synchronization)

Scenario: peerLastZxid is between minCommittedLog and maxCommittedLog

TRUNC+DIFF Synchronization

Scene: When the new Leader server finds that a Learner server contains a transaction record that it does not have, it rolls back the Learner server transaction to the ZXID closest to peerLastZxid that exists on the Leader server

Rollback synchronization only (TRUNC synchronization)

Scenario: peerLastZxid is greater than maxCommittedLog

Full synchronization (SNAP synchronization)

Scenario 1: peerLastZxid is smaller than minCommittedLog
Scenario 2: There is no Proposal cache queue on the Leader server and peerLastZxid is not equal to lastProcessZxid

16. How does ZooKeeper ensure transaction order consistency?

Zookeeper uses the globally incrementing transaction Id to identify all proposals. All proposals are presented with an ZXID. The ZXID is actually a 64-bit number with the highest 32 bits being epoch (period; Era; The world; New era is used to identify the leader cycle. If a new leader is generated, the epoch will be incremented, and the lower 32 bits are used to increment the count. When a new proposal is generated, a transaction execution request will be sent to other servers according to the two-stage process of the database. If more than half of the machines can execute it and succeed, the execution will start.

17. Why does a distributed cluster have a Master?

In a distributed environment, some business logic only needs to be executed by one machine in the cluster, and the results can be shared by other machines. In this way, repeated computing can be greatly reduced and performance can be improved. Therefore, leader election is required.

18. What can I do if the ZK node is down?

Zookeeper is also a cluster. You are advised to configure at least three servers. Zookeeper itself ensures that when one node goes down, other nodes continue to provide services.

If one Follower crashes, there are still two servers that provide access to the Zookeeper data. Since there are multiple copies of the data on Zookeeper, the data is not lost.

If a Leader fails, Zookeeper elects a new Leader.

The mechanism of the ZK cluster is that as long as more than half of the nodes are normal, the cluster can provide services normally. The cluster fails only if there are so many ZK nodes hanging that only half or less of them work. so

A 3-node cluster can fail 1 node (the leader can get 2 votes >1.5)

A 2-node cluster cannot fail any node (the leader gets 1 vote <=1)

19. Differences between ZooKeeper and NGINx load balancing

Zk load balancing is adjustable, nginx can only adjust the weight, other need to control the need to write plug-ins; However, the throughput of Nginx is much higher than that of ZK.

20. What are the deployment modes of Zookeeper?

Deployment mode: Single-machine deployment mode, pseudo-cluster deployment mode, and cluster deployment mode.

21. How many machines are required in a cluster? What are the clustering rules?

The cluster rule is 2N+1 (N>0, that is, three nodes).

22. Does the cluster support dynamically adding machines?

In fact, the horizontal expansion, Zookeeper is not very good at this aspect. Two ways:

All the restart: Stop all Zookeeper services, modify the configurations, and start them. Previous client sessions are not affected.

One at a time to restart: Under the principle that more than half of the VMS are available, the restart of one vm does not affect the external services of the entire cluster. This is a common way to do it.

Version 3.5 supports dynamic capacity expansion.

23. Is Zookeeper’s watch monitoring notification on a node permanent? Why not permanent?

It isn’t. Official statement: A Watch event is a one-time trigger. When the data of the Watch is changed, the server sends the change to the client with the Watch set, so that they can be notified.

Why is it not permanent? For example, if the server changes frequently, and the listening client in many cases, every change has to be notified to all the clients, putting a lot of pressure on the network and the server.

Generally, the client executes getData(“/node A “,true). If node A changes or is deleted, the client will get its Watch event. However, after node A changes again, and the client does not set the watch event, it will not send to the client.

In practice, in many cases, our clients don’t need to know every change on the server side, I just need the latest data.

24. What Are the Java clients of Zookeeper?

Java client: ZK client and Apache open source Curator.

25. What is Chubby and how do you compare it to Zookeeper?

Chubby is Google’s fully implemented PaxOS algorithm and is not open source. Zookeeper is an open source implementation of Chubby, using the ZAB protocol, a variant of the PaxOS algorithm.

26. Describe some common ZooKeeper commands.

Common commands include ls get set create delete.

27. The connection and difference between ZAB and Paxos algorithms?

Similarities:

1. Both have a Leader process role, which coordinates the execution of multiple Follower processes

2. The Leader process will wait for more than half of the followers to give correct feedback before submitting a proposal

3. In ZAB protocol, each Proposal contains an epoch value to represent the current Leader cycle. In Paxos, the name is Ballot

Difference:

ZAB is used to build a highly available distributed data master/slave system (Zookeeper), and Paxos is used to build a distributed consistent state machine system.

28. Typical Zookeeper application scenarios

Zookeeper is a typical publish/subscribe distributed data management and coordination framework that developers can use to publish and subscribe distributed data.

By cross-using the rich data nodes in Zookeeper and cooperating with Watcher event notification mechanism, it is very convenient to build a series of core functions involved in distributed applications, such as:

Data publish/subscribe
Load balancing
The naming service
Distributed coordination/notification
Cluster management
Master the election
A distributed lock
Distributed queue

1. Data publishing/subscription

introduce

A data publish/subscribe system, known as a configuration center, is where publishers publish data for subscribers to subscribe to.

purpose

Dynamically retrieving data (configuration information)

Centralized management of data (configuration information) and dynamic update of data

Design patterns

Push model

The Pull model

Data (configuration information) feature

The amount of data is usually small
Data content is dynamically updated at run time
All machines in the cluster share the same configuration

For example: machine list information, run time switch configuration, database configuration information, etc

Implementation based on Zookeeper

Data store: Stores data (configuration information) to a data node on Zookeeper
Data acquisition: The application reads data from the Zookeeper data node at startup initialization and registers a data change Watcher on the node
Data change: When data is changed, the Zookeeper node data is updated. Zookeeper sends a data change notification to each client. After receiving the notification, the client can read the changed data again.

2. Load balancing

Zk naming service

Naming service is to obtain the address of the resource or service by the specified name, using ZK to create a global path, this path can be used as a name, pointing to the cluster in the cluster, the address of the service, or a remote object and so on.

Distributed notification and coordination

For system scheduling: the operator sends notifications that actually change the state of a node through the console, and ZK sends those changes to all the clients of the Watcher that have registered the node. For performance reporting: each worker process creates a temporary node in a directory. It also carries the progress data of the work, so that the summary process can monitor changes in the sub-nodes of the directory to get a real-time global picture of the work progress.

Zk naming Service (File System)

Naming service is to obtain the address of the resource or service by the specified name, using ZK to create a global path, that is, a unique path, this path can be used as a name, pointing to the cluster in the cluster, the address of the service, or a remote object and so on.

Zk configuration Management (file system, notification mechanism)

Program distributed deployment in different machines, the program configuration information in the ZK znode, when the configuration changes, that is, when znode changes, you can change the content of a directory node in ZK, using Watcher notification to each client, so as to change the configuration.

Zookeeper Cluster management (file system, notification mechanism)

Cluster management does not care about two things: whether a machine exits or joins, or elects a master.

For the first point, all machines have a convention to create a temporary directory node under the parent directory and then listen for child node change messages from the parent directory node. Once a machine dies, its connection to ZooKeeper is disconnected, the temporary directory node it created is deleted, and all other machines are notified that a sibling directory has been deleted, so everyone knows it’s on board.

In the same way, all machines are notified that the new sibling directory has been added, and the highcount is now available again. For the second point, we change it slightly. All machines create a temporary sequentially numbered directory node, and select the machine with the smallest number each time as the master.

Zookeeper distributed lock (file system, notification mechanism)

With ZooKeeper’s consistent file system, locking issues are made easier. Locking services can be divided into two categories, one for holding exclusivity and the other for controlling timing.

For the first type, we treat a ZNode on ZooKeeper as a lock, implemented by createZNode. All clients create the /distribute_lock node, and the client that is successfully created owns the lock. When the distribute_lock node you created is deleted, the lock is released.

For the second type, /distribute_lock already exists, and all clients create a temporary sequentially numbered directory node under it. As with master, the least numbered directory node obtains the lock and is deleted when it is used up.

Zookeeper queue Management (file system, notification mechanism)

There are two types of queues:

1, synchronization queue, a queue is available only when all the members are together, otherwise wait for all members to arrive.

2. Queue entry and exit are carried out in FIFO mode.

First, create temporary directory nodes under the convention directory, and listen for the required number of nodes.

The second type is consistent with the basic principle of the control sequence scenario in distributed lock service. The entry column is numbered and the exit column is numbered. A PERSISTENT_SEQUENTIAL node is created in a specific directory, Watcher notifies the waiting queue when it is successfully created, and the queue deletes the node with the smallest sequence number for consumption. In this scenario, the ZNode of Zookeeper is used for message storage. The data stored in the ZNode is the message content in the message queue, and the SEQUENTIAL serial number is the message number. Since the nodes created are persistent, you don’t have to worry about losing queue messages.

Conclusion:

【 Spring recruitment series 】 the latest Linux interview real questions 45

【 Preparation for Spring Recruitment series 】 Over the years about MySQL high frequency interview questions

【 Spring Recruitment series 】springBoot soul 22 ask!

【 Prepare for spring recruitment series 】50 micro service interview questions in detail

【 Prepare for spring recruitment series 】27 MyBatis interview real questions detailed explanation

Q: How do you nail an interviewer

Spring recruitment is coming, sorted out some classic interview questions often tested by big factory, friends in need can pay attention to wechat public number: Java Programmers gathering place.