This article is participating in “Java Theme Month – Java Development in Action”, see the activity link for details

Hello everyone, today, Jie Ge (ha ha, life always someone call this, feel very cordial, accept) take you to formally enter the micro service architecture commonly used registry ZooKeeper chapter study

Planning to write ZooKeeper, I thought of a scene where I was “severely abused” by a guru when I just entered this company

global

In my first week in the company, I had dinner with my tutor, who said that he was studying ZooKeeper recently (maybe he was eager to find someone to discuss with him), so I started the following dialogue (yes, he was abused while eating…).

Why callzookeeper

Tutor: Listen to you, you have used Zookeeper for one year, so will you tell me what zookeeper is first?

Me :(how do I know what I said before, actually buried a hole for myself. Zookeeper is what we use all the time, it’s an important part of the microservices architecture, and I understand that it will often be used as a registry for Dubbo.

Tutor: Is that it?

Me (thinking: this is a bit hasty, but, but, I can’t think of anything else) : Um, hang on, um, that’s it, what do you say?

Tutor: This kind of answer is quite extensive. To be honest, it is a bit irrelevant. It doesn’t say anything to my heart.

ZooKeeper, called Zookeepr, why not call it an animal like other technologies? Like pig, cat, isn’t that the tradition? Zookeeper is the coordinator of distributed environment, so it is called zookeeper.

I: ha ha, originally be so come of, that this name is really very image.

Supervisor: Yes, it is a typical distributed data consistency solution in general. Distributed applications can implement functions such as data publish/subscribe, load balancing, naming service, distributed coordination/notification, cluster management, Master election, distributed lock and distributed queue based on zooKeeper.

Me: Wow, that’s professional, but a little abstract…

zookeeperThe characteristics of the

Tutor: Yes, a little bit, but in fact, if you have a certain understanding of it, including the architecture and features, you will not feel abstract. By the way, do you know what zooKeeper features are?

I (this question, can always be a bit ambiguous) : implement the publication and subscription services, do a good service coordinator. With high performance, it also needs to be fault tolerant, consistent and efficient.

Tutor: Well, that’s good. The answer to this question is a little succinct, but you can see that you have used it before, and the key points have been mentioned. Let me tell you about it. It has four characteristics:

  • A simple and efficient

ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical namespace, which is organized in a manner similar to a standard file system. Unlike typical file systems for storage, ZooKeeper data is kept in memory, which means ZooKeeper can achieve high throughput and low latency.

  • Cluster Node Synchronization

Like the distributed processes it coordinates, Zookeeper itself replicates and synchronizes between hosts in the cluster.

Each server that forms the ZooKeeper Service must know each other. They maintain an in-memory state map, as well as a persistent transaction log and snapshot. As long as most of these servers are available, the entire ZooKeeper service is available.

The Client only needs to connect to any ZooKeeper server. It maintains a TCP connection through which it sends requests, obtains responses, obtains monitoring events, and sends heartbeats.

If the TCP connection to the server breaks, the client will connect to a different server.

3) ordered

Zookeeper marks each update with a number that reflects the order of all ZooKeeper transactions. Subsequent operations can use this order to achieve higher levels of abstraction, such as synchronization primitives.

4) fast

This is especially fast when dealing with “read” loads. The ZooKeeper application runs on thousands of machines and performs best when reads are more common than writes, by a ratio of about 10:1

In general, zooKeeper’s implementation focuses on high performance, high availability, and strict sequential access. The performance aspects of ZooKeeper mean that it can be used in large distributed systems. The reliability aspect prevents it from causing a single point of failure. Strict sorting means that complex synchronization primitives can be implemented on the client side.

Me: Oh, well, maybe I don’t pay much attention to these things.

zookeeperData model of

Tutor: Right, so you probably know what the data model of ZooKeeper is, that is, what structure is it stored in?

Me: Yeah, yeah, YEAH! I often use client commands to check if my microservice instances are registered with ZooKeeper, which is usually a file system-like directory separated by a slash.

For example, if I want to find an instance, I need to go to its group, then to instance, then to the specific registered instance name. I can check whether the instance is registered by using ls:

If the command output is [], the current instance is not registered successfully.

However, if an instance number is displayed, the registration is successful.

Tutor: Well, this is commonly used and there is no problem with what you said. As mentioned above, zooKeeper provides namespaces that are very similar to those of standard file systems. Each node name is a series of path elements separated by a slash (/). Each ZNode in the ZooKeeper namespace is identified by a path.

Just like having a file system, this file system allows files to become directories.

Note that ZooKeeper is designed to store coordinated data: status information, configuration information, location information, etc., so the data stored on each node is usually small, in the byte to kilobyte range.

I (didn’t just repeat my words again? : Yeah, yeah.

znodenode

Tutor: Ok, so what do you know about zNode, which is part of the model?

Me: Well, I know there are temporary nodes and permanent nodes. If an instance is registered on it, a node is created for that instance. The instance name is a temporary node, and when the instance dies, the node disappears.

It will then store at least the data of its own node, which can be obtained by using the get command, and the rest may not be known for the time being. Hee hee ~

Tutor: Yes, you are right about temporary nodes and permanent nodes, and of course they are assigned a serial number when they are created to keep them in order. It’s easy to overlook what a ZNode contains just for use, but it’s actually important.

Znodes maintains a statistics structure that includes version numbers for data changes, ACL changes, and timestamps to allow cache validation and coordinated updates.

Of course, as you mentioned, ZNode first contains data information data

Except for data, the ZNode structure actually looks like this

It also contains ACL, STAT, and child node references

  • Access Control List (ACL) : Records the Access permission List of a Znode, that is, the people who can Access the node are stored.

  • Stat: Contains various metadata for ZNode, such as transaction ID, version number, timestamp, size, and so on.

  • Child: a reference to the child node of the current node, similar to the child node of a binary tree, except for the two children I drew.

I: HMM, this point, really did not understand.

Tutor: Ok, how much do you know about its Watch feature?

Me: I have learned Eureka. Comparatively speaking, the Watch mechanism of ZooKeeper enables the client to sense the new, deleted and updated operations of the service instance it needs to call on ZK in real time, so as to carry out corresponding service fusing, rollback and subsequent processing. Such a mechanism is of great help to actual production.

Tutor: No problem, this is actually a trigger registered on a particular ZNode. When the client calls any operation to getData(including getData(), getChildren() and exist()) and sets the listening event parameter watch to true, then when the znode changes, The ZooKeeper server sends notification of the change to the client on which the request is listening.

In addition, a new feature has been added in 3.6.0: clients can also set up permanent recursive monitoring on ZNode, which is not removed when triggered, and recursively triggers the registration of ZNode and changes to all child ZNodes.

Tutor: Do you know something about cluster architecture?

I (ha ha, finally arrived yesterday I saw the cluster architecture ~ this I can ah! : Yeah, yeah. I know that it has a master-slave structure. To be specific, it has one leader and multiple followers, as well as an observer that is only responsible for reading operations but does not participate in elections. This is the picture I drew when I was studying

  • First of all, there isclientserverTwo characters.clientthroughTCPWith one of themserverEstablish a connection. Among themserverDivided intoleaderfollower, as well asobserverThree roles

Leader: responsible for initiating and deciding votes

Follower: synchronizes the status of the leader and participates in the leader election. The write operation is forwarded to the leader and participates in the “write half success” policy

Observer: Synchronizes the status of the leader and forwards write operations to the leader. Does not participate in the voting process, nor does it participate in the “write half success” policy of the write operation

  • If the leader fails due to network or other reasons, a new leader is selected from the followers.

  • Observer machines can improve cluster read performance without affecting write performance.

Tutor: Yes, yes, this picture is quite clear, learning really needs to be organized into this kind of flow chart or frame diagram to show, and have the opportunity to tell others your understanding, the effect is the best. Great!

I (finally praised! But to keep a low profile) : Hee hee, no, no, because I am interested in this architecture, so I took the time to learn about it.

ZABagreement

Tutor: You know zooKeeper uses the ZAB protocol to ensure data consistency between the master and slave nodes?

I (tutor, you continue to show ~) : this, I have heard, but the specific words are not very clear, maybe I have not understood this.

Tutor: That’s ok. Let me tell you something.

The ZAB protocol has two basic modes: crash recovery and broadcast mode

Let’s start with crash recovery mode. As you mentioned earlier, there is a leader in a cluster architecture. But what if the leader fails due to network failure?

Me: need to run a new election, vote in followers to choose one

Tutor: Yes, the election is conducted by using THE ZAB protocol. The ZAB protocol will enter the recovery mode to elect the leader. After the election, zAB will go out of recovery mode.

When more than half of the followers have completed data synchronization, zAB goes from crash mode to broadcast mode.

What about this one? I drew it.

1) The leader receives the data write request (the client sends the data write request to any Follower and the Follower forwards the data write request to the leader)

2) Launch of the broadcast proposal

3) Initiate broadcast proposals to all followers

4) The Follower receives the Propose message and writes the log successfully

5) Return an ACK message to the Leader

6) The Leader receives more than half of the ACK messages, returns a success message to the client, and broadcasts a Commit request to the Follower

7) The Follower commits the transaction after receiving the message

I (thought: god teacher really fierce ah! : Wow, this graph gives you a clear view of the zooKeeper cluster request processing process. This helps me understand why ZooKeeper can guarantee atomic properties.

However, things are not the end of the meaning, the mentor stood in his point of view, out of the performance

Tutor: Do you know anything about reliability?

I (totally not!) : Heh heh, no concept.

Well, look at the picture on the website and you’ll get the idea.

1) Downtime and recovery of followers

2) Different followers break down and recover

3) Downtime of the leader

4) Both followers crash and recover

5) The other leader breaks down

From this graph, we can make some important observations.

1) First, If followers fail and recover quickly, zooKeeper can maintain high throughput even if the follower fails;

2) More importantly, of course, the Leader election algorithm allows the system to recover quickly enough to prevent a significant drop in throughput. From our observations, zooKeeper chooses a new leader in less than 200 milliseconds:

3) As followers recover, zooKeeper is able to improve throughput again after it starts processing requests.

Me: Silence…

Tutor: What’s wrong? It doesn’t matter, your current awareness of learning may not be established. But I find that your learning ability is still good, and you should pay more attention to thinking in the future. If I show you a technology, I need the result, not only to know how to use it, but also to explain the specific principle of this technology to others through the way of flow chart, as well as the suitable business scenarios in practical application.

This way, not only will you have a deep understanding of the technology, but the listener will quickly have an idea of the technology, and will be able to choose different technologies for different business scenarios. For example, in the registry, whether we should choose Eureka or ZooKeeper or other registries depends on our specific business scenarios.

Me: right, thank you tutor, it seems that I still have a very long way to go!

conclusion

In this way, after being “abused” by the tutor, he thought over the pain, and from then on, he honestly and truly learn technology.

There are probably a lot of people who are similar to me before, thinking that technology is just able to use it, right? Interview is really troublesome, the programmer’s job is not to add, delete, change and check it? Do I have to be so theoretical? It was not until the interview ended in failure for many times that I began to reflect on this problem after reading a lot of interview experience and experience of big wigs.

Later, because of stepping on a lot of pits, it gradually came to understand that programmers are divided into many levels, if you just use a technology, not to think about its principle, then not only will not upgrade, even the usual work may be difficult because of the lack of comprehensive consideration. Therefore, I have gradually developed the habit of continuous learning.

So, I hope you will continue to explore the underlying principles of each component of the architecture with me. You can learn a technology in depth every month, which may be you often use, or may be relatively unfamiliar, more thinking, more analysis, and more communication with your peers, I believe that you will not only become more and more like technology, but also become a bull you are currently worships!