Not envy mandarin duck not envy fairy, a line of code half a day

I personally don’t like this component because its code abuses me. Introduce a network function that Netty can easily implement, and have to cut NIO in the code, the code makes people look in the fog.

In addition, the expansion and reduction of Zookeeper once made my team suffer losses and lost a lot of data. If you use something bad, you will have a bad impression of it. Fortunately, it is old and I rarely use it.

1. What is ZooKeeper

ZooKeeper is a distributed coordination system that is widely used. It was originally a subproject of Hadoop and is currently a top-level project of the Apache Foundation. Common microservice frameworks such as SpringCloud and Dubbo can use Zookeeper as their registry.

In addition to being a registry, it has a wide range of usage scenarios, including naming services, distributed coordination/notification, elections, distributed locks, distributed queues, load balancing, configuration services, and more.

Since ZooKeeper talks about being a distributed system, it can’t do without CAP theory,

2. What is CAP theory

CAP theory refers to the importance of Consistency, Availability and Partition tolerance in a distributed system. Here are three concepts:

  • Consistency (C) : Whether the values are the same at the same time in different nodes or backups of a distributed system. For example, the master and slave nodes of MySQL may read different data on the slave and the host due to the time difference between the binlog replication.

  • Availability (A) : Whether the cluster can still respond to read/write requests from clients after some nodes fail.

  • Partition tolerance (P) : With a large number of partitions in a cluster, there are bound to be some issues with consistency and availability. In practical terms, partitioning is a time-bound requirement for communication. If the system cannot achieve data consistency within a certain period of time, it means that partitioning has occurred.

Distributed systems, since P is required, will mostly choose between C and A. ZooKeeper, on the other hand, is a distributed system with strong consistency and tends to CP.

Therefore, it is important to remember that ZooKeeper is highly consistent. ZooKeeper can be used in scenarios that require high data consistency.

At the same time, we should also realize that. ZooKeeper does not guarantee the availability of each service request. In extreme circumstances, ZooKeeper may discard some requests and the consumer application may need to re-request to get results.

As can be seen from the official architecture diagram, different clients and different machines connected to the ZooKeeper cluster see the same data, which is the reason for its strong consistency.

3. Usage scenarios of ZooKeeper

3.1 Register and configure the center

The information of service nodes such as Spring Cloud and Dubbo, such as machine list, etc., generally has a small data set, but the consistency requirement is very high, and the data is often changed, which is a very suitable scenario for ZooKeeper.

By publishing this information to ZooKeeper, the application node can get a consistent view of the data it’s retrieving if it changes.

3.2 Distributed coordination/notification

ZooKeeper has Watcher registration and asynchronous notification mechanisms that can be coordinated between different service nodes or even different systems. This is very much like the Pub/Sub mechanism in traditional message queues, but due to the real-time performance of ZooKeeper and strong data consistency, data distribution becomes very reliable.

3.3 the election

The so-called election is to elect a leader with the final decision among numerous service nodes, which is generally called Master in a service cluster. For example, a service needs an exposed interface, but it needs to be highly available. When a node in service goes down, you need to use the election function to select one node from the backup mechanism and continue the service. The remaining machines are called backups of the elected machine.

3.4 Distributed Lock

Distributed lock is a lock designed to coordinate shared resources in a distributed environment. For example, you have a timing service with two nodes, but only one node is required to perform the calculation of business logic at execution time. At this time, tasks become shared resources. When obtaining tasks, mutual exclusion can be used to ensure interference between each other and consistency.

3.5 Distributed Queue

ZooKeeper can also implement distributed queues. For example, when executing a batch of tasks, the previous tasks are processed before the later tasks are processed. At this point, you can save the task information to ZooKeeper.

It is similar to the queue concept of message queues, but is better suited for small-batch, rigorously ordered tasks.

4. Similar components

ZooKeeper is built on the ZAB protocol, which is similar to the Paxos protocol. Since these protocols were too complex, Etcd and Consul based on Raft protocols followed. ZooKeeper was developed based on the Java language, while the latter two were developed using Golang.

Etcd and Consul are up-and-comers who outperform ZooKeeper in terms of features and performance, and they are both CP systems with little difference in usage.

In Java ecology, ZooKeeper is used more. Considering the surrounding construction and product ecology, ZooKeeper plays a very important role in Java enterprise applications.

End

When I use it again, it’s definitely for work, not for pleasure. This also proves that I am not proficient at it at all. Although I have bought the book and read a little bit, don’t leave a comment asking me about the technology.