Zookeeper concepts

  • ZooKeeper is a distributed coordination service for managing large hosts. Coordinating and managing services in a distributed environment is a complex process. ZooKeeper solves this problem through a simple architecture and API

    ZooKeeper implements distributed locking

    The three elements of distributed lock are: locking, unlocking, lock timeout
  • The ZooKeeper data structure is similar to the tree structure and consists of the node ZNode
  • There are four types of ZNodes:

    • Persistent node: The default node type. The node still exists after the client that creates the node disconnects from ZooKeeper
    • Persistent_sequential: Persistent node sequential nodes are the nodes that are numbered by ZooKeeper according to the chronological order in which they were created when they are created
    • Ephemeral: When the client that created the node disconnects from ZooKeeper, the EPHEMERAL node is deleted
    • Temporary Node Sequential Nodes (EPHEMERAL_SEQUENTIAL): When a temporary node is created, ZooKeeper numbers the nodes according to the order in which they were created
    • ZooKeeper’s temporary sequential node is applied to realize distributed locking

ZooKeeper vs. Redis Distributed Locking:

A distributed lock Zookeeper Redis
advantages 1. A well-encapsulated framework is easy to implement

2. There is a queue waiting for lock, which improves the efficiency of lock snatching
SET and DEL instructions have high performance
disadvantages Adding and removing nodes has poor performance 1. Complicated implementation, atomicity, error deletion and lock timeout issues need to be considered

2. There is no queue waiting for lock, only the client can spin to wait for lock, which is inefficient

ZooKeeper’s data model

  • Like a tree in a data structure, like a directory in a file system
  • ZooKeeper’s data store is based on the node zNode
  • ZNode references are path references, and each ZNode node has a unique path

    The element in zNode

  • Data: The data information that ZNode stores
  • ACL: Records the access rights of the ZNode, which processes and IP have access to the node
  • Stat: various metadata for Znode (data of data)
  • ZooKeeper is used to store a small amount of state and configuration information instead of a large amount of business data (ZNode can store no more than 1MB of data).

    ZooKeeper basic operations

  • Create the node :create
  • Delete node: Delete
  • Determine if the node exists: EXISTS
  • Get the data for a node :getData
  • Set the data for a node :setData
  • Get all the child nodes of node under: getChildren exists, getData, getChildren belongs to read operation, when a Zookeeper client request read operation, can choose whether to set the watch

    ZooKeeper event notification

  • A Watch can be thought of as a trigger registered on a particular ZNode
  • When ZNode changes, call the create,delete, and setData methods, which will trigger the corresponding event registered on ZNode, and the client of the requested Watch will receive the asynchronous notification
  • ZooKeeper event notification interaction process:

    • The client calls the getData method, and the watch parameter is true. The server receives the request, returns the node data, and inserts the ZNode path of the watch and the list of Watchers in the corresponding Hash table
    • When deleted by the Watch’s Znode, the server will search the Hash table, find all the Watchers corresponding to this Znode, notify the client asynchronously, and delete the corresponding key-value in the Hash table

      ZooKeeper’s consistency

  • The ZooKeeper Service cluster is a master multi-slave structure
  • When data is updated, it is first updated to the master server and then synchronized to the slave server
  • When reading data, read any node directly
  • ZAB protocol is adopted to ensure the consistency of data between master and slave nodes

    ZAB agreement

  • Zab (ZooKeeper Automic Broadcast): resolves problems with ZooKeeper cluster crash recovery and master-slave data synchronization
  • ZAB has three node states:

    • Looking: Election status
    • Following: The state of the Following node (slave node)
    • No, no, no, no
  • Maximum ZXID: The most recent transaction number locally on the node, consisting of Epoch and Count

    Zab cluster crash recovery

  • When the primary server of ZooKeeper goes down, the cluster will undergo crash recovery, which is divided into three stages:

    • 1. Leader Election

      • The nodes in the cluster are Looking, and each of them votes to other nodes, which includes the ID of its server and the latest transaction ID(ZXID).
      • Nodes compare their ZXIDs with the ZXIDs received by other nodes. If the ZXIDs of other nodes are found to be larger than their own, that is, the data is newer than their own, they initiate a new vote and vote for the node to which the largest known ZXID belongs
      • After each vote, the server will count the number of votes and determine if a node gets more than half of the votes. Such a node will become the quasi-leader with the status of Leading and the status of other nodes becoming the Following
    • The stage of Discovery:

      • The latest ZXID and transaction log are found in the slave node, in order to prevent the election of multiple leaders in unexpected cases
      • The Leader receives the latest epoch value sent by all followers, and selects the largest epoch among them. Based on this value +1, a new epoch is generated and distributed to each Follower
      • Each Follower receives the latest epoch and returns an ACK(response code) to the Leader with their largest ZXID and historical transaction log. The Leader selects the largest ZXID and updates its own history log
    • Synchronization(the Synchronization phase)

      • The latest historical transaction log collected by the Leader is synchronized with all the followers in the cluster. Only when half of the followers succeed synchronously can the quasi-leader become the official Leader. Cluster crash recovery is officially complete

        ZAB master slave data synchronization

  • Broadcast ZooKeeper normally broadcasts to all followers from the Leader when updating data:

    • The client issues a write request to any Follower
    • The Follower forwards the write request to the Leader
    • The Leader adopts a two-stage submission approach: first keep the submission log, then submit the data. First send a veto broadcast to the followers
    • The Follower receives the veto message, writes to the log successfully, and returns the ACK message to the Leader
    • The Leader receives more than half of the ACK messages, returns success to the client, and broadcasts the COMMIT request to the followers

      Data Consistency: Strong Consistency: Weak Consistency Sequential Consistency: ZooKeeper, relying on transaction ID and version number, ensures that data is updated and read in order

      ZooKeeper application scenarios

  • Distributed lock: Apply temporary sequential node of ZooKeeper to realize distributed lock
  • Service registration and discovery: Use ZNode and Watcher to implement distributed service registration and discovery, such as Dubbo
  • Sharing configuration and status information: Redis’ distributed solution CODLS uses ZooKeeper to store data routing table and CODLS-Proxy node meta information. Meanwhile, commands initiated by colds-config will be synchronized to each surviving CODLS-Proxy through ZooKeeper
  • High Availability Implementation: Kafka,HBase,Hadoop all rely on ZooKeeper to synchronize node information to achieve high availability

    Create ZooKeeper based on Docker

    Yml zoo: image: ZooKeeper Restart: always hostname: zoo ports: -2181:2181 environment: -Zoo_my_id: 1 - ZOO_SERVER: server.1(id)=zoo(IP):2888:3888 2. Do docker-compose up-d

    ZooKeeper works in three modes

  • Single machine mode: single point of failure exists
  • Cluster mode: Deploy the ZooKeeper cluster on multiple servers
  • Pseudo-clustered mode: Multiple instances of ZooKeeper running on the same server still have a single point of failure where the configured port numbers are staggered

    ZooKeeper has three port numbers

  • 2181: The client connects to the listening port used by the ZooKeeper cluster
  • 3888: Used by election Leader
  • 2888: Machine communication usage within the cluster (the port number used for data synchronization between the Leader and followers, and the Leader listens on this port)