ZooKeeper data structure and operations

Related historical articles (you may want to check out the previous series 👇 before reading this article)

Four of the most complete Spring Boot series in China

What is ZooKeeper – Chapter 347

ZooKeeper Installation – Part 348

As mentioned earlier, ZooKeeper is an open source distributed coordination service framework. It is essentially a distributed file storage system, offer similar and the way of file system directory tree of data storage, and can to effective management of the nodes in the tree, which is used to maintain and monitor the state of the stored data. By monitoring the change of the state of the data, based on the data of cluster management.

ZooKeeper= file system + listening notification mechanism + ACL

In this section we’ll focus on the data structure of a file system.

ZooKeeper data structure

ZooKeeper maintains a data structure similar to a file system (as shown in the official diagram below). Each subdirectory entry (such as APP1) is called a ZNode (directory node). Just like a file system, we can CRUD a ZNode or its subzNodes under a ZNode. The only difference is that ZNode can store data.

1.1 Znode * * * * type

There are three types of ZNodes:

(1) Persistent node nodes will be persisted.

(2) Ephemeral node. After the client is disconnected, ZooKeeper automatically deletes the ephemeral node.

Sequential node. Each time a sequential node is created, ZooKeeper automatically adds a 10-digit number to the end of the path, starting at 1 and up to 2147483647 (2^32-1).

Each sequential node has a separate counter, monotonically increasing, maintained by the ZooKeeper Leader instance.

Znode actually comes in four forms, with persistent by default.

PERSISTENT nodes, such as create/test/a “hello”, are specified as PERSISTENT nodes using the create parameter

PERSISTENT_SEQUENTIAL (S0000000001), which is specified as a sequential node with the create-s parameter

(3) The EPHEMERAL node is designated as the sequential node by the create-e parameter

EPHEMERAL_SEQUENTIAL node (S0000000001) is designated by the create-S-e parameter as the temporary and sequential node

Zookeeper3.5. x introduces container nodes and TTL nodes (unstable)

(1) The Container node stores child nodes. If the child nodes in the Container node are 0, the Container node will be deleted by the server in the future. Scheduled tasks are executed every 60 seconds by default.

(2) The TTL node is disabled by default. You need to enable it by configuring it. If the TTL node has no child nodes or is not modified within a specified time, the TTL node will be deleted by the server.

2. Znode example

Let’s create a node to get a deeper understanding of the ZK data structure. Run the bin/ zkcli. sh command to connect to the ZK service and perform the following operations.

2.1 PERSISTENT Nodes

The creation format is create

Create a persistent /test node that holds “hello”. After the node is successfully created, you can use ls/to view the node. For persistent nodes we disconnect the client, reconnect it, and then use ls/to check that it still exists.

Of course we can create n child nodes under the /test node:

(1) Ls -r can traverse the directory structure of nodes;

(2) GET can obtain node data;

In addition, the node information can also be modified by using set data, and data can be deleted by delete:

2.2 PERSISTENT_SEQUENTIAL****

The creation format is create-s

Note:

(1) If there is a parent node, you need to create the parent node first. Otherwise, an error will be reported.

(2) Create -s /seq/ will be added, otherwise seq will be treated as a prefix.

(3) Other modification and deletion operations are the same as those for persistent nodes.

2.3 EPHEMERAL Temporary node

The creation format is create-e

(1) Press CTRL + C under MAC to disconnect the client, then on the connection, go through ls/View, the node /ephemeral node is automatically deleted after the connection is disconnected.

(2) The node life cycle is bound to the current session.

(3) The current session is not immediately deleted when it is disconnected. The delay is about 30 seconds, so you can still see the newly created temporary node on login immediately, but it will still be deleted after 30 seconds.

Ephemerals cannot have children: /ephemeral/test Ephemerals cannot have children: /ephemeral/test

2.4 EPHEMERAL_SEQUENTIAL**** Temporary sequential node

The creation format is create-e-s

2.5 Container **** Container node

The Container node is used to store child nodes. If the child nodes in the Container node are 0, the Container node will be deleted by the server in the future. Scheduled tasks are executed every 60 seconds by default.

The creation format is create-c

(1) create a container node with create-c;

(2) Create some child nodes in the container node and then delete the child nodes;

(3) After about 60 seconds, the container node is automatically deleted because it has no children.

Znode properties

Znode has both file and directory features. Both files and directories can be used as part of the path identification. Znode maintains:

Data: data information stored in a Znode.

ACL: records access permissions of ZNodes.

Stat: Contains various metadata for Znode, such as transaction ID, version number, timestamp, size, and so on.

Child: child reference of the current node.

By refining the above categories of attributes, details of the following attributes can be obtained:

CZxid: indicates the transaction ID when the node is created

MZxid: indicates the transaction ID of the node when it was last modified

Ctime: indicates the time when a node is created

Mtime: indicates the time when the node was last modified

PZxid: indicates the transaction ID of the node when the child node list was last modified. PZxid is updated only when the child list changes, not when the child content changes

Cversion: indicates the version number of a child node

Datspanning: Represents the content version number

DataLength: indicates the dataLength

NumChildren: indicates the number of child nodes

EphemeralOwner: Indicates the session sessionID when the temporary node was created. If it is a persistent node, the value is 0.

3.1 Zxid

Each operation that causes a ZooKeeper node to change state will cause the node to receive a timestamp in the Zxid format, and this timestamp is globally ordered. That is, each change to the node will produce a unique Zxid. If the value of Zxid1 is less than the value of Zxid2, the event corresponding to Zxid1 occurs before the event corresponding to Zxid2. In fact, each ZooKeeper node maintains two Zxid values, cZxid and mZxid.

(1) cZxid: indicates the timestamp in the Zxid format corresponding to the creation time of the node

(2) mZxid: indicates the timestamp in the Zxid format corresponding to the node modification time

In the implementation, the Zxid is a 64-bit number whose 32 bits higher is the epoch used to identify whether the Leader relationship has changed. Each time a Leader is elected, it will have a new epoch. The lower 32 bits are an increasing count.

3.2 the version number

The version number is used to record node data or the number of times the node’s child node list or permission information has been modified. If the version of a node is 1, it means that the node has changed once since its creation. Each operation on a node causes the version number of that node to increase. Each node maintains three version numbers:

(1) dataVersion: node dataVersion number

(2) cversion: indicates the version number of a child node

(3) Aversion: ACL version number owned by the node

It manages this data to enable caching and coordinate updates. The version number maintained by Znode increases each time the data in Znode is updated.

Four, summary

ZK data structure features:

(1) Each subdirectory is called znode, and each znode is the unique identifier of its path. For example, test1, the identifier of znode is /test/test1.

(2) ZNodes can have subdirectories, and each ZNode can store data.

(3) A ZNode can be a temporary node. If the session between the client and server expires, the ZNode will be deleted (note that it is not deleted immediately, but with a delay of about 30 seconds).

(4) If the data in zNode is modified, it can be monitored and notified to the corresponding client (this is the listening notification mechanism, which will be introduced later).