Zookeeper implements a distributed lock to trap interviewers

preface

We’ve talked about many locks only before, such as Synchonrized, ReentrantLock, AQS, and so on.

These are some of the uses of locks in singleton applications. As we all know, our applications are distributed now. So those locks aren’t enough for us anymore.

We need to use distributed lock, distributed lock generally has three implementation methods, such as: MySQL database, Redis, ZooKeeper.

The implementation methods of ZooKeeper and Redis are commonly used by major companies, and they have their own advantages and disadvantages. Today we mainly talk about the implementation of zooKeeper distributed lock. Redis distributed locks we’ll talk about later.

Zookeeper profile

Zookeeper is an open source distributed coordination service, which is a typical distributed data consistency solution.

Distributed applications can implement functions such as data publishing/subscription, load balancing, naming services, distributed coordination/notification, cluster management, Master elections, distributed locks, and distributed queues based on Zookeeper.

ZK implements two important concepts of distributed locking

The ZNode node

ZK’s storage structure is similar to that of a Windows file system. The difference is that Windows cannot store data in its directory, only files can store data.

But all levels of ZK directories are called ZNodes, they are all the same, there is no such thing as directories or files. There is only the concept of parent nodes and child nodes.

Data can be stored at any level of ZNode.

The following is a screenshot of the ZooKeeper client connection tool:

ZNode Node type

Temporary node

After the client is disconnected from ZooKeeper, the node is automatically deleted
Temporary ordered node

After the client is disconnected from ZooKeeper, the node is automatically deleted, but these nodes are arranged in order.
Persistent node

After the client is disconnected from ZooKeeper, the node still exists
Persistent node

After the client disconnects from ZooKeeper, the node still exists, but the nodes are arranged in an orderly manner.

Watch monitoring mechanism

Mainly monitor the following node change information.

Incorrect implementation of distributed locking

Lock principle

When multiple clients attempt to create a temporary node at the same time, the first client that successfully creates the node obtains the lock, while the other clients fail to obtain the lock.

Like double Eleven, 100,000 people at the same time to kill a commodity, who hands fast who can kill in seconds.

The process of obtaining locks

Here we use temporary nodes.

Four clients simultaneously create a temporary node.
Whoever successfully creates a temporary node first holds the lock (in this case, the temporary node represents the lock).
The other red clients decide that someone has created it and start listening for changes to the temporary node.

The process of releasing locks

The client on the red line is disconnected from ZooKeeper.
The temporary node is automatically deleted because it is temporary.
Watch, the client of the other green line, monitors that the temporary node is deleted, and will rush to create the temporary node (that is, to create the lock).

Analysis of existing problems

When the temporary node is removed, the other three clients rush to create the node. Three nodes is a small number, so there is no performance problem.

What if there are a thousand clients listening for nodes? Once the node is deleted, a thousand clients wake up, and a thousand clients create the node simultaneously.

However, only one client can be created successfully, but a thousand clients have to compete.

This puts a lot of strain on ZooKeeper and wastes the thread resources of these clients, 999 of which are for nothing.

This is called stampede phenomenon, also known as herding phenomenon.

A node is released and deleted, but alarm 1000 clients, this is silly.

Implement distributed lock correctly

Here we use sequential temporary nodes.

Lock principle

Multiple clients compete for locks and create their own nodes in order. Whoever ranks first will successfully obtain the lock.

It’s like waiting in line to buy something. Whoever is first in line buys first.

The process of creating locks

Clients A, B, C, and D snatch locks
A comes first, he creates the temporary order node 000001, he discovers that he is the smallest node, then successfully obtains the lock
B then obtains the lock. He creates the temporary order node 000001 in order, and finds a smaller node in front of it. Then he fails to obtain the lock. He starts listening to client A to see when he can release the lock
Same thing for C and D.

The process of releasing locks

After performing tasks, client A disconnects from ZooKeeper. In this case, the temporary sequence node is automatically deleted and the lock is released
Client B’s Watch has been covetously monitoring A, finding that he has released the lock, and immediately judging whether he is the smallest node, if so, the lock will be successfully obtained
C is listening to B, D is listening to C.

Rationality analysis

When A releases the lock, B will wake up, and B will acquire the lock, which has no effect on C and D, because THE node of B has not changed.

At the same time, B releases the lock, wakes up C, and C acquires the lock, which has no impact on D, because the node of C remains unchanged.

D…

Releasing the lock only wakes up the next client, not all clients. So there are no surprises in this scheme.

Ps: Create a temporary node = create a lock, delete a temporary node = release the lock.

Code implementation

We’re going to go straight to the packaged utility class, because if you write it yourself, if you don’t test it properly, if something goes wrong online, that’s a big problem.

So we’re going to use the curator utility class, which has already implemented distributed locks for us, and we’re going to use them just like ReentrantLock, very simple.

If you’re interested, check out the source code for Curator Peter.

Pom file configuration

<dependency>
    <groupId>org.apache.curator</groupId>
    <artifactId>curator-recipes</artifactId>
    <version>4.0. 0</version>
    <exclusions>
        <exclusion>
            <groupId>org.apache.zookeeper</groupId>
            <artifactId>zookeeper</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.apache.zookeeper</groupId>
    <artifactId>zookeeper</artifactId>
    <version>3.410.</version>
</dependency>
Copy the code

Java code

import org.apache.curator.framework.CuratorFramework;
import org.apache.curator.framework.CuratorFrameworkFactory;
import org.apache.curator.framework.recipes.locks.InterProcessMutex;
import org.apache.curator.retry.RetryNTimes;
import java.util.concurrent.TimeUnit;

/** * Toth distributed lock test */
public class CuratorDistrLockTest implements Runnable {

  / / zookeeper's address
  private static final String ZK_ADDRESS = "127.0.0.1:2181";

  private static final String ZK_LOCK_PATH = "/zkLock";

  static CuratorFramework client = null;

  static {
      // Connect to ZK. If the connection fails, retry every 5000 ms, up to 10 times
      client = CuratorFrameworkFactory.newClient(ZK_ADDRESS,
              new RetryNTimes(10.5000));
      client.start();
  }

  private static void curatorLockTest() {

      InterProcessMutex lock = new InterProcessMutex(client, ZK_LOCK_PATH);
      try {
          if (lock.acquire(6 * 1000, TimeUnit.SECONDS)) {
              System.out.println("= = = = = =" + Thread.currentThread().getName() + "Grabbed the lock ======");
              // Execute business logic
              Thread.sleep(15000);
              System.out.println(Thread.currentThread().getName() + "Mission accomplished.");
          }
      } catch (Exception e) {
          System.out.println("Abnormal Service");
      } finally {
          try {
              lock.release();
          } catch (Exception e) {
              System.out.println("Lock release exception");
          }
      }
  }

  public static void main(String[] args) {
      // Use two threads to simulate two clients
      // Each thread creates its own ZooKeeper connection object
      new Thread(new CuratorDistrLockTest()).start();

      new Thread(newCuratorDistrLockTest()).start(); } @Override public void run() { curatorLockTest(); }}Copy the code

The execution result

At the same time, observing the changes of the ZooKeeper client, two temporary order nodes appear in the zkLock directory.

When we refreshed the ZooKeeper client, we found that the temporary sequence nodes in the zkLock directory had been automatically deleted.

conclusion

Why not use the persistent node, because the persistent node must be manually deleted by the client, otherwise it will remain in ZooKeeper forever.

If our client acquires a lock and suddenly goes down before releasing it, the lock will remain unlocked. Other clients cannot obtain the lock.

The Lock function implemented by ZooKeeper is relatively sound, but the performance is slightly worse. For example, ZooKeeper maintains the consistency of cluster information and frequently creates and deletes nodes.

It would be a waste to maintain a ZooKeeper cluster just to implement distributed locks.

If your company already has a ZooKeeper cluster and the concurrency is not very large, you can use ZooKeeper to implement distributed locking.

Redis performs better on distributed locks than ZooKeeper, but it also has its drawbacks, which we will examine later.

Give a [look], is the biggest support for IT elder brotherCopy the code

Zookeeper implements a distributed lock to trap interviewers

preface

Zookeeper profile

ZK implements two important concepts of distributed locking

The ZNode node

ZNode Node type

Watch monitoring mechanism

Incorrect implementation of distributed locking

Lock principle

The process of obtaining locks

The process of releasing locks

Analysis of existing problems

Implement distributed lock correctly

Lock principle

The process of creating locks

The process of releasing locks

Rationality analysis

Code implementation

The execution result

conclusion

Related Posts

Linux server was hacked and blackmailed, how to attack in 3 hours

ZooKeeper (2021)

Diagram + pseudocode, understand the super difficult to understand distributed Paxos algorithm