What is distributed locking?

Hi, I’m Jack Xu, and today I’m going to talk to you about distributed locking. First of all, let’s talk about distributed lock. When we place orders, reduce inventory, grab tickets, choose courses, grab red packets and other business scenarios, if there is no lock control here, it will lead to very serious problems. Those of you who have studied multithreading know that to prevent multiple threads from executing the same piece of code at the same time, you can use the synchronized keyword or the ReentrantLock class in JUC, but almost any system these days has multiple machines deployed, and single-machine deployment is rare. Synchronized and ReentrantLock are useless, so a global lock is needed to replace JAVA’s synchronized and ReentrantLock.

There are three popular implementations of distributed lock, namely, the implementation based on cache Redis, the implementation based on ZK temporary order nodes and the implementation based on database row lock. Let’s start by building the lock with the setnx command in Jedis.

This article source: github.com/xuhaoj/redi…

Jedis writing

The idea of using Redis for distributed locking is to set a value in Redis to indicate that the lock is added, and then remove the key when the lock is released. The idea is very simple, but in the process of using to avoid some pits, we first look at the lock code:

    /** * Try to acquire distributed lock **@paramJedis Redis client *@paramLockKey lock *@paramRequestId Indicates the id of the request@paramExpireTime Expiration time *@returnSucceeded or not */
    public static boolean tryGetDistributedLock(Jedis jedis, String lockKey, String requestId, int expireTime) {
        // Set supports multiple parameters NX (not exist) XX (exist) EX (seconds) PX (million seconds)
        String result = jedis.set(lockKey, requestId, "NX"."EX", expireTime);
        if (LOCK_SUCCESS.equals(result)) {
            return true;
        }
        return false;
    }
Copy the code

This code is very simple, mainly under the said here with command is SET key value [EX seconds | PX milliseconds] [NX | XX] [KEEPTTL], without using SETNX + EXPIRE command, The reason is that SETNX+EXPIRE cannot guarantee atomicity, while SET is an atomic operation. So why set the timeout here? The reason for this is that when a client acquires a lock and hangs in the middle of executing a task without having time to explicitly release the lock, the resource will be locked forever, resulting in a deadlock, so a timeout must be set.

The code to release the lock is as follows:

    /** * Release the distributed lock **@paramJedis Redis client *@paramLockKey lock *@paramRequestId indicates the requestId, the name of the current worker thread *@returnWhether release succeeded */
    public static boolean releaseDistributedLock(Jedis jedis, String lockKey, String requestId) {
        String script = "if redis.call('get', KEYS[1]) == ARGV[1] then return redis.call('del', KEYS[1]) else return 0 end";
        Object result = jedis.eval(script, Collections.singletonList(lockKey), Collections.singletonList(requestId));
        if (RELEASE_SUCCESS.equals(result)) {
            return true;
        }
        return false;
    }
Copy the code

The value of the lock is set to the Id of the current thread. The value of the lock is set to the Id of the current thread. The value of the lock is set to the Id of the thread. Thread.currentthread ().getid (), and then check whether it is the currentThread. The second point is that the verification and release of the lock are two separate operations, not atomicity. How to solve this? If redis.call(‘get’, KEYS[1]) == ARGV[1] then returnredis.call(‘del’, KEYS[1]) else return 0 end

Source: com/XHJ distributedLock/DistLock. Java

Redisson writing

Redisson is one of the Redis clients for Java and provides some apis to facilitate the operation of Redis. But the Redisson client can be a bit powerful, we first open the official website to seeGithub.com/redisson/re… Redisson is not a pure Redis client, but a distributed service based on The Redis implementation. We can see the class name under the JUC package. Redisson helped us to make the distributed version. Like AtomicLong, you can just use ReddissonatomicLong. Locking is just the tip of the iceberg, and it supports master-slave, sentinel, cluster, and, of course, single-node mode.

Redisson provides a much simpler implementation of distributed locking. Let’s take a look at its usage. It’s quite simple, two lines of code, much simpler than Jedis, and it encapsulates all the issues that need to be considered in Jedis.

Let’s see, there are many ways to acquire a lock, there are fair locks and read/write locks, and we use a redissonClient.getLock, which is a reentrant lock.

Now I’m going to start the program

Open the Redis Desktop Manager tool to see what it contains. A HASH value is written to the lock. Key is the lock name jackxu, field is the thread name, and value is 1 (the number of lock reentrations).

Small partners may think I am nonsense, it doesn’t matter, we point in to see its source code is specific implementation.

TryLock (); tryAcquire(); ->tryAcquireAsync(); ->tryLockInnerAsync();

Now LET me pull up this Lua script and analyze it. It’s easy.

// KEYS[1] Lock name updateAccount
ARGV[1] Key expiration time 10000ms
ARGV[2] Specifies the thread name
// Lock name does not exist
if (redis.call('exists', KEYS[1= =])0) then
// Create a hash with key= lock name, field= thread name, value=1
redis.call('hset', KEYS[1], ARGV[2].1);
// Set the hash expiration time
redis.call('pexpire', KEYS[1], ARGV[1]);
return nil;
end;
// If the lock name exists, check whether the lock is owned by the current thread
if (redis.call('hexists', KEYS[1], ARGV[2= =])1) then
// If yes, value+1 indicates the number of reentrants +1
redis.call('hincrby', KEYS[1], ARGV[2].1);
// To regain the lock, reset the expiration time of the Key
redis.call('pexpire', KEYS[1], ARGV[1]);
return nil;
end;
// The lock exists, but is not held by the current thread.
return redis.call('pttl', KEYS[1]);
Copy the code

UnlockInnerAsync () in unlock() releases the lock, again using a Lua script.

KEYS[1] Specifies the name of the lock updateAccount
// KEYS[2] redisson_lock__channel:{updateAccount}
ARGV[1] Releases the lock message 0
// ARGV[2] Lock release time 10000
ARGV[3] Specifies the thread name
// The lock does not exist (expired or released)
if (redis.call('exists', KEYS[1= =])0) then
// Publish the message that the lock has been released
redis.call('publish', KEYS[2], ARGV[1]);
return 1;
end;
// The lock exists, but is not added by the current thread
if (redis.call('hexists', KEYS[1], ARGV[3= =])0) then
return nil;
end;

// The lock exists and is added by the current thread
// Reentrant count -1
local counter = redis.call('hincrby', KEYS[1], ARGV[3] -1);
// If the value of -1 is greater than 0, it indicates that the thread holds the lock and has other tasks to execute
if (counter > 0) then
// Reset the lock expiration time
redis.call('pexpire', KEYS[1], ARGV[2]);
return 0;
else
// After -1 equals 0, you can remove the lock
redis.call('del', KEYS[1]);
// Post a message releasing the lock after deletion
redis.call('publish', KEYS[2], ARGV[1]);
return 1;
end;

// Otherwise return nil
return nil;
Copy the code

Source in the com/XHJ/distributedLock/LockTest Java

After watching it in action, we found that it really works as smoothly as ReentrantLock in the JDK. Here are some more questions:

  • What should I do if the lock expires when the business is not finished? This is guaranteed by the watchdog.
  • In cluster mode, if multiple masters are locked, the Redission automatically selects the same master.
  • If the Redis master fails while the service is running, the Redis slave still has this data.

RedLock

“RedLock” is a literal translation of Chinese. Redlock is not a tool, but a distributed locking algorithm officially proposed by Redis. We know that in single-machine deployment, there is a single point of issue, as long as redis fails, locking will not work. If the master-slave mode is adopted, only one node is locked. Even though sentinel is highly available, if the master node fails and master/slave switchover occurs, the problem of lock loss may occur. Based on the above considerations, In fact, Antirez, the author of Redis, also considered this problem and proposed a RedLock algorithm.

I’ve drawn a diagram here where each of these five instances is deployed independently, with no master-slave relationship, so they’re the five master nodes.

To obtain a lock, do the following:

  • Gets the current timestamp, in milliseconds
  • Try to create a lock on each master node in turn, with a short expiration time set, typically a few tens of milliseconds
  • Try to create a lock on most of the nodes. For example, 5 nodes should be 3 nodes (n / 2 +1).
  • The client calculates the time for the lock to be established. If the lock is established for less than the timeout period, the lock is established successfully
  • If the lock fails to be established, then the locks are removed in turn
  • As soon as someone else creates a distributed lock, you have to keep polling to try to acquire it

But this algorithm is still quite controversial, there may be a lot of problems, can not ensure that the lock process is correct. Martin Kleppmann questioned the algorithm, and Antirez responded. One is a veteran distributed architect, the other is the father of Redis, and this is the famous fairy fight about the red lock.

Finally, Redisson website also gives how to use redlock redlock, a few lines of code to fix, still very silky, interested partners can see.

They are writing

Zk is a centralized service that provides configuration management, distributed collaboration, and naming. Its model looks like this: it consists of a series of nodes, called zNodes, which act like a file system. Each Znode represents a directory, and then znodes have some features that we can categorize into four categories:

  • Persistent node (zK disconnected node still exists)
  • Persist sequential nodes (/lock/ nod-0000000000 if the first child node is created, /lock/ nod-0000000001 for the next, and so on)
  • Temporary node (node deleted after client disconnects)
  • Temporary sequential node

The ZooKeeper distributed lock applies temporary sequential nodes. Let’s see how this is done in a graphical way.

Acquiring a lock

First, create a persistent node, ParentLock, in Zookeeper. When the first client wants to acquire a lock, it creates a temporary sequential node Lock1 under the ParentLock node.Client1 then looks for all temporary sequential nodes under ParentLock and sorts them to see if Lock1, the node it created, is in the first order. If it is the first node, the lock is successfully acquired.In this case, if another client Client2 comes to acquire the lock, a temporary sequential node Lock2 is created under ParentLock.Client2 looks for all temporary sequential nodes under ParentLock and sorts them to determine whether Lock2 is the first node in the sequence, and finds that Lock2 is not the smallest node.

Thus, Client2 registers Watcher with Lock1, which is sorted just ahead of it, to listen for the existence of Lock1. This means Client2 failed to grab the lock and entered the wait state.In this case, if another Client3 comes to acquire the lock, then a temporary sequential node Lock3 is created in the ParentLock download.Client3 looks for all temporary sequential nodes under ParentLock and sorts them to determine whether Lock3 is the first node in the sequence. The result also shows that Lock3 is not the smallest node.

Thus, Client3 registers Watcher with Lock2, which is sorted only ahead of it, to listen for the presence of Lock2. This means that Client3 also failed to grab the lock and entered the wait state.Client1 gets the lock, Client2 listens on Lock1, and Client3 listens on Lock2. It forms a waiting queue, like in Java already rely on AQS (AbstractQueuedSynchronizer).

Release the lock

There are two ways to release a lock:

1. The task is complete. The client displays release

When the task is complete, Client1 displays a directive calling to delete the node Lock1.2. During the task execution, the client crashes

Client1 that has acquired the lock will be disconnected from the Zookeeper server if Duang crashes during the task execution. Depending on the nature of the temporary node, the associated node Lock1 is automatically deleted.

Because Client2 is always listening for Lock1’s presence status, Client2 is notified immediately when Lock1 is removed. In this case, Client2 queries all nodes under ParentLock to check whether Lock2 is the smallest node. If it is minimum, Client2 logically acquires the lock.Similarly, if Client2 also deletes Lock2 because of task completion or node crash, Client3 will be notified.Eventually, Client3 succeeds in obtaining the lock.

Curator

In the Apache open source framework Apache Exhibit, contains the implementation of the Zookeeper distributed lock. Github.com/apache/cura…

It is also simple to use, as follows:We looked at the next still silky, I will not analyze the source code, interested can see my colleague’s blogExhibit ZK distributed lock implementation principle 。

conclusion

Zookeeper is designed for distributed coordination, strong consistency, and robust locks. If you can’t get the lock, just add a listener and you don’t need to poll all the time. Disadvantages: In the high request high concurrency, the system crazy lock release lock, finally ZK can not bear so much pressure may have the risk of downtime.

Zk lock performance is lower than Redis lock performance. Each write request can only be made to the leader, and the leader will broadcast the write request to all flowers. If all flowers are successful, the write request will be submitted to the leader. In fact, this process is equivalent to a 2PC process. When locking, it is a write request. When there are many write requests, ZK will have a lot of pressure, and finally the server will respond slowly.

Redis locks are simple to implement, simple to understand logic, good performance, and can support high concurrency lock acquisition and release operations. Disadvantages: Redis is prone to single point of failure, cluster deployment, not strongly consistent, locking is not robust; The expiration time of key is not clear and can only be adjusted according to the actual situation. You have to constantly try to acquire the lock yourself and compare performance.

Finally, both Redis and ZooKeeper should meet the characteristics of distributed locking:

  • Reentrant (threads that have acquired locks do not need to acquire them again during execution)
  • If an exception or timeout occurs, the system is automatically deleted to avoid deadlock
  • Mutually exclusive, only one client can hold the lock
  • High performance, high availability and fault tolerance mechanism in distributed environment

Each has its own merits, specific business scenarios specific use.