The original

Distributed lock solutions are very many, commonly used such as ZooKeeper, today is how to achieve distributed lock through Redis.

We’ll start with the simplest and work our way through the distributed lock.

Redis distributed lock is realized by SETNX command

SETNX means SET if Not eXists, the value of the key is SET only if it does Not exist, otherwise nothing is done.

With SETNX we can implement a crude distributed lock as follows

  1. Client 1 adds a String to the lock, and the lock succeeds:
127.0.0.1:6379> SETNX lock 1 (integer) 1 // Client 1, the lock is successfully addedCopy the code
  1. Client 2 applied to lock because it failed to lock after it arrived:
127.0.0.1:6379> SETNX lock 1 (integer) 0 // Failed to lock client 2Copy the code
  1. A client that has been “locked successfully” can then operate on “shared resources” (for example, modify a row of MySQL data, or call an API request).

  2. Once the “Share Resource” operation is complete, the lock is released (in effect, the data just added is deleted) and released to other clients

127.0.0.1:6379> DEL lock // Release the lock (integer) 1Copy the code

The whole logic is that all clients who want to operate “shared resource” go to Redis to SETNX the same data, and the one who successfully adds the data first will have the distributed lock. Then they can operate “shared resource”, and delete the data after the operation to release the lock

Deadlocks and handling

Although the above method can achieve distributed lock, but there is a big problem, when client 1 holds the lock, if the following scenario occurs, will cause a “deadlock” :

  • The lock was not released in time because the service processing of the program was abnormal. Procedure
  • Process hung, no chance to release lock
  • .

How do I avoid deadlocks? It is easy to think of assigning a “lease period” to the lock when applying for it.

When you implement this in Redis, you give the key an expiration date.

How long is a good setting? It must not be too short, otherwise “shared resources” will be released to me before the operation is completed. This time is mainly determined according to your business time.

If the shared resource operation takes less than 10 seconds, set the key to expire for 10 seconds when locking:

127.0.0.1:6379> SETNX lock 1 // Lock (integer) 1 127.0.0.1:6379> EXPIRE lock 10 // Automatically expires after 10s (integer) 1Copy the code

In this way, the lock can be “automatically released” after 10s even if the client does not manually release the lock due to an exception.

Is that a sure thing? Note that there are two redis commands above, that is, after the first command is executed, Redis suddenly hangs and the second command is not executed successfully. In this case, a deadlock will still occur.

In summary, as long as the two commands are not guaranteed to be atomic operations (successful together), there is a risk of “deadlock” occurring

How to do?

Prior to Redis 2.6.12, we needed to make sure SETNX and EXPIRE performed atomically, and how to handle various exceptions.

However, after Redis 2.6.12, Redis extended the parameter of the SET command to natively support adding data while specifying an expiration time:

127.0.0.1:6379> SET lock 1 EX 10 NX OKCopy the code

The lock was released early

In the previous section on deadlocks and handling, the time for an operation on shared resources should be properly evaluated based on the business to prevent the lock from being released before the operation completes.

But even if we make a reasonable assessment, there are cases where the “shared resource” operation may take longer than the lock lease, such as

  • The shared Resource operation is stuck
  • Network request timeout
  • .

When a lock is DEL locked, the other client can acquire the SET lock and begin the “Shared resource” operation. At this point, the previous client has completed the operation and released the DEL lock. At this point, it releases the lock of the other client.

There are two problems with this

  1. How do I prevent other clients from releasing locks
  2. How can I avoid early release of locks

Prevents other clients from releasing locks

The key to solve this problem is how to determine whether the lock is his own?

When a client locks a thread, it sets a unique identifier that only it knows. It can be its thread ID or a UUID. Here we use the UUID as an example:

// SET lock VALUE to UUID 127.0.0.1:6379> SET lock $UUID EX 20 NX OKCopy the code

This assumes that the 20s operation of the shared resource takes sufficient time, regardless of the early release of the lock.

After that, before releasing the lock, you need to determine whether the lock is still owned by you. The pseudo-code can be written as follows:

If redis.get("lock") == $uuid: redis.del("lock")Copy the code

The lock is released using the GET + DEL commands, and the atomicity problem comes up again. What to do? Redis doesn’t offer a 2-in-1 command this time.

Although Redis itself does not provide it, we can write two commands in Lua scripts, and the process of executing Lua scripts by Redis is atomic

The Lua script for safely releasing locks is as follows:

// Determine if the lock belongs to you before releasing itif redis.call("GET",KEYS[1]) == ARGV[1]
then
    return redis.call("DEL",KEYS[1])
else
    return 0
end
Copy the code

Before we solve the problem of locks being released early, let’s review the process. Distributed lock based on Redis implementation, a rigorous process is as follows:

  1. Lock:SET lock_key $unique_id EX $expire_time NX
  2. Operating shared Resources
  3. Release lock: Lua script, first GET to determine whether the lock belongs to oneself, then DEL release lock

Now we have one problem left to solve

Avoid locks being released early

As mentioned above, even if the lease period is reasonably evaluated, there is also a probability that the lock will be released in advance. How to solve this problem?

Do you extend the lease? Extending the lease period can only reduce the probability of the occurrence of events, not completely eliminate

We can design such a scheme: when adding a lock, first set an expiration time, and then start a “daemon thread” to periodically check the expiration time of the lock. If the lock is about to expire and the operation of shared resources has not been completed, then the lock will be automatically “renewed” and reset the expiration time.

If you’re a Java technology stack, fortunately, there’s a Redisson library that encapsulates all of this

Redisson

Redisson is a Java language implementation of the Redis SDK client. When using distributed locks, it uses “automatic renewal” to avoid lock expiration. This daemon thread is referred to as the “watchdog” thread in Redisson.

In addition, the SDK packages a number of easy-to-use features:

  • Reentrant lock
  • Optimistic locking
  • Fair lock
  • Read-write lock
  • Redlock (more on that below)

The SDK provides a friendly API that can manipulate distributed locks in the same way as local locks.

summary

Distributed lock based on Redis implementation, possible problems, and corresponding solutions:

  • Deadlock: Set the expiration time
  • Lock is released by someone else: the lock is written to a unique identifier, which is checked before being released (Lua script)
  • Locks are released early: using daemon threads, automatically renewed

Distributed lock in master-slave cluster + sentinel mode

The scenarios analyzed above are all problems that can arise from locking in a “single” Redis instance and do not go into Redis deployment architecture details.

Is there a problem with the Redis distributed lock in master-slave cluster + sentinel mode?

Take a look at this scenario

  1. Client 1 runs the SET command on the primary database and locks the primary database successfully
  2. The SET command has not been synchronized to the slave library (master/slave replication is asynchronous).
  3. The slave library is promoted by sentry to the new master library, this is locked to the new master library, lost!

In order to solve the problem of lock loss during master/slave switchover, the author of Redis proposed a solution called Redlock

Redlock

The Redlock solution proposed by the author is based on two premises:

  1. Use only the master library, not the slave library and sentinel instance
  2. Multiple primary libraries are required, and at least five instances are officially recommended

In other words, to use Redlock, you need to deploy at least five instances of Redis, all of which are the master library and have no relationship to each other.

Note: Deploy either a Redis Cluster or five simple Redis instances.

How to implement Redlock

  1. The client first gets “current timestamp T1”
  2. The client initiates lock requests to these five Redis instances in turn (using the SET command mentioned above) and sets the timeout time (milliseconds). If one instance fails to lock (including network timeout, lock being held by other people, etc.), the client immediately applies for lock to the next Redis instance
  3. If the client successfully locks more than >=3 (most) Redis instances, the client obtains “current timestamp T2” again. If the lock lease period is > T2-T1, the client is considered to have successfully locked; otherwise, the client is considered to have failed to lock
  4. The shared resource is unlocked successfully
  5. Lock failure or end of operation, issue lock release request to “all nodes” (Lua script release lock mentioned earlier)

The whole logic is that everyone goes to grab the lock, who grab more who will have the lock

There are several important points in this

  1. Clients must apply for locks on multiple Redis instances
  2. Ensure that most nodes are locked successfully
  3. The total locking time of most nodes should be less than the expiration time set by the lock (the lease period should not be set too short to prevent the lock from being invalid before it is acquired).
  4. To release locks, send lock release requests to all nodes to prevent lock residue. (Lock residue is specified when the lock is successfully added but the client considers that the lock fails due to network reasons.)

Redlock (Redlock) achieves the distributed lock system even if some nodes are unavailable by registering locks to the existing multiple master libraries. It solves the problem of lock loss during master/slave switchover

Redlock does not conflict with the master-slave cluster + sentinel model. Redlock only requires five master libraries. It does not care if there are slave libraries and sentinels behind the five master libraries

Note that Redlock still needs to be paired with Redisson to address premature lock expiration issues

So is Redlock really safe? Some people don’t think so

The debate of Redlock

This is an extension of the Redis Redlock debate for those interested