Distributed lock implementation based on Redis, the principle is very simple: check whether the Key exists, if it does not exist, Set Key, lock success, if it does exist, lock failure. Isn’t it? Is it that simple?

If that’s what you think, then you really need to listen to me. So let’s look at an example.

Before we get started, let’s set some rules:

  • About the sample code:

    • This is accompanied by the sample code I prepared, written in C#
    • The material Id in this example is fixed to 10000
    • The materials in the examples all start at 100
  • About Key in Redis:

    • The Key indicating material inventory is ProductStock_10000
    • In a self-implemented distributed lock, the Key of the lock is indicated as DistributedLock_10000
    • In RedLock.net, the Key indicating the lock is redlock:10000

1 | 0 if do not have a lock

If there is no lock, we can send 100 requests through Jmeter to see if the inventory ends up being zero

/ / / < summary > / / / unlocked deduct inventory / / / < summary > / / / < returns > < / returns > [HttpPost (" DecreaseProductStockWithNoLock ")] public async Task<string> DecreaseProductStockWithNoLock() { var stockKey = GetProductStockKey(ProductId); var currentQuantity = (long)(await _redisDatabase.Database.StringGetAsync(stockKey)); If (currentQuantity < 1) throw new Exception(" Stock insufficient "); var leftQuantity = currentQuantity - 1; await _redisDatabase.Database.StringSetAsync(stockKey, leftQuantity); Return $" Remaining inventory: {leftQuantity}"; }Copy the code

Over, inventory is all out of order, clean up and run o(╥﹏╥) O!

2 | 0 single application of the lock

When it comes to locks, the first thing most people think of is Monitor’s syntactic sugar lock, which is the first type of lock that most people come across. In a single application, because a lock is a thread lock, there is generally no problem using that lock.

/ / / < summary > / / / in a single application of deducting the inventory / / / < summary > / / / < returns > < / returns > [HttpPost (" DecreaseProductStockInSingleApp ")] public string DecreaseProductStockInSingleApp() { long leftQuantity; lock (_lockObj) { var stockKey = GetProductStockKey(ProductId); var currentQuantity = (long)_redisDatabase.Database.StringGet(stockKey); If (currentQuantity < 1) throw new Exception(" Stock insufficient "); leftQuantity = currentQuantity - 1; _redisDatabase.Database.StringSet(stockKey, leftQuantity); } return $" left inventory: {leftQuantity}"; }Copy the code

It turns out, as we expected, that the remaining inventory is 0

But if we cluster applications and deploy multiple identical applications, then Lock can’t do anything about it. Next, let’s launch two application instances to take a look

# Run in a development environment To see more information dotnet XXTk. Redis. DistributedLock. Api. DLL - urls http://localhost:5000 - environment Development dotnet XXTk.Redis.DistributedLock.Api.dll --urls http://localhost:5010 --environment DevelopmentCopy the code

As you can see, there were 100 requests sent, and there were 17 left in the inventory that should have been zero

In the cluster lock 3 | 0 application

3 | 1 version 1

Obviously, lock is no longer working, so it’s time to move on to the topic of Distributed lock design based on Redis.

Here’s the initial idea:

  1. Use material Id as Redis Key
  2. If the Key exists in Redis, the lock is considered to have been occupied by another thread
  3. If a Key does not exist in Redis, the Key is added to Redis, and Value is assigned at will
  4. Remove the Key from Redis when the lock is obtained

With that in mind, it’s time to think about how to implement it. Fortunately, the Redis command SETNX key Value satisfies our requirements.

For those unfamiliar with the Redis command, please refer to this Redis command documentation

/ / / < summary > / / / the application cluster to deduct inventory V1 / / / < summary > / / / < returns > < / returns > [HttpPost (" V1 / DecreaseProductStockInAppCluster ")" public async Task<string> DecreaseProductStockInAppClusterV1() { var lockKey = GetDistributedLockKey(ProductId.ToString()); / / use SETNX key value command lock the if (await _redisDatabase. Database. StringSetAsync (lockKey, 1, null, the When NotExists, CommandFlags.DemandMaster)) { try { var stockKey = GetProductStockKey(ProductId); var currentQuantity = (long)await _redisDatabase.Database.StringGetAsync(stockKey); If (currentQuantity < 1) throw new Exception(" Stock insufficient "); var leftQuantity = currentQuantity - 1; await _redisDatabase.Database.StringSetAsync(stockKey, leftQuantity); Return $" Remaining inventory: {leftQuantity}"; } the finally {/ / releases the lock await _redisDatabase. Database. KeyDeleteAsync (lockKey, CommandFlags. DemandMaster); }} else throw new Exception(" lock failed "); }Copy the code

I didn’t find a way for Jmeter to count the number of successful or failed requests, so I used aggregate reports and manually calculated the error rates in the reports. If you know, please share with me, thanks!

Through calculation, 50 times of success and 50 times of failure, and the inventory we checked is also left 50, so our demand has been basically realized.

3 | 2 version 2

Although version 1 has basically fulfilled our requirements, consider this:

  • When the code is executed in the try block, the application crashes, causing the lock not to be released
  • When releasing the lock, the connection to Redis failed due to a network problem and the lock was not released

If any of the above conditions occur, the lock cannot be released properly, causing the lock to never be released, resulting in a deadlock.

So what should we do? Yes, add an expiration date to the lock! The SETNX command does not have an EXPIRE time parameter, so we need to set the expiration time of the lock through the EXPIRE command after obtaining the lock.

Is that all right? Of course not. We need to merge the SET and EXPIRE operations into one atomic operation. What should we do? Don’t worry, Redis has enhanced the SET command to use the SET key value EX seconds NX command, which is synonymous with SETNX.

/ / / < summary > / / / the application cluster to deduct inventory V2 / / / < summary > / / / < returns > < / returns > [HttpPost (" V2 / DecreaseProductStockInAppCluster ")" public async Task<string> DecreaseProductStockInAppClusterV2() { var lockKey = GetDistributedLockKey(ProductId.ToString()); var expiresIn = TimeSpan.FromSeconds(30); // SET key value EX seconds NX If (await _redisdatabase. AddAsync(lockKey, 1, expiresIn, When.NotExists, CommandFlags.DemandMaster)) { try { var stockKey = GetProductStockKey(ProductId); var currentQuantity = (long)await _redisDatabase.Database.StringGetAsync(stockKey); If (currentQuantity < 1) throw new Exception(" Stock insufficient "); var leftQuantity = currentQuantity - 1; await _redisDatabase.Database.StringSetAsync(stockKey, leftQuantity); Return $" Remaining inventory: {leftQuantity}"; } the finally {/ / releases the lock await _redisDatabase. Database. KeyDeleteAsync (lockKey, CommandFlags. DemandMaster); }} else throw new Exception(" lock failed "); }Copy the code

3 | 3 version 3

Ok, we have solved the deadlock problem, so is our distributed lock perfect? NO! NO! NO! There are still some problems:

  1. If thread A acquires the lock and sets the lock expiration time to 30s while the service execution time needs to 40s, the lock will be released early
  2. If the lock is released early and is acquired by another thread B, then thread A has finished executing the lock release code in the finally block. This will release the lock that does not belong to thread A but belongs to thread B.

Does it feel like the more you change, the more problems you have? Let’s take it one at a time. Let’s solve the second problem of “releasing a lock that doesn’t belong to you by mistake” first. In order for the thread to know which lock is its own, we need to give the thread a unique and unique name. When we need to release the lock, we first check if it is our own lock, and then release the lock. Where does that name go? LockKey = Value; LockKey = Value;

/ / / < summary > / / / the application cluster to deduct inventory V3 / / / < summary > / / / < returns > < / returns > [HttpPost (" V3 / DecreaseProductStockInAppCluster ")" public async Task<string> DecreaseProductStockInAppClusterV3() { var lockKey = GetDistributedLockKey(ProductId.ToString()); var resourceId = Guid.NewGuid().ToString(); var expiresIn = TimeSpan.FromSeconds(30); // Run the SET key value EX seconds NX command to SET the expiration time. And set the value to the business Id if (await _redisDatabase.AddAsync(lockKey, resourceId, expiresIn, When.NotExists, CommandFlags.DemandMaster)) { try { var stockKey = GetProductStockKey(ProductId); var currentQuantity = (long)await _redisDatabase.Database.StringGetAsync(stockKey); If (currentQuantity < 1) throw new Exception(" Stock insufficient "); var leftQuantity = currentQuantity - 1; await _redisDatabase.Database.StringSetAsync(stockKey, leftQuantity); Return $" Remaining inventory: {leftQuantity}"; } finally {// Unlock lock if (await _redisdatabase.getAsync <string>(lockKey) == resourceId) { _redisDatabase.Database.KeyDelete(lockKey, CommandFlags.DemandMaster); }} else throw new Exception(" lock failed "); }Copy the code

3 | 4 version 4

You see the problem with the code above, right? Yes, the final lock release code is a two-step, not atomic operation, which is definitely not allowed. However, Redis does not provide the command, so we have to use the Lua script:

/ / / < summary > / / / the application cluster to deduct inventory V4 / / / < summary > / / / < returns > < / returns > [HttpPost (" V4 / DecreaseProductStockInAppCluster ")" public async Task<string> DecreaseProductStockInAppClusterV4() { var lockKey = GetDistributedLockKey(ProductId.ToString()); var resourceId = Guid.NewGuid().ToString(); var expiresIn = TimeSpan.FromSeconds(30); // Run the SET key value EX seconds NX command to SET the expiration time. And set the value to the business Id if (await _redisDatabase.AddAsync(lockKey, resourceId, expiresIn, When.NotExists, CommandFlags.DemandMaster)) { try { var stockKey = GetProductStockKey(ProductId); var currentQuantity = (long)await _redisDatabase.Database.StringGetAsync(stockKey); If (currentQuantity < 1) throw new Exception(" Stock insufficient "); var leftQuantity = currentQuantity - 1; await _redisDatabase.Database.StringSetAsync(stockKey, leftQuantity); Return $" Remaining inventory: {leftQuantity}"; } finally {// Release the lock, Use lua scripts for operating the atomicity of await _redisDatabase. Database. ScriptEvaluateAsync (@ "if redis. Call (' get ', KEYS[1]) == ARGV[1] then return redis.call('del', KEYS[1]) else return 0 end", keys: new RedisKey[] { lockKey }, values: new RedisValue[] { resourceId }, CommandFlags.DemandMaster); }} else throw new Exception(" lock failed "); }Copy the code

If you didn’t use my sample code and wrote it yourself, you might have a problem with the lock not being released properly: after executing the Lua script, 0 is returned. This is probably because you used the Json serialization tool to serialize objects into strings to store in Redis. But because Json serializes the string, the quotes (“) are also serialized into (“), this causes the string “123” to be stored in Redis as “”123″”. Specific solution can reference I realize RedisNewtonsoftSerializer classes.

3 | 5 v5

Finally, let’s solve the last problem — the business execution time exceeds the lock expiration time, causing the lock to be released early. Because we cannot accurately predict the execution duration of services, it is not reasonable to set the lock expiration time too long. Therefore, if services are not completed, we must extend the lock expiration time appropriately when the lock is about to expire. Can be solved by timer.

/ / / < summary > / / / the application cluster to deduct inventory V5 / / / < summary > / / / < returns > < / returns > [HttpPost (" V5 / DecreaseProductStockInAppCluster ")" public async Task<string> DecreaseProductStockInAppClusterV5() { var lockKey = GetDistributedLockKey(ProductId.ToString()); var resourceId = Guid.NewGuid().ToString(); var expiresIn = TimeSpan.FromSeconds(30); // Run the SET key value EX seconds NX command to SET the expiration time. And set the value to the business Id if (await _redisDatabase.AddAsync(lockKey, resourceId, expiresIn, When.NotExists, CommandFlags. DemandMaster)) {try {/ / start the timer, timing key to extend the expiration time var. Interval = expiresIn TotalMilliseconds / 2; var timer = new System.Threading.Timer( callback: state => ExtendLockLifetime(lockKey, resourceId, expiresIn), state: null, dueTime: (int)interval, period: (int)interval); var stockKey = GetProductStockKey(ProductId); var currentQuantity = (long)await _redisDatabase.Database.StringGetAsync(stockKey); If (currentQuantity < 1) throw new Exception(" Stock insufficient "); var leftQuantity = currentQuantity - 1; await _redisDatabase.Database.StringSetAsync(stockKey, leftQuantity); timer.Change(Timeout.Infinite, Timeout.Infinite); timer.Dispose(); timer = null; Return $" Remaining inventory: {leftQuantity}"; } finally {// Release the lock, Use lua scripts for operating the atomicity of await _redisDatabase. Database. ScriptEvaluateAsync (@ "if redis. Call (' get ', KEYS[1]) == ARGV[1] then return redis.call('del', KEYS[1]) else return 0 end", keys: new RedisKey[] { lockKey }, values: new RedisValue[] { resourceId }, CommandFlags.DemandMaster); }} else throw new Exception(" lock failed "); } private void ExtendLockLifetime(string lockKey, string resourceId, TimeSpan expiresIn) { _redisDatabase.Database.ScriptEvaluate(@" local currentVal = redis.call('get', KEYS[1]) if (currentVal == false) then return redis.call('set', KEYS[1], ARGV[1], 'PX', ARGV[2]) and 1 or 0 elseif (currentVal == ARGV[1]) then return redis.call('pexpire', KEYS[1], ARGV[2]) else return -1 end ", keys: new RedisKey[] { lockKey }, values: new RedisValue[] { resourceId, (long)expiresIn.TotalMilliseconds }, CommandFlags.DemandMaster); }Copy the code

The distributed lock 3 | 6 using RedLock.net

The above version 5 already covers the basic idea of distributed locking, but I must have written it rather poorly, so I recommend a good open source implementation – RedLock.net

The Redis official document sorts out the distributed lock implementation of common languages and also sorts out the implementation principle of RedLock.

/// use RedLock to reduce inventory in the application cluster /// </summary> // <returns></returns> [HttpPost("DecreaseProductStockInAppClusterWithRedLock")] public async Task<string> DecreaseProductStockInAppClusterWithRedLock () {/ / lock the expiration time is 30 s, waiting for the lock time is 20 s, if you don't have access to the lock, Is waiting for 1 seconds later try again to obtain the using var redLock = await _distributedLockFactory. CreateLockAsync (resource: ProductId.ToString(), expiryTime: TimeSpan.FromSeconds(30), waitTime: TimeSpan.FromSeconds(20), retryTime: TimeSpan.FromSeconds(1) ); If (redlock. IsAcquired) {var stockKey = GetProductStockKey(ProductId); var currentQuantity = (long)await _redisDatabase.Database.StringGetAsync(stockKey); If (currentQuantity < 1) throw new Exception(" Stock insufficient "); var leftQuantity = currentQuantity - 1; await _redisDatabase.Database.StringSetAsync(stockKey, leftQuantity); Return $" Remaining inventory: {leftQuantity}"; } else throw new Exception(" lock failed "); }Copy the code

4 | 0 standing on the Angle of Redis

We have implemented distributed locks from the perspective of the application above, but from the perspective of Redis, there are a few issues to consider:

Redis is down and cannot be locked

If Redis goes down, the Redis server becomes unavailable, which makes locking impossible.

The solution is simple to improve the high availability of Redis by configuring the master-slave relationship, but this creates the following problems.

Redis primary/secondary switchover causes lock failure

Here’s how it works:

  1. Client A has obtained the lock from Redis Master
  2. The master crashed before the Key representing the lock was synchronized to the Redis slave
  3. The Redis slave is upgraded to Redis Master
  4. Client B obtains the lock held by client A from the new Redis Master.

This is obviously going badly wrong! Thus, the RedLock algorithm was born.

5|0RedLock

We won’t talk about clock drift, so we’ll assume that the clock drift between multiple servers is so small that we can ignore it.

The basic principle of

First of all, we need at least 5 Redis servers (an odd number greater than or equal to 5). These 5 Redis servers are independent of each other without any master-slave or cluster relationship.

Next, we get the locks on the Redis server in left-to-right order, assuming

  • The lock expiration time is 10 seconds.
  • The lock starts at 00:00:00,
  • The lock was obtained on the first server at 00:00:01.
  • The lock was obtained on the second server at 00:00:02,
  • The lock was obtained on the third server at 00:00:03.

More than half (3/5) of Redis servers now have locks.

Time taken to obtain the lock = Time when the last Redis server to obtain the lock obtains the lock – Start time of lock Lock Remaining in effect (TTL) = Lock expiration time – Time taken to obtain the lock

Time used to obtain a lock = 00:00:03-00:00:00 = 3s, TTL = 10s – (00:00:03-00:00:00) = 7s. Therefore, the lock acquisition time does not exceed the lock validity period, we consider the lock acquisition success.

There are two conditions for successful lock acquisition:

More than half of Redis servers acquired locks within the lock validity period

retry

The above example is a very successful lock acquisition case. However, many times, distributed lock acquisition is not so smooth, and the following situations may occur:

  • A has obtained the locks of two Redis servers
  • B has obtained the locks of two Redis servers
  • C has obtained the lock of a Redis server

If requests from all three clients are blocked until the lock’s expiration date is reached, this can seriously affect lock acquisition efficiency, and a retry mechanism is required.

Retry mechanism: At the beginning, send the SET key value EX senconds NX command to all Redis servers at the same time. When all servers return the results, determine whether “two conditions for successful lock acquisition” have been fulfilled. If so, the lock has been successfully obtained. If not, immediately release the acquired lock, wait a short time, and repeat the above steps (usually three times). If the two conditions for successful lock acquisition are not met during this period, the lock acquisition is considered to have failed.

The lock fails due to the primary/secondary switchover

In fact, in the RedLock algorithm, if the Redis service is configured with a master/slave relationship, the problem we mentioned earlier will still occur — the master/slave switch will cause the lock to fail.

To solve this problem, we need to delay the time when the Redis slave node is promoted to the Redis Master node, which is the TTL of the lock. In this way, the lock will not fail (this seems to exist only in theory, If you know how to delay the slave promotion master, please share it with me).

Release the lock

Releasing locks is easy, just send a lock deletion command to each server, because our script already guarantees that only locks associated with the current business will be deleted.

6 | 0 epilogue

After sorting out so much, finally came to the end, you also found that Redis based on the implementation of a distributed lock, it is not as simple as imagined, there are really a lot of details. On the other hand, at least in my opinion, The RedLock algorithm is a bit heavy, and if I didn’t care about the lock inconsistency caused by the master/slave switch, Redis alone would be enough.