Redis to achieve a distributed lock, based on the Go language.

The preface

Previous articles were written theory knowledge, the head is a bit big, suddenly want to write some practical aspects of the content, just recently the company doing asynchronous task migration, with the help of a distributed lock and task fragmentation, so going to write 2 article actual combat, respectively introduces distributed lock and task fragmentation is implemented, this in the actual project, should be frequently used, Today this article will first explain the implementation of distributed lock.

Usage scenarios

In fact, there are many scenarios for the use of distributed locks. In Xiaomi, I mainly encounter the following scenarios:

  • To perform a scheduled task in a service cluster, we want only one machine to perform it, so we need to use a distributed lock. Only the machine that gets the lock can perform the scheduled task.
  • When an external request is made to the cluster, such as an operation on an order, we need to place a distributed lock on the order dimension in the entry to avoid reentrant.

Redis distributed lock

SetNx() = Redis distributed lock = Redis distributed lock = Redis distributed lock = Redis distributed lock = Redis distributed lock

  • What if the machine that acquired the lock hangs?
  • When the lock times out, both A and B obtain the lock at the same time. How can I solve this problem?

Redis distributed locks are more than just SetNx(). You don’t know what SetNx() is.

The Redis Setnx (SET if Not eXists) command sets a specified value for a key if the key does Not exist. (Return value: 1 if the setting succeeds, 0 if the setting fails)

If SetNx() returns 1, it means that the lock has been obtained; if 0 is returned, it means that the lock has not been obtained. In order to avoid machine downtime and restart, the lock has not been released, so we need to record the timeout time of the lock. The overall execution process is as follows:

  • SetNx() obtains the lock first and sets value to timeout. If the lock is obtained successfully, it returns directly.
  • If the lock is not obtained, the machine may be down and restarted. You need to run GetKey() to obtain the timeout value of the lock. If the lock is not timed out, the machine is not down and restarted, and the lock fails to be obtained.
  • If the lock has timed out, you can retrieve the lock again and set A new timeout period for the lock. To avoid multiple machines acquiring the lock at the same time, you need to use GetSet(), because GetSet() will return the old value. If two machines A/B execute GetSet() at the same time, if A executes first and B executes later, If A calls GetSet() and B calls GetKey(), the value of current_time will be the same as that of current_time, and if B calls GetKey(), the value of current_time will be the same as that of CURRENT_time. To see who got the lock first. (This is probably the hardest part of distributed locking to understand, and this is where I get stuck every time I revisit this logic…)

The Redis Getset command is used to set the specified key value and return the old key value. Return value: Returns the old value of the given key. Returns nil if key has no old value, that is, if key does not exist; An error is returned when key exists but is not a string.

There may be students said, wrote a pile, look at my head are big, come to come to, Lou elder brother gave you a picture, is not a lot of clarity


The specific implementation

The basic principle is clear, the following began to heap code ha, first look at the logic of obtaining lock, the inside of the annotation write quite detailed, even if the students do not understand programming, should be able to understand:

// To obtain a distributed lock, consider the following:

// 1. Machine A obtains the lock, but before releasing the lock, machine A hangs or restarts, which will cause all other machines to hang. In this case, the lock needs to be reset according to the lock timeout period.

// 2. When the lock times out, two machines need to obtain the lock at the same time. You need to use the GETSET method to let the first machine obtain the lock, and the other machine continues to wait.

func GetDistributeLock(key string, expireTime int64) bool {



 currentTime := time.Now().Unix()

 expires := currentTime + expireTime

 redisAlias := "jointly"



 // 1. Obtain the lock and set value to the timeout period of the lock

 redisRet, err := redis.SetNx(redisAlias, key, expires)

 if nil == err && utils.MustInt64(1) == redisRet {

  // The lock was successfully obtained

  return true

 }



 // 2. If the machine that acquired the lock suddenly restarts and hangs up, the lock timeout period needs to be determined. If the lock times out, the new machine can acquire the lock again

 // 2.1 Obtaining the lock timeout period

 currentLockTime, err := redis.GetKey(redisAlias, key)

 iferr ! =nil {

  return false

 }



 If "Lock timeout period" is greater than or equal to "Current time", the lock has not timed out

 if utils.MustInt64(currentLockTime) >= currentTime {

  return false

 }



 // 2.3 Update the latest timeout period to the value of the lock and return the old timeout period

 oldLockTime, err := redis.GetSet(redisAlias, key, expires)

 iferr ! =nil {

  return false

 }



 // 2.4 If the two "old timeouts" of the lock are equal, it proves that no other machine has successfully acquired the lock

 If both A and B are competing, A will GetSet first. When B goes to GetSet, oldLockTime will be equal to the timeout set by A

 if utils.MustString(oldLockTime) == currentLockTime {

  return true

 }

 return false

}

Copy the code

MustString(), utils.MustInt64(), utils.MustInt64(), utils.

Let’s look at the logic for deleting locks:

// Delete the distributed lock

// @return bool true- Delete succeeded; False - Delete failed

func DelDistributeLock(key string) bool {

 redisAlias := "jointly"

 redisRet := redis.Del(redisAlias, key)

 ifredisRet ! =nil {

  return false

 }

 return true

}

Copy the code

Then there is the business processing logic:

func DoProcess(processId int) {



 fmt.Printf("Start thread %d \n", processId)



 redisKey := "redis_lock_key"

 for {

  // Get the distributed lock

  isGetLock := GetDistributeLock(redisKey, 10)

  if isGetLock {

   fmt.Printf("Get Redis Key Success, id:%d\n", processId)

   time.Sleep(time.Second * 3)

   // Delete the distributed lock

   DelDistributeLock(redisKey)

  } else {

   // If the lock is not obtained, sleep for a while to avoid excessive redis load

   time.Sleep(time.Second * 1)

  }

 }

}

Copy the code

Finally, 10 threads are used to execute DoProcess() :

func main() {

// Initialize the resource

 var group string = "i18n"

 var name string = "jointly_shop"

 var host string



// Initialize the resource

 host = "http://ip:port"

 _, err := xrpc.NewXRpcDefault(group, name, host)

 iferr ! = nil {

  panic(fmt.Sprintf("initRpc when init rpc failed, err:%v", err))

 }

 redis.SetRedis("jointly"."redis_jointly")



// Start 10 threads to grab Redis distributed lock

 for i := 0; i <= 9; i ++ {

  go DoProcess(i)

 }



// Avoid child thread exit, main thread sleep for a while

 time.Sleep(time.Second * 100)

 return

}

Copy the code

After 100s, only one thread (2, 1, 5, 9, 3) acquires the lock.

Start thread 0

Start the sixth thread

Start the ninth thread

Start the fourth thread

Start the fifth thread

Start the second thread

Start the first thread

Start the eighth thread

Start the seventh thread

Start the third thread

Get Redis Key Success, id:2

Get Redis Key Success, id:2

Get Redis Key Success, id:1

Get Redis Key Success, id:5

Get Redis Key Success, id:5

Get Redis Key Success, id:5

Get Redis Key Success, id:5

Get Redis Key Success, id:5

Get Redis Key Success, id:5

Get Redis Key Success, id:5

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:9

Get Redis Key Success, id:3

Get Redis Key Success, id:3

Get Redis Key Success, id:3

Get Redis Key Success, id:3

Get Redis Key Success, id:3

Copy the code

In the pit of

There are some pits in the middle, so LET me briefly say:

  • Before we had done a service migration, physical machine needs to be migrated to Neo cloud, when the flow from the physical machine migration Neo after cloud, don’t forget to stop the physical machine timing task, or physical opportunity to grab this distributed lock, especially after the code is subject to change, physical machine if get the lock, will continue to implement the old code, that is a big pit.
  • Do not easily modify the timeout of distributed lock, in order to quickly troubleshoot problems before, modify once, and then a very weird problem, troubleshoot a day, the specific problem is not too clear, we are interested in, you can simulate it yourself.

Afterword.

This distributed lock was actually written by me in 2019 and has been running online for 2 years. It only needs simple modification and can be run online. There is no need to worry about the pit inside, because the pit has been run by me.

Last week wrote an article on the current limit, combined with today’s distributed lock, and are used in the project recently, so I tidy it up, in fact, I want to write, is the implementation of the tasks divided way, is also a recent do asynchronous tasks in the company when the Get to the new skills, it supports multiple machine concurrent execution of a task, isn’t it amazing, will share with you later.

Look at the time, it’s already early in the morning, sleep ~~

Welcome everyone to like a lot, more articles, please pay attention to the wechat public number “Lou Zai advanced road”, point attention, do not get lost ~~