Swastika! Distributed lock implementations that no one knows about are all here

Click on the above
Focus on not getting lost ~

Introduction to this Article

1. Introduce business scenarios

2, distributed lock family members

3, distributed lock member implementation principle analysis

4. Final conclusion

2019 is over!

2020 has arrived!

Let’s start with a scene:

Recently, the boss took a big list allowed in a terminal equipment installed our APP, terminal equipment manufacturers, live at least hundreds of thousands to millions of level, this APP is the product designed according to the market competing goods analysis, several small yards farmers burning it developed, the main function is to online shopping one-stop service, the background can give various merchants assign permissions, To maintain information about goods that need to be sold.

Boss O: It is not easy to talk about, the next thing is to consider how to attract more users to register on terminal devices, how to guide users to buy, this part is in charge of small P, need to do as soon as possible, I will go on a business trip tomorrow!

Product small P: hey, hey ~, eyes a turn son, it is easy to think of, thought: “this is not simple, at least in the home page to do an activity page…” .

Technical T: I soon got to know the demand of the product. At present, LITTLE J is mainly in charge of this part, and has arranged the activity page with the front end and back end classmates.

Business scenario 1:

Because T just took over the project, is trying to get familiar with the code, deployment architecture. In the process of looking at the code, I found that there may be problems in the code of ordering, which is distributed deployment. If multiple users buy the same product at the same time, it may lead to oversold inventory (inconsistent data) of the product, and there is no control in the code for this situation.

I just know that they are selling virtual goods in the past, there is no inventory, so I did not consider so much…

This time it is different. This time it is selling physical goods, so there is such a word as inventory. At least make sure that the quantity of inventory can not exceed the set quantity.

Small T big eyes on the screen, hold your breath, fortunately found this problem in advance, hurriedly think of a way to fix, not to make money but also lose money, the boss must not be crazy, also want to do ~

Business Scenario 2 appears:

A brother under little T is pressing, and found a small problem. Because we have close cooperation with goose factory on the terminal device, we need to obtain the access_token when calling their interface, but the access_token expires in 2 hours and needs to be obtained again after expiration.

During the compression test, several different Access_tokens are displayed in the log when the expiration time is reached. This is because the service is also distributed and multiple nodes simultaneously initiate third-party interface requests.

Although the last access_token obtained has no adverse side effects, it may lead to multiple unnecessary calls to third-party interfaces, as well as repeated invalid access_token acquisition (duplicate work) in a short period of time.

Business Scenario 3 appears:

After the order is placed, the warehouse logistics is also notified. After the user completes the payment, the payment callback may send multiple order messages to MQ, and the warehouse service will consume the order message from MQ. At this point, the order message must be idempotent and de-processed.

The above will help you understand why distributed locking is necessary to solve the problem and outline several business scenarios.

All of the above problems are serialized for shared resources to ensure safe and reasonable operation.

Try it out with a picture:

At this point, the Synchronized, already provided using Java, ReentrantReadWriteLock… , can only in a single JVM process to multithreading on shared resources to ensure thread safety, in the distributed system environment are all bad, the mood is not cold ah.

This problem has to consult the distributed lock family to support, I heard that there are many members in their family, each member has this distributed lock function, then began to explore.

Why do we need distributed locks to solve this problem?

Listen to what Martin has to say:

Martin Kleppmann, a distributed systems researcher at the University of Cambridge in the UK, had a heated discussion with Antirez, the father of Redis, about whether RedLock (the distributed lock implementation algorithm in Redis) was secure.

What did they talk about? What did they talk about? You could write an article on your own

Please read Maritin’s blog post for yourself:

https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html

Efficiency:

Using distributed locks prevents multiple clients from repeating the same work, which can waste resources. For example, users may receive multiple SMS or email reminders after payment is completed.

For example, in business scenario 2, the access_token is obtained repeatedly.

Operations on shared resources are idempotent operations, and no matter how many times you operate, the result will not be different. In essence, this is to avoid repeated operations on shared resources and improve efficiency.

Validity:

Using distributed locks can also avoid lock failure, which may cause correctness failure, data inconsistency, data loss, or other serious problems.

For example, business scenario 1, merchandise inventory oversold problem.

Operations on shared resources are non-idempotent. Multiple clients operating shared resources may cause data inconsistency.

What are the characteristics of distributed locks?

The following are some characteristics of distributed locks. Not all members of the distributed lock family meet this requirement. The implementation mechanisms are different.

Mutual exclusion: Distributed locks are mutually exclusive between multiple clients.

Reentrancy: The same thread on the same client can be locked multiple times.

Lock timeout: Supports lock timeout as local locks, preventing deadlocks.

Non-blocking: Can support trylock() non-blocking as ReentrantLock does.

Support for fair and unfair locks: Fair locks are locks obtained in the order in which they are requested. Unfair locks are good.

Distributed lock family implementers are introduced

Distributed lock family implementers at a glance:

The mind map makes a simple classification, not necessarily precise, of almost every component implementer of a distributed lock.

Let them introduce themselves:

1. Database

Exclusive lock (pessimistic lock) : based on the SELECT * from table WHERE XX =yy for update SQL statement to implement, there are many defects, generally not recommended, later introduced.

Update xx set version = new… update xx set version = new… Where id = y and version = old If the update fails, the client tries again, reads the latest version number or timestamp, and tries the update again. This mechanism is similar to the CAS mechanism.

2, Redis

Features: CAP model belongs to the AP | | no consistency algorithm performance is good

This is recommended if you happen to use Redis in your project and do not want to introduce additional distributed lock components.

There are several frameworks available to support distributed locking, such as Redisson, spring-integration-Redis, and redis setnx.

In addition, you can implement distributed locking yourself based on the Redis command and atomic features supported by Redis Lua.

3, they are

Features: CAP model belongs to the CP | ZAB consistency algorithm | stability is good

Common for development and recommended if you happen to be using A ZK cluster in your project.

The Apache Curator framework provides ready-made distributed locking functionality, which is recommended to use directly.

In addition, distributed locks can be implemented based on Zookeeper features and native Zookeeper APIS.

4, other

Chubby is a coarse-grained distributed lock service developed by Google, but it is not open source. It has published papers and some relevant documents for further understanding. Go out to Baidu to obtain the documents, without too much discussion.

Tair, is a distributed KV storage solution of Ali open source, which has not been used and will not be discussed too much.

Etcd, CAP model belongs to CP, Raft consistency algorithm implementation, not used, not discussed too much.

Hazelcast, an open source project for memory-based data grids, provides flexible and scalable distributed memory computing and is widely recognized as the best solution for improving application performance and scalability.

Of course, the common distributed locks Zookeeper and Redis recommended above need to be weighed according to specific business scenarios to achieve the desired effect in terms of functions. There are great differences in principles.

Voice-over: You know which one, you know the principle, hold on, you use which one.

Database pessimistic lock implementation

Operating with a “pessimistic mindset” of resources that cannot be locked is blocked waiting.

1. There is a resource lock table

CREATE TABLE 'resource_lock' (' id 'int(4) NOT NULL AUTO_INCREMENT COMMENT '主键', 'resource_name' varchar(64) NOT NULL DEFAULT COMMENT 'lock resource name ', 'owner' varchar(64) NOT NULL DEFAULT 'COMMENT ',' desc 'varchar(1024) NOT NULL DEFAULT' COMMENT ', 'update_time' TIMESTAMP NOT NULL DEFAULT 'COMMENT' PRIMARY KEY (' id '), KEY 'uidx_resource_name' (' resource_name ') USING BTREE) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=' lock ';Copy the code

Resource_name Lock resource name must have a unique index.

2. Use posture

Transactions must be added, and the query and update operations must be atomically performed in a single transaction.

Pseudo-code implementation:

@Transactionpublicvoidlock(String name) { ResourceLock rlock = exeSql("select * from resource_lock where resource_name =  name for update"); if (rlock == null) { exeSql("insert into resource_lock(reosurce_name,owner,count) values (name, 'ip',0)"); }}Copy the code

Resources locked with for Update. If the execution is successful, it will return immediately, insert the database, and then execute some other business logic until the transaction is committed. If the execution fails, it will remain blocked.

You can also test this effect on the database client tool when executing for Update on a terminal without committing a transaction. Executing a for update on another terminal with the same condition keeps it stuck, spinning in circles…

Distributed locking is also possible, but there are performance bottlenecks.

3, pessimistic lock advantages and disadvantages

Advantages: Easy to use, easy to understand, ensure data consistency.

Disadvantages of a lot of, list:

1) At the RR transaction level, the for UPDATE operation of select is implemented based on gap lock, which is a pessimistic lock implementation mode, so there is blocking problem.

2) In the case of high concurrency, a large number of requests will queue up most of the requests, affecting database stability and consuming CPU and other resources of the service.

When the client that acquired the lock waits too long, it will prompt:

[40001][1205] Lock wait timeout exceeded; try restarting transaction

In the case of high concurrency, too many application threads are occupied and services cannot respond properly.

3) If the thread that acquired the lock first does not release the lock for some reason, a deadlock may occur.

4) If the lock is not released for a long time, it will occupy the database connection and may burst the database connection pool, affecting other services.

5) MySql database will perform query optimization, even if the use of index, the optimization found that the full table sweep efficiency, may upgrade the row lock to the table lock, which may be more serious.

6) Reentrant is not supported, and the timeout wait time is global and cannot be changed arbitrarily.

Database optimistic lock implementation

Optimistic lock, with “optimistic attitude” to operate the shared resource, can not get the lock success, it does not matter later try to see again, and then directly exit, try a certain number of times or not? You can do it later instead of blocking.

1. There is a resource table

Add a field to the table, either a version number or a timestamp. Through the version number or timestamp, to ensure that multiple threads at the same time operation of shared resources in order and correctness.

CREATE TABLE 'resource' (' id 'int(4) NOT NULL AUTO_INCREMENT COMMENT '主键', 'resource_name' varchar(64) NOT NULL DEFAULT COMMENT 'resource name ',' share 'varchar(64) NOT NULL DEFAULT COMMENT' status ', 'version' int(4) NOT NULL DEFAULT 'COMMENT ',' desc 'varchar(1024) NOT NULL DEFAULT' COMMENT ', 'update_time' TIMESTAMP NOT NULL DEFAULT 'COMMENT' PRIMARY KEY (' id '), KEY 'uidx_resource_name' (' resource_name ') USING BTREE) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=' resource_name ';Copy the code

2. Use posture

Pseudo-code implementation:

Resrouce resource = exeSql("select * from resource where resource_name = xxx"); boolean succ = exeSql("update resource set version= 'newVersion' ... where resource_name = xxx and version = 'oldVersion'"); if (! Succ) {// Initiate retry}Copy the code

In the actual code, you could write a while loop that retries, fails to update, and retrieves a new version number until the update succeeds.

3. Advantages and disadvantages of optimistic locking

Advantages: Easy to use, ensure data consistency.

Disadvantages:

1) Adding row locks has some performance overhead

2) In high concurrency scenarios, spin operations in threads will consume certain CPU resources.

Update table set state = 1 where id = XXX and state = 0; update table set state = 1 where id = XXX and state = 0;

Optimistic locking is similar to the CAS Compare And Swap update mechanism

Based on Redis distributed lock implementation

Distributed lock based on SetNX

Distributed lock based on Redis implementation, performance is the best, implementation is also the most complex.

RedLock mentioned in the previous article is a “robust” implementation algorithm of distributed lock proposed by Antirez, the father of Redis, but it is also more controversial and generally not recommended.

Redis prior to 2.6.12 used setnx + expire to implement distributed locks, as shown in the following example code:

publicstaticbooleanlock(Jedis jedis, String lockKey, String requestId, int expireTime) {        Long result = jedis.setnx(lockKey, requestId);        //设置锁        if (result == 1) {            //获取锁成功            //若在这里程序突然崩溃，则无法设置过期时间，将发生死锁            //通过过期时间删除锁            jedis.expire(lockKey, expireTime);            return true;        }        return false;    }Copy the code

Failure is returned if lockKey exists, success is returned otherwise. Expire () is used to set the expiration time for the lockKey to expire after the code is synchronized. Otherwise, the lock cannot be released by the next thread.

However, the setnx + expire two commands are executed in the program, which is not atomic operation and is prone to accidents.

If the program crashes after the lock is set and before the expiration time is set, a deadlock will occur if the lockKey is not set to expire.

There are two ways to solve the above problems:

1) Method 1: Lua script

We can also implement atomicity of lock Settings and expiration times using Lua scripts and run the script through the jedis.eval() method:

ARGV[1] is the random value of the UUID, ARGV[2] private static final String SCRIPT_LOCK = "if redis. Call ('setnx', KEYS[1], ARGV[1]) == 1 then redis.call('pexpire', KEYS[1], ARGV[2]) return 1 else return 0 end"; // Unlock script, KEYS[1] unlock key, ARGV[1] private static final String SCRIPT_UNLOCK = "If redis. Call ('get', KEYS[1]) == ARGV[1] then return redis.call('del', KEYS[1]) else return 0 end";Copy the code

2) Mode 2: set native command

After Redis 2.6.12 SETNX added the expiration time parameter:

SET lockKey anystring NX PX max-lock-time

The program code is as follows:

publicstatic boolean lock(Jedis jedis, String lockKey, String requestId, int expireTime) {        String result = jedis.set(lockKey, requestId, "NX", "PX", expireTime);        if ("OK".equals(result)) {            return true;        }        return false;    }Copy the code

Although SETNX way can guarantee the setting atomic locks and expiration time, but if we set the expiration time is shorter, and execute the business time is long, will lock code block failure problems, failure after other clients can also get the same lock, and perform the same business, may be some problem at this time.

We need to set the expiration time long enough to ensure that the above problems do not occur, but how long it is reasonable to set it depends on the business. If the other client has to block to get the lock, it needs to design a loop timeout waiting mechanism, etc., which feels quite troublesome.

The Spring Enterprise integration pattern implements distributed locking

In addition to using the Jedis client, you can directly use the Enterprise Integration pattern framework provided by Spring, which provides many distributed lock methods. Spring provides a unified distributed lock abstraction, concrete implementation currently supports:

Gemfire
Jdbc
Zookeeper
Redis

In the early days, the code for distributed locks existed in the Spring Cloud Cluster, a sub-project of Spring Cloud, and was later moved to Spring Integration.

Spring Integration project address: https://github.com/spring-projects/spring-integration

This is where Spring’s power lies, with its global abstraction of Lock distributed locks.

The abstract structure is as follows:

LockRegistry as the top-level abstract interface:

/** * Strategy for maintaining a registry of shared locks * * @author Oleg Zhurakousky * @author Gary Russell * @since 2.1.1 * / @ FunctionalInterfacepublic interfaceLockRegistry {/ * * * Obtains the lock associated with the parameter object. *  @param lockKey The object with which the lock is associated. * @return The associated lock. */ Lock obtain(Object lockKey); }Copy the code

The defined Obtain () method obtains the concrete Lock implementation classes, created separately in the corresponding XXxlockRegs implementation class.

In springBoot2. x (Spring5), the SET + PEXIPRE command is used in combination with the Lua script to implement the obtain() method. In springboot1. x (Spring4), this is done with the SETNX command.

The implementation class of the Obtain () method in ZookeeperLockRegistry is ZkLock, which is internally implemented based on the Apache Curator framework.

The jdbClockObtain () method in JdbcLockRegistry is implemented as JdbcLock, which is internally implemented based on an INT_LOCK database lock table and is operated by a JdbcTemplate.

Client usage:

private final String registryKey = "sb2"; RedisLockRegistry lockRegistry = new RedisLockRegistry(getConnectionFactory(), this.registryKey); Lock lock = lockRegistry.obtain("foo"); lock.lock(); try { // doSth... }finally { lock.unlock(); }}Copy the code

The following is the latest version of the implementation to explain the specific process of locking and unlocking.

RedisLockRegistry$RedisLock lock()

Locking steps:

1) The lockKey is registryKey:path (in this example, sB2 :foo), and client C1 has the priority to apply for lock.

2) Run the lua script. If get lockKey does not exist, set lockKey succeeds. The value is clientid (UUID) and the default expiration time is 60 seconds.

3) Client C1 repeats lock for the same thread, pexpire lockKey, reset expiration time to 60 seconds.

4) Client C2 applies for locking, executes lua script, get lockKey already exists, and is different from clientid which has been locked, locking fails

5) Client C2 hangs and tries to lock again every 100ms.

RedisLock#lock()

We can compare the flow chart above with your understanding.

@Overridepublic void lock() { this.localLock.lock(); while (true) { try { while (! obtainLock()) { Thread.sleep(100); //NOSONAR } break; } catch (InterruptedException e) { /* * This method must be uninterruptible so catch and ignore * interrupts and only break out of the while loop when * we get the lock. */ } catch (Exception e) { this.localLock.unlock(); rethrowAsLockException(e); }} private Boolean obtainLock() {Boolean success = RedisLockRegistry.this.redisTemplate.execute(RedisLockRegistry.this.obtainLockScript, Collections.singletonList(this.lockKey), RedisLockRegistry.this.clientId, String.valueOf(RedisLockRegistry.this.expireAfter)); boolean result = Boolean.TRUE.equals(success); if (result) { this.lockedAt = System.currentTimeMillis(); } return result; }Copy the code

Lua script:

private static final String OBTAIN_LOCK_SCRIPT =    "local lockClientId = redis.call('GET', KEYS[1])\n" +            "if lockClientId == ARGV[1] then\n" +            "  redis.call('PEXPIRE', KEYS[1], ARGV[2])\n" +            "  return true\n" +            "elseif not lockClientId then\n" +            "  redis.call('SET', KEYS[1], ARGV[1], 'PX', ARGV[2])\n" +            "  return true\n" +            "end\n" +            "return false";Copy the code

RedisLockRegistry$RedisLock Unlock ()

RedisLock#unlock()

@Overridepublic void unlock() { if (! this.localLock.isHeldByCurrentThread()) { throw new IllegalStateException("You do not own lock at " + this.lockKey); } if (this.localLock.getHoldCount() > 1) { this.localLock.unlock(); return; } try { if (! isAcquiredInThisProcess()) { throw new IllegalStateException("Lock was released in the store due to expiration. " + "The  integrity of data protected by this lock may have been compromised."); } if (Thread.currentThread().isInterrupted()) { RedisLockRegistry.this.executor.execute(this::removeLockKey); } else { removeLockKey(); } if (LOGGER.isDebugEnabled()) { LOGGER.debug("Released lock; " + this); } } catch (Exception e) { ReflectionUtils.rethrowRuntimeException(e); } finally { this.localLock.unlock(); Keyprivate void removeLockKey() {if (this.unlinkavailable) {try { RedisLockRegistry.this.redisTemplate.unlink(this.lockKey); } catch (Exception ex) { LOGGER.warn("The UNLINK command has failed (not supported on the Redis server?) ; " + "falling back to the regular DELETE command", ex); this.unlinkAvailable = false; RedisLockRegistry.this.redisTemplate.delete(this.lockKey); } } else { RedisLockRegistry.this.redisTemplate.delete(this.lockKey); }}Copy the code

Unlock () does not remove the Key directly by calling the DEL command in Redis. This is also an optimization made in SpringBoot2. x, and the UNLINK command is available in Redis4.0 and above.

In other words, the latest version of the distributed lock implementation requires Redis4.0 or later to be used.

Take a look at the explanation on Redis’s website:

This command is very similar to DEL: it removes the specified keys.Just like DEL a key is ignored if it does not exist. However thecommand performs the actual memory reclaiming in a different thread,so it is not blocking, while DEL is. This is where the command namecomes from: the command just unlinks the keys from the keyspace. Theactual removal will happen later asynchronously.Copy the code

DEL always releases the value part in block mode. However, if the value is too large, such as too many allocations for large lists or hashes, it will block Redis for a long time. To solve this problem, Redis implements the UNLINK command, or “non-blocking” delete. If the value is small, DEL is generally as efficient as UNLINK.

In essence, this locking is implemented using SETNX, and Spring has only a thin layer of encapsulation that supports reentrant locking, timeout waiting, and interruptible locking.

However, there is a problem. The expiration time of the lock cannot be flexibly set. This is allowed when RedisLockRegistry is created during client initialization, but it is global.

/** * Constructs a lock registry with the supplied lock expiration. * @param connectionFactory The connection factory. *  @param registryKey The key prefix for locks. * @param expireAfter The expiration in milliseconds. */public RedisLockRegistry(RedisConnectionFactory connectionFactory, String registryKey, long expireAfter) { Assert.notNull(connectionFactory, "'connectionFactory' cannot be null"); Assert.notNull(registryKey, "'registryKey' cannot be null"); this.redisTemplate = new StringRedisTemplate(connectionFactory); this.obtainLockScript = new DefaultRedisScript<>(OBTAIN_LOCK_SCRIPT, Boolean.class); this.registryKey = registryKey; this.expireAfter = expireAfter; }Copy the code

The expireAfter parameter is global and can also cause problems. It is possible that the lock expires and is acquired by another client before the transaction is completed, which may cause other problems.

After the analysis of the source code, in fact, we can also learn from RedisLockRegistry implementation based on their own packaging to achieve distributed locks, such as:

1, allow to set expiration time by different Key, not global?

2. When services are not processed, the current client starts a scheduled task detection to automatically extend the expiration time.

Do it yourself? The trouble? Don’t worry, don’t worry! The industry already has a ready-made implementation, the Redisson framework, which is discussed in the Redisson section below.

Look at it from the Redis cluster perspective

From the perspective of Redis architecture, there are still problems. Because data in Redis cluster is asynchronously synchronized to each node, if the Master node crashes without synchronization to other nodes after obtaining the lock, the new Master node can still obtain the lock, so multiple application services can obtain the lock at the same time.

Based on the above considerations, Antirez, the father of Redis, proposed a RedLock algorithm.

Analysis of RedLock algorithm implementation process:

Assume the Redis Cluster deployment mode is Redis Cluster, with a total of 5 master nodes, acquire a lock by the following steps:

1) Get the current timestamp in milliseconds

2) Try to create locks on each master node in turn, and set the expiration time to be short, usually tens of milliseconds

3) Try to create a lock on most nodes, e.g. 5 nodes should be 3 nodes (n / 2 +1)

4) The client calculates the lock establishment time. If the lock establishment time is less than the timeout time, the establishment is successful

5) If lock establishment fails, delete the lock in turn

6) As long as a client successfully creates a distributed lock, other clients have to constantly poll to try to acquire the lock

As mentioned above, further analysis of the RedLock algorithm may still be problematic, and it is a point of contention between Martain and Antirez.

Fault 1: A node crashes and restarts

When a node crashes and restarts, multiple clients hold locks.

Suppose there are five Redis nodes: A, B, C, D, and E. Imagine the following sequence of events:

1) Client C1 successfully locks nodes A, B, and C in Redis cluster (but D and E are not locked).

2) Node C Duang crashed and restarted, but client C1 failed to persist the lock on node C and lost it.

3) After node C is restarted, client C2 successfully attempts to lock C, D, and E in Redis cluster.

So, the tragedy! Both clients C1 and C2 acquire the same distributed lock.

To deal with the lock failure caused by node restart, Antirez proposes the concept of delayed restart. That is, after a node crashes, it is not restarted immediately, but restarted after a period of time, which is longer than the lock validity period.

In this way, locks that the node participates in will expire until the restart, and it will not affect existing locks after the restart.

This is also through artificial compensation measures to reduce the probability of the occurrence of inconsistency.

Problem 2: Clock jumping

Suppose there are five Redis nodes: A, B, C, D, and E. Imagine the following sequence of events:

1) Client C1 successfully locks nodes A, B, and C in Redis cluster. However, the communication with D and E fails due to network problems.

2) The clock on node C jumps forward, causing locks maintained on it to expire quickly.

3) Client C2 successfully adds the same lock to nodes C, D, and E in Redis cluster.

At this point, and a tragedy! Both clients C1 and C2 hold the same distributed lock.

To deal with lock failures caused by clock jumps, Antirez proposes to prohibit manual system time modification and use an NTPD program that does not “jump” the system clock. This is also through artificial compensation measures to reduce the probability of the occurrence of inconsistency.

But… , RedLock algorithm does not solve the problem of lock failure caused by timeout operation on shared resources.

The implementation of such a controversial algorithm is not recommended.

In general, the framework introduced in this article provides a distributed lock implementation that meets most of the requirements.

Summary:

Above, we have carried out an in-depth analysis of the implementation principle of Spring-Integration-Redis, and also analyzed the controversial issues of RedLock.

In addition, we also mentioned the spring-Integration distributed lock that integrates Jdbc, Zookeeper, and Gemfire implementations. Gemfire and Jdbc are available for you to see.

Why provide a Jdbc distributed lock implementation?

Guess what, when your application is not concurrency high, such as a backend business, and does not rely on Zookeeper, Redis, and other additional components, only rely on the database.

Spring-integration-jdbc (spring-integration-JDBC, spring-integration-jdbc, spring-integration-jdbc, spring-integration-jdbc, spring-integration-jdbc, spring-integration-integration-JDBC)

CREATE TABLE INT_LOCK  (    LOCK_KEY CHAR(36) NOT NULL,    REGION VARCHAR(100) NOT NULL,    CLIENT_ID CHAR(36),    CREATED_DATE DATETIME(6) NOT NULL,    constraint INT_LOCK_PK primary key (LOCK_KEY, REGION)) ENGINE=InnoDB;Copy the code

Concrete implementation logic is also very simple, we go to see.

The distributed lock of the integrated Zookeeper implementation, which is implemented based on the Co-curator framework, is not expanded in this section and will be analyzed later.

Distributed lock based on Redisson

Redisson is the client of the Java implementation of Redis, and its API provides comprehensive support for Redis commands.

Jedis simply uses blocking I/O to interact with Redis, and Redission supports non-blocking I/O via Netty.

Redisson encapsulates the implementation of the Lock, allowing us to use it as if we were manipulating our native locks, in addition to friendly encapsulation of collections, objects, common caching frameworks, and so on.

So far, the number of stars on Github is 11.8K, indicating that this open source project is worth paying attention to and using.

Redisson distributed lock Github:

https://github.com/redisson/redisson/wiki/8.-Distributed-locks-and-synchronizers

Redisson can easily support a variety of Redis deployment architectures:

1) single Redis

2) Master-slave + Sentinel Sentinel

3) Redis – Cluster Cluster

Config = new Config(); MasterSlaveServersConfig serverConfig = config.useMasterSlaveServers() .setMasterAddress("") .addSlaveAddress("") .setReadMode(ReadMode.SLAVE) .setMasterConnectionPoolSize(maxActiveSize) .setMasterConnectionMinimumIdleSize(maxIdleSize) .setSlaveConnectionPoolSize(maxActiveSize) SetSlaveConnectionMinimumIdleSize (maxIdleSize). SetConnectTimeout (CONNECTION_TIMEOUT_MS) / / 10 seconds by default .setTimeout(socketTimeout) ; RedissonClient redisson = Redisson.create(config); RLock lock = redisson.getLock("myLock"); // Get lock lock.lock(); Lock. lock(10, timeunit.seconds); Boolean res = lock.tryLock(100, 10, timeunit.seconds); if (res) { try { ... } finally { lock.unlock(); }}Copy the code

Very simple to use, the RedissonClient client provides a number of interface implementation, support reentrant lock, fair lock, read and write lock, lock timeout, RedLock and so on have provided a complete implementation.

Lock () lock process:

In order to be compatible with older versions, Redis commands are executed through Lua scripts in Redisson, while maintaining atomic operations.

Lua script executed with lock:

The Hash structure in Redis is stored.

Parameter Description:

KEY[1] : The name of the KEY to lock, such as myLock in the example.

ARGV[1] : Expiration time set for the locked Key

ARGV[2] : Hash Key name, lockName: UUID: thread ID

protected String getLockName(long threadId) { return id + ":" + threadId; }Copy the code

1) Client C1 applies for lock, and key is myLock.

2) If the key does not exist, set the value through hset and set the expiration time through pEXPIRE. At the same time, the Watchdog task is enabled. By default, the Watchdog checks the expiration time every 10 seconds and resets the expiration time to 30 seconds if the key is still present.

Open WatchDog source code:

3) The same thread of client C1 locks again. If the key exists, the lockName in the Hash of Redis is judged to be the same as the lockName of the current thread, and the value of lockName in the Hash is increased by 1, indicating that reentrant locking is supported.

4) Customer C2 applies for lock. If key exists, determine that the lockName in the Hash of Redis is different from the lockName of the current thread, and execute PTTL to return the remaining expiration time.

5) The C2 thread of the client keeps trying the PTTL time, which is implemented based on Semaphore Semaphore. If there is permission, return immediately; otherwise, if the PTTL time is still not approved, continue to try again.

Retry source code:

An implementation like Redisson solves the problem when the business process takes longer than the expiration time.

Redisson also extended the Lock interface, called the RLock interface, to extend the Lock interface, such as setting Key expiration time, non-blocking + timeout, etc.

voidlock(long leaseTime, TimeUnit unit); booleantryLock(long waitTime, long leaseTime, TimeUnit unit) throws InterruptedException;Copy the code

The WatchDog logic in Redisson ensures that no deadlocks occur.

If the client goes down, the WatchDog task stops. In this case, the expiration time of the Key is not reset. When the expiration time of the Key held by the hanging client expires, the lock is automatically released, and other clients try to obtain the lock.

You can further read the description of WatchDog on the official website:

If Redisson instance which acquired lock crashes then such lock could hang forever in acquired state. To avoid this Redisson maintains lock watchdog, it prolongs lock expiration while lock holder Redisson instance is alive. By default lock watchdog timeout is 30 seconds and can be changed through Config.lockWatchdogTimeout setting.

The unlock() process is the same, using a lua script to execute a bunch of instructions.

Unlock lua script:

According to the analysis of the lock process, you can look at the script analysis.

Distributed lock based on Zookeeper

Zookeeper is a centralized service that provides distributed service coordination and is implemented based on the Paxos algorithm. The Zookeeper data node is similar to a file directory and has the Watch mechanism. Based on the two features, the distributed locking function is implemented.

Data node:

Sequential temporary nodes: Zookeeper provides a multi-level node namespace (called Znode), each node is represented by a path separated by a slash (/), and each node has a parent node (except the root node), much like a file system.

Nodes can be classified as PERSISTENT and EPHEMERAL, and each node can be marked as SEQUENTIAL. Once the node is marked as SEQUENTIAL, the entire node becomes SEQUENTIAL.

Generally we can combine these types of nodes to create the desired nodes, for example, create a persistent node as a parent node, create a temporary node under the parent node, and mark the temporary node as ordered.

Watch mechanism:

Zookeeper also provides another important feature, the Watcher (event listener).

ZooKeeper allows users to register some Watcher on a specified node, and the ZooKeeper server notifies users of certain events when they are triggered.

Zookeeper implements distributed locking.

First, we need to create a parent node, a PERSISTENT node of the type shown in the/LOCKS /lock_name1 node. Whenever a shared resource needs to be accessed, sequential children are created under the parent node, and the node type is EPHEMERAL. It is marked SEQUENTIAL and has a specific name consisting of a temporary node name + a parent node name + an ordinal number, as shown in the figure / 0000000001/0000000002/0000000003.

After the child node is established, sort all the child nodes under the parent node starting with the name of the temporary node to determine whether the sequence number of the child node just established is the smallest node. If it is the smallest node, the lock will be obtained.

If it is not the smallest node, the wait lock is blocked, and the previous sequential node of the node is obtained, and a listening event is registered for it, and the corresponding operation of the node is waiting for the lock. When the shared resource is called, the node is removed, zK is closed, and a listening event can be triggered to release the lock.

                                                                                                                                                                                                                                                                                                                                InterProcessMutex lock = new InterProcessMutex(client, lockPath); if ( lock.acquire(maxWait, waitUnit) ){ try { // do some work inside of the critical section here } finally { lock.release(); }} public void acquire() throws Exception { if ( ! internalLock( -1, null) ) { throw new IOException( "Lost connection while trying to acquire lock: " + basePath); } } private boolean internalLock( long time, TimeUnit unit) throws Exception { /* Note on concurrency: a given lockData instance can be only acted on by a single thread so locking isn't necessary */ Thread currentThread = Thread.currentThread(); LockData lockData = threadData. get(currentThread); if ( lockData ! = null ) { // re-entering lockData.lockCount.incrementAndGet(); return true; } String lockPath = internals.attemptLock(time, unit, getLockNodeBytes()); if ( lockPath ! = null ) { LockData newLockData = new LockData(currentThread, lockPath); threadData.put(currentThread, newLockData); return true; } return false; } / /... Other codes omitted
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Copy the code

InterProcessMutex is a reentrant lock that is implemented by a Curator of InterProcessMutex.

Locking process:

1) Reentrant locks are recorded in ConcurrentMap threadData.

If threaddata. get(currentThread) has a value then the lock is reentrant and the record is incremented by 1.

3) Create a node in the resources directory: for example, create /0000000002, and set this node to EPHEMERAL_SEQUENTIAL.

4) Obtain all child nodes in the current directory and determine whether your node is the smallest node.

5) If it is the smallest node, then the lock is obtained. If it is not the smallest node, then it proves that someone has already obtained the lock, so it needs to obtain the previous node of its node.

6) The previous node of /0000000002 is /0000000001. After we obtain this node, we register Watcher. Watcher calls Object.notifyAll () to unblock.

7) Object.wait (timeout) or object.wait() to block the wait

Unlock process:

1) If the number of reentrant lock times is reduced by 1, the number of lock times is not 0, and the number of lock times is 0, continue.

2) Delete the current node.

3) Delete reentrant lock data from threadDataMap.

The distributed lock integrated by Apache Curator, Redisson and Spring framework introduced above, since it is a framework implementation, will consider user needs and try to design and implement a universal distributed lock interface.

Basically, it covers the following implementation methods:

Of course, Both Redisson and Curator implement their own distributed lock interfaces, which are easy to extend.

Custom in the Curator InterProcessLock interface, custom in Redisson RLock interface, inherited Java. Util. Concurrent. The locks. Lock interface.

For distributed locks implemented by Redis:

Under most requirements, “extremely complex scenarios” will not be encountered. Distributed locks based on Redis are commonly used and have high performance.

It can obtain the lock in a simple and crude way. If it cannot obtain the lock, it tries to obtain the lock continuously, which consumes performance.

In addition, the design positioning of Redis determines that its data is not strongly consistent. There is no consistent algorithm, and in some extreme cases, problems may occur, and the lock model is not robust enough.

Even with the implementation of The Redlock algorithm, it is controversial. In some complex scenarios, there is no guarantee that its implementation is completely free of problems, and it is also relatively performance consuming.

For distributed locks implemented by Zookeeper:

Zookeeper advantages:

Natural design positioning is distributed coordination, strong consistency. The lock model is robust, easy to use and suitable for distributed locking.

If you can’t get the lock, you just need to add a listener instead of polling all the time.

If the client goes down, it doesn’t matter, the temporary node is automatically deleted, triggering a listener to notify the next node.

Zookeeper faults:

If a large number of clients frequently apply for locks or release locks, the ZK cluster will be under great pressure.

In addition, this article on the Spring-Integration Redis implementation of distributed lock to do a detailed analysis, can be used directly, more recommended direct use of Redisson, the implementation of a large number of distributed lock mechanisms, a separate open Springboot integration jar package, is also very convenient to use.

Several business scenarios mentioned at the beginning of the article, after the introduction and principle analysis of the distributed lock family, can choose their own technical solutions.

Above, there must be a can meet your needs, I hope we have a harvest!

Code word is not easy, the article is not appropriate, welcome to be corrected.

Phase to select

This time, thoroughly understand “Java bytecode files”

Raft distributed protocol implemented in middleware

This paper thoroughly understand the CAS implementation principle

Through a case scene in life, uncover and issue the mysterious veil of the underlying AQS

From an online failure to understand the TCP three grip, four swing & Java stack analysis to source code exploration

Scan below the TWO-DIMENSIONAL code attention, the original dry goods timely push

Focus on the sharing of Java backend related technologies, the old driver of actual practice, not limited to JVM, concurrency, design patterns, performance optimization, distributed & microservices, cloud native related topics.

Swastika! Distributed lock implementations that no one knows about are all here

1. Introduce business scenarios

2, distributed lock family members

3, distributed lock member implementation principle analysis

4. Final conclusion

Distributed lock based on SetNX

Summary:

Related Posts

Introduction to Dubbo

Getting started with G1 garbage Collector

Java — Inner class 1 in detail