Modern computers are essentially multicore, and our business processes typically run on multiple computers. Whether it is multiple threads on a computer or different processes running on multiple computers, conflicts will inevitably occur when dealing with system resources. In order to solve the conflict of shared resources, we often need to lock.

Here are two common examples.

Example 1: a= 0,10 threads execute a++ at the same time. After the program runs for several times, it will be found that the value of a is uncertain each time. If you want to ensure that the final value of a is 10, you can serialize the concurrent threads by locking them.

Example 2: In the limited purchase activity of an e-commerce platform, commodity A can only buy 50 pieces. How to ensure that only 50 pieces can be sold in the end? We can also rely on the locking mechanism to lock the fastener stock every time. The locking here does not have to be our business code, but can also be a database row lock or a distributed lock depending on other middleware.

The nature of the lock

Combine the above two examples to explain my understanding of locks. Lock: is the nature of the business scenario exists in the Shared resource, multiple processes or threads need to deal with the competition to obtain and share resources, in order to guarantee a fair, reliable, accurate results, etc, the business logic, to see the problem of concurrent execution into serial, serial third-party lock is introduced as a judgment of who has the authority to operate to a Shared resource.

This successfully abstracts the business problem into a lock fetching problem. So how do you handle locks in a computer?

The realization of the lock

The most fundamental implementation of locking still needs hardware support. CPU provides atomic operation instructions. For example, X86, ARM and other architectures provide some mechanisms to ensure that multiple operation instructions are executed in one atomic operation. These atomic operations are the basis for our locking implementation.

The operating system further encapsulates the lock provided by CPU. For example, the Linux operating system provides spinlock, mutex, DWLock, RCU lock and other common logical locks.

The various languages we use to do business development give us a more convenient, out-of-the-box lock structure on top of the operating system and assembly. Consider the Mutex (sync.mutex) in Golang, the read-write lock (sync.rwmutex), and the various atomic operations of automic.

The problem of the lock

We can divide locks into pessimistic locks and optimistic locks. Pessimistic locking assumes that other threads will compete for shared resources and must acquire the lock structure before executing subsequent logic. Optimistic locks, on the other hand, assume that there is no need for contention, and directly modify the value to achieve the target if successful. If not, retry until successful or timeout.

CAS (Compare and Swap) is a typical technique to implement optimistic locking. There are three basic operation values in CAS mechanism: value memory value, old old expected value, and new expected value.

func compareAndSwap(value int, old int.new int) bool{
    if(value ! =old){return False
    }
    *value= new
    return True
}
Copy the code

In the above code, we compare whether value and old are equal. If they are not equal, the old value has been modified and the lock fails. If they are equal, new is assigned to value and the lock is successfully acquired. When lock acquisition fails, the system changes the value of value to the current value and waits for the next lock acquisition until timeout or lock acquisition succeeds.

CAS have a typical problem: ABA. The so-called ABA problem is that during the CAS operation, other concurrent threads change the memory value from A to B, and then to A. The CAS mechanism considers that A has not changed, but actually A has changed state, which may cause multiple threads to acquire the lock at the same time. The typical solution to ABA problems is to increase the version number, which is also implemented in Golang and the Java language. Methods to add the version number is in the concurrent execution of multiple threads to modify the variable A, add an incremental version number, version number are required to execute the condition of success is consistent, and after A successful execution version number + 1, it’s A way to ensure that multiple processes at the same time, there can be only one access to lock, also won’t because retry logic problems.

A distributed lock

In order to ensure high availability of the system, many processes need to deploy multiple instances. In order to enable multiple instances to properly process shared resources, the same lock cannot be used inside these instances. Therefore, third-party systems must be introduced to implement the lock. We can rely on MySQL row locks, Redis atomic operations, etc.

When a third-party system is required to generate the lock structure, the third-party system must ensure its high availability. For example, we use the stand-alone Redis to realize the lock operation. If the stand-alone Redis is down, all processes will fail to obtain the lock. In order to ensure the high availability of Redis, we can adopt the master-slave architecture mode. When we adopt the master-slave architecture, there will be a time delay for the Master node to synchronize data to the Slave node. If two processes AB, A after the Master node obtains the lock structure, before the data is synchronized to the Slave node, B reads the data of the Slave node and finds that there is no such lock, and also changes the value of A key to obtain the lock, AB will obtain the lock at the same time, which obviously violates the original intention of using the lock.

Other solutions exist when Redis does distributed locking, but we’ll just illustrate the problem.

To sum up, the three-party system we rely on needs to maintain the Consistency of its external data, which is commonly referred to as the C (Consistency) feature in CAP theory. Based on this feature, ETCD and Zookeeper are good choices for open source software.

In the actual work, whether to develop or choose an open source product to do distributed lock depends on our business scenario. If a CP model with strong consistency is needed, ETCD is our first choice. Redis is also a good choice if there are some inconsistencies. Of course, it should also be considered comprehensively in combination with the amount of business adjustment and operation and maintenance capacity.

One caveat: Locking implementations, CAS or otherwise, are cpu-intensive, so it is best if you can design your business logic to avoid locking. Lock with as little force as possible. Sharding is an effective means to improve lock efficiency.

conclusion

In this paper, we mainly introduced the lock in the computer, briefly introduced the implementation of the lock, the lock of the common problems and how to choose distributed lock. In this paper, there is no specific implementation of a source code level analysis, more is standing in the perspective of onlookers thinking about the principle and nature of the lock problem.

The above is purely personal learning, summary of some views, welcome everyone private letter exchange.

Invite attention to the public number: a row of weekly update technical articles, and we progress together.