Implementation principle of Go mutex

This article has participated in the call for good writing activities, click to view: back end, big front end double track submission, 20,000 yuan prize pool waiting for you to challenge!

Go language lock is easy to use, this article sort out the realization principle of lock.

There are two types of locks in Golang: Mutex and RWMutex. Mutex is also called read lock, and read/write lock is also called read lock.

Write locks block write locks: When one coroutine has a write lock, other coroutines write locks block
Write locks block read locks: When a coroutine has a write lock, other coroutines block read locks
Read locks block write locks: When a coroutine has a read lock, other coroutines block write locks
Read locks do not block read locks: when one coroutine owns a read lock, other coroutines can also own a read lock

1. Use

Mutex and read-write locks are not very different in use

Use Lock() to add a mutex Lock and Unlock() to Unlock it
Read/write locks use RLock() to add read locks and RUnlock() to read locks. Use Lock() to add a write Lock, and use Unlock to Unlock a write Lock.

But they are used in different scenarios:

Mutex will serialize the operation, which can ensure that the operation is completely ordered. It is suitable for the situation that resources can only be operated by one coroutine, and the concurrency capability is weak.
Read/write lock is suitable for read more write less, concurrency is strong.

package main

import (
   "fmt"
   "sync"
   "time"
)

/** * @author: Jason Pang * @description: Test mutex */
func testMutex(a) {
   count := 0
   var l sync.Mutex
   for i := 0; i < 10; i++ {
      go func(a) {
         l.Lock()
         defer l.Unlock()
         fmt.Println("--------- mutex", count)
         count++
      }()
   }
}

/** * @author: Jason Pang * @description: Test read/write lock */
func testRWMutex(a) {
   count := 0
   var l sync.RWMutex
   for i := 0; i < 10; i++ {
      go func(a) {
         l.RLock()
         defer l.RUnlock()
         fmt.Println("--------- read/write mutex", count)
         count++
      }()
   }
}

func main(a) {
   testMutex()
   testRWMutex()
   time.Sleep(10 * time.Second)
}

Copy the code

Output:

You can see that with a mutex, the count value is incremented sequentially, whereas with a read/write mutex, the data is unordered.

2. Basic knowledge

Before we get into the specifics of how locking works, we need to review a few basics: process synchronization, semaphore, and spin.

2.1 Process Synchronization

Process synchronization is essentially achieved by controlling access to critical sections.

Critical resources: Resources that are accessible to only one process at a time are called critical or exclusive resources. Most of the physical devices in a computer system, as well as the stacks, variables, and tables used in some software, are critical resources that require to be shared mutually exclusive
Critical Section: The section of code that accesses critical resources in each process is called the critical section. If all processes can be guaranteed mutually exclusive access to their own critical region, the mutually exclusive access to critical resources can be realized.

Entry section: If the critical resource is not accessed at the moment, the process can enter the critical section to access the resource and set the flag that it is being accessed. If the critical resource is being accessed by a process at this moment, the process cannot enter the critical section.
Exit section: Used to restore a flag that is being accessed in a critical section to a flag that is not accessed.

Synchronization Mechanism Rules

(1) Free to let in. When no process is in the critical region, it indicates that the critical resource is idle. A process that requests to enter the critical region should be allowed to enter its own critical region immediately to efficiently utilize the critical resource.

(2) Busy is waiting. When an existing process enters a critical section, it indicates that the critical resource is being accessed, so other processes attempting to enter the critical section must wait to ensure mutually exclusive access to the critical resource.

(3) Limited wait. Processes that require access to critical resources should be guaranteed access to their own critical sections within a limited time to avoid falling into a “dead wait” state.

(4) let the right wait. When a process cannot enter its own critical section, it should immediately release the processor to prevent the process from falling into a “busy waiting” state.

The rule is consistent with reality: IF I have free time, I can use it. If I am not free, I can wait for order (if I am busy, I can wait). I have nothing else to do while I wait, so I can go and rest. You can’t keep me waiting.

2.2 a semaphore

In 1965, Dutch scholar Dijkstra proposed Semaphores mechanism, which is an effective process synchronization tool. Dijkstra, YYDS.

2.2.1 type

Semaphores have now evolved into the following four types:

Integer semaphore
Recording semaphore
AND type semaphore
Semaphore set

Although there are different types of semaphores, the core is pair: an integer quantity S that represents the number of resources and can only be accessed through the two standard Atomic Operation wait(S) and signal(S). Wait is used to reduce S and signal is used to increase S.

Wait (S) : while S<=0 do no-op; S: = S - 1; Signal (S) : S: = S + 1;Copy the code

2.2.2 application

Use semaphores to achieve mutual exclusion process

To enable multiple processes to mutually access a critical resource, set a mutex for the resource, set its initial value to 1, and place the critical section for each process to access the resource between wait(MUtex) and signal(mutex) operations.
Use semaphore to realize the precursor relationship

There are two concurrently executing processes P1 and P2. P1 has the statement S1; P2 has the statement S2. We want to execute S2 after S1. To achieve this forward relationship, we simply share a common semaphore S with processes P1 and P2, with an initial value of 0, and place the signal(S) operation after statement S1. Insert wait(S) before S2, i.e

In process P1, use S1; Signal (S);

In process P2, wait(S) is used; S2.

2.3 the spin

The synchronization rules have the option to wait, and the spin is basically what happens when a process can’t get to its own critical region.

2.3.1 definition

If it is found that the lock is currently held by another coroutine, the coroutine attempting to lock does not immediately block, but continuously detects whether the lock is released. This process is called the spin process. The spin time is short, but if the lock is found to have been released during the spin, the coroutine can acquire the lock immediately.

2.3.2 process

The spin process checks to see if the lock is available, and if not, executes the CPU PAUSE instruction. The CPU does nothing for this instruction, which is typically 30 clock cycles. After PAUSE is executed, check whether the lock can be added.

During this process, the process is still executing, not sleeping.

2.3.3 advantage

The main purpose of ** spin is to be more efficient and reduce losses. ** Spin has the advantage of making better use of the CPU and avoiding coroutine switching as much as possible. Because the coroutine currently applying for a lock owns the CPU, if the lock is acquired after a short spin, the current coroutine can continue running without entering the blocking state.

2.3.4 conditions

Spin should not be used arbitrarily, otherwise it will not give full play to the advantage and will cause more loss. For example, if there is no limit to the number of spins, and the process that acquired the lock takes a long time to release the lock, the CPU of the process that gained the spin will be wasted.

So to use spin, the following conditions must be met:

The number of spins should be small enough, usually 4, that is, at most 4 spins
The number of CPU cores must be greater than 1, otherwise the spin is meaningless because no other coroutine can release the lock
The number of processes in the scheduling mechanism must be greater than 1, otherwise the spin is meaningless
Runnable queues in the scheduling mechanism must be empty, otherwise coroutine scheduling will be delayed and CPU will need to be ceded to more needy processes

2.3.5 problem

Spin has a feature that ignores the process that is queuing to be locked. In the spin process, the lock can be locked when it is acquired, similar to queue-jumping.

So using spin can cause problems: in extreme cases, many processes are waiting in line to be locked, and a process has just arrived to spin the lock, and if it succeeds, the process will jump the queue and successfully lock. If a process spins the lock continuously, the queued process will be unable to obtain the lock for a long time.

A common solution is to add starvation to the lock, where spin is not allowed.

3. Implementation principle

3.1 Mutex

3.1.1 structure

The Mutex structure is as follows:

// A Mutex must not be copied after first use.
// Mutex cannot be copied after it is used. (meaning you can't copy values, you can make reference copies)
/ / copy can easily lead to unintended deadlock, https://mozillazg.com/2019/04/notes-about-go-lock-mutex.html#hidcopy
type Mutex struct {
   state int32
   sema  uint32
}

Copy the code

State Indicates the status of the mutex, such as whether it is locked.
Sema stands for semaphore, coroutine blocks waiting for the semaphore, unlocked coroutine releases the semaphore and wakes up the coroutine waiting for the semaphore.

State is a 32-bit integer variable that is internally divided into four parts to record the four states of Mutex.

const (
   mutexLocked = 1 << iota / / value 1
   mutexWoken / / value 2
   mutexStarving / / value 4
)

Copy the code

Locked: indicates whether the Mutex is Locked. 0: no Mutex is Locked. 1: The Mutex is Locked.

Woken: indicates whether any coroutine has been Woken up. 0: no coroutine has been Woken up. 1: coroutine has been Woken up and is being locked. When the lock is released, no other coroutines are awakened in normal mode.

‘Starving’ : indicates whether the Mutex processes hunger. 0: no hunger. 1: starvation.

Waiter: indicates the number of blocked coroutines to determine whether the semaphore needs to be released when the coroutine is unlocked

Locking between coroutines is actually robbing Locked of the right to assign values to Locked. Locking is successful if the Locked field is set to 1. If not, block and wait

Mutex. semaphore. Once the coroutine holding the lock is unlocked, the waiting coroutines are woken up in turn.

3.1.2 Mutual exclusion and fairness

Go1.13. Studying how medicine, woken, and Locked are used

// Mutex fairness.
//
// Mutex can be in 2 modes of operations: normal and starvation.
// In normal mode waiters are queued in FIFO order, but a woken up waiter
// does not own the mutex and competes with new arriving goroutines over
// the ownership. New arriving goroutines have an advantage -- they are
// already running on CPU and there can be lots of them, so a woken up
// waiter has good chances of losing. In such case it is queued at front
// of the wait queue. If a waiter fails to acquire the mutex for more than 1ms,
// it switches mutex to the starvation mode.
//
// In starvation mode ownership of the mutex is directly handed off from
// the unlocking goroutine to the waiter at the front of the queue.
// New arriving goroutines don't try to acquire the mutex even if it appears
// to be unlocked, and don't try to spin. Instead they queue themselves at
// the tail of the wait queue.
//
// If a waiter receives ownership of the mutex and sees that either
// (1) it is the last waiter in the queue, or (2) it waited for less than 1 ms,
// it switches mutex back to normal operation mode.
//
// Normal mode has considerably better performance as a goroutine can acquire
// a mutex several times in a row even if there are blocked waiters.
// Starvation mode is important to prevent pathological cases of tail latency.

Copy the code

Mutex has two modes: normal mode and hungry mode.

Normal mode: In normal mode, the waiting coroutines are arranged in first-in, first-out order. When a coroutine is awakened, it does not own the lock directly. It must compete with the newly arrived coroutine for ownership of the lock. The advantage of a newly arrived coroutine is that it is already running on the CPU, and there may be many new coroutines, so the awakened coroutine may not be able to grab the lock. In this case, the awakened coroutine is placed at the head of the waiting queue. If the waiting coroutine does not acquire the lock for more than 1ms, the lock will be put into starvation mode.

Hungry mode: In hungry mode, the unlocked coroutine hands ownership of the lock directly to the coroutine at the head of the queue. A new coroutine arrives at the exact moment it is unlocked, and the newly arrived coroutine does not attempt to spin the lock. Instead, they put themselves at the back of the queue.

If the coroutine in the waiting queue acquires a lock, it looks

(1) Whether it is the last coroutine in the waiting queue

(2) Whether the waiting time is less than 1ms

If any of the conditions are met, the lock is changed to normal mode.

Normal mode is generally considered to have better performance because new coroutines can continuously acquire locks even if there are waiting coroutines. Starvation mode prevents waiting coroutines from being locked for long periods of time.

3.1.3 the Lock

// Lock locks m.
// If the lock is already in use, the calling goroutine
// blocks until the mutex is available.
// If the Lock is already occupied, the coroutine of the Lock is called to block until the Lock is released
func (m *Mutex) Lock(a) {
   // Fast path: grab unlocked mutex.
   // If the lock is not occupied, not hungry, not awake, and not waiting coroutines, the lock is successfully locked
   // This is a perfect state
   if atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) {
      if race.Enabled { // The default is false, so you can ignore it
         race.Acquire(unsafe.Pointer(m))
      }
      return
   }
   // Slow path (outlined so that the fast path can be inlined)
   m.lockSlow()
}

func (m *Mutex) lockSlow(a) {
   var waitStartTime int64
   starving := false
   awoke := false
   iter := 0
   old := m.state
   for {
      // Don't spin in starvation mode, ownership is handed off to waiters
      // so we won't be able to acquire the mutex anyway.
      // If it is in normal mode and the lock is preempted, spin if it meets the spin condition
     	// It is necessary to ensure that the coroutines in the waiting queue can acquire the ownership of the lock to prevent the waiting queue from starving
      // End the spin if the lock becomes hungry or is unlocked, or does not meet the spin condition
      if old&(mutexLocked|mutexStarving) == mutexLocked && runtime_canSpin(iter) {
         // Active spinning makes sense.
         // Try to set mutexWoken flag to inform Unlock
         // to not wake other blocked goroutines.
         // If the wait queue has coroutines and the lock is not set to wake up, try to set the wake up state
         // The advantage of this is that when the lock is unlocked, it does not wake up the blocked coroutine, ensuring that the lock is more likely to be acquired
         if! awoke && old&mutexWoken ==0&& old>>mutexWaiterShift ! =0 &&
            atomic.CompareAndSwapInt32(&m.state, old, old|mutexWoken) {
            awoke = true
         }
         runtime_doSpin()
         iter++
         old = m.state
         continue
      }
      // The lock is hungry or unlocked, or does not meet the spin condition (it is still locked)
      // The lock status includes - hunger locked, hunger unlocked, normal locked, and normal unlocked
      // Get the latest lock status
      new := old
      // Don't try to acquire starving mutex, new arriving goroutines must queue.
      // If it is normal, try locking. Hunger state to give the right to compete, certainly can't lock
      if old&mutexStarving == 0 {
         new |= mutexLocked
      }
      // If the lock is still occupied or the lock is hungry, you can only put yourself on the wait queue
      // At this stage, the coroutine can only lie flat in these states. Hunger gives the right to compete
      ifold&(mutexLocked|mutexStarving) ! =0 {
         new+ =1 << mutexWaiterShift
      }
      // The current goroutine switches mutex to starvation mode.
      // But if the mutex is currently unlocked, don't do the switch.
      // Unlock expects that starving mutex has waiters, which will not
      // be true in this case.
      // If you are hungry and the lock is occupied, change the lock to hungry
      // The lock is not occupied and cannot be changed to hungry mode, because if the lock is not occupied but is hungry, there should be a wait queue.
      // This condition is violated if the lock is hungry instead of occupied. (Don't quite understand this sentence)
        ifstarving && old&mutexLocked ! =0 {
         new |= mutexStarving
      }
      // If the coroutine sets the wake state of the lock, it needs to reset the wake state.
      // The coroutine is either locked or asleep, regardless of the wake state
      if awoke {
         // The goroutine has been woken from sleep,
         // so we need to reset the flag in either case.
         if new&mutexWoken == 0 {
            throw("sync: inconsistent mutex state")}new &^= mutexWoken
      }
      // old -> new
      // (0,1) normal and locked -> (+1,1? ,1) Wait + 1, status undetermined, lock -> add to wait queue
      // (0,0) normal and not locked -> (+0,0,1) wait unchanged, normal, lock -> lock succeeded
      // (1,1) hungry and locked -> (+1,1? ,1) Wait + 1, hunger undetermined, lock -> add to wait queue
      // (1,0) hungry and unlocked -> (+1,1,0) wait +1, hungry, unlocked -> add to wait queue
      if atomic.CompareAndSwapInt32(&m.state, old, new) {// If CAS succeeds
         // If the lock is unlocked and in the normal state, the lock is successfully occupied
         if old&(mutexLocked|mutexStarving) == 0 {
            break // locked the mutex with CAS
         }
         // If we were already waiting before, queue at the front of the queue.queueLifo := waitStartTime ! =0
         if waitStartTime == 0 {
            waitStartTime = runtime_nanotime()
         }
        
         // Call runtime_SemacquireMutex to suspend the coroutine
         // waitStartTime can determine whether the coroutine is new or has been woken up
         QueueLifo =false; queueLifo=false
         QueueLifo =true queueLifo=true
         // If the coroutine is awakened later, execution continues from that position
         runtime_SemacquireMutex(&m.sema, queueLifo, 1)
         // The coroutine has been awakened
         // Determine if the coroutine has not acquired a lock for a long time. If so, it is a hungry coroutine
         starving = starving || runtime_nanotime()-waitStartTime > starvationThresholdNs
         // The coroutine was suspended for a long time, so we need to retrieve the current lock state
         old = m.state
         // Indicates that the current state is hungry. When hungry, the awakened coroutine is set to acquire the lock.
         ifold&mutexStarving ! =0 {
            // If this goroutine was woken and mutex is in starvation mode,
            // ownership was handed off to us but mutex is in somewhat
            // inconsistent state: mutexLocked is not set and we are still
            // accounted as waiter. Fix that.
            // When I am hungry, I wake up and find that the lock is not released, the wake up value is 1, and the wait list has no coroutines (not counting me as coroutines)
            // Don't conform to the setting
            ifold&(mutexLocked|mutexWoken) ! =0 || old>>mutexWaiterShift == 0 {
               throw("sync: inconsistent mutex state")}// The value is 7, because the lock state is not locked, use 7 to reduce the number of waits by one, and set the lock to lock
            delta := int32(mutexLocked - 1<<mutexWaiterShift)
            If the coroutine waking up the wait queue is not hungry, or if it is the last coroutine in the wait queue, the state is changed to normal
            if! starving || old>>mutexWaiterShift ==1 {
               // Exit starvation mode.
               // Critical to do it here and consider wait time.
               // Starvation mode is so inefficient, that two goroutines
               // can go lock-step infinitely once they switch mutex
               // to starvation mode.
               delta -= mutexStarving
            }
            // Set the lock state to wait count minus one and set it to lock. Locking success
            // This calculation method is too fancy
            atomic.AddInt32(&m.state, delta)
            break
         }
         // This coroutine is, indeed, a coroutine awakened by the system
         awoke = true
         // Set the spin count to 0
         iter = 0
      } else { // If the CAS fails, start again
         old = m.state
      }
   }

   if race.Enabled {
      race.Acquire(unsafe.Pointer(m))
   }
}

Copy the code

3.1.3 UnLock

// Unlock unlocks m.
// It is a run-time error if m is not locked on entry to Unlock.
// If the lock is unlocked, a runtime error is reported.
//
// A locked Mutex is not associated with a particular goroutine.
// It is allowed for one goroutine to lock a Mutex and then
// arrange for another goroutine to unlock it.
// Lock and unlock can be different coroutines
func (m *Mutex) Unlock(a) {
   if race.Enabled { // The default is false
      _ = m.state
      race.Release(unsafe.Pointer(m))
   }

   // Fast path: drop lock bit.
   // Not hungry, no waiting coroutine, no wake up, direct unlock completed
   new := atomic.AddInt32(&m.state, -mutexLocked)
   // Indicates that there may be a hungry state, a waiting coroutine, a wake up coroutine, and an unfinished business that must continue processing
   if new! =0 {
      // Outlined slow path to allow inlining the fast path.
      // To hide unlockSlow during tracing we skip one extra frame when tracing GoUnblock.
      m.unlockSlow(new)}}func (m *Mutex) unlockSlow(new int32) {
   if (new+mutexLocked)&mutexLocked == 0 {
      throw("sync: unlock of unlocked mutex")}// In normal mode
   if new&mutexStarving == 0 {
      old := new
      for {
         // If there are no waiters or a goroutine has already
         // been woken or grabbed the lock, no need to wake anyone.
         // In starvation mode ownership is directly handed off from unlocking
         // goroutine to the next waiter. We are not part of this chain,
         // since we did not observe mutexStarving when we unlocked the mutex above.
         // So get off the way.
         // If there are no more coroutines in the wait list, or if there are already coroutines waking up, there is no need to waste energy waking up other coroutines
         if old>>mutexWaiterShift == 0|| old&(mutexLocked|mutexWoken|mutexStarving) ! =0 {
            return
         }
         // Grab the right to wake someone.
         // Wait for the number of coroutines to decrease by 1 and set the lock wake position to 1
         new = (old - 1<<mutexWaiterShift) | mutexWoken
         if atomic.CompareAndSwapInt32(&m.state, old, new) {
            runtime_Semrelease(&m.sema, false.1)
            return
         }
         old = m.state
      }
   } else {// In starvation mode
      // Starving mode: handoff mutex ownership to the next waiter.
      // Note: mutexLocked is not set, the waiter will set it after wakeup.
      // But mutex is still considered locked if mutexStarving is set,
      // so new coming goroutines won't acquire it.
      // In hunger mode, lock ownership is directly given to the first person in the waiting queue
      // Note that the locked bit of the lock is not set. This will be done later on the awakened coroutine
      // While the locked bit is not set, new coroutines cannot obtain locks in starvation mode.
      runtime_Semrelease(&m.sema, true.1)}}Copy the code

3.1.4 Function Description

[runtime_canSpin] : sync_runtime_canSpin is implemented in SRC /runtime/proc.go; In the Runtime package, the runtime_canSpin method puts some restrictions on whether the number of iter passes is at least 4, or the number of CPU cores is at least 1, the maximum logical processor is greater than 1, and there is at least one local P queue. And the local P queue is runnable and the G queue is empty.

[runtime_doSpin] : sync_runtime_doSpin is implemented in SRC /runtime/proc.go; Represents a call to the procyield function, which is also implemented in assembly language. The function loops internally to invoke the PAUSE directive. PAUSE instructions do nothing, but consume CPU time, and are not optimized unnecessarily by the CPU when they are executed.

【runtime_SemacquireMutex】 : sync_runtime_SemacquireMutex is implemented in SRC/Runtime /sema.go; Represents blocking the current coroutine through a semaphore.

Runtime_Semrelease: sync_runtime_Semrelease is implemented in SRC/Runtime /sema.go. Represents unblocking the current coroutine by a semaphore.

3.1.5 flowchart

www.processon.com/view/link/6…

Go mutex implementation logic is very complex, you can see a lot of performance optimization code, so the whole logic is difficult to understand. Even if you look at the notes and the flow chart, it should be a little difficult to understand. My suggestions are: first, understand the functions represented by Lock, woken and medicine; second, do you want the lock implementation of version 1.13? You can look at the early implementation, which will be simpler.

4. To summarize

I thought the lock writing implementation would be done quickly, but the 100 or 200 lines of code took about a week. Personally, I do not recommend to see the implementation of 1.13 locks, it is too complicated. Can have a look at www.cnblogs.com/niniwzw/p/3…

The easier way to understand this is to use a state machine, and put lock and lock write operations into the state machine, but the lock state is divided into four parts, plus various operations, and it takes a lot of time to draw, if you are interested in drawing.

When writing this article, some data check the university course “computer operating system”, find these books are really good, not only accurate and easy to understand, before are rote memorization, now feel is a treasure book.

Write another article, this article is too long to write.

data

Thread synchronization (the difference between mutex and semaphore)
Semaphores and their use and implementation (in detail)
Go expert programming
Golang synchronization mechanism implementation
【 My Path as an Architect 】- Talk about the Sync package in GO
Golang mutex internal implementation
Design Ideas and Evolution of Go Sync. Mutex
Read go in semaphore(semaphore) source
Go: Some considerations about using locks (mutex)