This is the 25th day of my participation in the August Challenge

race

preface

In the previous article, I shared concurrency through the combination of Goroutine and channel (the CSP model). This article focuses on concurrency through traditional synchronization mechanisms

Go is not limited to CSP concurrency. It also has a traditional synchronization mechanism, but it is recommended to use it sparingly. WaitGroup, mentioned in the previous post (unclear here), is one of the traditional synchronization mechanisms. Traditional synchronization mechanisms include mutexes, etc. In go, there is actually a library of atomic operations called atomic operations. Using the methods below this library is concurrency safe when multiple Goroutines are running concurrently. However, this article will not use this method for demonstration purposes

Mutexes are used below to implement safe addition operations with concurrent execution of multiple Goroutines. Below is a simple increment data to demonstrate, first do not use the lock, see if there is any problem

package main import ( "fmt" "time" ) type atomicInt int func (a *atomicInt) increment() { *a++ } func (a *atomicInt) get() int { return int(*a) } func main() { var a atomicInt a.increment() go func() { a.increment() }() Time.sleep (time.millisecond) mft.println (a)Copy the code

As you can see, normal print results. But that doesn’t mean it’s ok. The go run-race command mentioned in the previous article can see data access conflicts. Now execute the above program with that name as you can see

Increment is writing to address 0x00C0000BC008 when main reads the address. That is, one Goroutine is writing while another goroutine is reading. Now use locks to solve this problem, as follows:

package main import ( "fmt" "sync" "time" ) type atomicInt struct { value int lock sync.Mutex } func (a *atomicInt) Increment () {a.lock.lock () defer A.lock.unlock () A. value++} func (a *atomicInt) get() int {a.lock.lock () defer A.lock.unlock () return a.value} func main() { var a atomicInt a.increment() go func() { a.increment() }() time.Sleep(time.Millisecond) fmt.Println(a.get()) }Copy the code

Execute this code through a Go Run-race, and there are no data conflicts. This is just a simple demonstration of the use of locks in GO, which will be explained in detail below

Race & data race

In a serial program (that is, a program has only one Goroutine), the order of execution of the steps in the program is determined by program logic. For example, in a series of statements, the first sentence is executed before the second, and so on. When a program has two or more Goroutines, the steps within each goroutine are executed sequentially, but we have no way of knowing the sequence of events X in one Goroutine and y in the other. If we cannot confidently say that one event precedes the other, then the two events are concurrent

Consider a function that works correctly in a serial program. If the function still works correctly when called concurrently, the function is concurrency safe, where concurrency is when the function is called simultaneously from two or more Goroutines without additional synchronization. This concept can also be extended to other functions, such as methods or operations that operate on a particular type. A type can be called concurrency-safe if all of its accessible methods and operations are concurrency-safe

To make a program concurrently safe does not require that every concrete type of program be concurrently safe. In fact, concurrency-safe types are the exception rather than the rule, so you can access a variable concurrently only if the documentation indicates that the type is safe. For most variables, to avoid concurrent access, either restrict variables to a goroutine or maintain a higher level of mutex invariants. These concepts are explained below

Conversely, exported package-level (global) functions can generally be considered concurrency safe. Because package-level variables cannot be confined to a Goroutine, functions that modify these variables must be mutually exclusive

There are many reasons why functions don’t work when called concurrently, including deadlocks, livelock, and running out of resources. The following section focuses on one of these scenarios – races

A race is when a program fails to give the correct result when multiple Goroutines are executed in some staggered order. Races are deadly to your program because they can lurk in your program and occur infrequently, perhaps only in a high-load environment or when using a particular compiler, platform, and architecture. All of this makes races difficult to reproduce and analyze

Data race scenario

Again, the classic bank transfer example is used to explain races

package bank

var balance int

func Deposit(amount int)  {
    balance = balance + amount
}

func Balance() int {
    return balance
}
Copy the code

For a simple program like the one above, you can see at a glance that any serial calls to Deposit and Balance will give you the correct result. That is, Balance prints out the total amount previously deposited. If these functions are called in parallel instead of serial, Balance cannot guarantee the correct output. Consider the following two Goroutines, which represent two transactions to the same shared account

package main

import (
    "fmt"
    "go.language/ch9/bank"
)

func main() {
    //A
    go func() {
        bank.Deposit(200)//------A1
        fmt.Println("=", bank.Balance())//------A2
    }()
    //B
    go bank.Deposit(100)//------B
}
Copy the code

A deposits 200 yuan and then checks her balance, while B deposits 100 yuan. A1 and A2 are carried out concurrently with B, so we cannot predict the actual execution sequence. Intuitively, there may be three different orders, called “A first”, “B first”, and A/B/A. The table below shows the value of the balance variable after each step. The quoted string represents the output account balance

In all cases, the final account balance is $300. The only difference is whether the account balance seen by A includes B’s transactions (third case).

But that instinct is wrong. There is A fourth possibility that B’s deposit is executed in the middle of A’s deposit operation, after balance reading (balance+amount) but before balance update (balance =…). That would cause B’s savings to disappear. This is because A’s deposit operation A1 is actually two sequential operations, the read part and the write part, which we’ll call A1r and A1w. Here is the order of execution in question:

  1. A1r, it reads the value of balance, which is 0
  2. Balance =100 (note that A1r has already taken the value of balance, which is 0)
  3. A1w, balance+amount, resulting in 200, which is assigned to balance
  4. A2, so A reads 200 when it reads it

After A1r, the expression balance + amount evaluates to 200, which is used to write in the A1w step, ignoring the intermediate deposit operation entirely. The final balance was only $200, and the bank made $100 from B

This condition in a program is one of those races called a data race. Data races occur when two Goroutines concurrently read and write to the same variable and at least one of them is writing

Things get more complicated when a data race occurs when the variable type is larger than a machine word length (such as an interface, string, or slice). The following code updates x concurrently to two slices of different lengths

var x []int
go func() {
    x = make([]int, 10)
}()
go func() {
    x = make([]int, 1000000)
}()

x[99999] = 1
Copy the code

The value of x in the last expression is undefined, and it could be nil, a slice of length 10, or a slice of length 1000000. We know that slice contains three parts: pointer, length, and capacity. If the pointer comes from the first make call and the length comes from the second make call, x becomes a chimeric with a nominal length of 1 000 000, but the underlying array has only 10 elements. In this case, trying to store up to the 9999th element can damage memory far away, with unpredictable consequences and problems that can be difficult to debug and locate. This semantic minefield is called undefined behavior, and C programmers are familiar with it. Fortunately, Go has fewer of these problems in comparison

The idea that a parallel program is a series of programs that run interlaced is an illusion. As you’ll see later, data races can be caused for even stranger reasons. How do I avoid data races in my programs?

How do I avoid data races

We know that data races occur when two or more Goroutines concurrently read and write to the same variable and at least one of them is a write. As you can see from the definition, there are three ways to avoid data races

  • Method 1: Do not modify variables. Consider the map below, which is lazily initialized so that for each key the load is triggered on the first access. If the icon calls are serial, then the program works fine, but if the workcon calls are concurrent, there is a data race when accessing the map
Var ICONS = make(map[string] image.image) func loadIcon(name string) image.image {} image.Image { icon, ok := icons[name] if ! ok { icon = loadIcon(name) icons[name] = icon } }Copy the code

If the map is initialized with complete data before any other Goroutine is created, it is not modified. It is safe to call the Icon function concurrently no matter how many Goroutines there are, because each goroutine only reads this map

var icons = map[string]image.Image{ "spades.png": loadIcon("spades.png"), "hearts.png": loadIcon("hearts.png"), "diamonds.png": loadIcon("diamonds.png"), "clubs.png": LoadIcon (" button.png "),} // Concurrency security func Icon(name string) image. image {return ICONS [name]}Copy the code

In the example above, the assignment of the ICONS variables takes place at package initialization, before the main function of the program starts to run. Once initialized, ICONS are no longer modified. Data structures that never modify and are immutable are inherently concurrency safe and do not require any synchronization

  • Method 2: Avoid data races by avoiding accessing the same variable from more than one Goroutine

Since other Goroutines do not have direct access to relevant variables, they must use channels to send query requests or update variables to restricted Goroutines. This is what the Go motto means: “Don’t communicate by sharing memory, communicate by sharing memory” (this was mentioned in an earlier article; you can click here). A goroutine that uses channel requests to proxy all access to a restricted variable is called a monitor goroutine for that variable.

Here is a rewrite of the bank transfer logic, limiting the balance variable with a monitoring Goroutine called Teller

Var deposits = make(chan int) var balances = make(chan int) func Deposit(amount int) {deposits < -amount } func Balance() int {return <-balances} func teller() {var Balance int // Balance is restricted in teller goroutine for {select  { case amount := <-deposits: balance += amount case balances <- balance: }} func init() {go teller() // start monitor goroutine}Copy the code

Method 3: Allow multiple Goroutines to access the same variable, but only one goroutine can access it at a time. This approach is called the mutual exclusion mechanism. Will be shared in the next post

reference

The Go Programming Language — Alan A. A. Donovan

Go Language Learning Notes — Rain Marks