preface

Let’s start with a little fiction

11 o ‘clock in the evening, programmer Xiao Ming’s hands are still flying on the keyboard, eyes are still staring at the computer screen.

No way this period of time in the growth of the company’s performance, the demand is also more natural, overtime is indispensable.

The weather was unpredictable. Outside the window, heavy rain fell and lightning thundered.

But this did not affect Xiao Ming, unexpected, suddenly a huge thunder flash, the office building so power outage, then the whole building echoed xiao Ming that a piercing “oh my god”.

At this point, how big is xiao Ming’s heart area?

After xiao Ming heart calm, suddenly the stomach is very painful, want to go to the toilet, Xiao Ming thought that must be the night to eat some fort king has a problem.

The electricity in the whole building was off, so Ming couldn’t see anything. He could only touch the wall to get to the toilet door step by step.

To the toilet (sharing resources), because it is too urgent, Xiaoming directly rushed into the toilet, hand groping just the first door did not lock the door, then seized the door and entered.

This is ridiculous, this door happened to be in the toilet, the toilet door is broken, can not lock the door.

In the dark, xiao Hong could not see, but relying on the sound, she found something moving in front of the door. She felt something was wrong, so she bent all her strength and kicked it with her high-heeled feet.

Xiao Ming is very lucky, was kicked in the “life”, tore heart crack lung to shout out a word “pain”!

The end of the story, pulled so much, in fact, is to explain that for shared resources, if not locked, in a multi-threaded environment, then it may occur rollover scene.

Next, with 30+ pictures, take you into the operating system to avoid multi-threaded resource competition mutually exclusive, synchronous method.


The body of the

Competition and Collaboration

In single core CPU system, in order to achieve the appearance of multiple programs running at the same time, operating system, usually in the form of time slice scheduling for each process executed every time a time slice, ran out of time, I switch to the next process is run, due to the time slice of time is very short, so he created a “concurrent” phenomenon.

concurrent

In addition, operating systems create the illusion of large, private virtual memory for each process. This abstraction of the address space makes it seem as if each program has its own memory, when in fact the operating system secretly “multiplexes” physical memory or disk across multiple address Spaces.

Virtual memory management – swap in and swap out

If a program has only one execution flow, it is also single-threaded. Of course, a program can have more than one execution flow, also known as the so-called multi-threaded program, thread is the basic unit of scheduling, process is the basic unit of resource allocation.

So, there are resources that can be shared between threads, such as code segments, heap space, data segments, open files, etc., but each thread has its own stack space.

multithreading

The problem is that multiple threads competing for shared resources can cause confusion of shared data if no effective measures are taken.

Let’s do a small experiment, create two threads, each of which executes the shared variable I increment by 1 10000 times, with the following code:

The final value of the I variable should be 20000, but unfortunately, it is not. Let’s execute the above program:

I run it twice, and I get 15173, and I get 20000.

Each run not only produces an error, but also a different result. In the computer is not tolerated, although it is a small probability of error, but a small probability of events it will happen, “Murphy’s law” we all understand it.

Why does this happen?

To understand why this happens, we must understand the sequence of code generated by the compiler to update the counter I variable, that is, the order in which the assembly instructions are executed.

In this example, we just want to add the number 1 to I, so the corresponding assembly instruction executes like this:

You can see that simply adding the number 1 to I actually executes three instructions while the CPU is running.

Imagine that our thread 1 enters this code area, it loads the value of I (let’s say 50 at this point) from memory into its register, and then it adds one to the register, where the value of I is 51.

Now, an unfortunate thing happens: a clock interrupt occurs. Therefore, the operating system stores the state of the currently running thread to the thread’s thread control block, TCP.

Now something worse happens, thread 2 is scheduled to run and enter the same code. It also performs the first instruction, fetching the value of I from memory and putting it into a register, where the value of I in memory is still 50, so the value of I in the thread 2 register is also 50. Suppose thread 2 performs the next two instructions, adding the value of I in the register + 1, and then saving the value of I in the register to memory, so that the value of global variable I is 51.

Finally, another context switch occurs, and thread 1 resumes execution. Remember that it has executed two assembly instructions and is now ready to execute the last one. Recall that the value of I in the thread 1 register is 51, so after the last instruction is executed and the value is saved to memory, the value of the global variable I is set to 51 again.

In simple terms, the code that increases I (value 50) is run twice, and the final I value should be 52, but due to uncontrollable scheduling, the final I value is 51.

For thread 1 and thread 2 above, I have drawn a flow chart to be more explicit:

Blue is thread 1, red is thread 2

The concept of mutual exclusion

The situation shown above is called a race condition. When multiple threads compete with each other to operate on shared variables, we get the wrong result due to bad luck, i.e. a context switch occurs during execution. In fact, each run can get a different result. Therefore, the output results are indeterminate.

Because the code that operates on shared variables by multiple threads may cause a race state, we call this code a critical section. It is a code fragment that accesses shared resources and must not be executed by multiple threads simultaneously.

We want this code to be mutually exclusive, which means that while one thread is executing in the critical section, other threads should be blocked from entering the critical section. In plain English, this code can only be executed with one thread at most.

The mutex

Also, mutex isn’t just for multithreading. When multiple processes compete for shared resources, they can also use mutual exclusion to avoid resource chaos caused by resource competition.

The concept of synchronization

Mutex solves the problem of concurrent processes/threads using critical sections. This interaction based on critical section control is relatively simple. Once one process/thread enters the critical section, all other processes/threads attempting to enter the critical section will be blocked until the first process/thread leaves the critical section.

We all know that in multithreading, each thread does not necessarily execute sequentially, they move forward at independent and unpredictable speeds, but sometimes we want multiple threads to work closely together to accomplish a common task.

For example, thread 1 is responsible for reading data, while thread 2 is responsible for processing data. The two threads cooperate and depend on each other. Thread 2 will block and wait until it receives no wake up notification from thread 1. When thread 1 reads the data and needs to pass the data to thread 2, thread 1 will wake up thread 2 and give the data to thread 2 for processing.

The so-called synchronization means that concurrent processes/threads may need to wait and exchange messages with each other at some key points. Such mutually restricted waiting and exchange information is called process/thread synchronization.

Take the synchronous example of life, you are hungry and want to eat, you ask your mother to cook earlier, mother heard and began to cook, but before mother has not finished the meal, you must block and wait, wait for your mother to finish the meal, nature will inform you, then you can have a meal.

The synchronous relationship between eating and cooking

Note that synchronization and mutex are two different concepts:

  • Synchronization is like: “Operation A should be performed before operation B”, “operation C must be performed after operation A and operation B are completed”, etc.
  • Mutual exclusion is like: “Operation A and operation B cannot be performed at the same time”;

Implementation and use of mutual exclusion and synchronization

In the process of concurrent execution of process/thread, there is a cooperative relationship between process/thread, such as mutually exclusive, synchronous relationship.

In order to achieve proper cooperation between processes and threads, the operating system must provide measures and methods to achieve process cooperation, and there are two main methods:

  • The lock: lock and unlock operation;
  • A semaphore: P, V operation;

Both are convenient for process/thread exclusion, and semaphores are a little more powerful than locks, which also allow easy process/thread synchronization.

The lock

Using locking and unlocking can solve the mutual exclusion problem of concurrent threads/processes.

Any thread that wants to enter a critical section must first perform a lock operation. If the lock operation passes smoothly, the thread can enter the critical area. Unlock a critical resource after it is accessed to release the critical resource.

Lock – unlock

Depending on the implementation of the lock, it can be classified into busy wait lock and no busy wait lock.

Let’s take a look at the implementation of the busy wait lock

Before I explain the implementation of the busy wait lock, I’ll introduce the special atomic operation instructions provided by modern CPU architectures, the test-and-set instructions.

If the test-and-set instruction is expressed in C code, the form is as follows:

The test and setup instructions do the following:

  • theold_ptrUpdated tonewThe new value
  • returnold_ptrThe old value;

The key, of course, is that this code is executed atomically. Because you can test old values as well as set new ones, we call this instruction “test and set.”

So what is atomic operation? Atomic operations are either all or none, and there can be no mid-execution

We can use the test-and-set command to implement “busy wait lock”, the code is as follows:

Busy waiting for lock implementation

Let’s make sure we understand why this lock works:

  • The first scenario is to first assume that a thread is running, calling lock(), and no other thread holds the lock, so flag is 0. When the TestAndSet(flag, 1) method is called and returns 0, the thread breaks out of the while loop to acquire the lock. It also sets the atomic flag to 1, indicating that the lock has been held. When the thread leaves the critical section, call unlock() to clear the flag to zero.

  • The second scenario is when a thread already holds a lock (flag 1). This thread calls lock() and then TestAndSet(flag, 1), this time returning 1. As long as another thread holds the lock, TestAndSet() returns 1 repeatedly, and this thread keeps busy, etc. When flag is finally set to 0, the thread calls TestAndSet(), which returns 0 and sets it to 1 atomically, thus acquiring the lock and entering the critical section.

Obviously, when the lock is not acquired, the thread will loop through wile and do nothing, so it is called a “busy wait lock”, also known as a spin lock.

This is the simplest type of lock that keeps spinning, using CPU cycles, until the lock is available. On a single processor, you need a preemptive scheduler (that is, constantly interrupting one thread through the clock to run the others). Otherwise, spin locks cannot be used on a single CPU, because a spinning thread never gives up the CPU.

Let’s look at the implementation of “no-wait locks”

No wait lock magic is when the lock is not available, no spin.

Since you don’t want to spin, when the lock is not acquired, you put the current thread into the lock wait queue, and then execute the scheduler, leaving the CPU to other threads.

Implementation of no-wait locks

This paper only proposes two ways to realize simple lock. Of course, in a specific operating system implementation, it will be more complicated, but there are two basic elements of this example.

If you want to have a better understanding of locking, I recommend you to read chapter 28 of “Introduction to Operating System”, which is available for free on wechat Reading.

A semaphore

Semaphores are a means provided by operating systems to coordinate access to shared resources.

Usually the semaphore represents the number of resources, and the corresponding variable is an integer (SEM) variable.

In addition, there are two atomically operated system call functions to control semaphores:

  • P operationWill:sem1After subtraction, ifsem < 0, the process/thread enters the blocking wait, otherwise continue, indicating that the P operation may block;
  • V operationWill:sem1After adding, ifsem < = 0To wake up a waiting process/thread, indicating that the V operation will not block;

The P operation is used before entering the critical region, and the V operation is used after leaving the critical region. These two operations must be paired.

For example, the semaphore of two resources is equivalent to two train tracks. PV operation is shown in the following figure:

Semaphores and train tracks

How does the operating system implement PV operation?

The semaphore data structure and PV operation algorithm are described as follows:

Algorithm description of PV operation

PV functions are managed and implemented by the operating system, so the operating system has made PV functions atomic.

How does the PV operation work?

Semaphores can not only achieve mutually exclusive access control of critical sections, but also synchronize events between threads.

Let’s start by talking about how to use semaphores to achieve mutually exclusive access to critical sections.

Set a semaphore S for each type of shared resource, whose initial value is 1, indicating that the critical resource is not occupied.

A process/thread exclusion can be achieved by placing the entry to a critical section between P(s) and V(s) :

At this point, any thread that wants to enter the critical region must first perform P on the mutex and then perform V after it has accessed the critical resource. Since the initial value of the mutex is 1, the s value changes to 0 after the first thread performs P, indicating that the critical resource is free and can be allocated to the thread to make it enter the critical region.

If a second thread wants to enter the critical section, P should be executed first. The result makes S negative, which means that the critical resource has been occupied. Therefore, the second thread is blocked.

In addition, only after the first thread performs V operation to release critical resources and restores s value to 0, will the second thread wake up and make it enter the critical area. After it completes the access to critical resources, it performs V operation again to restore S to the initial value 1.

For two concurrent threads, the mutex values are 1, 0, and -1, respectively:

  • If the mutex is 1, no thread enters the critical region.
  • If the mutex is 0, a thread enters the critical section.
  • If the mutex is -1, one thread enters the critical section and another waits to enter.

By means of mutex, only one thread can be executed at any time in the critical section, thus achieving the effect of mutex.

Again, let’s talk about how to synchronize events using semaphores.

Synchronization is done by setting a semaphore with an initial value of 0.

Let’s take the previous “meat-cook” synchronization example and implement it in code:

When the mother first asks her son whether he wants to cook, she executes P(s1), which is equivalent to asking whether the son needs to eat. Since the initial value of S1 is 0, s1 becomes -1, indicating that the son does not need to eat, so the mother thread enters the waiting state.

When the son is hungry, V(s1) is executed, causing the S1 semaphore to change from -1 to 0, indicating that the son needs to eat. This wakes up the blocked mother thread, and the mother thread starts cooking.

Then, the son thread executes P(s2), which is equivalent to asking the mother if the meal is finished. Since the initial value of S2 is 0, s2 becomes -1 at this point, indicating that the mother has not finished the meal, and the son thread waits.

Finally, the mother finishes the meal, executes V(s2), s2 changes from -1 to 0, and wakes up the waiting son thread, which is ready to eat.

Producer-consumer problems

Producer-consumer model

Producer-consumer Problem description:

  • After the producer generates the data, it puts it in a buffer.
  • The consumer fetches data from the buffer for processing;
  • Only one producer or consumer can access the buffer at any time;

The analysis of the problem can be concluded as follows:

  • Only one thread can operate the buffer at any time, indicating that the operation buffer is critical code, which needs to be mutually exclusive;
  • When the buffer is empty, the consumer must wait for the producer to generate data; When the buffer is full, the producer must wait for the consumer to fetch the data. Producers and consumers need to synchronize.

So we need three semaphores, which are:

  • Mutex semaphoremutex: used for mutually exclusive access to the buffer, initialized to 1;
  • Resource semaphorefullBuffers: used when the consumer asks if there is data in the buffer, reads the data if there is data, and initializes the value 0 (indicating that the buffer is initially empty);
  • Resource semaphoreemptyBuffers: used by the producer to ask if the buffer has an empty space, and then generate data with an initial value of N (buffer size);

Specific implementation code:

If the consumer thread executes P(fullBuffers) at the beginning, since the initial value of the semaphore fullBuffers is 0, the fullBuffers value will change from 0 to -1, indicating that there is no data in the buffer and the consumer can only wait.

Next, it is the producer’s turn to run P(emptyBuffers), which reduces 1 empty slot. If no other producer thread is currently running code in the critical area, the producer thread can put the data into the buffer, and run V(fullBuffers) when it is done. The change of the semaphore fullBuffers from -1 to 0 indicates that a “consumer” thread is blocking the wait data, and the blocking consumer thread will be woken up.

Once the consumer thread is woken up, if no other consumer thread is reading data at this point, it can go directly to the critical section and read data from the buffer. Finally, after leaving the critical region, the number of empty slots + 1.


Classical synchronization problem

The philosopher’s Dining problem

At the beginning, when I was in school, the interviewer also asked the topic of “dining for philosophers”, I listened to a face confused, no matter how the interviewer talked about this question, I still did not understand, inexplicable said that this question would be “deadlocked”.

It was a game over. It was a game over. It was a game over.

To this day, watch me illustrate the problem.

The problem of the philosopher’s meal

Let’s take a look at the problem description of philosophers’ meals:

  • 5A big brother philosopher, doing nothing, eating noodles around a round table;
  • As it happens, this table only has5A fork, one between two philosophers;
  • Philosophers gather together to think, and when they are hungry, they want to eat.
  • Paradoxically, these philosophers required two forks to eat noodles, meaning they had to have both left and right forks to eat;
  • When they’re done, they put the two forks back and continue thinking.

So the question is, how do you ensure that the sages move in an orderly fashion so that no one ever reaches for a fork?

Plan a

We try to solve this using semaphore, or PV operation, as follows:

The above procedure seems very natural. Pick up the fork and use P, which means use it when you have it, or wait for another philosopher to put it back when you don’t have it.

The problem with plan one

However, there are an extreme problems with this solution: assuming that the five philosopher picked up the left fork at the same time, the table is no fork, so that no one can get their right hand fork, also said that every philosopher in P (fork [N] (I + 1) %) this statement blocked, obviously this deadlock phenomenon happened.

Scheme 2

Since “solution 1” will cause a deadlock by competing with the left fork at the same time, we can add a mutex semaphore before taking the fork. The code is as follows:

The function of the mutex in the above program is that once one philosopher enters the “critical zone,” where no one else can move until the philosopher is ready to reach for the fork, the next philosopher can eat.

The problem with plan two

Plan 2 allows philosophers to eat in order, but only one philosopher can eat at a time, and there are five forks on the table. It is theoretically possible for two philosophers to eat at the same time. Therefore, from the perspective of efficiency, this is not the best solution.

Plan 3

So since plan two uses a mutex, which would allow only one philosopher to eat, we don’t use it.

In addition, the problem of plan 1 is that it is possible for all philosophers to hold the knife and fork on the left side at the same time, so we avoid that philosophers can hold the knife and fork on the left side at the same time, and adopt a branch structure, and take different actions according to the philosopher’s number.

The even-numbered philosophers were asked to take the left fork first and then the right fork, and the odd-numbered philosophers were asked to take the right fork first and then the left fork.

The above procedure, when operating P, picks up the left and right forks in different order, depending on the philosopher’s number. In addition, V operations do not need to branch because V operations do not block.

Plan three solves the problem

Plan three does not have deadlock, also can eat two people at the same time.

Plan 4

Another possible solution here is to use an array state to keep track of whether each philosopher is in progress, thinking, or hungry (trying to reach for a fork).

A philosopher, then, can eat only when neither of his neighbours has eaten.

The neighborhood of the ith philosopher is defined by the macros LEFT and RIGHT:

  • LEFT : ( i + 5 – 1 ) % 5
  • RIGHT : ( i + 1 ) % 5

For example, if I is 2, LEFT is 1 and RIGHT is 3.

The specific code is as follows:

The above program uses an array of semaphores, one for each philosopher, so that the philosopher who wants to eat is blocked while the required fork is occupied.

Note that each process/thread runs the smart_person function as main code, while the other take_forks, put_forks, and test are just ordinary functions, not separate processes/threads.

Plan four also solves the problem

Plan 4 also does not appear deadlock, also can eat two people at the same time.

The reader-writer problem

The philosopher’s meal problem above is useful for modeling processes such as competing problems with limited mutually exclusive access, such as I/O devices.

Another well-known problem is the “reader-writer” problem, which creates a model for database access.

The reader reads the data and does not modify it, while the writer can either read or modify the data.

Reader-writer problem description:

  • Read-read permit: Multiple readers are allowed to read at the same time
  • Read-write is mutually exclusive: readers can read when there is no writer and writers can write when there is no reader
  • Write-write is mutually exclusive: the writer writes only when there are no other writers

Next, several solutions are proposed for analysis.

Plan a

Use semaphores to try to solve:

  • A semaphorewMutex: The mutex that controls the write operation. The initial value is 1.
  • Readers countrCount: Number of reading operations being performed, initialized to 0.
  • A semaphorerCountMutex: Controls mutually exclusive modification of the rCount reader counter, starting with 1;

Let’s look at the implementation of the code:

This implementation is a reader-first strategy, because as long as the reader is reading, subsequent readers can directly enter, if the reader continues to enter, the writer will be hungry.

Scheme 2

So if there’s a reader-first strategy, there’s also a writer-first strategy:

  • Whenever a writer is ready to write, the writer should write as quickly as possible, and subsequent readers must block;
  • If a writer keeps writing, the reader is hungry;

The following variables are added on the basis of scheme 1:

  • A semaphorerMutex: Controls the mutex that the reader enters, with an initial value of 1;
  • A semaphorewDataMutex: The mutex that controls the writer’s write operations. The initial value is 1.
  • The writer countwCount: Records the number of writers. The initial value is 0.
  • A semaphorewCountMutex: controls wCount mutexes. The initial value is 1.

The specific implementation code is as follows:

Notice how rMutex works: multiple readers start reading data, they all enter the reader queue, and then a writer comes along. After executing P(rMutex), the subsequent readers cannot enter the reader queue because they are blocked on rMutex. When the writer arrives, all readers can enter the writer queue. So the writer takes precedence.

At the same time, after the first writer executes P(rMutex), he cannot start writing immediately. He must wait until all readers in the reader queue finish reading and wake up the writer through V(wDataMutex).

Plan 3

Since both the reader-first strategy and the writer-first strategy lead to hunger, let’s implement the fairness strategy.

Fair strategy:

  • Same priority;
  • Writer and reader mutually exclusive access;
  • Only one writer can access a critical section;
  • Multiple readers can access street-facing resources at the same time;

Specific code implementation:

After reading the code, I wonder if you have such a question. Why did you add a semaphore flag to achieve fair competition?

By comparing the reader-first strategy of scheme 1, it can be found that in the reader-first strategy, as long as there are subsequent readers arriving, the readers can enter the reader queue, while the writer must wait until no readers arrive.

If no readers arrive, the reader queue is empty, that is, rCount==0, at which point the writer can enter the critical section to perform the write operation.

The function of flag here is to prevent readers from having this special permission (special permission is that readers can enter the reader queue as long as they arrive).

For example, some readers start to read data, and they all enter the reader queue. At this time, a writer comes to perform P(falg) operation, so that the subsequent readers are blocked on the flag and cannot enter the reader queue, which makes the reader queue gradually empty, that is, rCount is reduced to 0.

The writer cannot start writing immediately (because the reader queue is not empty at this time) and blocks on the semaphore wDataMutex. After all the readers in the reader queue finish reading, the last reader process executes V(wDataMutex) to wake up the previous writer, and the writer continues to write.


chatter

Live up to expectations, xiao Lin delayed again, did not reach “week more”, ashamed of shame, estimate a lot of readers are quick to forget xiao Lin is who, Xiao Lin beg everybody not to take close!

Yes, just last week, big good weekend, Xiao Lin lazy, review two days of “Dragon Ball Z” anime, see really cool ah, although have seen many times, who call me dragon ball fan.

That’s life. It’s loose, it’s tight.

Kobayashi is a tool man for you. Goodbye, see you next time!


Good article recommendation:

Process and thread basics family bucket, 30 pictures a set to go

20 images reveal the mystery of memory management