Prequel to concurrent programming

preface

In the past, when LEARNING the C++ keyword volatile, I read the article “C/C++ Volatile keyword in-depth Analysis” written by Ali database master He Dengcheng. Mainly because we didn’t understand what visibility, atomicity and orderliness were. Did not understand the memory model and some of the specifications. Many beginners, like me, have gone straight to the topics of volatile, synchronized, and Wait /notify in their Java concurrent programming books. For example, don’t skip chapter 1/2 when reading Java High Concurrency Programming. A recent article on concurrent programming on geek time made me realize a lot of things I didn’t understand when I was studying C++. In addition, I organized some knowledge points I understood in series and recorded them to prepare for concurrent programming. The article may be a bit long, but the content is relatively simple, please read patiently.

Why concurrent programming

In the days of low CPU performance, small memory, and expensive hard disks, most programs were serial, let alone multithreaded or even single-threaded. However, with the development of hardware, CPU performance is more and more powerful, more and more cores, more and more memory, disk is cheaper and cheaper. In order to make full use of computer hardware in the program, especially those precious resources (such as CPU), single thread/single process can not meet the requirements, then began to gradually appear concurrent programs. On the one hand, concurrent programs can make full use of computer resources, and on the other hand, they can respond to users more quickly, so they have the best of both worlds. But a new world opens, with light and darkness. By learning concurrent programming, we can navigate a new world. By understanding concurrent programming, we can avoid the darkness of the new world.

Multithreading and multiprocess in concurrency

When we talk about concurrent programming, we are talking more about concurrent programs of the same code than about concurrent programs of different codes. The former is the programmer’s problem to solve, the latter is the operating system’s problem to solve. There are several common approaches to concurrent programming: multiprocess and multithreading and their combinations. Programming in C++ allows for multiple processes, while programming in Java only allows for multiple processes. This choice also makes sense, after all, the operating system switching between threads consumes fewer resources and is faster than switching between processes. The difference between a process and a thread should be familiar to every programmer who has ever been interviewed, but it is important to note that the rest of this discussion is based on single-process, multi-threaded concurrent programming.

The premise that the program can be concurrent

IO intensive and CPU intensive are both familiar. The former refers to more IO operations (disk read and write, network read and write) during the execution of the program. For example, mysql performs data read and write. The latter refers to the execution process will use more CPU, such as MATLAB matrix multiplication. It can be said that all programs are either using CPU or IO during execution, as well as during hibernation. Because of this, there is the possibility of concurrency. Imagine if all the programs on a machine use CPU only or do IO only. Programs like while(true){}!

We know that the CPU is the brain of the computer and is theoretically needed for any operation, so why are CPU intensive and IO intensive discussed separately? It is worth mentioning that DMA (Direct Memory Access) allows the CPU to be used only at the beginning and end of AN I/O operation. The CPU can do other things at other times. It makes it possible for CPU and IO to work in parallel. So there are two prerequisites for a program to be able to run concurrently:

CPU execution instructions and IO operations can be parallel
Most programs use both CPU and IO

The difficulty of concurrent programming

Concurrent programming is difficult for the following reasons:

Theory: Unlike other syntactic parts of the language, concurrent programming involves not only the key knowledge of the language, but also the underlying operating system. While learning many computer languages, the concurrent programming section alone can write a thick book, so it is more theoretical.
Low exposure: In a few months, I’ll have been working for three years and learning programming for ten. In so much time, the author only wrote several concurrent programming examples in the process of learning, but did not actually use them in work. Concurrent programming is not common for the most part, especially with high concurrency, which is common only in middleware systems.
Error prone: Concurrent programs often have weird bugs due to visibility, atomicity, and orderliness issues. These bugs are often difficult to reproduce and locate. Many programs have given up the use of concurrency, for example, the large systems I maintained during my tenure in Tencent are single-process single-thread. Performance is not enough, the machine will do.

Core knowledge

It is said that the snake hit seven inches, learning concurrent programming to grasp its key technology. But the key to learning the keys to concurrent programming is to understand what the key technologies are designed to solve and how to solve them so that you can learn faster and remember them better. The key technical points of multithreaded concurrent programming are built around visibility, atomicity and orderliness. Mastering the definitions of visibility, atomicity, and orderliness and the scenarios that trigger problems will help you track down concurrency bugs and learn concurrency programming with ease.

What is visibility

Visibility: Changes made by one thread to a shared variable can be immediately seen by another thread. This is called Visibility. Visibility problems do not exist for serial programs, but not necessarily for concurrent programs. One of these possibilities is as follows:

CPU execution instruction speed is tens or even hundreds of times faster than memory read and write speed, if the CPU executes a read and write instruction to read and write data in memory, the CPU performance will be greatly wasted. Those who have learned operating systems know that in order to solve this problem, hardware engineers have added a cache between the CPU and the memory. When the CPU executes the read and write instructions, the data is read into the cache and the execution result is stored in the cache. When the operation is finished, the result is synchronized to the memory at some point. This has many benefits, but introduces a new problem: cache consistency.

The visibility problem is a comprehensive one, and CPU caching is only one of the possible causes of visibility, such as compiler optimizations and instruction rearrangements

What is atomicity

Atomicity: The ability of a CPU to execute one or more operations without interruption. The word atom is familiar to programmers because the concept of atom is also present in databases. From a chemical point of view, atoms are fundamental particles that can no longer be divided. Back in the computer world, many programmers who do not understand the underlying language think that every statement in a high-level language is an atomic operation, but it is not, such as a simple assignment. In C++, this code requires at least three CPU instructions.

Instruction 1: First, you need to load variable I from memory into the CPU register.
Instruction 2: After that, I =1 is performed in the register;
Instruction 3: Finally, write the result to memory.

If it is single-threaded serial, there is no atomicity problem even if a statement is divided into multiple instructions. This is not necessarily true in concurrent programs. Java’s long type is 8 bytes, and on a 32 system, reading and writing a variable of this type requires two memory operations (two virtual machine instructions), and the Java Virtual Machine specification allows two virtual machine instruction operations to be nonatomic. Chances are that the following scenario will occur:

Thread 1 has just read the first 4 bytes of I and is ready to read the next 4 bytes;
Thread 2 changes the last 4 bytes of I;
Thread 1 reads the last 4 bytes of I and displays the value.

So I reads as a spliced error value, atomic problem. See further understanding the Java Virtual Machine: Advanced JVM Features and Best Practices (version 2) in the section 12.3.4- Special rules for Long and Double variables.

3. What is order

Orderliness: The orderliness of a program. We always think of code as being executed from front to back, and this is true in the single-threaded case. In concurrent programs, however, there can be out-of-order problems, resulting in orderliness problems. To sum up, if you look inside this thread, all operations are in order; If you observe another thread in one thread, all operations are out of order. For example, the following code:

class OrderingExample {
    int x = 0;
    boolean flag = false;
    public void writer(a) {
        x = 42; // The ultimate answer to the universe
        flag = true;
    }
    public void reader(a) {
        if (flag == true) {
            //x = ?}}}Copy the code

In a single thread x must be 42, while in a concurrent program x may not be 42, but may be 0. Compiler optimizations, instruction rearrangements, etc., may result in the above code being executed like this:

class OrderingExample {
    int x = 0;
    boolean flag = false;
    public void writer(a) {
        flag = true;
        x = 42; // The ultimate answer to the universe
    }
    public void reader(a) {
        if (flag == true) { x = ? }}}Copy the code

If (flag==true) in reader, then the CPU switches to thread 2. If (flag==true) in reader, then the CPU switches to thread 2. Code, x = 0, order problem.

The memory model

How can concurrent programs share variables in read and write memory due to visibility, atomicity, and orderliness? That’s what the memory model does: it defines a specification for the behavior of read and write operations of multithreaded programs in a shared memory system. These rules are used to regulate the read and write operations of memory, so as to ensure the correctness of instruction execution. It’s about the processor, it’s about the cache, it’s about concurrency, it’s about the compiler. It solves the memory access problems caused by multi-level CPU cache, processor optimization and instruction rearrangement, and ensures the visibility, atomicity and order in concurrent scenarios.

The memory model is only a model, that is, a specification. Each language has its own specific implementation details. Java has the Java memory model JMM, and C++ has the C++ memory model (OS dependent memory model). One thing to note here is the difference between the memory model and the object model. When I was learning C++, there was a famous book called “understanding C++ object model in depth”, which talked about how C++ objects are laid out in memory, which is completely different from the C++ memory model, which is the same thing in Java. At the same time to distinguish between memory model and memory structure, a lot of people are easy to confuse these two concepts, do not believe baidu under the memory model, many are talking about the stack and so on. The corresponding stack, static area should be memory structure.

Operating systems and JVMS

Originally do not want to add this paragraph, because the prequel just want to write some knowledge points that have nothing to do with language. But I added this paragraph as a reminder to Java programmers. We all know that C++ programs are not portable, mainly because their compilation is dependent on the operating system. Java is different. It can truly write once, run everywhere, mainly because it has Java Virtual Machine (JVM). The JVM creates a layer of abstraction on the operating system that hides the details at the operating system level. For Java programs, the JVM is the operating system, so many of the concepts of the operating system are directly transferred to the JVM, such as processes/threads, IO operations, etc. Most of the time, many books don’t distinguish between them. Because most of these apis are implemented by JVMS calling native methods. However, some concepts are different, such as virtual machine instructions, virtual machine program counters, main and working memory, JMM, and so on, probably because these implementations are different from the operating system. In contrast, in a Java virtual machine, virtual machine instructions correspond to CPU instructions. Main memory corresponds to physical memory; Working memory corresponds to CPU cache.

conclusion

The content of concurrent programming in this article has nothing to do with the language. It is more about learning some pre-concept knowledge of concurrent programming, which is my personal understanding. Hope to be helpful to you, the wrong place also please give more advice. Remember to pay attention to the public account oh, recording a C++ programmer to Java learning road.