Introduction to Multithreading: Step by step approach to the world of multithreading

A gentle introduction to multithreading

Say, Say, Triangles

The Nuggets translation Project

Permanent link to this article: github.com/xitu/gold-m…

Translator: steinliber

Proofreader: Graywd, Endone

Modern computers have the ability to perform multiple operations at the same time. Supported by more advanced hardware and smarter operating systems, this feature can make your programs run and respond faster.

Writing software that takes advantage of this feature can be fun, but also tricky: it requires you to understand what’s going on behind the computer’s back. In the first section, I’ll try to cover briefly threads, one of the tools provided by operating systems that can do this magic. Let’s get started!

Processes and threads: Name things the right way

Modern operating systems can run multiple programs at the same time. That’s why you can read this article in your browser (one program) while listening to music on your player (another program). Each program here is considered to be an executing process. Operating systems know many tricks at the software level to make one process run with other processes, and can use the underlying hardware to do so. Either way, the end result is a feeling that everything is running at the same time.

Running processes on an operating system is not the only way to perform multiple operations simultaneously. Each process can also run multiple sub-tasks, called threads, at the same time. You can think of threads as part of the process itself. Each process triggers at least one thread when it starts, which is called the master thread. Additional threads can then be started and terminated within the process as required by the program/developer. Multithreading is the technique of running multiple threads in the same process.

For example, your player might have multiple threads running: one for rendering the interface — usually the main thread, another for playing music, and so on.

You can think of an operating system as a container containing multiple processes, each of which is a container containing multiple threads. In this article, I’ll focus only on threads, but this whole topic is fascinating enough to warrant a more in-depth analysis in the future.

Figure 1: An operating system can be thought of as a box containing processes, which in turn can be thought of as a box containing one or more threads.

The difference between processes and threads

Each process has its own chunk of memory allocated by the operating system. By default, processes cannot share chunks of memory with each other: browser programs cannot access memory allocated to players, and vice versa. Even if you run the same process instance (say, if you launch the browser twice), they don’t share memory. The operating system treats each instance as a new process and allocates its own separate memory. So, in general, multiple processes cannot share data with each other unless they use some advanced technique called interprocess communication.

Unlike processes, threads share the same block of memory allocated by the operating system to their parent process: this makes it easy for the player’s audio engine to read data from the main interface, and vice versa. Threads therefore communicate with each other more easily than processes. In addition, threads are generally lighter than processes: they take up fewer resources and are faster to create, which is why they are also called lightweight processes.

Threads are an easy way to have your program perform multiple operations at the same time. Without threads, you need to write a program for each task, run them as processes and synchronize those processes through the operating system. This is not only harder (interprocess communication is tricky) but also slower (processes are heavier than threads).

Green threads, fibers

Threads mentioned so far have been operating system level concepts: a process that wants to start a new thread must pass through the operating system. However, not every platform supports threads natively. Green threads, also known as fibers, are an emulation of threads that allow multithreaded programs to work in environments that do not provide threading power. For example, it is possible to implement green threads even though the underlying operating system of the virtual machine does not have native support for threads.

Green threads are faster to create and manage because they bypass the operating system entirely, but there are drawbacks. I’ll cover this topic in the next section.

The name “Green Threads” comes from the Green team at Sun Microsystem, which designed Java’s original thread library in the 1990s. Java no longer uses green threads: they were switched to native threads in 2000. Other programming languages like Go, Haskell, or Ruby — they use the same implementation as green threads instead of native threads.

What threads are for

Why should a process use more than one thread? As I mentioned earlier, parallel processing can greatly speed things up. Suppose you want to render a movie in a movie editor. If the editor is smart enough, it can spread the rendering operation over multiple threads, with each thread handling a portion of the movie. So if the task takes an hour with one thread, it takes 30 minutes with two threads; Using four threads takes 15 minutes, and so on.

Is it really that simple? Here are three things to consider:

Not every program needs multiple threads. If your application performs sequential operations or waits for the user to do something, multithreading might not be so good;
You can’t simply add more threads to your application to make it run faster: each subtask must be carefully thought through and designed to operate in parallel;
There is no 100% guarantee that threads will truly execute operations in parallel (i.e., simultaneously) : it really depends on the underlying hardware on which the program is running.

Last but not least: if your computer does not support multiple operations at the same time, operating systems will pretend that they do. We’ll see that in a minute. For now, let’s think of concurrency as we look at tasks running at the same time, whereas true parallelism is literally tasks running at the same time.

Figure 2: Parallelism is a subset of concurrency.

What makes concurrency and parallelism possible

The computer’s central processing unit (CPU) does the grunt work of running programs. It consists of several parts, the main one of which is called the core: this is where the calculations are actually performed. A core can only perform one operation at a time.

Undoubtedly, this is a major shortcoming of the core. Thus, the operating system layer provides advanced technology that enables users to run multiple processes (or threads) simultaneously, especially in graphical environments and even on single-core machines. The most important of these is called preemptive multitasking, in which preemption is the ability to control the interruption of a running task, switch to another task, and then restore it after a certain period of time.

So if you have only one CPU core, part of the operating system’s job is to distribute that single core computing power among multiple processes or threads that loop through one another. This gives you the illusion that more than one program is running in parallel, and if you use multiple threads, it will feel like the program is doing many things at once. This satisfies concurrency, but not true parallelism — the ability to run processes simultaneously is still missing.

Modern cpus have multiple cores, each of which performs an independent operation at the same time. This means that true parallelism can be achieved with multiple cores. For example, my Intel Core I7 processor has four cores: it can run four different processes and threads at the same time.

The operating system can detect the number of cores inside the CPU and assign processes or threads to each of them. Threads can be assigned to any of these cores if the operating system prefers, and the scheduling is completely transparent to the running program. In addition, if all the cores are busy, preemptive multitasking is involved in scheduling. This allows you to run more processes and threads than the actual number of cores available to the computer.

Multithreaded applications run in a single core: Does it make sense?

True parallelism is impossible on a single-core machine. However, if your application can benefit from multithreading, it makes sense to run multithreading on a single-core machine. In this case, when a process uses multiple threads, preemptive multitasking keeps the application running even if one of the threads is performing a slow or blocking task.

Let’s say you’re developing a desktop application that reads some data from a slow disk. If you just write a single-threaded program, the entire application will be unresponsive while reading data until the data is read: CPU power allocated to the single thread will be wasted waiting for disk to wake up. Of course, the operating system runs many other processes besides this, but your particular application will not make any progress.

Let’s rethink your application in a multi-threaded way. Thread A of the program is responsible for disk access and thread B is responsible for the main interface. If thread A is stuck due to slow device reads, thread B is still running the main interface, keeping your application responsive. This is possible because with two threads, the operating system can switch between them to allocate CPU resources without letting the program get stuck with slower threads.

The more threads, the more problems

As we know, threads share the same block of memory as their parent process. This makes it easy to exchange data between threads of the same application. For example, a movie editor might have a large portion of shared memory for containing the video timeline. Such shared memory is read by several worker threads that render the movie into a file. All they need is a handle (such as a pointer) to the memory region from which they can read data and output render frames to disk.

This works fine as long as multiple threads are reading from the same memory location. Trouble starts when one or more of them write to shared memory while other threads are reading from it. There are two problems:

Data contention – when the writer thread modifies memory, the reader thread may read the memory. If the writer thread has not finished writing, the reader thread will get corrupted data.
Race condition – The reader thread should not read memory until the writer thread has finished writing. What if things happened in the opposite order? More subtle than a data race is that a race condition is when multiple threads perform their work in an unpredictable order, when in fact, we want these operations to be executed in the correct order. Even if data contention is protected, your program may still trigger a race condition.

The concept of thread safety

A piece of code can be said to be thread-safe if it is executed by multiple threads at the same time and works properly, that is, without data contention or race conditions. You may have noticed that some libraries claim to be thread-safe: if you’re writing a multithreaded program and want to ensure that any third-party functions can be used across threads without triggering concurrency problems, pay attention to these statements.

The root cause of data competition

We know that a CPU core can only execute one machine instruction at a time. Such an instruction is called an atomic operation because it is indivisible: it cannot be broken down into smaller operations. The Greek word “atom” (ἄτομος; Atomos means it can’t be sliced.

The indivisible property makes atomic operations inherently thread-safe. When one thread performs atomic writes on shared data, no other thread can read the half-modified data. In contrast, when a thread performs atomic reads on shared data, it reads the entire value that appears in memory at some point. It is impossible for other threads to slip in while performing atomic operations, so there is no data race.

Unfortunately, most operations are nonatomic. On some hardware even a simple assignment such as x = 1 May consist of multiple atomic machine instructions, making the assignment itself a non-atomic operation. A data race can be triggered if one thread is reading an x value while another thread is assigning it.

The root cause of competitive conditions

Preemptive multitasking gives the operating system complete control over thread management: it can start, stop, or suspend threads based on advanced scheduling algorithms. As a developer, you have no control over the timing or order in which threads are executed. In fact, simple code like the following does not guarantee a particular order:

writer_thread.start()
reader_thread.start()
Copy the code

Run the program a few times, and you’ll notice how it behaves differently each time it runs: sometimes the write starts first, and sometimes the read starts first. If your program needs to be written before you read it, you’re bound to run into competition conditions.

This behavior is called nondeterminism: the result of a run changes every time and you can’t predict it. Debugging programs that are affected by race conditions is very annoying because you can’t always reproduce the problem in a controlled way.

To teach threads to get along: concurrency control

Data competition and competitive conditions are real world problems: some people have even died because of them. The art of scheduling multiple concurrent threads is called concurrent control: to deal with this problem, operating systems and programming languages offer several solutions. The most important of these are:

Synchronization – a way to ensure that resources are used by only one thread at a time. Synchronization is marking a specific part of code as “protected” so that multiple concurrent threads don’t execute the code at the same time, preventing them from messing up the shared data;
Atomic operations – Due to special instructions provided by the operating system, many non-atomic operations (like the assignment operations before) can become atomic operations. This way, the shared data remains valid regardless of how other threads access it.
Immutable data – Shared data is marked immutable and nothing can change it: threads can only read from it, eliminating the root cause. As we know, it is safe to read data from the same memory location without modifying the memory thread. This is the main idea behind functional programming.

I’ll discuss all of these fascinating topics in the next section of this small series on concurrency. Stay tuned!

reference

8 bit avenue – Difference between Multiprogramming, Multitasking, Multithreading and Multiprocessing Wikipedia – Inter-process communication Wikipedia – Process (computing) Wikipedia – Concurrency (computer science) Wikipedia – Parallel computing Wikipedia – Multithreading (computer architecture) Stackoverflow – Threads & Processes Vs MultiThreading & Multi-Core/MultiProcessor: How they are mapped? Stackoverflow – Difference between core and processor? Wikipedia – Thread (computing) Wikipedia – Computer multitasking Ibm.com – Benefits of threads Haskell.org – Parallelism vs. Concurrency Stackoverflow – Can multithreading be implemented on a single processor system? HowToGeek – CPU Basics: Multiple CPUs, Cores, And hyper-threading Explained Oracle.com – 1.2 What is a Data Race? Jaka’s Corner – Data Race and Mutex Wikipedia – Thread safety Preshing on Programming – Atomic vs. Non-Atomic Operations Wikipedia – Green threads Stackoverflow – Why should I use a thread vs. using a process?

If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.

The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.