This article starts from the process and thread implementation principle in Linux, extends to the Linux thread model, and finally briefly explains the cost of thread switching.

Just start to learn, not necessarily right, kind people come to correct me!!

Processes and threads in Linux

First, define the basic concept of process and process:

  • Processes are the basic unit of resource allocation
  • Thread is the basic unit of CPU scheduling
  • There may be multiple threads under a process
  • Threads share the resources of a process

The basic principle of

Linux user mode processes and threads basically meet the above concepts, but kernel mode does not distinguish between processes and threads. It can be argued that processes are executed uniformly in the kernel, but some are “normal” (processes) and some are “lightweight” (pthreads or npthreads), which are stored in task_struct structures.

Use fork to create the process and pthread_create to create the thread. Both system calls end up calling do_dork, which completes the copy of the task_struct structure and adds the new process to the kernel schedule.

A process is the basic unit of resource allocation, and threads share the resources of the process

Common processes require deep-copy virtual memory, file descriptors, signal processing, etc. Lightweight processes are “lightweight” because they require only a shallow copy of most information, such as virtual memory, and multiple lightweight processes share the resources of one process.

Threads are the basic unit of CPU scheduling, and there may be multiple threads in a process

Linux has introduced the concept of thread groups, so that the original “process” corresponds to the thread, and “thread group” corresponds to the process, so that “one process can have multiple threads” :

  • Multiple process groups exist in the operating system
  • Multiple processes in a process group (1:n)
  • One thread group for each process (1:1)
  • There are multiple threads in a thread group (1:n)

In task_struct, a process group with a pGID tag, a thread group with a TGID tag, or a process or thread with a PID tag. Given that there is currently a process group, the above concepts correspond as follows:

  • A process group has a primary process (parent process). Pid is equal to the pGID of the process group. All other processes in the process group are child processes of the parent process. Pid is not equal to pGID
  • Each process corresponds to a thread group, pid equals tGID.
  • A thread group has a “main thread” (loosely called the “main thread”, which corresponds to the main process; Must not be semantically called “parent thread”), pid equals the tGID of the thread group; All other threads in the thread group are level with the main thread. Pid is not equal to tGID

Therefore, a call to getPgid returns pGID, a call to getPid should return tGID, and a call to gettid should return PID. Don’t get confused when using it.

It makes sense that threads other than the main thread are the basic unit of CPU scheduling. A task_struct can also be scheduled by the CPU, so the main thread is the basic unit of CPU scheduling.

All threads with the same Tgid form a conceptual “process”, with only the main thread actually allocating resources when it is created and the other threads sharing resources from the main thread through shallow copies. The process is the basic unit of resource allocation by combining ordinary threads with lightweight processes introduced earlier.

Take a chestnut

pgid tgid pid
111 111 111
112 112 112
112 112 113
113 113 113
113 113 114
113 115 115
113 115 116
113 115 117
  • There are three process groups 111, 112, and 113
    • Process group 111 has one parent process 111 that allocates resources separately
      • Process 111 has one thread 111 sharing the resources of process 111
    • Process group 112 has a parent process 112 that allocates resources separately
      • There are two threads 112 and 113 under process 112 sharing the resources of process 112
    • Process group 113 has one parent process 113 and one child process 115, which allocate resources separately
      • There are two threads 113 and 114 under process 113 sharing the resources of process 113
      • There are three threads 115, 116, and 117 under process 115 sharing the resources of process 115

summary

It is now much easier to understand processes and threads in Linux:

  • Process is a logical concept used to manage resources, corresponding totask_structThe resource
  • Each process has at least one thread for specific execution, corresponding totask_structTask scheduling information in
  • In order totask_structIn pid distinguishes threads, tgid distinguishes processes, and pGID distinguishes process groups

Linux thread model

One to one

Both LinuxThreads and NPTL use a one-to-one thread model, where one user thread corresponds to one kernel thread. The kernel is responsible for scheduling each thread, which can be dispatched to other processors. Linux 2.6 uses the NPTL thread library, a one-to-one thread model, by default.

Advantages:

  • Simple implementation.

Disadvantages:

  • Most operations on the user thread are mapped to the kernel thread, causing frequent switching between the user and kernel states.
  • The kernel maps scheduling entities for each thread, which can affect system performance if a large number of threads occur in the system.

For one more

As the name implies, in the many-to-one thread model, multiple user threads are connected to the same kernel thread, and all the details of thread creation, scheduling, and synchronization are handled by the process’s user-space thread library.

Advantages:

  • Many operations of the user thread are transparent to the kernel and do not require frequent switching between user and kernel states. Makes thread creation, scheduling, synchronization, and so on very fast.

Disadvantages:

  • Because multiple user threads correspond to the same kernel thread, if one of the user threads blocks, the other user threads cannot execute.
  • The kernel does not know which threads exist in the user mode, so it cannot achieve complete scheduling, priority and so on like the kernel thread

Many to many

The many-to-one thread model is very lightweight, and the problem is that multiple user threads correspond to a fixed kernel thread. The multi-threaded model solves this problem: m user threads correspond to N kernel threads, usually M > N. NGPT, led by IBM, used a many-to-many threading model, which is now obsolete.

Advantages:

  • Lightweight with many-to-one model
  • Because multiple kernel threads are involved, if one user thread blocks, the other user threads can still execute
  • Since there are multiple kernel threads, complete scheduling and priority can be achieved

Disadvantages:

  • complex

thread

Linux uses a one-to-one threading model, with very little difference between user thread switching and kernel thread switching. Also, if you ignore the cost of the user voluntarily giving up the execution right (yield) of the user thread, you only need to consider the cost of kernel thread switching.

Note that this is just a simplification to help you understand. In fact, the user thread library does a lot of work in the scheduling, synchronization, and so on of user threads, and this overhead cannot be ignored.

As the JVM explains with Thread#yield() : if the underlying OS does not support the semantics of yield, the JVM lets the user thread spin to the end of the time slice, and the thread passively switches to achieve a similar effect.

What causes thread switching

  • Time slice rotation
  • Thread block
  • The thread actively abandons the time slice

Overhead of thread switching

Directly overhead

The direct overhead is caused by the thread switch itself and is inevitable.

Switching between user mode and kernel mode

Thread switching can only be performed in the kernel state. If the current user is in the user state, the switch between the user state and the kernel state must be caused. (What is the cost of switching between user mode and kernel mode??)

Context switch

Thread (or process-at-will) information needs to be stored in a task_struct. When a thread is switched, the task_struct of the old thread must be cut out of the kernel, and the new thread must be inserted into the kernel, causing a context switch. In addition, it is necessary to switch registers, program counters, thread stacks (including operation stacks, data stacks) and so on.

Thread scheduling algorithm

Thread scheduling algorithms need to manage thread states, wait conditions, etc., and maintain priority queues if scheduled by priority. This cost is significant if thread switching is frequent.

Indirect costs

Indirect overhead is a side effect of direct overhead, depending on the system implementation and user code implementation.

Cache misses

To switch processes, new logic needs to be executed. If the two access address Spaces are not similar, cache misses can occur, depending on the system implementation and user code implementation. If the system cache is large, the impact of cache missing can be reduced. If the address space of the user thread accessing the data is close, the cache miss rate itself is low.

The same applies to page table and other fast and slow table structures.


Reference:

  • Linux thread implementation mechanism analysis
  • Three historical implementation models for threads
  • Understanding system architecture from Java perspective (1)CPU context switch
  • Understanding system architecture from Java perspective (2) CPU cache
  • Understanding system architecture from Java perspective (3) Pseudo-sharing

This article is published under the Creative Commons — Share the Same 4.0 International License. It must be used for commercial purposes.