Process management

Where am I from? Let’s start with the original question, how does the program work?

A few words about the difference between a program and a process:

  • Program: static, is an executable file stored on disk, is a set of instructions.

  • Process: It is a dynamic Process of program execution. Multiple execution of the same program corresponds to multiple processes.

Program running procedure

It can be seen from the figure that the program is first compiled by the compiler into machine instructions, before running the program into memory, in memory to create a process entity. A process entity (process image) consists of PCB, program segment, and data segment. The CPU then fetches instructions from memory to run the program.

The composition of the process entity

Hang three QQ numbers at the same time, will correspond to three QQ processes, their PCB, data segment is different, but the content of the program segment is the same (are running the same QQ program)

PCB is for operating system; The program segment and data segment are for the process’s own use.

After introducing the concept of process entity, a process can be defined as the running process of process entity and an independent unit of system resource allocation and scheduling

Organization of processes

Struct task_struct; struct task_struct;

struct task_struct { /* * offsets of these are hardcoded elsewhere - touch with care */ volatile long state; /* -1 unrunnable, 0 runnable, >0 stopped */ unsigned long flags; /* per process flags, defined below */ int sigpending; mm_segment_t addr_limit; /* thread address space: 0-0xBFFFFFFF for user-thead 0-0xFFFFFFFF for kernel-thread */ struct exec_domain *exec_domain; volatile long need_resched; unsigned long ptrace; int lock_depth; /* Lock depth */ /* * offset 32 begins here on 32-bit platforms. We keep * all fields in a single cacheline that are needed for * the goodness() loop in schedule(). */ long counter; long nice; unsigned long policy; struct mm_struct *mm; int processor; . }Copy the code

Linux uses a two-way linked list to connect all processes.

Process state and transitions

  • Created state: When a process is being created, it is in the “Created state”. At this stage, the operating system allocates resources to the process and initializes the PCB.

  • Ready state: After the process is created, it enters the “ready state”. The process in the ready state is ready to run, but cannot run temporarily because there is no idle CPU.

  • Running: If a process is running on the CPU, it is running. The CPU will execute the corresponding program (execution instruction sequence) of the process;

  • Running: While a process is running, a request may be made to wait for an event to occur (such as a system resource allocation, or a response from another process). The process cannot proceed until this event occurs, at which point the operating system disengage the process from the CPU and puts it into a “blocked state.” When the CPU is idle, another “ready” process is selected to run on the CPU.

  • Terminating: A process can make an exit system call asking the operating system to terminate the process. At this point, the process will enter the “terminate state”, the operating system will tell the process to remove the CPU, reclaim memory space and other resources, and finally reclaim the PROCESS PCB. When the process is terminated, the process disappears completely.

Most people can’t remember transitions between processes, between ready to run and blocked. In a nutshell, memorizing transitions by rote is impossible. Just memorize the following transitions and you will be able to draw them in your mind.

Note:

  1. The blocking state -> ready state is not controlled by the process itself, and is a passive behavior.(A program is blocked waiting for a resource or signal. Once a resource is acquired or a signal is sent to a blocking process, it can become ready for the CPU to execute it.)
  2. Running state -> blocking state is an active behavior of the process itself;(The program runs halfway to wait for resources (critical resources, critical sections, etc.); Or need to do I/O input and output, read and write memory, are the active behavior of the program)
  3. It cannot be directly converted from the blocking state to the running state, nor from the ready state to the blocking state(Because entering the blocked state is requested by the process, it must be made at runtime.)

Essential differences between ready state, blocking state and running state:

  • Blocked: a process stops, lacks necessary resources, and does not run even if CPU scheduling is given a chance
  • Ready state: process stop, resources are not short, lack of CPU scheduling, to CPU scheduling can run
  • Running state: Nothing missing, running processes

Process control

Why do you need a primitive?

The main function of process control is to implement effective management of all processes in the system. It has the functions of creating new processes, canceling existing processes, and realizing process state transformation.

How to achieve process control? A: Use "primitive implementation".

The execution of primitives is atomic and seamless. So why does the process control (state transition) process have to be done in one go?

For example, suppose the variable state in the PCB stands for the current state of the process, 1 stands for ready, 2 stands for blocked…

Assuming that the event that process 2 is waiting for occurs, the kernel program responsible for process control in the operating system needs to do at least two things:

  1. Set PCB 2’s state to 1;
  2. Put PCB 2 from blocking queue to ready queue;

After completing the first step, PCB 2 receives an interrupt signal with state=1, but it is placed in the blocking queue, mainly because the first and second steps are not an atomic operation.

Lua scripts can also implement this atomic operation redis, so how to implement this primitive atomicity?

Implementations of primitives?

The interrupt mechanism

The interrupt mechanism

Before we get to that, it's important to understand the interrupt mechanism.

Off interrupts and on interrupts are just like switches in our lives. Shutdown interrupts are designed to protect some programs that cannot be stopped midway. The CPU of a computer performs time division multiplexing, which means that the CPU can only execute one instruction per clock cycle.

In a multiprogramming environment (is what we usually call several programs running at the same time), the CPU is constantly these procedures alternately to execution is a a, respectively, from a macro point of view so we can feel more programs are executed at the same time, but from the microcosmic point of view is CPU in different times (very short) performed within the different individual instructions of the program.

The CPU switches between these instructions through interrupts. Interrupt is to let the CPU in a period of time to perform the same program designed multiple instructions, such as in the very event and then back to normal, the CPU can restore busy very events before the advent of the computer work environment (usually called the recovery site), to restore the scene, the CPU is not allowed to disturb by other program, This is when the shutdown interrupts and no other requests are answered. When the field recovery is completed, the CPU will start the interrupt, the instructions of other waiting programs will start to be executed by the CPU, and the computer will return to normal.

The classification of interruption is not listed here, and interested babies can search for it themselves.

Primitives to implement

Atomicity can be achieved with two privileged instructions, “off interrupt instruction” and “on interrupt instruction”.

Normal: the CPU routinely checks whether there is an interrupt signal to be processed after each instruction is executed. If there is, the CPU suspends the current program and executes the corresponding interrupt handler instead. After the CPU executes the off interrupt instruction, it no longer checks the interrupt signal routinely, and does not resume the check until the on interrupt instruction is executed.

The following four diagrams show primitive operations for process creation, termination, blocking, awakening, and switching

Whatever process control primitive, there are three main things to do:

  1. Update information in PCB
  • All process control primitives must modify the process status flag;
  • Depriving a currently running process of CPU usage necessarily requires the preservation of its running environment;
  • The environment must be restored before a process can start running.
  1. Insert the PCB into the appropriate queue
  2. Allocate/reclaim resources

Process of communication

Process communication is divided into shared storage, message passing and pipe communication.

As the name implies, process communication refers to the exchange of information between processes. A process is the unit that allocates system resources (including memory address space), so each process has independent memory address space.

The Shared memory

To ensure security, one process cannot directly access another process’s address space. But information exchange between processes must be implemented. So how do you share it? If you read my previous installments on Shared memory management and file systems, you should have some ideas.

Note:

  1. Access to the shared space by two processes must be mutually exclusive (mutually exclusive access is achieved through tools provided by the operating system).
  2. The operating system is only responsible for providing shared space and synchronous mutex tools (such as P and V operations)

Shared storage can be used in two ways:

  1. Structure-based sharing: for example, only one array of length 10 can be stored in the shared space. This sharing mode is slow and limited, which is a low-level communication mode.
  2. Storage-based sharing: Drawing a shared storage area in memory where the form and location of data are controlled by the process, not the operating system. By contrast, this sharing is faster and an advanced form of communication.

Pipeline communication

A “pipe” is a shared file, also known as a pipe file, used to connect to a reading and writing process. This is essentially creating a fixed size buffer in memory.

  1. The pipe can only use half duplex communication, and can only realize one-way transmission in a certain period of time. If you want to achieve two-way simultaneous communication, you need to set up two pipes;
  2. Each process has exclusive access to the pipe;
  3. Data is written to the pipe as a stream of characters, and when the pipe is full, the write() system call of the writing process blocks, waiting for the reading process to pick up the data. When the pipeline is empty, the read() system call will block.
  4. If it’s not full, it’s not allowed to read. If it is not empty, it is not allowed to write;
  5. Once the data is read, it is discarded from the pipe, which means that there can be at most one reading process, or there may be misread data.

The messaging

Data is exchanged between processes in units of formatted messages. Processes exchange data using the two primitives “send message” and “receive message” provided by the operating system.

A message consists of a message header and a message body. The message header includes the ID of the sending process, the ID of the receiving process, the message type, the length of the message and other formatted information (the “message” sent in the computer network is actually a formatted message).

The message is directly attached to the message buffer queue of the receiving process. The message is first sent to the intermediate entity (mailbox), so it is also called “mailbox communication mode”. Eg: The E-mail system in the network.

summary

thread

If you don’t know about thread properties, look at the figure above.

Three threading models

For each implementation of the threading model, we focus on four topics:

  1. Who is responsible for managing threads?
  2. Does thread switching require CPU warping?
  3. Is the operating system aware of user-level threads?
  4. What are the advantages and disadvantages of this threading approach?

Many-to-one model

Historical background: Early operating systems (e.g., early Unix) only supported processes, not threads. At the time, “threading” was implemented by thread libraries.

Many-to-one model: Multiple user-level threads map to a single kernel-level thread. Only one kernel thread is assigned to a process.

Thread simulation implementation code is as follows:

From a code point of view, a thread is just a piece of code logic. The above three pieces of code can be logically viewed as three “threads.” The while loop is one of the weakest “thread libraries” that manage threads (such as scheduling).

Many programming languages provide powerful thread libraries that can create, destroy, and schedule application threads.

Features of many-to-one model:

  1. User-level threading is implemented by the application through thread libraries, and all thread management is done by the application (including thread switching)
  2. In user-level threads, thread switching can be done in user mode without operating system intervention.
  3. From the user’s point of view, there are multiple threads. However, the operating system kernel is not aware of the existence of threads. User-level threads are threads that can be seen from the user’s perspective.
  4. The advantages and disadvantages
  • Advantages: The switch of user-level threads can be completed in user space without switching to the core mentality. The system overhead of thread management is low and the efficiency is high
  • Disadvantages: When a user-level thread is blocked, the entire process is blocked, and the concurrency is not high. Multiple threads cannot run in parallel on a multicore processor.

Note: The operating system only "sees" kernel-level threads, so only kernel-level threads are units of processor allocation.

One-to-one model

Most modern operating systems implement kernel-level threads, such as application Windows and Linux.

One-to-one model: a user-level thread is mapped to a kernel-level thread. Each user process has the same number of kernel-level threads as the user-level threads.

One-to-one model features:

  1. The management of kernel-level threads is accomplished by the operating system kernel.
  2. Thread scheduling, switching and other work are responsible for by the kernel, so the kernel-level thread switching must be completed under the core mentality;
  3. The operating system establishes a corresponding TCB(Thread Control Block) for each kernel-level Thread, and manages threads through THE TCB. “Kernel-level threads” are “threads visible from the perspective of the operating system kernel”;
  4. The advantages and disadvantages
  • Advantages: When one thread is blocked, other threads can continue to execute, with strong concurrency. Multithreading can be executed in parallel on multi-core processors.
  • Disadvantages: a user process will occupy multiple kernel-level threads, thread switching is completed by the operating system kernel, need to switch to the core mentality, so the cost of thread management is high, overhead.

It is important to note that most implementations of the one-to-one model limit the number of threads supported by the system because each user thread is created by creating a corresponding kernel thread, and because the overhead of creating kernel threads affects the performance of the application. Linux, as well as the Windows family of operating systems, implements a one-to-one model.

Java is the use of one-to-one thread model, its one thread corresponds to a kernel thread, scheduling is completely given to the operating system to deal with, so the cost of switching threads is very large, thread number tuning is an important part of Java engineering.

Many-to-many model

Many-to-many model: N user-level threads map to m kernel-level threads (n >= M). Each user process corresponds to m kernel-level threads.

Kernel-level threads are the unit of processor allocation. For example, in a multi-core CPU environment, the process on the left can be allocated a maximum of two cores.

Many-to-many model features:

  1. It overcomes the disadvantages of low concurrency in many-to-one model (one block blocks all), and overcomes the disadvantages of one user process occupying too many kernel-level threads and too much overhead in one-to-one model.
  2. A kernel-level thread can run any mapped user-level thread code. The process blocks only if the running code logic in both kernel-level threads blocks.
  3. While the many-to-many model allows developers to create as many user threads as they want, it does not increase concurrency because the kernel can only schedule one thread at a time.
  4. When one user thread executes a blocking system call, the kernel can schedule another user thread to execute it.
  5. Differ from one to one model, its all the user thread is not in the process and the kernel thread binding, but can be dynamically bound kernel threads, when a kernel thread because its binding user thread blocking operation is the kernel yield the CPU scheduling, the rest of the process of its associated user threads can again with other kernel threads run binding.

This is the basis of the GO language's popularity over the years. The Coroutine Goroutine scheduler in GO is the implementation of this scheme. In GO, a process can start thousands of Goroutines. And those few kilobytes are enough for goroutine to run, which can support a large number of Goroutines in limited memory space, allowing for more concurrency.

summary

Understanding these threading models will actually give you a deeper understanding of the mechanics of each language, as well as the strengths and weaknesses of those high-concurrency mechanisms.

Process scheduling

Scheduling is far away from development, so I don’t need to go too deep. I’ll talk about scheduling in a nutshell.

When you have a bunch of tasks to do, but they can’t be done at the same time because of limited resources. This requires some sort of rule to determine the order in which these tasks should be handled, and this is the problem of “scheduling” research.

First pay attention to three issues, the timing of process scheduling, process switching and process, process scheduling.

The timing of process scheduling

Process scheduling (low-level scheduling) is an algorithm that selects a process from the ready queue and assigns it a processor.

This is a place where you can’t schedule most of the time atomic operations are required without being interrupted by external forces, but processes can be scheduled and switched in normal critical sections.

Explain two nouns:

Critical resource: a resource that is allowed to be used by only one process at a time. Processes need exclusive access to critical resources.

Critical section: the piece of code that accesses critical resources.

Process switching and process

The generalized process scheduling includes two steps: selecting a process and switching a process. The process switching process is mainly completed:

  1. Save various data of the original running process
  2. Recovery of various data of the new process (such as: program counters, program status words, various data registers and other processor field information, which is generally stored in the process control block)

Process scheduling mode

Non – deprivation scheduling is also called non – preemption. That is, only the process is allowed to voluntarily abandon the processor. Even when more pressing tasks arrive during a run, the current process continues to use the processor until the process terminates or voluntarily blocks.

Simple implementation, low system overhead but unable to deal with urgent tasks in time, suitable for early batch processing system.

Deprivation scheduling mode, also called preemption mode. While a process is executing on the processor, if a more important or urgent process needs to use the processor, the executing process is immediately suspended and the processor is assigned to the more important and urgent process.

You can prioritize more urgent processes, or you can implement the ability for each process to execute in time slices (through the clock interrupt). Suitable for time-sharing operating system, real-time operating system.

Evaluation index of scheduling algorithm

A lot of metrics are known by name, and I’ll focus on CPU utilization and system throughput, which are the two metrics that most architecture concurrency, or Java garbage collectors, consider today.

CPU utilization

CPU usage: Indicates the percentage of the busy CPU time in the total CPU time.

Utilization = busy time/Total time

Eg: A computer only supports a single program. A job needs to be run on the CPU for 5 seconds, and then output by the printer for 5 seconds, and then execute for 5 seconds before it is finished. In this process, CPU utilization = (5+5)/(5+5+5)

System throughput

With computers, you want to be able to do as much work as possible in as little time as possible. System throughput: the number of jobs completed per unit of time

System throughput = total number of jobs completed/total time spent

Eg: It takes 100 seconds for a computer system to complete 10 jobs, then the system throughput is? 10/100 = 0.1 tracks/second.

There is no best scheduling algorithm, only the most appropriate algorithm, the actual application of the algorithm mainly depends on the scene is more pursuit of response time or system throughput.

For example, in the garbage collector, CMS is response time first, in order to obtain the minimum pause time, in order to reduce STW, some throughput is sacrificed. In some applications or websites with high requirements on response time, user programs cannot have long pauses. CMS can be used in this scenario. UseParalleGC+UseParalleoldGC Garbage collector is throughput first but requires long STW.

Scheduling algorithm

Tips: Learning ideas of various scheduling algorithms

  1. Algorithm thought
  2. Algorithm rules
  3. Is this scheduling algorithm used for job scheduling or process scheduling?
  4. Preemptive? Non-preemptive?
  5. Advantages and disadvantages
  6. Whether it causes hunger

Suitable for early batch systems

These algorithms mainly care about fairness to users, average turnaround time, average waiting time and other indicators to evaluate the overall performance of the system, but they do not care about “response time” and do not distinguish the urgency of tasks, so the interaction is very bad for users. Therefore, these three algorithms are generally suitable for the early batch processing system, of course, FCFS algorithm is often used in combination with other algorithms, now also plays a very important role.

Suitable for interactive systems

Compared with the early batch operating system, the cost of computer is greatly reduced, so the interactive operating system (including time-sharing operating system, real-time operating system, etc.) after the emergence of the system pays more attention to the response time, fairness, balance and other indicators. These algorithms can also meet the requirements of interactive system. Therefore, these three algorithms are suitable for interactive systems. (UNIX, for example, uses a multi-level feedback queue scheduling algorithm)

The specific algorithm will not be expanded here. If there is a lot of feedback from babies in the later stage, we will consider expanding it.

This is the previous post, the next post will focus on process synchronization mutex, the next post will have the following directory preview:

Welcome to praise, share, look, if you like, you can pay attention to my princess number, the first article in this ~ ~ ~

Previous recommendations:

  1. Operating system storage management, high energy early warning!!

2. Operating system file management, everything is file!!

3. What is DevOps

Article reference: Wang Dao teacher operating system