preface

The application layer usually focuses on using an API as a black box, but understanding the mechanics of the operating system will help us use it better and avoid mistakes.

The Unix system

In 1969, Unix was born out of A failed operating system called Multics from Bell LABS, and it was released with the source code so that many organizations improved on it. It is written in C with only a few hundred system calls and is designed with everything in file.

Linux system

Linus developed by Finnish students in 1991, based on Intel80386 microprocessor, after the development of the Internet released the source code. It borrows a lot of design ideas from Unix, but its implementation is completely different, a different operating system. The open source license is very free and can be modified freely, but the modified source code needs to continue to be published.

Linux features

  • It is a single kernel, as opposed to a microkernel.
  • Modular design supports dynamic loading of kernel modules.
  • Symmetric multiprocessing (SMP) is supported, where multiple processors need to consider sharing resources.
  • In preemption multi-task operating system, the kernel scheduling of process belongs to preemption.
  • Kernel threads are supported, but it does not distinguish between threads and processes; all processes are the same to it.

What is an operating system

The operating system is responsible for the most basic functions and system management, including the kernel, device drivers, boot, shell, interface, file management, and other system tools. Applications typically use library functions, which tell the kernel to do various tasks through system calls.

CPU activity status

  • Runs in user space and executes user processes.
  • Run in kernel space, in process context, to execute a process.
  • Running in kernel space, in interrupt context, independent of the task process, it is handling some interrupt.

Process & Thread

A process is an executable program that contains various resources, such as files, signals, kernel data, processor state, memory space, execution threads, global data segments, and so on. Processes provide virtual processors and virtual memory, making it look like the process has its own resources.

The kernel keeps all processes in a task queue, which is a bidirectional circular linked list, and each node in the list is of type task_struct process descriptor structure in < Linux /sched.h>.

All processes are descendants of the init process with PID 1. The init process is started at the end of kernel startup, which reads the system initialization script and executes other related programs. Each process has a parent process, and each process can have zero or more child processes. Fork () is used to create a child process by copying the current process. The differences between the two processes are PID, PPID, certain resources, and statistics. Fork () uses the copy-on-write mechanism, in which the parent and child processes share the same copy and the data is copied only when it needs to be written. The actual overhead of fork() is copying the parent’s page table and creating a unique process descriptor for the child. Exec () is responsible for reading the executable and loading it into space to start running. The process is terminated when it calls the exit() system call.

A thread is the active object of a process, and each thread has its own program counter, process stack, and set of registers. Threads can share memory address Spaces, files, and other resources within the same program. Linux has no concept of threads from a kernel perspective, it treats threads as processes. Kernel threads have no separate address space and only run in kernel space, never switching to user space.

Process scheduling

The purpose of the process scheduler is to maximize CPU time by executing as many tasks as possible, but we know that there are almost always more tasks than processors in the system, so at some point a task will not be executed. Multitasking systems can execute multiple processes concurrently, one-to-many on one CPU, many-to-many on multiple cpus, and all tasks appear to be running at the same time.

Multi-task system is divided into preemptive and non-preemptive, Linux is preemptive scheduling, scheduler to determine the process suspension and execution, each process can execute time slice, modern systems generally have certain policies to dynamically allocate time slice. In non-preemptive mode, the process is stopped by itself, otherwise it will continue running. The scheduler cannot decide whether to stop the process. If a process keeps running, it will crash the system.

Scheduling policy

Processes can be EITHER I/O intensive or processor intensive. The former spends most of its time making various I/O requests, and usually these processes run for only a short period of time and then block on the I/O. On the other hand, most of the time is spent executing code, running non-stop with few I/O requests. Of course, some processes are both I/O intensive and processor intensive. For the former, we want fewer time slices, and for the latter, we want more time slices. The scheduling strategy is to find a balance between the two contradictions, so that the process can respond quickly and maximize the system utilization.

Priority-based scheduling is the most basic scheduling algorithm. It classifies processes with higher priorities to run first, and those with lower priorities to run later. Processes with the same priority run in turn. High-priority processes on some systems will run more frequently and in longer slices. The scheduler selects the process with the highest priority and unexhausted time slice to execute. Linux has two priority ranges. The first is the NICE value, which ranges from -20 to +19 and defaults to 0. The higher the value, the lower the priority. The second is real-time priority. The default value ranges from 0 to 99. A higher value indicates a higher priority. Any real-time process has a higher priority than a normal process, which means that nice priority and real-time priority are two different dimensions.

Time slice refers to the time that a process can continue to run before being preempted. The default time slice size is not easy to determine. Too long time slice will worsen the interactive response, and too short time slice will increase the loss caused by process switchover. I/O intensive wants short slices, processor intensive wants long slices.

The CFS scheduler

Linux uses the Full Fair Scheduling (CFS) algorithm, a scheduler for common processes called SCHED_NORMAL in Linux and SCHED_OTHRER in POSIX, implemented in kernel/sched_fair.c. CFS allows each process to run for a certain amount of time, cycle around, and select the least running process as the pending process. CFS calculates how long a process should run based on the total number of runnable processes, rather than counting time slices by priority. CFS has a minimum time slice (minimum granularity), which defaults to 1ms. This means that even with an infinite number of processes, each process can have at least a 1ms time slice. CFS ensures a fair processor usage ratio for each process.

  • Instead of simply using the concept of time slices, CFS must maintain an accounting of the time each process runs to ensure that each process is fairly allocated.
  • When CFS selects the next process to run it picks the one with the minimum running time.
  • A sleeping (blocked) process that is in an unexecutable state is removed from the executable red-black tree and placed on a waiting queue. Wait queues are simple linked list structures,
  • Wake up, as opposed to sleep, the process is set to the executable state and moved from the wait queue to the executable red-black tree.
  • Context switches, that is, switching from one executable thread to another, are handled by the kernel/ Sched. c context_switch function. Map virtual memory from the previous process to the new process, saving recovery stack and register information and other related information.
  • The kernel provides need_resCHED flags to indicate whether a rescheduling is needed. The kernel checks this flag whenever it interrupts a handler or returns after a system call. If set, the kernel selects a more appropriate process to run.

Real-time scheduling

Linux provides two real-time scheduling policies: SCHED_FIFO and SCHED_RR. The former is a simple first-in, first-out scheduling algorithm, which does not use time slices. Processes at SCHED_FIFO are scheduled before any SCHED_NORMAL processes. Once a SCHED_FIFO level process is in an executable state, it will execute until it blocks itself or displays a CPU release. Only a higher-priority SCHED_FIFO or SCHED_RR process can preempt it. Two SCHED_FIFO processes of the same priority take turns executing, while other regular processes wait until they become inoperable.

SCHED_RR is roughly the same as SCHED_FIFO, but it runs out of pre-allocated time and doesn’t continue, that is, SCHED_RR is SCHED_FIFO with time slices. Time slices are only relative to processes of the same priority, and a low-priority process cannot preempt the SCHED_RR task even if its time slice runs out.

Giving up processor time

Linux uses the sched_yield() system call to give up processor time for the current process to other pending processes and, in the case of real-time processes, to move the process from the active queue to the back of its priority queue. The yield semantics of early Linux were different, simply putting processes at the end of the priority queue and not giving up for very long.

Focus on artificial Intelligence, reading and feeling, talk about mathematics, computer Science, distributed, machine learning, deep learning, natural language processing, Algorithms and Data Structures, Java depth, Tomcat kernel, etc.