Sorry, I'm a little fluttered after learning all this Linux

Linux introduction

UNIX is an interactive system for handling multiple processes and multiple users online at the same time. The reason for saying UNIX is that Linux evolved from UNIX, which was designed by programmers and primarily serves programmers. Linux inherits the design goals of UNIX. The Linux operating system is ubiquitous in everything from smartphones to cars, supercomputers and appliances, home desktops to corporate servers.

Most programmers like to keep their systems simple, elegant, and consistent. For example, at the lowest level, a file should be just a collection of bytes. Sequential access, random access, keystroke access, and remote access will only get in the way. Similarly, if the command

ls A*
Copy the code

Means only list all files starting with A, then command

rm A*
Copy the code

It should remove all files starting with A, not just files with A*. This property is also the principle of Least Surprise.

Half of the least surprise principle is often applied to user interface and software design. The prototype is that the feature or feature should conform to the user’s expectations and should not surprise or shock the user.

Experienced programmers often expect systems to be functional and flexible. A fundamental goal of Designing Linux is to do one thing per application and do it well. So the compiler only does compilation, the compiler doesn’t produce lists because there are other applications that do it better.

A lot of people don’t like redundancy, so why use copy when CP tells you exactly what you want to do? This is a complete waste of valuable hacking time. To extract all the lines containing the string ARD from the file, the Linux programmer should type

grep ard f
Copy the code

Linux interface

Linux is a pyramidal system, as shown below

The application makes system calls to place parameters in registers (and sometimes on the stack) and to trap the system into an instruction switch from user state to kernel state. Because trap instructions cannot be written directly in C, C provides a library of functions that correspond to system calls. Some functions are written in assembly but can be called from C. Each function first puts the arguments in place and then executes the system call instruction. So if you want to make a read system call, the C program calls the read library to do it. By the way, the posiX-specified library interface is not the system call interface. That is, POSIX tells a standard system which library procedures to provide, what their parameters are, what they must do, and what results they must return.

In addition to the operating system and system call libraries, the Linux operating system also provides some standard programs, such as text editors, compilers, file manipulation tools, etc. These are the applications that deal directly with the user. So we can say that Linux has three different interfaces: system call interface, library function interface, and application program interface

The Graphical User Interface (GUI) in Linux is very similar to that in UNIX. This GUI creates a desktop environment, including Windows, targets and folders, toolbars, and file drag-and-drop capabilities. A complete GUI also includes window managers and various applications.

The GUI on Linux is supported by the X window. The main components are the X server, control keyboard, mouse, monitor, etc. When using a graphical interface on Linux, users can run programs or open files with mouse clicks, copy files by dragging and dropping, and so on.

Linux components

In fact, a Linux operating system can be made up of the following components

BootloaderBooters are software that manages the computer’s startup process. For most users, it’s just a pop-up screen, but the internal operating system does a lot of work
The Kernel (Kernel)The kernel is the core of an operating system. It manages cpus, memory, and peripherals.
Initialize the System: This is a subsystem that guides the user space and controls the daemon. Once the initial boot is handed over from the boot loader, it is the initialization system used to manage the boot process.
Daemon processBackground processes are programs that run in the background, such as printing, sound, scheduling, etc. They can be started during boot or after logging in to the desktop
Graphical Server: This is the subsystem that displays graphics on the monitor. It is often referred to as the X server or X.
Desktop Environment: This is where the user actually interacts with it. There are many desktop environments to choose from, each containing built-in applications such as file managers, Web browsers, games, and so on
Applications: The desktop environment does not provide full applications. Just like Windows and macOS, Linux provides thousands of high-quality software that can be easily found and installed.

Shell

Although Linux applications provide guIs, most programmers still prefer to use the command line interface, called the shell. The user usually launches a shell window in the GUI and works in the shell window.

Shell commands are faster, more powerful, and easy to expand without repetitive strain injury (RSI).

Here are some of the simplest Bash shells. When the shell starts, it first initializes and prints a prompt on the screen, usually a percent sign or dollar sign, for user input

After the user types a command, the shell extracts the first word, which is a string of characters separated by Spaces or tabs. Given that the word is the name of the program to run, the program is searched for and run if it is found. The shell then suspends itself until the program is finished, and then tries to read the next instruction. The shell is also a normal user program. Its main function is to read user input and display the output of computation. Shell commands can contain arguments that are passed as strings to the invoked program. Such as

cp src dest
Copy the code

The CP application is called with two parameters, SRC and dest. The program interprets that the first argument is an existing file name, and then creates a copy of that file named dest.

Not all arguments are file names, as shown below

head -20 file
Copy the code

The first argument, -20, tells the HEAD application to print the first 20 lines of the file instead of the default 10. Parameters that control command operations or specify optional values are called flags, which by convention should be represented with -. This notation is necessary for example

head 20 file
Copy the code

Is a perfectly legal command that tells the head program to print the first 10 lines of a file named 20, and then the first 10 lines of a file named file. The Linux operating system can accept one or more parameters.

To make it easier to specify multiple file names, the shell supports magic characters, also known as wild cards. For example, * can match one or more possible strings

ls *.c
Copy the code

Tell LS to list all files whose names end in.c. If multiple files exist at the same time, they are juxtaposed later.

The other wildcard is the question mark, which matches any character. A set of characters in brackets can represent any one of them, so

ls [abc]*
Copy the code

It lists all files starting with a, B, or C.

Shell applications do not necessarily input and output through terminals. When the shell starts, it gets access to standard input, standard output, and standard error files.

Standard output is input from the keyboard, and standard output or standard error is output to the monitor. Many Linux programs default to input from standard input and output from standard output. Such as

sort	
Copy the code

Sort is called, reads the data from the terminal (until the user enters Ctrl-d), sorts it alphabetically, and prints the results to the screen.

It is also often possible to redirect standard input and standard output, redirecting standard input using < followed by the file name. Standard output can be redirected with a greater-than sign >. Allows standard input and output to be redirected in a command. For example, the command

sort <in >out
Copy the code

Causes sort to take input from the in file and output the result to the out file. Since there is no redirection for standard error, the error message is printed directly to the screen. Programs that read in from standard input, process it, and write it to standard output are called filters.

Consider the following directive consisting of three separate commands

sort <in >temp; head -30 <temp; rm tempCopy the code

The sort application is called first, reading from the standard input in and passing the standard output to temp. When the program is finished, the shell runs head, telling it to print the first 30 lines, and print them on standard output (terminal by default). Finally, the temp temp file is deleted. Gently, you go, you swing sleeves, do not take away a cloud.

The first program on the command line usually produces output, which in the above example is not received by the temp file. However, Linux also provides a simple command to do this, such as the following

sort <in | head -30
Copy the code

| above is called the vertical bar symbol, its meaning from the sort of application, according to the output directly as input without having to create, use and remove temporary files. A collection of commands connected by a pipeline symbol is called a pipeline. For example, the following

grep cxuan *.c | sort | head -30 | tail -5 >f00
Copy the code

Lines containing cXuan in any.t ending file are written to standard output and sorted. The first 30 lines of this content are handed out by head and passed to tail, which in turn passes the last five lines to Foo. This example provides a pipe to connect multiple commands.

You can put a series of shell commands in a file and run that file as input. The shell processes them in order, just like typing commands on a keyboard. Files that contain shell commands are called shell scripts.

A recommended site for learning shell commands is www.shellscript.sh/

A shell script is actually a program that can assign values to variables and contain loop control statements such as if, for, while, etc. The shell is designed to look like C. There is no doubt that C is father. Since the shell is also a user program, the user can choose a different shell.

Linux applications

The Linux command line, also known as the shell, is made up of a number of standard applications. There are six main types of these applications

File and directory operation commands
The filter
Text program
System management
Program development tools, such as editors and compilers
other

In addition to these standard applications, there are other applications such as Web browsers, multimedia players, image browsers, office software, and game programs.

We’ve already seen several Linux applications in the example above, such as sort, CP, ls, head, but let’s take a look at the other Linux applications.

Let’s start with a couple of examples, for example

cp a b
Copy the code

Is to make a copy of A into B, and

mv a b
Copy the code

Move a to B, but delete the original file.

There are some differences between the above two commands. Cp copies the file. After the copy is complete, there are two files A and B. Mv is equivalent to the file movement, after the movement is complete, there is no more a file. The cat command can concatenate multiple file contents. Use rm to delete files. Using chmod allows owners to change access; You can run the mkdir and rmdir commands to create or delete a file or directory. Use ls to view directory files. Ls can display many properties, such as size, user, creation date, etc. Sort determines the display order of files

Linux applications also include the grep filter, which extracts lines of a particular pattern from standard input or from one or more input files; Sort sorts the input and prints it to standard output; Head extracts the first few lines of input; Tail extracts the last few lines of input; Other filters include cut and paste, which allow lines of text to be cut and copied. Od converts input to ASCII; Tr implements character case conversion; Pr for formatting printout and so on.

Program compiler using GCC;

The make command is used for automatic compilation, which is a powerful command for maintaining large programs whose source code is often composed of many files. Typically, some are header files. Source files usually include these files using the include directive. Make keeps track of which files belong to header files and then schedules the automatic compilation process.

Standard POSIX applications are listed below

The program	application
ls	List the directory
cp	Copy the file
head	Displays the first few lines of the file
make	Compiling files generates binaries
cd	Switch directory
mkdir	Create a directory
chmod	Example Modify file access permissions
ps	Listing file processes
pr	Formatted print
rm	Delete a file
rmdir	Deleting a File directory
tail	Extract the last few lines of the file
tr	Character set conversion
grep	grouping
cat	To standard output multiple files consecutively
od	Display files in octal
cut	Cut from the file
paste	Paste from the file

Linux kernel Architecture

Now that we have seen the overall structure of Linux, let’s take a look at the kernel structure of Linux as a whole

The kernel sits directly on the hardware and is responsible for I/O interaction, memory management, and controlling CPU access. Interrupts and schedulers are also included in the figure above. Interrupts are the primary means of interaction with the device. The scheduler comes into play when an interrupt occurs. The low-level code here stops the running process, saves its state in the kernel process structure, and starts the driver. Process scheduling also occurs when the kernel completes something and starts the user process. The dispatcher in the figure is the Dispatcher.

Note that the scheduler is a dispatcher instead of a scheduler, which is different

Scheduler and Dispatcher are both concepts related to process scheduling. The difference is that scheduler randomly selects a process from several processes. The Dispatcher allocates CPU to processes that the Scheduler selects.

Then, we divided the kernel system into three parts.

The I/O part is responsible for all parts of the kernel that interact with the device and perform network and storage I/O operations.

The diagram shows the I/O hierarchy. The highest level is a virtual file system, which means that no matter the file comes from memory or disk, it passes through the virtual file system. At the bottom, all drivers are character drivers or block device drivers. The main difference is whether random access is allowed. Network driver device is not an independent driver device, it is actually a character device, but the network device and character device processing is different.

In the device drivers above, the kernel code is different for each device type. Character devices can be used in two ways. There are one-click devices such as VI or Emacs that require input from each keyboard. Others, like the shell, type a line and press enter to send the string to the program for editing.

Network software is typically modular, supported by different devices and protocols. Most Linux systems include a complete hardware router in the kernel, but this is not as good as an external router. On top of the router is the protocol stack, including TCP/IP. On top of the stack is the socket interface, which communicates with the outside world and acts as a door.

Above the disk drive is the I/O scheduler, which sorts and allocates disk reads and writes to minimize unwanted head movements.

To the right of I/O is the memory part, where programs are loaded into memory and executed by the CPU, and there are parts of virtual memory, how pages are swapped in and out, and the replacement of bad pages and frequently used pages are cached.
Process module is responsible for process creation and termination, process scheduling, Linux processes and threads as runnable entities, and use a unified scheduling strategy to schedule.

At the top level of the kernel is the system call interface, through which all system calls pass. The system call triggers a trap to convert the system from user mode to kernel mode, and then transfers control to the kernel component above.

Linux processes and threads

Let’s take a closer look at the Linux kernel to understand the basic concepts of Linux processes and threads. System calls are interfaces to the operating system itself, which are important for creating processes and threads, allocating memory, sharing files, and I/O.

We will start from the commonality of each version to discuss.

The basic concept

Each process runs a separate program and has a separate thread of control when initialized. In other words, each process has its own program counter that keeps track of the instructions that need to be executed. Linux allows processes to create additional threads at run time.

Linux is a multi-programming system, so there are independent processes running at the same time. In addition, each user has several active processes at the same time. Because on a large system, there may be hundreds or thousands of processes running at the same time.

In some user Spaces, background processes, called daemons, run even when the user logs out.

Linux has a special kind of daemon called a Cron daemon that wakes up every minute to check if there is work to be done, and then goes back to sleep to wait for the next wake-up.

Cron is a daemon that can do anything you want, such as regular system maintenance, regular system backups, etc. There are similar programs on other operating systems, such as the Cron daemon on Mac OS X called Launchd. On Windows it is called a Task Scheduler.

On Linux systems, processes are created in a very simple way, with the fork system call creating a copy (copy) of the source process. The process that forks is called a parent process, and the process that forks is called a child process. Both parent and child processes have their own memory images. If the parent changes some variables after the child is created, the child does not see the changes. After the fork, the parent and child are independent of each other.

Although the parent and child remain independent, they can share the same file. If the parent has opened a file before the fork, the parent and child still share the open file after the fork. Changes to shared files are visible to both parent and child processes.

So how do you distinguish between parent and child processes? The child process is just a copy of the parent, so they are the same in almost everything, including memory images, variables, registers, and so on. The key is the return value of the fork call. If the fork returns a non-zero value, this is the child’s Process Identiier (PID), which returns a zero value to the child. This can be represented in the following code

pid = fork(); If (pid < 0){error() // pid < 0, create failure} else if(pid > 0){parent_handle() // parent process code} else { Child_handle () // Child process code}Copy the code

After fork, the parent process will get the PID of the child process, which is the unique identifier of the child process, i.e., PID. If the child process wants to know its PID, it can call the getPID method. When the child process finishes running, the parent process gets the PID of the child process. Since a process forks many children, and children also fork, PID is very important. We call the process after the first call to fork a primitive process, and a primitive process can generate an inheritance tree

Linux interprocess communication

The interprocess communication mechanism in Linux is usually called internel-process communication, and the following mechanisms are discussed. Generally speaking, the interprocess communication mechanism in Linux can be divided into six types

Let’s outline each of them below

Signal signal

Signaling is the first interprocess communication mechanism used by UNIX systems. Because Linux inherits from UNIX, Linux also supports signaling, which is achieved by sending asynchronous event signals to one or more processes. Signals can be generated from keyboard or access non-existent locations. Signals send tasks to child processes through the shell.

You can type kill -l on your Linux system to list the signals your system uses. Here are some of the signals I’ve provided

A process can choose to ignore incoming signals, but two cannot: SIGSTOP and SIGKILL signals. The SIGSTOP signal tells the currently running process to shut down, and the SIGKILL signal tells the current process that it should be killed. In addition, the process can choose which signals it wants to process, or it can choose to block signals, or if it does not block signals, it can choose to process them itself, or it can choose to do kernel processing. If you choose to hand it over to the kernel for processing, the default processing is performed.

The operating system interrupts the process of the target program to send signals to it. Execution can be interrupted in any non-atomic instruction, if the process has registered a new number handler, then the process is executed, if not, the default processing is used.

For example, when a process receives a signal for a SIGFPE floating point exception, the default action is to dump it and exit. There is no priority for signals. If two signals are generated for a process at the same time, they can be presented to the process or processed in any order.

Now let’s see what these signals are used for

SIGABRT and SIGIOT

SIGABRT and SIGIOT signals are sent to the process to tell it to terminate, and this signal is usually started by the process itself when the C library’s abort() function is called

SIGALRM 、 SIGVTALRM、SIGPROF

When the clock function times out, SIGALRM, SIGVTALRM, and SIGPROF will be sent to the process. When the actual time or clock time times out, SIGALRM is sent. When the CPU time used by the process times out, SIGVTALRM is sent. SIGPROF is sent when the CPU time used by the process and the system representative process times out.

SIGBUS

SIGBUS will be sent to the process when it causes a bus interrupt error

SIGCHLD

SIGCHLD is sent to the child process when it terminates, is interrupted, or is resumed from an interrupt. A common use of this signal is to instruct the operating system to clean up the resources used by the child process after it terminates.

SIGCONT

A SIGCONT signal instructs the operating system to resume a process that was previously suspended by a SIGSTOP or SIGTSTP signal. An important use of this signal is in job control in Unix shells.

SIGFPE

The SIGFPE signal is sent to the process when an incorrect arithmetic operation (such as dividing by zero) is performed.

SIGUP

The SIGUP signal is sent to the process when the terminal controlled by the signal is closed. Rather than exit on this signal, many daemons will reload their configuration files and reopen their log files.

SIGILL

The SIGILL signal is emitted when an attempt is made to execute an illegal, malformed, unknown, or privileged instruction

SIGINT

When the user wishes to interrupt the process, the operating system sends SIGINT signals to the process. The user wants to interrupt the process by typing Ctrl-c.

SIGKILL

The SIGKILL signal is sent to the process to terminate immediately. In contrast to SIGTERM and SIGINT, this signal cannot capture and ignore execution, and the process cannot perform any cleanup operations after receiving this signal, with some exceptions below

The zombie process cannot be killed because it is already dead and waiting for its parent to capture it

A blocked process will not be killed until it wakes up again

The init process is Linux’s initialization process and will ignore any signals.

SIGKILL is usually sent to the process as the last signal to kill it, usually when there is no response from SIGTERM.

SIGPIPE

Sent to the process when SIGPIPE attempts to write to the process pipe and finds that the pipe is not connected and cannot write

SIGPOLL

A SIGPOLL signal is sent when an event occurs on a file descriptor that is explicitly monitored.

SIGRTMIN to SIGRTMAX

SIGRTMIN to SIGRTMAX are real-time signals

SIGQUIT

When a user requests to exit the process and perform a core dump, the SIGQUIT signal is sent to the process by its controlling terminal.

SIGSEGV

When the SIGSEGV signal makes an invalid virtual memory reference or segmentation error, it is sent to the process when a segmentation violation is executed.

SIGSTOP

SIGSTOP indicates when the operating system terminates for later recovery

SIGSYS

When the SIGSYS signal passes the error parameter to the system call, it is sent to the process.

SYSTERM

We mentioned briefly the term SYSTERM, which is a signal sent to the process to request termination. Unlike SIGKILL signals, this signal can be captured or ignored by the procedure. This allows the process to perform good terminations, freeing resources and preserving state when appropriate. SIGINT is almost the same as SIGTERM.

SIGTSIP

The SIGTSTP signal is sent by its control terminal to the process to request the terminal to stop.

SIGTTIN and SIGTTOU

Signals are sent to the process when SIGTTIN and SIGTTOU signals attempt to read or write from the TTY in the background, respectively.

SIGTRAP

Sends SIGTRAP signals to processes when exceptions or traps occur

SIGURG

The SIGURG signal is sent to the process when the socket has readable emergency or out-of-band data.

SIGUSR1 and SIGUSR2

SIGUSR1 and SIGUSR2 signals are sent to the process to indicate user-defined conditions.

SIGXCPU

A SIGXCPU signal is sent to a process when it runs out of CPU for more time than a predetermined value that can be set by a user

SIGXFSZ

The SIGXFSZ signal is sent to the process when it grows beyond the maximum allowed file size.

SIGWINCH

The SIGWINCH signal is sent to the process when its controlling terminal changes its size (window changes).

Pipeline pipe

Processes on Linux systems can communicate by setting up pipes.

Between two processes, a channel can be established into which one process writes byte streams and the other process reads byte streams from the channel. Pipes are synchronous, and when a process tries to read data from an empty pipe, the process blocks until data is available. < span style = “box-sizing: border-box! Important; word-wrap: break-word! Important;

sort <f | head
Copy the code

It creates two processes, sort and head and sort, and sets up a pipe between the two applications so that the standard output of the SORT process is the standard input of the HEAD program. The output generated by the sort process does not need to be written to the file, and if the pipe is full the system stops sort to wait for the head to read the data

Pipeline is actually |, two applications don’t know the existence of pipeline, everything is by the management and control of the shell.

Shared memory Shared memory

Two processes can also communicate with each other through shared memory, where two or more processes can access the common memory space. The shared work of two processes is done through shared memory, and changes made by one process are visible to the other (much like communication between threads).

Before using shared memory, you need to go through a series of invocation processes as follows

Create a shared memory segment or use an existing shared memory segment(shmget())
Attach a process to the memory segment that has been created(shmat())
Detach processes from connected shared memory segments(shmdt())
Perform control operations on shared memory segments(shmctl())

First-in, first-out queue FIFO

First-in, first-out queue FIFOS are often referred to as Named Pipes, and Named Pipes work much like regular Pipes, but do have some obvious differences. Unnamed pipes have no backup files: the operating system is responsible for maintaining buffers in memory used to transfer bytes from the writer to the reader. Once the write or output terminates, the buffer is reclaimed and the transferred data is lost. Named pipes, by contrast, have supporting files and unique apis, and they exist in the file system as dedicated files for devices. When all process communication is complete, the named pipe is retained in the file system for later use. Named pipes have strict FIFO behavior

The first byte written is the first byte read, the second byte written is the second byte read, and so on.

Message Queue

You may not know what a message queue means when you hear it. A message queue is used to describe an internal list of links in the kernel addressing space. Messages can be sent sequentially to the queue and retrieved from the queue in several different ways. Each message queue is uniquely identified by an IPC identifier. There are two modes of message queue, one is strict mode, strict mode is like FIFO first in first out queue, messages are sent in order, read in order. There is also a non-strict pattern where the order of the messages is not very important.

The Socket Socket

Another way to manage communication between two processes is to use sockets, which provide end-to-end dual-phase communication. A socket can be associated with one or more processes. Just as pipes have command pipes and unnamed pipes, sockets have two modes. Sockets are generally used for network communication between two processes, and network sockets need support from basic protocols such as TCP (Transmission Control Protocol) or lower level UDP (User Datagram Protocol).

Sockets are classified as follows

Sequential Packet Socket: This type of socket provides reliable connections for datagrams of fixed maximum length. The connection is bidirectional and sequential.
Datagram Socket: Packet socket supports bidirectional data streaming. Packet sockets may receive messages in a different order than the sender.
Stream Socket: Streaming sockets work like a telephone conversation, providing a reliable two-way flow of data.
Raw Socket: Basic communication protocols can be accessed using raw sockets.

Process management system calls in Linux

Now look at the system calls related to process management in Linux. You need to know what a system call is before you know it.

Operating systems shield us from the differences between hardware and software, and their primary function is to provide users with an abstraction that hides the internal implementation and lets them only care about how to use it in a GUI graphical interface. Operating systems can be divided into two modes

Kernel mode: mode used by the operating system kernel
User mode: The mode used by the user application

Context switching refers to frequent switching between kernel mode and user mode. System call refers to a way to cause the switch between kernel and user mode. System call usually runs silently in the background, indicating that a computer program requests services from its operating system kernel.

There are many system call instructions, but here are some of the most important system calls related to process management

fork

The fork call is used to create a child that has the same program counters, CPU registers, and open files as the parent.

exec

The exec system call is used to execute the file that resides in the active process, and when exec is called, the new executable replaces the previous executable and gets executed. That is, after calling exec, the old file or program is replaced with a new file or execution, and then the file or program is executed. The new executor is loaded into the same execution space, so the PID of the process does not change because we are not creating a new process, just replacing the old one. But the process’s data, code, and stack have all been modified. If the process currently being replaced contains multiple threads, all threads are terminated and the new process image is loaded for execution.

The concept of Process image needs to be explained here

What is a process image? A process image is an executable file that is needed to execute a program and usually includes the following

Codesegment (codesegment/textsegment)

Also known as text segment, used to store instructions, run code a chunk of memory space

This space size is determined before the code runs

Memory space is generally read-only, and code for some schemas is allowed to be writable

It is also possible to include read-only constant variables, such as string constants, in a code snippet.

datasegment

Can read but write

Stores initialized global variables and initialized static variables

The lifetime of data in a data segment is process persistence: it exists when the process is created and disappears when the process dies

BSS segment:

Can read but write

Store uninitialized global and uninitialized static variables

The default value of data in the BSS segment is 0

The Data segment

Is read-write because the value of a variable can be changed at run time. The size of this segment is also fixed.

Stack:

Can read but write

Stores local variables in functions or code (non-static variables)

The lifetime of the stack lasts with the code block, you get allocated space when the code block runs, and you automatically reclaim space when the code block ends

Heap:

Can read but write

Stores malLOc/RealLOC space dynamically allocated during program execution

The lifetime of the heap persists with the process, from malloc/ Realloc to free

Below is a diagram of how these regions are made

The exec system call is a collection of functions that are

execl
execle
execlp
execv
execve
execvp

Here’s how Exec works

The current process image is replaced with the new process image
The new process image is the one you pass as exec
End the currently running process
The new process image has the PID, the same environment, and some file descriptors (because the process is not replaced, only the process image is replaced)
CPU state and virtual memory are affected, and the virtual memory mapping of the current process image is replaced by the virtual memory of the new process image.

waitpid

Wait for the child process to end or terminate

exit

On many computer operating systems, the termination of a computer process is performed by executing the exit system call command. 0 indicates that the process ends normally, and other values indicate that the process ends in an abnormal behavior.

Some other common system calls are as follows

System call instruction	describe
pause	Hang up the signal
nice	Change the priority of a time-sharing process
ptrace	Process tracking
kill	Sends a signal to the process
pipe	Create the pipe
mkfifo	A special file to create a FIFO (named pipe)
sigaction	Sets the processing method for the specified signal
msgctl	Message control operation
semctl	Semaphore control

Implementation of Linux processes and threads

Linux process

In the Linux kernel structure, processes are represented as tasks, which are created by the structure. Unlike other operating systems that distinguish between processes, lightweight processes, and threads, Linux uses a uniform task structure to represent execution context. Thus, for each single-threaded process, a single-threaded process will be represented by a task structure, and for multi-threaded processes, a task structure will be assigned to each user-level thread. The Linux kernel is multithreaded, and kernel-level threads are not associated with any user-level threads.

For each process, there is a task_struct process descriptor in memory corresponding to it. The process descriptor contains all the useful information about the kernel management process, including scheduling parameters, open file descriptors, and so on. Process descriptors are present in the kernel stack from the moment the process is created.

Linux, like Unix, uses PID to distinguish different processes. The kernel will organize the task structure of all processes into a bidirectional linked list. The PID can be mapped directly to the address of the task structure called the process, thus requiring direct access without traversing the bidirectional linked list.

We mentioned the process descriptor, which is a very important concept, we also mentioned above process descriptor is located in memory, we omit the sentence here, and that is the process descriptor is the user’s task structure, when the process is located in the memory and start running, the process description FuCai will be transferred to memory.

A Process In Memory is called Process In Memory (PIM), which is an embodiment of the von Neumann architecture. A program loaded into Memory and executed is called a Process. Simply put, a process is a program that is executing.

Process descriptors fall into the following categories

Scheduling Parameters: The process priority, the most recent CPU consumption, and the most recent sleep determine which process to run next
Memory imageAs we mentioned above, a process image is an executable file that is required to execute a program and consists of data and code.
Signal (signals): Displays which signals are captured and which are executed
registerWhen the kernel is trapped, the contents of the register are saved.
System Call State: Information about the current system call, including parameters and results
File Descriptor tables: An I-Node data structure that uses the file descriptor as an index to locate the file in the file descriptor table when the system about the file descriptor is called
Accounting StatisticsSome operating systems also store the maximum AMOUNT of CPU time that a process occupies, the maximum stack space that a process has, and the number of pages that a process can consume.
Kernel stack: a fixed stack that can be used by the kernel part of a process
other: Current process status, event wait time, timeout from alarm, PID, parent process PID, user identifier, etc

With this information, it is now easy to describe how these processes are created in Linux, and creating a new process is actually quite simple. Create a new user-space process descriptor for the child process, and then copy a lot of content from the parent process. Assign a PID to the child process, set its memory mapping, give it access to the parent process file, register and start.

When fork a system call, the calling process plunks into the kernel and creates task-specific data structures, such as the kernel stack and thread_info structures.

See the thread_info structure

Docs.huihoo.com/doxygen/lin…

This structure contains the process descriptor, which is located in a fixed location, allowing Linux systems to locate the data structure of a running process with minimal overhead.

The main content of a process descriptor is to populate it according to the parent process descriptor. The Linux operating system looks for an available PID that is not being used by any process, so update the process identifier to point to a new data structure. To reduce collisions with hash tables, process descriptors form linked lists. It also sets the task_struct field to point to the corresponding previous/next process on the task array.

Task_struct: a Linux process descriptor that contains a lot of C++ source code, which we’ll cover later.

In principle, memory area is opened for the child process, data segment and stack segment are allocated for the child process, and the content of the parent process is copied. However, in fact, after the fork is completed, the child process and the parent process do not share memory, so the replication technology is needed to realize synchronization, but the replication cost is relatively high. So the Linux operating system uses a trick. That is, the child process is assigned a page table, and then the newly assigned page table points to the parent process’s pages, while these pages are read-only. Protection errors are enabled when the process writes to these pages. After the kernel discovers a write operation, it allocates a copy to the process, so that the data is copied to the copy and the copy is shared. This method is called copy on write. This method avoids the need to maintain two copies in the same memory area and saves memory space.

After the child process starts running, the operating system calls the exec system call, and the kernel does a lookup to validate the executable, copying parameters and environment variables to the kernel, freeing up the old address space.

Now the new address space needs to be created and populated. If the system supports mapped files, as it does on Unix systems, a new page table is created, indicating that there are no pages in memory, unless the page being used is a stack page whose address space is supported by an executable on disk. When a new process starts running, it immediately receives a Page fault, which causes pages with code to load into memory. Finally, the parameters and environment variables are copied to the new stack, the signal is reset, and the registers are all zeroed out. The new command starts running.

Here is an example where the user prints ls, the shell calls fork to copy a new process, and the shell calls exec to overwrite its memory with the contents of the executable file LS.

Linux threads

Now let’s talk about threads in Linux. Threads are lightweight processes. You’ve heard this a lot. If the parent process forks, the shared file is used, but thread switching is not as expensive as process switching, and threads communicate more easily. There are two types of threads: user-level threads and kernel-level threads

User-level thread

User-level threads avoid using the kernel. Typically, each thread will display a call switch, send a signal, or perform some kind of switch operation to abandon the CPU. Similarly, timers can force the switch, and user threads usually switch much faster than kernel threads. One problem with implementing threads at the user level is that a single thread can monopolize the CPU time slice, causing other threads to fail to execute and starve to death. If an I/O operation is performed, the I/O blocks and other threads cannot run.

One solution is that some user-level threading packages address this problem. You can use a clock cycle monitor to control first time slice exclusivity. Some libraries then address I/O blocking for system calls through special wrappers, or they can write tasks for non-blocking I/O.

Kernel level thread

Kernel-level threads are typically implemented in the kernel using several process tables, one for each task. In this case, the kernel schedules each thread within the time slice of each process.

All calls that can block are implemented as system calls, and when a thread blocks, the kernel can choose between running another thread in the same process (if there are any ready threads) or running another thread in another process.

The overhead from user space -> kernel space -> user space is high, but the thread initialization time loss is negligible. The advantage of this implementation is that the thread switch time is determined by the clock, so it is less likely to tie the time slice to the time occupied by other threads in the task. Similarly, I/O blocking is not a problem.

Hybrid implementation

Combining the advantages of user-space and kernel-space, designers take a kernel-level thread approach and then multiplex user-level threads with some or all of the kernel threads

In this model, the programmer is free to control the number of user threads and kernel threads, with great flexibility. With this approach, the kernel identifies only kernel-level threads and schedules them. Some of these kernel-level threads are multiplexed by multiple user-level threads.

Linux scheduling

Let’s look at scheduling algorithms in Linux. The first thing we need to realize is that threads in Linux are kernel threads, so Linux is thread-based, not process-based.

For scheduling purposes, Linux divides threads into three categories

Real-time first in, first out
Real-time polling
time-sharing

The real-time first-in, first-out thread has the highest priority and is not preempted by another thread unless it is a freshly prepared thread with a higher priority. The real-time rotation thread is basically the same as the real-time first-in, first-out thread, except that each real-time rotation thread has an amount of time and can be preempted when the time is up. If more than one real-time thread is ready, then each thread runs for the amount of time it has, and then inserts into the end of the real-time rotation thread.

Note that this real time is only relative and cannot be absolute real time because the running time of the thread is uncertain. They are more real-time than time-sharing systems

Linux assigns a nice value to each thread, which represents the concept of priority. The nice value defaults to 0, but can be changed by system calls to nice. The value ranges from -20 to +19. The nice value determines the static priority of the thread. The average system administrator’s NICE value has a higher priority than the average thread, and it ranges from -20 to -1.

Let’s look at two scheduling algorithms for Linux in more detail, which are internally similar in design to scheduling queues. A runqueue has a data structure that monitors all runnable tasks in the system and selects the next runnable task. Each run queue is related to each CPU in the system.

The Linux O(1) scheduler is a historically popular scheduler. The name comes from its ability to perform task scheduling in constant time. In the O(1) scheduler, the scheduling queue is organized into two arrays, one for tasks that are active and one for tasks that are expired. As shown in the figure below, each array contains 140 linked headers, each with a different priority.

The general process is as follows:

The scheduler selects the highest priority task from the active array. If the time slice of the task is expired, it is moved to the expired array. If the task is blocked, such as waiting for the I/O events, so before its time slice expired, once the I/O operation is complete, then this task will continue to run, it will be back before being in the array of activities, because before this task has consumed part of CPU time slice, so it will run the rest of the time slice. When the task has run its time slice, it is put into the expired array. Once there are no other tasks in the active array, the scheduler swaps Pointers so that the active array becomes an expired array, and the expired array becomes an active array. Using this approach ensures that tasks of each priority are executed without thread starvation.

In this scheduling method, tasks with different priorities get different slices of time allocated by CPU. Processes with higher priorities usually get longer slices, while tasks with lower priorities get less slices.

In order to ensure better service delivery, the interactive process is usually given a higher priority, which is the user process.

Linux doesn’t know whether a task is I/O intensive or CPU intensive, it just depends on the interactive approach, and Linux differentiates between static priorities and dynamic priorities. Dynamic prioritization is achieved using a reward mechanism. There are two ways to reward interactive threads and penalize cpu-hogging threads. In the Linux O(1) scheduler, the highest priority reward is -5. Note that lower this priority is more easily accepted by the thread scheduler, so the highest penalty priority is +5. The operating system maintains a variable named sleep_AVg, which increases when tasks are awakened and decreases when tasks are preempted or the amount of time expired, which is reflected in the reward mechanism.

The O(1) scheduling algorithm is the 2.6 kernel version of the scheduler, which was originally introduced in the unstable 2.5 version. Early scheduling algorithms in multiprocessor environments demonstrated that scheduling decisions could be made by accessing active arrays. So that scheduling can be completed at a fixed time O(1).

What does it mean that the O(1) scheduler uses a heuristic approach?

In computer science, a heuristic is a way to solve a problem quickly when traditional methods are slow, or to find an approximate solution where traditional methods cannot find any exact solution.

O(1) The use of heuristics in this way makes task priorities complex and imperfect, resulting in poor performance when dealing with interactive tasks.

To address this shortcoming, the developers of the O(1) Scheduler proposed a new solution, the Completely Fair Scheduler (CFS). The main idea of CFS is to use a red-black tree as a scheduling queue.

Data structures are too important.

CFS arranges tasks in a tree according to how long they have been running on the CPU, down to nanoseconds. The following is the CFS construction model

The scheduling process of CFS is as follows:

The CFS algorithm always prioritizes the tasks that use the least CPU time. The smallest tasks are usually on the far left. When a new task needs to be run, CFS compares the task to the leftmost value. If the task has a minimum time value, it will run, otherwise it will compare and find the right place to insert. The CPU then runs the leftmost task currently compared on the red-black tree.

The time it takes to select a node to run in a red-black tree can be constant time, but the time to insert a task is O(loog(N)), where N is the number of tasks in the system. This is acceptable given the current load level on the system.

The scheduler only needs to consider runnable tasks. These tasks are placed in the appropriate scheduling queue. Non-runnable tasks and tasks waiting for various I/O operations or kernel events are put into a wait queue. The wait queue header contains a pointer to the list of tasks and a spin lock. Spinlocks are useful in concurrent processing scenarios.

Synchronization in Linux

Let’s talk about synchronization in Linux. Early Linux kernels had only one Big Kernel Lock (BKL). It blocks the ability of different processors to process concurrently. Therefore, some more granular locking mechanisms need to be introduced.

Linux provides several different types of synchronization variables that can be used both in the kernel and in user applications. In layers, Linux provides encapsulation for hardware-supported atomic instructions by using operations such as atomic_set and atomic_read. Hardware provides memory reordering, which is the mechanism of the Linux barrier.

The description of a high level of synchronization like spin lock is that when two processes access a resource at the same time, after one process gets the resource, the other process does not want to be blocked, so it spins and waits for a while to access the resource. Linux also provides mechanisms such as mutex or semaphore, and also supports non-blocking calls such as mutex_tryLock and mutex_tryWait. Interrupts are also supported to handle transactions and can be dynamically disabled and enabled for the corresponding interrupts.

Linux boot

Let’s talk about how Linux starts.

After the computer is powered On, the BIOS performs power-on self-test (POST) to check and initialize the hardware. Because the operating system will use the disk, screen, keyboard, mouse and other devices. Next, the first partition on the disk, also known as the MBR(Master Boot Record) Master Boot Record, is read into a fixed memory area and executed. This partition contains a very small, 512-byte program. The program calls a boot independent program from the disk, and the boot program copies itself to the high address memory to free the low address memory for the operating system.

After the replication is complete, the boot program reads the root directory of the boot device. Boot programs understand file system and directory formats. The Boot program is then called into the kernel, transferring control to the kernel. Until here, Boot has done its job. The system kernel starts running.

The kernel boot code is done in assembly language, mainly including creating the kernel stack, identifying the CPU type, calculating memory, disabling interrupts, starting the memory management unit, etc., and then calling the MAIN function of C language to execute the operating system part.

This section also does a number of things. First, a message buffer is allocated to hold debugging problems, and debugging information is written to the buffer. This information can be called up by a diagnostic program if debugging goes wrong.

Then the operating system will automatically configure the device, detect the device, load the configuration file, and if the detected device responds, it will be added to the linked device table. If there is no response, it is classified as disconnected and ignored.

After configuring all the hardware, the next thing to do is to carefully manually process process 0, set up its stack, and then run it, perform initialization, configure the clock, and mount the file system. Create the init process (process 1) and the daemon (process 2).

The init process detects its flag to determine whether it serves single or multiple users. In the former case, it calls fork to create a shell process and waits for the process to terminate. In the latter case, fork is called to create a process that runs the system-initialized shell script (i.e. /etc/rc), checks file system consistency, mounts file systems, starts daemons, and so on.

The /etc/rc process then reads data from /etc/ttys, which lists all terminals and attributes. For each enabled terminal, the process calls the fork function to create a copy of itself, process it internally and run a program called Getty.

The Getty program will type on the terminal

login:
Copy the code

Wait for the user to enter a user name. After the user name is entered, the Getty program ends and the /bin/login program starts running. The login program enters a password and compares it with the password saved in /etc/passwd. If the password is correct, the login program replaces itself with a user shell program and waits for the first command. If not, the login program asks for a different user name.

The entire system startup process is as follows

Linux Memory Management

The Linux memory management model is straightforward, because this mechanism makes Linux portable and can be implemented on machines with similar memory management units. Let’s take a look at how Linux memory management is implemented.

The basic concept

Every Linux process has an address space that consists of three segments: text, data, and stack. The following is an example of a process address space.

Data segments contain the storage of program variables, strings, arrays, and other data. The data segment is divided into two parts, the initialized data and the uninitialized data. The uninitialized data is what we call BSS. Initialization of the data section requires compilation of constants that are determined at the time and variables that require an initial value at the start of the program. All variables in the BSS section are initialized to 0 after loading.

Unlike Text segments, data segments can be changed. A program always modifies its variables. Moreover, many programs need to allocate space dynamically at execution time. Linux allows data segments to grow or shrink as memory is allocated and reclaimed. To allocate memory, a program can increase the size of a data segment. In C, there is a standard library called malloc that is often used to allocate memory. The process address space descriptor contains dynamically allocated memory areas called the heap.

The third segment is the stack segment. On most machines, the stack segment will be at the top address of the virtual memory address and further down (toward zero address space). For example, on 32-bit x86 machines, the stack starts at 0xC0000000, which is the 3GB virtual address limit that processes are allowed to see in user mode. If the stack continues to grow beyond the stack segment, a hardware failure occurs and the page drops one page.

When the program starts, the stack area is not empty. Instead, it contains all the shell environment variables and the command lines typed into the shell in order to invoke it. For example, when you type

cp cxuan lx
Copy the code

The cp program will run with the string cp cxuan Lx on the stack, which will find the names of the source and destination files.

When two users are running the same program, such as an editor, two copies of the editor program code are kept in memory, but this approach is not efficient. Linux supports shared text segments as an alternative. In the figure below we see two processes, A and B, which have the same text area.

Data segments and stack segments are shared only after the fork, and unmodified pages are shared. If either one needs to be larger but has no adjacent space to accommodate it, there is no problem, because adjacent virtual pages do not have to map to adjacent physical pages.

In addition to dynamically allocating more memory, processes in Linux can access file data through memory-mapped files. This feature allows us to map a file to a portion of process space and the file can be read and written as if it were an array of bytes in memory. Mapping a file in makes random reads and writes much easier than using I/O system calls like read and write. Access to shared libraries uses this mechanism. As shown below.

We can see that two identical files are mapped to the same physical address, but they belong to different address Spaces.

The advantage of mapping files is that two or more processes can be mapped to the same file at the same time, and the write operations of any one process to the file are visible to other files. Multi-threaded shared memory can be provided with high bandwidth by mapping temporary files that disappear after the process exits. In practice, however, no two address Spaces are the same, because each process maintains different open files and signals.

Linux memory management system call

Let’s look at the system call approach to memory management. In fact, POSIX does not specify any system calls for memory management. Linux, however, has its own in-memory system calls, the main ones being the following

The system calls	describe
s = brk(addr)	Change the data segment size
a = mmap(addr,len,prot,flags,fd,offset)	mapping
s = unmap(addr,len)	Cancel the mapping

If an error is encountered, the return value of s is -1, a and ADDR are memory addresses, len is the length, prot is the control guard bit, flags is the other flag bit, fd is the file descriptor, and offset is the file offset.

BRK specifies the size of a data segment by giving the first byte address beyond the data segment. If the new value is larger than the old one, the data area becomes larger and larger, and vice versa.

The MMAP and unmap system calls control the mapping files. The first parameter addr to MMP determines the address of the file map. It must be a multiple of the page size. If the argument is 0, the system assigns an address and returns a. The second parameter is length, which tells you how many bytes need to be mapped. It is also a multiple of the page size. Prot determines the protection bit of the mapped file, which can be marked as readable, writable, executable, or a combination of these. The fourth argument flags controls whether the file is private or readable and whether addr is required or just prompted. The fifth parameter, fd, is the file descriptor to map. Only open files can be mapped, so if you want to map files, you must open files; The last argument, offset, indicates when the file was started; it doesn’t have to start from zero every time.

Linux memory management implementation

Memory management system is one of the most important parts of an operating system. Since the early days of computing, we’ve used more memory than we actually had in the system. Memory allocation strategies overcome this limitation, and the most famous of them is virtual memory. Virtual memory enables the system to have more memory by sharing it among competing processes. The virtual memory subsystem mainly includes the following concepts.

Large address space

The operating system makes the system appear to be much larger than actual physical memory because virtual memory is many times larger than physical memory.

To protect the

Each process in the system has its own virtual address space. These virtual address Spaces are completely separate from each other, so processes running one application do not affect the other. Also, the hardware virtual memory mechanism allows memory to protect critical memory areas.

The memory mapping

Memory mapping is used to map images and data files to the process address space. In memory mapping, the contents of a file are mapped directly into the virtual space of a process.

Fair physical memory allocation

The memory management subsystem allows each running process in the system to fairly allocate the system’s physical memory.

Shared virtual memory

Although virtual memory allows processes to have their own memory space, there are times when you need shared memory. For example, if several processes are running in the shell at the same time, this will involve IPC interprocess communication, where you need shared memory for information transfer rather than copying each process to run independently.

So let’s talk formally about what is virtual memory

Abstract model of virtual memory

Before considering the methods Linux uses to support virtual memory, it’s useful to consider an abstract model that doesn’t get bogged down in too many details.

When the processor executes an instruction, it reads the instruction from memory and decode it. When the instruction is decoded, it retrieves the contents of a certain location and stores it in memory. The processor then proceeds to execute the next instruction. Thus, the processor is always accessing memory to retrieve instructions and store data.

In a virtual memory system, all address Spaces are virtual rather than physical. But it’s the physical addresses that actually store and extract instructions, so you need to have the processor translate virtual addresses into physical addresses based on a table maintained by the operating system.

For easy conversion, virtual and physical addresses are divided into fixed-size blocks called pages. These pages have the same size, and if the page size is different, it will be difficult for the operating system to manage. Linux on Alpha AXP systems uses 8 KB pages, while Linux on Intel x86 systems uses 4 KB pages. Each page has a unique number, the page Frame number (PFN).

This is the Linux memory mapping model. In this page model, the virtual address consists of two parts: the offset and the virtual page box number. Each time the processor encounters a virtual address, it extracts the offset and the virtual page box number. The processor must convert the virtual page box number to the physical page number and then access the physical page at the correct offset position.

The figure above shows the virtual address space of two processes, A and B, each with its own page table. These page tables map virtual pages in the process to physical pages in memory. Each item in the page table is contained

Valid Flag: Indicates whether the page table entry is valid
This entry describes the physical page frame number
Access control information, how the page is used, whether it is writable and whether the code can be executed

To map the virtual address of the processor to the physical address of memory, you first need to calculate the page number and offset of the virtual address. The page size is a power of 2 and can be done by shifting.

If the current process attempts to access the virtual address but fails to access it, this situation is called a page-missing exception. In this case, the virtual operating system notifies the operating system of the incorrect address and the cause of the page error.

By mapping virtual addresses to physical addresses in this way, virtual memory can be mapped to the physical pages of the system in any order.

On-demand paging

Since physical memory is much less than virtual memory, the operating system needs to be careful to avoid direct use of inefficient physical memory. One way to save physical memory is to load only the pages the executor is currently using (a lazy loading idea, isn’t it?). . For example, you can run the database to query the database, in which case not all the data is loaded into memory, only the data that needs to be checked. This technique of loading virtual pages in only when needed is called on-demand paging.

exchange

If a process needs to put a virtual page into memory, but no physical page is available at this point, the operating system must discard another page in physical memory to make room for that page.

If the page has been modified, the operating system must preserve the content of the page so that it can be accessed later. Pages of this type are called dirty pages, and when removed from memory, they are saved in special files called swap files. Access to swap files is very slow relative to the speed of the processor and physical memory, and the operating system needs to balance writing pages to disk with keeping them in memory for reuse.

Linux using a least recently used (LRU) page aging technology to fair choose the page may be removed from the system, this scheme involved in the system each page, the age of the page with the changes in the number of access, if a page visited many, then the page said the younger, if a er page access number is too little, The easier the page is to swap out.

Physical and virtual addressing modes

Most multipurpose processors support the concepts of physical address mode and virtual address mode. The physical addressing mode does not require a page table, and the processor does not attempt to perform any address translation in this mode. The Linux kernel is linked to run in a physical address space.

Alpha AXP processors have no physical addressing mode. Instead, it divides the memory space into several regions, two of which are specified as physically mapped addresses. This kernel address space is called the KSEG address space and contains all addresses from 0xFFFFFC0000000000 up. In order to execute or access data from linked code (kernel code, by definition) in KSEG, the code must be executed in kernel mode. The Linux kernel linked to Alpha is executed from the address 0xffffFC0000310000.

Access control

Each entry in the page table also contains access control information, which checks whether the process should access memory.

Memory access is restricted if necessary. For example, memory containing executable code is naturally read-only memory; The operating system should not allow processes to write data through their executable code. In contrast, pages containing data can be written, but attempts to execute instructions for that memory will fail. Most processors have at least two modes of execution: kernel mode and user mode. You don’t want access to user-executed kernel code or kernel data structures unless the processor is running in kernel mode.

The access control information is stored in the Page Table Entry, the Page Entry above, which is the Alpha AXP PTE. Bit fields have the following meanings

The value is valid, indicating whether the bit is valid

Failed at read time, failed while trying to read this page

Write time error, error occurred while trying to write

When an error occurs during execution, the processor reports the page error and passes control to the operating system whenever it tries to execute the instructions in the page.

Address space matching, which is used when the operating system wants to clear some entries in the conversion buffer.

Hint used when mapping an entire block using a single transform buffer entry instead of multiple transform buffer entries.

Code running in kernel mode can read pages

Code in user mode can read pages

Code running in kernel mode can be written to pages

Code running in user mode can be written to a page

Page frame no.

For a PTE with V bits set, this field contains the physical page frame number (page frame number) for that PTE. For invalid PTES, if this field is not zero, it contains information about the location of the page in the swap file.

In addition, Linux uses two bits

_PAGE_DIRTY

If set, the page needs to be written out to the swap file

_PAGE_ACCESSED

Linux is used to mark pages as visited.

The cache

The virtual memory abstraction model above can be used to implement, but not very efficiently. Both operating system and processor designers try to improve performance. But in addition to increasing the speed of processors, memory, etc., the best way to do this is to maintain a cache of useful information and data to make certain operations faster. In Linux, there is a lot of memory-management-related buffering, which is used to improve efficiency.

Buffer cache

The buffer cache contains the data buffers used by block device drivers.

Remember what a block device is? So just to review

A block device is a device capable of storing fixed-size chunks of information that can be read and (optionally) written to fixed-size chunks, sectors, or clusters. Each block has its own physical address. Typically block sizes range from 512 to 65536. All information transmitted will be in contiguous chunks. The basic feature of a block device is that each block is opposite and can read and write independently. Common block devices include hard disks, Blu-ray discs, and USB disks

Block devices generally require fewer pins than character devices.

The buffer cache is used to quickly find blocks of data by device identifiers and block numbers. If the data can be found in the buffer cache, there is no need to read the data from the physical block device, which is much faster.

Page caching

Page caching is used to speed up access to images and data on disk

It is used to cache the contents of a file one page at a time and can be accessed through the file and offsets within the file. When pages are read into memory from disk, they are cached in the page cache.

Swap cache

Only the changes (dirty pages) are saved in the swap file

As long as these pages are not modified after being written to the swap file, the next time the page is swapped, it does not need to be written to the swap file because the page is already in the swap file. You can just throw it away. On a heavily switched system, this saves a lot of unnecessary and expensive disk operations.

Hardware cache

A hardware cache is typically used in the processor. Caching of page table entries. In this case, the processor does not always read the page table directly, but caches the translation of the page as needed. These are transformation backup buffers also known as TLBS, which contain cached copies of page table entries from one or more processes in the system.

After referencing the virtual address, the processor attempts to find a matching TLB entry. If found, the virtual address can be translated directly into a physical address and the correct operation can be performed on the data. If the processor cannot find a matching TLB entry, it obtains support and help from the operating system by signaling that TLB loss has occurred. A system-specific mechanism is used to pass this exception to the operating system code that can fix the problem. The operating system generates a new TLB entry for the address map. After the exception is cleared, the processor tries to convert the virtual address again. The execution succeeds this time.

There are downsides to using caches. To save effort, Linux must spend more time and space maintaining these caches, and the system will crash if the caches are corrupted.

Linux page table

Linux assumes that page tables are divided into three levels. Each page table accessed contains the next level page table

The PDG in the figure represents the global page table, and whenever a new process is created, a new page directory, PGD, is created for the new process.

To convert a virtual address to a physical address, the processor must take the contents of each level field, convert it to the offset of the physical page containing the page table, and read the page box number of the next level of the page table. Do this three times until you find the frame number of the physical page that contains the virtual address.

Every platform on which Linux runs must provide translation macros that allow the kernel to traverse a particular process’s page table. This way, the kernel does not need to know the format of the page table entries or how they are arranged.

Page allocation and unallocation

There are many requirements for physical pages in the system. For example, when an image is loaded into memory, the operating system needs to allocate pages.

All physical pages in the system are described by the MEM_map data structure, which is a list of MEM_MAP_t. It includes some important attributes

Count: This is the number of users on the page, greater than 1 when the page is shared between multiple processes
Age: This describes the age of the page and is used to determine whether the page is suitable for discarding or swapping
Map_nr: This is the physical page frame number described by this mem_map_t.

The page allocation code finds and frees pages using the free_area vector, each element of which contains information about the page block.

Distribution of the page

Linux page allocation uses a well-known partner algorithm to assign and unassign pages. Pages are block-allocated in powers of two. This means that it can allocate 1, 2, 4, and so on, as long as there are enough pages available in the system to meet the requirements. The criteria is nr_free_pages> min_free_pages. If this is true, a page block of the required size is searched in the free_area to complete allocation. Each element of free_area has a mapping of allocated pages and free page blocks for blocks of that size.

The allocation algorithm searches for the requested size of the page block. If no page blocks of the requested size are available, a page block twice the requested size is searched and repeated until a page block is found all the way through free_area. If the page block found is larger than the requested page block, the found page block is subdivided until the appropriate size is found.

Since each block is a power of two, the splitting process is easy because you just split the block in half. Free blocks are queued in the appropriate queue, and the allocated page block is returned to the caller.

If a 2-page block is requested, the first block of 4 pages (starting from the frame on page 4) will be split into two 2-page blocks. The first page (starting with the frame on page 4) is returned to the caller as the allocated page, and the second block (starting with the page on page 6) is queued as a 2-page free block on element 1 of the free_area array.

Page unassignment

One of the biggest consequences of this memory approach is fragmentation of memory, splitting larger free pages into smaller ones. The page unallocation code regroups the page into larger free blocks as much as possible. Each time a page is released, adjacent blocks of the same size are checked to see if they are free. If so, it is combined with the newly freed page block to form a new free page block of the next page-size block. Each time two page blocks are recombined into a larger free page block, the page-free code attempts to recombine that page block into a larger free page. In this way, blocks of available pages will use as much memory as possible.

For example, in the figure above, if you want to free the page on page 1, combine it with the already free page frame on page 0 and queue it as a free block of 2 pages in element 1 of free_area

The memory mapping

The kernel has two types of memory mapping: shared and private. Private is used when the process does not write to the file in order to read only the file. In this case, private mapping is more efficient. However, any write to a private mapped page causes the kernel to stop mapping pages in that file. Thus, the write neither changes the file on disk nor is it visible to other processes accessing the file.

On-demand paging

Once the executable image has been memory-mapped to virtual memory, it can be executed. Because only the beginning of the image is physically pulled into memory, it will quickly access areas of virtual memory that do not already exist in physical memory. The operating system reports this error when a process accesses a virtual address that does not have a valid page table.

Page error describes the virtual address of the page error and the type of memory access (RAM) caused.

Linux must find the VM_areA_struct structure that represents the memory area where the page error occurred. Because searching the VM_AREA_struct data structure is critical to handling page errors effectively, they are linked together in an AVL (Adelson-Velskii and Landis) tree structure. If the virtual address causing the failure does not have a VM_AREA_struct structure, then the process has accessed an invalid address. Linux issues SIGSEGV signals to the process, and if the process does not have a handler for the signal, the process will terminate.

Linux then checks for the type of page error that occurred against the type of access allowed for this virtual memory area. A memory access error is also signaled if the process accesses memory in an illegal manner, such as writing to a read-only area.

Linux has now determined that the page error is legitimate, so it must be handled.

The file system

In Linux, the most intuitive and visible part is the file system. Let’s take a look at the principles and ideas behind Linux China’s filesystem, system calls and filesystem implementations. Some of these ideas came from MULTICS and are now used by other operating systems such as Windows. Linux is designed to be Small is Beautiful. Although Linux uses only the simplest mechanisms and a few system calls, Linux provides a powerful and elegant file system.

Basic Concepts of Linux file system

Linux was originally designed as a MINIX1 file system, which only supported 14-byte filenames, and its maximum file size was only 64 MB. The file system after MINIX 1 is ext. The Ext system was much better at supporting both byte sizes and file sizes than MINIX 1, but ext was still not as fast as MINIX 1. As a result, Ext 2 was developed to support long filenames and large files with better performance than MINIX 1. This makes it the primary file system for Linux. Linux, however, uses VFS and used to support multiple file systems. With Linux links, you can dynamically mount different file systems onto the VFS.

A file in Linux is a sequence of bytes of arbitrary length, and a file in Linux can contain arbitrary information, such as ASCII, binary, and other types of files.

For convenience, files can be organized in a directory, and directories stored as files can be largely processed as files. Directories can have subdirectories to form hierarchical file systems. On Linux, the root directory is /, which usually contains multiple subdirectories. The/character is also used to distinguish directory names. For example, /usr/cxuan indicates the usr directory under the root directory, which has a subdirectory named cxuan.

Let’s introduce the directory names under the root directory of the Linux system

/binIt is an important binary application that contains the binary files used by all users of the system
/bootTo launch the relevant file that contains the boot loader
/devContains device files, terminal files, USB or any device connected to the system
/etc, configuration files, startup scripts, etc., including configuration files required by all programs, as well as startup and shutdown shell scripts to start/stop individual applications
/homeThe home directory is used by all users to store personal information
/libSystem library files, including binary library files supported under /bin and /sbin
/lost+foundTo provide a lost + lookup system in the root directory, you must be root to view the contents of the current directory
/media, mount removable media
/mntTo mount the file system
/optTo provide an optional application installation directory
/proc, a special dynamic directory for maintaining system information and status, including information about running processes
/root, the primary directory folder of the root user
/sbinImportant binary system files
/tmpTemporary files created by the system and the user. All files in this directory will be deleted when the system restarts
/usrContains applications and files that are accessible to most users
/var, frequently changing files such as log files or databases, etc

In Linux, there are two kinds of paths. One is absolute path, which tells you to look up files from the root directory. The disadvantage of absolute path is that it is too long and inconvenient. The relative path is also called a working directory.

If /usr/local/books is the working directory, then shell command

cp books books-replica 
Copy the code

This is the relative path, and

cp /usr/local/books/books /usr/local/books/books-replica
Copy the code

Represents an absolute path.

It is common in Linux for one user to use another user’s files or to use files in a file tree structure. Two users share a file in one user’s directory structure that must be referenced by an absolute path if another user wants to use it. If the absolute path is long, it becomes cumbersome to type each time, so Linux provides a link mechanism.

As an example, here is a diagram before using links

As shown above, if there are two working accounts Jianshe and CXuan, and Jianshe wants to use the A directory under CXuan, it might type /usr/cXuan /A, which is A graph after the unused link.

After using the link, the following is an illustration

Jianshe can now create a link to use the directory under CXuan. ‘

When a directory is created, two directory entries are created at the same time. And.. , the former represents the working directory itself, and the latter represents the parent directory of the directory, that is, the directory in which the directory resides. So, /usr/jianshe to access the directory in cXuan is.. /cxuan/xxx

What does it mean that Linux file systems are disk-insensitive? In general, file systems on one disk remain separate from each other. If a file system directory wants to access a file system on another disk, you can do so in Windows.

The two file systems are on different disks and are independent of each other.

In Linux, however, there is mount support, which allows one disk to be mounted to another, so that the relationship at the top will look like this

After hanging, the two file systems no longer need to care which disk the file systems are on, and the two file systems are visible to each other.

Another feature of Linux file systems is support for locking. In some applications, two or more processes may use the same file at the same time, which can lead to race conditions. One solution is to lock it with different granularity, in order to prevent a process from modifying only one row of records and making the entire file unusable.

POSIX provides a flexible locking mechanism with varying levels of granularity that allows a process to lock a byte or an entire file using a single indivisible operation. The locking mechanism requires the process attempting to lock to specify the file it wants to lock, the starting position, and the bytes it wants to lock

Linux provides two types of locks: shared locks and mutex locks. If a part of a file already has a shared lock, adding an exclusive lock will not work. If a part of the file system is already mutex, any locking until the mutex is unlocked will not succeed. In order to successfully lock, all bytes of the part requesting the lock must be available.

In the locking phase, the process needs to design the situation after the locking failure, that is, determine whether to choose blocking after the locking failure. If blocking mode is selected, when the lock in the locked process is deleted, the process will unblock and replace the lock. If the process chooses non-blocking, the lock is not replaced, it is immediately returned from the system call with a status code indicating whether the lock was successfully locked, and the process chooses the next time to try again.

Locking areas can be overlapped. Below we demonstrate the locking regions for three different conditions.

As shown in the figure above, the shared lock of A is locked from byte 4 to byte 8

As shown above, the process places A shared lock on both A and B, where 6-8 bytes are overlapped locks

As shown in the figure above, processes A and B and C have shared locks, so byte 6 and byte 7 are shared locks.

If A process attempts to lock the sixth byte, the setting will fail and block. Since the region is locked by A, B, and C at the same time, the process will not be locked until A, B, and C release the lock.

Linux file system call

Many system calls involve files and file systems. Let’s first look at system calls to individual files and then look at system calls to entire directories and files.

To create a new file, the creat method is used, without the e.

There was an incident where Ken Thompson, the founder of UNIX, was asked what he would do if he had the chance to rewrite UNIX. He said he would change creat to Create.

The two arguments to this system call are the file name and the protection mode

fd = creat("aaa",mode);
Copy the code

This command creates a file named aaa and sets the file’s protection bit according to mode. These bits determine which user is likely to access the file and how.

The CREAT system call not only creates a file named AAA, but also opens it. To allow subsequent system calls to access the file, the CREAT system call returns a non-negative integer, which is called a file descriptor, which is fd above.

If a CREAT system call is made on an existing file, the contents of that file are cleared, starting at 0. The Open system call can also create files by setting the appropriate parameters.

Let’s take a look at the main system calls, as shown in the following table

The system calls	describe
fd = creat(name,mode)	A way to create a new file
Fd = open (file,…).	Open the file to read, write, or read
s = close(fd)	Close an open file
n = read(fd, buffer, nbytes)	Read data from a file into the cache
n = write(fd, buffer, nbytes)	Write data from the cache to a file
position = lseek(fd, offset, whence)	Moving file pointer
s = stat(name, &buf)	Obtaining File Information
s = fstat(fd, &buf)	Obtaining File Information
s = pipe(&fd[0])	Create a pipe
S = an FCNTL (fd,…).	File lock and other operations

In order to read and write a file, you need to open the file first. You must use creat or open to open the file. The parameter is whether to open the file read-only, read/write, or write only. The open system call also returns the file descriptor. Once the file is opened, you need to close it using the close system call. Close and open always return the minimum number of FDS not used.

What is a file descriptor? A file descriptor is a number that identifies the open file in a computer’s operating system. It describes data resources and how they are accessed.

When a program asks to open a file, the kernel does the following

Grant access
inGlobal File TableCreate aEntry (entry)
Provide the location of the entry to the software

File descriptors consist of unique non-negative integers, and at least one file descriptor exists for every open file on the system. File descriptors were originally used in Unix and are used by modern operating systems including Linux, macOS, and BSD.

When a process successfully accesses an open file, the kernel returns a file descriptor that points to an entry in the global file table. This file entry contains file inode information, byte shift, access limit, etc. For example, see the following figure

By default, the first three file descriptors are STDIN(standard input), STDOUT(standard output), and STDERR(standard error).

The file descriptor for standard input is 0, which in terminals defaults to the user’s keyboard input

The file descriptor for standard output is 1, which in terminals defaults to the user’s screen

The default data flow associated with the error is 2, which in the terminal defaults to the user’s screen.

After a brief talk about file descriptors, let’s return to the discussion of file system calls.

Among file system calls, the most expensive are read and write. Both read and write take three parameters

File descriptor: tells which open file to read and write to
Buffer address: Tell where data needs to be read and written from
statistical: Tells you how many bytes need to be transferred

That’s all the parameters. It’s a very simple and lightweight design.

While almost all programs read and write files sequentially, some programs need to be able to randomly access any part of the file. Associated with each file is a pointer that indicates the current location in the file. When read (or written) sequentially, it usually points to the next byte to be read (written). If the pointer is at position 4096 before 1024 bytes are read, it will automatically move to position 5120 after a successful read of the system call.

The Lseek system call changes the value of the pointer position so that subsequent calls to read or write can begin anywhere in the file, even beyond the end of the file.

Lseek = lseek, uppercase.

The reason lseek avoids being called seek is because seek was already used for search on previous 16-bit computers.

Lseek takes three arguments: the first is the file descriptor of the file, and the second is the location of the file. The third tells the file that the location is relative to the beginning of the file and that the current location is still the end of the file

lseek(int fildes, off_t offset, int whence);
Copy the code

The return value of lseek is the absolute position in the file after changing the file pointer. Lseek is the only system call that never causes a true disk lookup; it simply updates the current file location, which is the number in memory.

For each file, Linux keeps track of the file mode (regular, directory, special file), size, last modified time, and other information. The program can see this information through the STAT system call. The first argument is the file name and the second is a pointer to the structure where the request information is to be placed. The properties of these structures are shown below.

A device for storing files
A device for storing files
I – node number
File mode (including protected bit information)
Number of file links
File owner identity
The group to which the file belongs
File size (bytes)
Creation time
Last modification/access time

The fstat call is the same as stat, except that fstat can operate on open files, while stat can only operate on paths.

The PIPE file system call is used to create a shell pipe. It creates a series of pseudo files to buffer the data between the pipe component and returns file descriptors to read or write to the buffer. In a pipe, do something like this

Sort the < | in the head - 40Copy the code

The sort process will write to file descriptor 1, the standard output, into the pipe, and the HEAD process will read from the pipe. In this way, sort simply reads from file descriptor 0 and writes to file descriptor 1 (pipe), not even knowing that they have been redirected. If there is no redirect, sort automatically reads from the keyboard and prints to the screen.

The final system call is FCNTL, which locks and unlocks files, applies shared locks and mutex, or performs some other file-related operation.

Instead of focusing on individual files, let’s focus on the overall directory and file system-related system calls listed below.

The system calls	describe
s = mkdir(path,mode)	Create a new directory
s = rmdir(path)	Remove a directory
s = link(oldpath,newpath)	Create a link to an existing file
s = unlink(path)	Unlink the file
s = chdir(path)	Change working directory
dir = opendir(path)	Open a directory to read
s = closedir(dir)	Close a directory
dirent = readdir(dir)	Read a directory entry
rewinddir(dir)	Reverse the directory to make it available here

You can use mkdir and rmdir to create and delete directories. Note that you can delete the directory only when the directory is empty.

A directory entry is created when you create a link to an existing file. The system calls link to create a link. Oldpath represents the existing path, newpath represents the path to be linked, and unlink can be used to delete directory entries. The file is automatically deleted when the last link of the file is deleted.

The working directory can be changed using the chdir system call.

The last four system calls are for reading directories. Like regular files, they can be opened, closed, and read. Each call to readdir returns a directory entry in a fixed format. You cannot write to a directory, but you can create a directory in a folder using creat or Link, or delete a directory using unlink. A user cannot look for a particular file in a directory, but can use rewindir to apply to an open directory, allowing him to read from the beginning.

Implementation of Linux file system

Let’s focus on Virtual File systems. VFS hides from high-level processes and applications all the file system differences supported by Linux, and whether the file system is stored on a local device or needs to access a remote device over a network. Devices and other special files are associated with the VFS layer. Next, we’ll take a look at the first widely available file system for Linux: ext2. Later, we’ll look at the improvements made to the ext4 file system. A variety of other file systems are also in use. All Linux systems can handle multiple disk partitions, each with a different file system on it.

Linux virtual file system

In order to enable applications to interact with different types of file systems on local and remote devices, there are hundreds of different file systems in Linux, such as EXT3 and EXT4, as well as memory-based RAMfs, TMPFS and network-based NFS, and user-based FUSE. Fuse, of course, should not be a complete filesystem, but rather a module for putting filesystem implementations into user mode, satisfying the kernel filesystem interface, which is an implementation of the filesystem. Linux makes a layer of abstraction for these file systems called the VFS virtual file system,

The following table summarizes the four main file system structures supported by VFS.

object	describe
superblock	Specific file system
Dentry	A directory entry that is part of a path
I-node	Specific file
File	Open file associated with a process

Superblocks contain important information about the layout of a file system, and if they are broken, the entire file system becomes unreadable.

The i-node index node contains the descriptor for each file.

In Linux, directories and devices are also represented as files because they have corresponding I-Nodes

The file systems on which the super block and index block reside have corresponding structures on disk.

To facilitate some directory operations and path traversal, such as /usr/local/cxuan, VFS supports a dentry data structure that represents directory entries. There are a lot of things the dentry data structure (books. Gigatux. Nl/mirror/kern…

Directory entries are cached in the dentry_cache cache. For example, the cache entry caches /usr, /usr/local, and so on. If multiple processes hardwire access to the same file, their file objects will point to the same entry in this cache.

Finally, the file data structure, which represents both the open file and the memory representation, is created from the open system call. It supports read, write, SendFile, Lock, and other system calls we described earlier.

A real file system implemented under VFS does not need to use exactly the same abstractions and operations internally. However, they must semantically implement the same filesystem operations as those specified by the VFS object. Elements of the operational data structure for each of the four VFS objects are Pointers to the functionality in the underlying file system.

Linux Ext2 file system

Now let’s take a look at one of the most popular disk file systems in Linux, ext2. The first version of Linux, for the MINIX1 file system, was limited to a maximum 64 MB file name size. The MINIX 1 file system was forever replaced by its extension ext, which allowed for longer filenames and file sizes. Due to its poor performance, Ext was replaced by its replacement, ext2, which is still widely used.

An ext2 Linux disk partition contains a file system, which is laid out as follows

The Boot block, block 0, is not intended for Linux, but is used to load and Boot the computer’s Boot code. After block 0, disk partitions are divided into groups that are independent of the location of disk cylinder boundaries.

The first block is a superblock. It contains information about the layout of the file system, including i-nodes, the number of disk blocks, and the beginning of the list of free disk blocks. The next is the group descriptor, which contains information about the location of the bitmap, the number of free blocks and I-nodes in the group, and the number of directories in the group. This information is important because ext2 distributes directories evenly across disk.

The two bitmaps in the figure are used to record free blocks and free I-nodes, a choice inherited from the MINIX 1 file system. Most UNIX file systems use bitmaps instead of free lists. The size of each bitmap is a block. If a block size is 1 KB, then the number of block groups is limited to 8192 blocks and 8192 I-nodes. The size of a block is a strict limit, and the number of blocks is not fixed, increasing by a factor of four in a 4KB block.

After the superblock are the I-Nodes themselves, with i-nodes ranging from 1 to some maximum. Each I-node is a 128-byte long, which describes exactly one file. The I-Node contains statistics (including the owner information available to the STAT system call, which actually reads the information from the I-Node) and enough information to find all the disk blocks that hold the file data.

After i-Nodes are data blocks. All files and directories are kept here. If a file or directory contains more than one block, the block distribution on the disk is not necessarily continuous, but may be discontinuous. In fact, large file blocks can be broken up into many smaller pieces scattered over the entire disk.

I-nodes corresponding to directories are scattered across the entire disk group. If there is enough space, ext2 organizes regular files into the same block group as the parent directory, and data files on the same block into the initial I-Node. Bitmaps are used to quickly determine where new file system data is allocated. When allocating a new file block, ext2 also preallocates many additional data blocks to the file, which reduces the amount of file fragmentation that occurs when data is written to the file in the future. This strategy allows the file system to be loaded on the entire disk, followed by the sorting and defragmentation of files, with better performance.

To do this, you first use a Linux system call, such as open, which determines the path to open the file. There are two kinds of paths, relative paths and absolute paths. If a relative path is used, the search starts from the current directory, otherwise it starts from the root directory.

The file name of a directory file cannot exceed 255 characters, and its allocation is shown below

Each directory consists of an integer number of disk blocks so that the directory can be written to the disk as a whole. In a directory, the entries of files and subdirectories are unsorted and are placed next to each other. Directory entries cannot span disk blocks, so there are usually unused bytes at the end of each disk block.

Each directory entry in the figure above consists of four fixed-length attributes and one variable length attribute. The first property is the number of I-Nodes. The i-Node number for file first is 19, the i-Node number for file second is 42, and the I-Node number for directory third is 88. This is followed by the rec_len field, which indicates the size of the directory entry in bytes, with some extension after the name, and is used to find the next directory entry when the name is filled with an unknown length, until the last one is unused. That’s what the arrow in the picture means. This is followed by the type field: F for a file, D for a directory, and finally the fixed-length filename, which is 5, 6, 5, and finally ends with the filename.

How is the rec_len field extended? As shown in the figure below

As you can see, the middle second has been removed, so make its field the fill of the first directory entry. Of course, this population can be used as a subsequent directory entry.

Because directories are searched in a linear order, it can take a long time to find directory entries at the end of large files. Therefore, the system maintains a cache for recently accessed directories. The cache looks up by filename, and if the cache hits, it avoids costly thread searches. Each part that makes up the path holds a Dentry object in the directory cache, and the directory entries of subsequent path elements are found through i-Node until the actual file I-Node is found.

For example, to use the absolute path to find a file, we temporarily set the path to /usr/local/file. The steps are as follows:

First, the system determines the root directory, and it usually uses i-Node 2, or inode 2, because inode 1 is on an ext2/3/4 file systemBad blockIndex node. The system places an entry in the Dentry cache for future root lookups.
Then, look for the string in the root directoryusrTo obtain the i-node number of the /usr directory. The i-node for /usr is also in the dentry cache. The node is then taken out and the disk block is parsed out of it so that you can read the /usr directory and look for stringslocal. Once the directory entry is found, the directory/usr/localThe i-Node can be obtained from the. With the i-node number in /usr/local, you can read the I-Node and determine the disk block where the directory resides. Finally, look for file from the /usr/local directory and determine its I-node number.

If the file exists, the system extracts the I-Node number and uses it as an index to locate the corresponding I-Node in the I-Node table and load the i-Node into memory. I-nodes are stored in the I-Node Table, a kernel data structure that holds the I-Node number of the currently open file and directory. Here are some i-Node data structures supported by Linux file systems.

attribute	byte	describe
Mode	2	File attributes, protection bits, setuID and setgid bits
Nlinks	2	Number of directory entries pointing to the I-node
Uid	2	UID of the file owner
Gid	2	GID of the file owner
Size	4	File size in bytes
Addr	60	Addresses of the 12 disk blocks and the next three indirect blocks
Gen	1	This parameter is added each time an I-Node is reused
Atime	4	The last time the file was accessed
Mtime	4	The last time the file was modified
Ctime	4	The i-Node time was changed recently

Now let’s talk about the file reading process. Remember how the read function is called?

n = read(fd,buffer,nbytes);
Copy the code

When the kernel takes over, it starts with these three parameters and user-related information from the internal table. One of the entries in the inner table is the file descriptor array. The file descriptor array uses the file descriptor as an index and holds an entry for each open file.

The file is related to the i-Node node number. How do you find the i-Node for a file using a file descriptor?

One design idea used here is to insert a new table, called the open-file-Description Table, between the file descriptor table and the I-Node table. The read/write location of the file will exist in the open file descriptor table, as shown in the following figure

We use shell, P1, and P2 to describe the parent, child, and child relationships. The Shell first generates P1, whose data structure is a copy of the Shell, so both point to the same open file descriptor entry. When P1 is finished running, the Shell’s file descriptor still points to the open file description at the P1 file location. The Shell then generates P2, and the new child automatically inherits the file’s read/write location, even though P2 and the Shell don’t know exactly where the file is read/write.

When an unrelated process opens a file, it will get its own open file descriptor entry and its own file read/write location, which is what we need.

Thus, opening file descriptors is equivalent to giving related processes the same read and write location, while giving unrelated processes their own private location.

An I-Node contains the disk addresses of three indirect blocks, each of which points to the address of the disk block and can store different sizes.

Linux Ext4 file system

To prevent data loss due to system crashes and power failures, ext2 systems must write each block to disk as soon as it is created, and the delay caused by disk head seek operations is intolerable. To enhance the robustness of the file system, Linux relies on journaling file systems. Ext3 is a journaling file system that is an improvement on ext2, and ext4 is an improvement on Ext3, which is also a journaling file system. Ext4 changed ext3’s block addressing scheme to support larger files and larger file system sizes. Let’s describe the features of the ext4 file system.

The most basic function of a recording file system is to record a log. This log records the operations of all file systems in sequence. By sequentially writing out changes to file system data or metadata, operations are not subject to the overhead of disk head movement during disk access. Eventually, the change is written and submitted to the appropriate disk location. If the file system is down before the change is committed to disk, during a reboot the system detects that the file system was improperly unmounted and iterates through the logs and applies the log records to make changes to the file system.

The Ext4 file system is designed to highly match the ext2 and ext3 file systems, although the Ext4 file system has made changes to its kernel data structure and disk layout. However, a file system can be successfully mounted from an ext2 file system to an ext4 file system and provide appropriate logging.

Logs are files that are managed as circular buffers. Logs can be stored on the same or different device from the primary file system. The read and write operations for logging will be played by a separate JBD(Journaling Block Device).

There are three main data structures in JBD: Log Records, atomic operations, and transactions. A log record describes a low-level file system operation that typically results in changes within a block. Because system calls like write contain changes in multiple places — i-Nodes, existing blocks, new blocks, free lists, and so on. Related log records are grouped atomically. Ext4 notifies the start and end of the system call process, enabling JBD to ensure that all or none of the records of atomic operations are applied. Finally, mainly for efficiency reasons, JBD treats a collection of atomic operations as a transaction. Logging in a transaction is stored continuously. Logging can only be discarded after all changes have been applied to disk together.

Because logging per disk can be expensive, ext4 can be configured to keep a log of all disk changes, or just those related to file system metadata. Just recording metadata can reduce overhead and improve performance, but there is no guarantee that file data will not be corrupted. Several other logging systems, such as SGI’s XFS, maintain logs of a range of metadata operations.

/proc file system

Another Linux file system is the /proc (process) file system

The main idea is derived from version 8 of UNIX developed by Bell LABS and later adopted by BSD and System V.

However, Linux expands on this idea in several ways. The basic concept is to create a directory in /proc for each process on the system. The name of the directory is the process PID, expressed in decimal. For example, /proc/1024 is a directory with process number 1024. In this directory are files related to process information, such as process command lines, environment variables, and signal masks. In fact, these files do not exist on disk. When this information is needed, the system reads it from the process on demand and returns it to the user in a standard format.

Many Linux extensions relate to other files and directories in /proc. They contain all kinds of information about CPUS, disk partitions, devices, interrupt vectors, kernel counters, file systems, loaded modules, and so on. Non-privileged users can read a lot of this information, so they can learn about the system in a secure way.

NFS Network file system

Networking has played an important role in Linux since its inception. Next, we will discuss the Network File System (NFS). The function of NFS in modern Linux is to link different File systems on different computers into a logical whole.

The NFS architecture

The basic idea behind NFS is to allow arbitrarily selected clients and servers to share a common file system. In many cases, all clients and servers will be shared within the same Local Area Network (LAN), but this is not required. It may also be the case that the client and server can run on a WAN if they are far apart. The client can be a server, and the server can be a client, but for simplicity, we’re talking about clients consuming services, and servers providing services.

Each NFS service exports one or more directories for remote clients to access. When a directory is available, all of its subdirectories are also available. Therefore, the entire directory tree is usually exported as a whole. The list of exported directories is maintained in a file called /etc/exports that can be exported automatically when the server is started. Clients access these exported directories by mounting them. When a client mounts a remote directory, the directory becomes part of the client directory hierarchy, as shown in the figure below.

In this example, client number one is mounted to the bin directory on the server, so it can now access /bin/cat or any other directory using shell. Similarly, the client 1 can also be mounted to the no. 2 on the server to access the/usr/local/projects/proj1 or other directories. Client 2 can also be mounted to server 2 at/MNT /projects/proj2.

As you can see from the above, the same file has different access paths and different names on different clients because different clients mount the file to different locations in their respective directory trees. Mount points are typically local to the client, and the server is unaware of the existence of any of them.

NFS protocol

As one of NFS protocols supports heterogeneous systems, the client and server may run different operating systems on different hardware. Therefore, it is necessary to define interfaces between the server and the client. This allows anyone writing a new client to work with an existing server and vice versa.

NFS achieves this goal by defining two client-server protocols. A protocol is a series of requests that the client sends to the server and the server sends back to the client.

The first NFS protocol deals with mounts. A client can send a pathname to the server and ask the server if it can mount the server’s directory to its own directory hierarchy. Because the server does not care where it is mounted, the request does not contain the mount address. If the pathname is valid and the specified directory has been exported, the server returns the file handle to the client.

The file handle contains fields that uniquely identify the file system type, disk, directory I node number, and security information.

Subsequent calls to read and write files in the installed directory or any of its subdirectories use file handles.

When Linux starts, the shell script /etc/rc is run before multiple users. You can write commands to mount remote file systems into the script so that the necessary remote file systems can be mounted automatically before the user is allowed to log in. Most Linux versions support automatic mounting. This feature enables the association of remote and local directories.

Compared with manual mounting to /etc/rc, automatic mounting has the following advantages

If there is some kind of failure in the /etc/rc directory listed, the client will not start, or it will be difficult, delayed, or accompanied by error messages, and if the client does not need the server at all, the manual work will be wasted.
Allowing clients to try a set of servers in parallel can achieve a degree of fault tolerance and improve performance.

On the other hand, by default all optional file systems are the same when auto-mount. Since NFS does not provide support for file or directory replication, users need to ensure that all of these file systems are the same. Therefore, most automatic mounts are applied only to binary files and read-only file systems with few changes.

The second NFS protocol is designed for file and directory access. Clients can manipulate directories and read and write files by sending messages to the server. The client can also access file properties, such as file mode, size, and last modified time. NFS supports most Linux system calls, but open and close system calls do not.

Not supporting open and close is not an oversight, but a deliberate design that makes it unnecessary to open a file before reading it or close it when you finish reading it.

NFS uses the standard UNIX protection mechanism and uses RWX bits to identify owners, groups, and other users. Initially, each request message carries the caller’s groupId and userId, which NFS validates. In fact, it trusts the client not to cheat. Public key cryptography can be used to create a secure key that is used to authenticate the client and server on each request and reply.

NFS implementations

Even though the client and server code implementations are independent of the NFS protocol, most Linux systems use a three-tier implementation as shown in the figure below. The top layer is the system call layer, which handles system calls such as open, read, and close. The second layer, the virtual file system (VFS) layer, is invoked after parsing and parameter checking.

The TASK of the VFS layer is to maintain a table in which each open file has an entry. The VFS layer maintains a virtual I-node, or V-Node for short, for each open file. The v node is used to indicate whether a file is local or remote. If they are remote files, the V-Node provides enough information for the client to access them. For local files, the file system and the i-Node of the file are logged, because modern operating systems can support multiple file systems. Although VFS was designed to support NFS, modern operating systems use VFS with or without NFS.

Linux IO

We’ve looked at Linux processes and threads and Linux memory management, so let’s take a look at I/O management in Linux.

Linux and other UNIX systems, I/O management is straightforward and simple. All IO devices are treated as files, which are read and written internally using the same read and write.

Basic Concepts of Linux IO

Linux also has I/O devices such as disks, printers, and networks, which Linux incorporates into the file system as a special file, usually located in the /dev directory. These special files can be treated in the same way as regular files.

Special files generally fall into two categories:

A block special file is a device capable of storing fixed-size chunks of information that can be read and (optionally) written to fixed-size chunks, sectors, or clusters. Each block has its own physical address. Typically block sizes range from 512 to 65536. All information transmitted will be in contiguous chunks. The basic feature of a block device is that each block is opposite and can read and write independently. Common block devices are hard disks, Blu-ray discs, and USB disks. Block devices usually require fewer pins than character devices.

Disadvantages of block-specific files Block devices based on a given solid-state memory are slower than byte addressing based on the same type of memory because reading or writing must begin at the beginning of the block. So, to read any part of the block, you must find the beginning of the block, read the whole block, and discard it if it is not used. To write part of a block, you have to find the beginning of the block, read the whole block into memory, modify the data, find the beginning of the block again, and then write the whole block back to the device.

Another class of I/O devices are character special files. The character device sends or receives a stream of characters in units of characters, regardless of any block structure. The character device is not addressable and does not have any seek operations. Common character devices are printers, network devices, mice, and most other devices that are different from disks.

Each device-specific file is associated with the device driver. Each driver is identified by a major device number. If a driver supports multiple devices, a secondary device number is added after the primary device to identify the driver. The primary and secondary device numbers together identify the unique driver device.

As we know, in a computer system, the CPU does not directly interact with the Device, but has a component called Device Control Unit in between. For example, the hard disk has a disk controller, the USB has a USB controller, and the monitor has a video controller. These controllers are like agents, they know how to deal with the behavior of hard drives, mice, keyboards and monitors.

Most character special files cannot be accessed randomly because they need to be controlled in a different way than block special files. For example, if you type some characters on the keyboard, but you find that you have typed a wrong one, some people like to use backspace to delete, others like to use del to delete. To interrupt a running device, some systems use Ctrl-U to end, but ctrl-C is now common.

network

Another concept of I/O is the network, also introduced by UNIX, and a key concept in the network is the socket. A socket allows a user to connect to the network, just as a mailbox allows a user to connect to the postal system, as shown in the diagram below

Sockets are positioned as shown in the figure above and can be created and destroyed on the fly. After a socket is successfully created, the system returns a file descriptor. This file descriptor is used for creating links, reading data, writing data, and disconnecting connections. Each socket supports a specific type of network type, specified at creation time. The most commonly used ones

Reliable connection-oriented byte stream
Reliable connection-oriented packets
Unreliable packet transmission

A reliable connection-oriented bytecode uses pipes to establish a connection between two machines. Being able to ensure that bytes arrive sequentially from one machine to another, the system is able to guarantee that all bytes will arrive.

The second type is similar to the first except for the demarcation between packets. If three writes are sent, the recipient using the first method receives all bytes directly; The recipient of the second method receives all bytes in three installments. In addition, users can also use the third type of data packet transmission, which is unreliable. The advantage of using this transmission method is high performance, sometimes it is more important than reliability, such as in streaming media, performance is especially important.

The above involves two forms of transport protocol, NAMELY TCP and UDP. TCP is the Transmission control protocol, which can transmit reliable byte streams. UDP is a user datagram protocol that can only transmit unreliable byte streams. They all belong to the TCP/IP protocol cluster. The following is the network protocol layer

Both TCP and UDP are located at the network layer, which is based on IP, the Internet protocol.

Once the socket is established on the source and destination computers, a link can be established between the two computers. When the communicating party uses the LISTEN system call on the local socket, it creates a buffer and blocks until the data arrives. The other party uses the CONNECT system call. If the other party accepts the CONNECT system call, the system establishes a connection between the two sockets.

Once a socket connection is established, it acts as a conduit from which a process can read and write data using the file descriptor of the local socket and close the connection when it is no longer needed using the close system call.

Linux I/O system call

Every I/O device in a Linux system has a special file associated with it. What is a special file?

In an operating system, a special file is a file associated with a hardware device in the file system. Special files are also called device files. The purpose of a special file is to expose the device as a file in the file system. Special files provide an excuse for hardware devices to access tools for file I/O. Because there are two types of devices, there are also two types of special files, namely character special files and block special files

For most I/O operations, only the appropriate files can be used and no special system calls are required. Then, sometimes some device-specific processing is required. Prior to POSIX, most UNIX systems had a system call called IOCtl, which was used to perform a large number of system calls. Over time, POSIX has cleaned up the iocTL functionality into separate function calls for terminal devices, which are now separate system calls.

Here are the system calls for a few management terminals

The system calls	describe
tcgetattr	Retrieve attributes
tcsetattr	Set properties
cfgetispeed	Get input rate
cfgetospeed	Get output rate
cfsetispeed	Setting the input rate
cfsetospeed	Setting the output rate

Linux IO implementation

IO in Linux is implemented through a series of device drivers, one for each device type. The device driver reserves ports for the operating system and hardware to mask differences between the operating system and hardware.

When a user accesses a special file, the file system provides the primary device number and secondary device number of the special file and determines whether it is a block-specific file or a character-specific file. The primary device number is used to identify the character device or the block device, and the secondary device number is used for parameter transfer.

Each driver has two parts: both are part of the Linux kernel, and both run in kernel mode. The top half runs in the caller context and interacts with the rest of Linux. The second half runs in the kernel context and interacts with the device. Drivers can invoke kernel procedures such as memory allocation, timer management, DMA control, and so on. Callable kernel functions are documented in the driver-kernel interface.

I/O implementation refers to the implementation of both character devices and block devices

Block device implementation

The goal of the special file I/O portion of the system that handles blocks is to keep the number of transfers as small as possible. To achieve this, the Linux system sets up a cache between the disk driver and the file system, as shown in the figure below

Prior to Linux kernel 2.2, Linux systems maintained two caches: the page cache and the buffer cache, so files stored in one disk block could be stored in both caches. After version 2.2, the Linux kernel has only one unified cache. A Generic Block layer brings these together, enabling the necessary transitions between disks, data blocks, buffers, and data pages. So what is a common data block layer?

The common data block layer is part of a kernel that handles requests to all block devices in the system. Common data blocks mainly have the following functions

The data buffer is placed high in memory so that when the CPU accesses the data, the page is mapped to the kernel linear address and then unmapped

Zero copy mechanism is implemented so that disk data can be placed directly into user mode address space without first copying to kernel memory

Managing disk volumes treats multiple disk partitions on different block devices as one partition.

Take advantage of the latest advanced features of disk controllers, such as DMA, etc.

The cache is a powerful tool to improve system performance. No matter what purpose a data block is needed, the system searches for the data block in the cache first. If the data block is found, the system returns it directly, avoiding a disk access and greatly improving system performance.

If the page cache does not contain this block, the operating system will fetch the page from disk into memory and then read it into the cache for caching.

The cache supports both read and write operations. When a program writes back a block, it is first written to the cache rather than to the disk. The block is written to the cache when the cache in the disk reaches a certain number of values.

IO schedulers are used in Linux systems to ensure less repetitive head movement and thus less loss. The I/O scheduler sorts read and write operations on block devices and merges read and write requests. Linux has many variations of the scheduler to meet different work needs. The most basic Linux scheduler is based on the traditional Linux Elevator scheduler. The main workflow of the Linux elevator scheduler is sorting disk sector addresses and storing them in a two-way linked list. New requests will be inserted as linked lists. This method can effectively prevent the magnetic head from moving repeatedly. Because the elevator scheduler is prone to hunger. As a result, Linux has been modified to maintain two linked lists, with sorted reads and writes within the deadline. The default read operation takes 0.5 seconds, and the default write operation takes 5 seconds. If the list with the longest waiting time within the deadline is not served, it will be served first.

Character device implementation

Interacting with the character device is relatively simple. Because character devices generate and consume character streams, bytes of data, support for random access makes little sense. One exception is the use of line rules. A line rule can be associated with a terminal device, represented by the tty_struct structure, which represents the interpreter that exchanges data with the terminal device, and of course is part of the kernel. For example, line rules can edit lines, map carriage returns to line feeds, and other operations.

What are line rules?

A line rule is a layer in some UNIx-like systems. A terminal subsystem typically consists of three layers: an upper layer that provides a character device interface, a lower layer of hardware drivers that interact with hardware or pseudo-terminals, and a middle layer of rules that implement behavior common to the terminal devices.

Network device implementation

Network devices interact differently, although network devices also produce character streams because their asynchronous nature makes it difficult for them to integrate with other character devices on the same interface. Network device drivers generate many packets of data that reach user applications over network protocols.

Modules in Linux

UNIX device drivers are statically loaded into the kernel. Therefore, device drivers are loaded into memory whenever the system is started. With the advent of Linux on personal computers, this pattern of static links being used for a while after completion was broken. The number of I/O devices available on PCS has increased by an order of magnitude relative to the I/O devices available on minicomputers. Most users do not have the ability to add a new application, update device drivers, reconnect the kernel, and then install.

Linux addresses this problem by introducing a loadable module mechanism. Loadable is a block of code that is added to the kernel while the system is running.

When a module is loaded into the kernel, several things happen: First, the module is dynamically redeployed during the loading process. Second, the system checks whether the resources required by the program are available. If available, mark these resources as in use. Third, set up the desired interrupt vector. Fourth, update the driver conversion table to handle the new master device type. Finally, run the device driver.

Once this is done, the driver is installed, and other modern UNIX systems support loadable mechanisms.

Linux security

Linux, an operating system derived from MINIX and UNIX, was a multi-user system from the beginning. This means that Linux has built in security and information access control mechanisms since its early days. The following focuses on some aspects of Linux security

Basic Concepts of Linux security

A Linux User group consists of a series of registered users, each with a unique User ID (UID). A UID is an integer between 0 and 65535. A file (process or other resource) is marked with its owner’s UID. By default, the owner of a file is the person who created the file, and the owner of the file is the user who created the file.

Users can be divided into groups, each marked by a 16-bit integer, called a GID(group ID). Grouping users is done manually by the system administrator by adding a record to the database to indicate which users belong to which groups. A user can belong to different groups.

The basic security mechanism in Linux is relatively easy to understand; each process keeps track of its owner’s UID and GID. When the file is created, it gets the UID and GID of the process that created it. When a file is created, its UID and GID are marked as the process’s UID and GID. This file also acquires some permissions determined by the process. These permissions specify what access the owner, other users in the owner’s group, and other users have to the file. For these three types of users, the potential access is read, write, and execute, marked by R, W, and X, respectively. Of course, permissions to execute files are only meaningful if the file is reversible binary. An error is reported when you attempt to execute a non-executable file that has execution permission.

Linux users fall into three categories

Root (Super administrator), its UID is 0, and the user has extreme permissions to override many restrictions, including read/write execution.
Users of the system, the UID ranges from 1 to 499.
The average user, the UID range is generally 500 to 65534. The permissions of such users are limited by basic permissions as well as by administrators. Note, however, that the nobody account, whose UID is 65534, has further restricted permissions and is generally used to implement guest accounts.

Each user class in Linux is marked by three bits, so nine bits is enough to represent all permissions.

Let’s look at some basic user and permission examples

binary	tag	Granted file access rights
111000000	rwx——	The owner can read, write, and execute
111111000	RWXRWX –	Owners and groups can read, write, and execute
111111111	rwxrwxrwx	All can read, write, and execute
000000000	———	No one has any authority
000000111	——rwx	Only other users outside the group have ownership
110100100	Rw – r – r –	Owners can read and write, and others can read
110100100	rw-r—–	Owners can read and write, and groups can read

As we mentioned above, a special user whose UID is 0 is called the superuser (or root). The superuser can read and write to any file on the system, regardless of who owns the file or how the file is protected. A process with UID 0 also has a few privileges to make protected system calls that ordinary users might not have. Normally, only the system administrator knows the password of the superuser.

On Linux, directories are also files and have the same protection mode as ordinary files. The difference is that the x bits of the directory represent search rather than execute permission. Therefore, if a directory is protected in rwxr-xr-x mode, it allows the owner to read, write, and find the directory, while others can only read and find, but not add or remove files from the directory.

Special I/ O-related files have the same protection bits as ordinary files. This mechanism can be used to restrict access to I/O devices. For example, a printer is a special file whose directory is /dev/lp, which can be owned by either the root user or a special user called a daemon, with protected mode rw——- to prevent everyone else from accessing the printer. After all, there would be chaos if everyone used a printer.

Of course, if /dev/lp is protected in RW ——-, that means no one else can use the printer.

This problem is solved by adding a guard bit, SETUID, to the previous 9 bits. When a process’s SETUID bit is turned on, its effective UID becomes the owner UID of the corresponding executable file, not the UID of the user currently using the process. Make the program that accesses the printer daemon-owned and turn on the SETUID bit so that any user can execute the program and have the privileges of the daemon.

In addition to SETUID, there is a SETGID bit, which works similarly to SETUID. But this bit is rarely used.

Linux security-related system calls

There are not many security system calls in Linux, only a few, as the list below shows

The system calls	describe
chmod	Change the file protection mode
access	Test access with real Uids and Gids
chown	Change the owner and group
setuid	Set the UID
setgid	Set the GID
getuid	Get the real UID
getgid	Gets the real GID
geteuid	Get a valid UID
getegid	Gets a valid GID

Chmod is one of the most commonly used system calls in daily development. It is not expected that system calls can be used in daily development. Before chmod, we thought it was to change permissions, but now it is to change file protection mode professionally. Its specific function is as follows

S = chmod(" pathname "," value ");Copy the code

For example,

s = chmod("/usr/local/cxuan",777);
Copy the code

The path /usr/local/cxuan is protected by RWXRWXRWX, which can be operated by any group or person. Only the owner of the file and the superuser have the right to change the protection mode.

The access system call is used to verify that the actual UID and GID have specific permissions on a file. Here are the four getXXX system calls to get uIds and Gids.

Note: Chown, setuid, and setgid are only used by superusers to change the UID and GID of the owner process.

Linux Security Implementation

When a user logs in, the login program, also known as login, asks for a username and password. It hashes the password and looks it up in /etc/passwd to see if there is a match. The reason for using hashes is to prevent passwords from existing in the system in an unencrypted manner. If the password is correct, the login program reads the name of the shell program selected by the user in /etc/passwd, which could be bash, shell, or some other CSH or KSH. The login program then uses the two system calls setuid and setgid to change its UID and GID to the user’s UID and GID. It then turns on the keyboard as standard input, the file descriptor for standard input is 0, and the screen as standard output, the file descriptor is 1. The screen is also output as standard error and the file descriptor is 2. Finally, execute the shell program selected by the user and terminate.

When any process wants to open a file, the system first compares the protected bits recorded in the file’s I-Node with the user’s valid UID and GID to check whether access is allowed. If access is allowed, open the file and return the file descriptor; Otherwise, the file is not opened and -1 is returned.

The Linux security model and implementation are essentially the same as most traditional UNIX systems.

Cxuan, the programmer of the public account, replied to cXuan to receive high-quality materials.

I wrote six PDFS myself, very hardcore, linked below

Cxuan took pains to read four PDFS.

Cxuan read two more PDFS.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Sorry, I’m a little fluttered after learning all this Linux

Linux introduction

Linux interface

Linux components

Shell

Linux applications

Linux kernel Architecture

Linux processes and threads

The basic concept

Linux interprocess communication

Process management system calls in Linux

Implementation of Linux processes and threads

Linux scheduling

Linux boot

Linux Memory Management

The basic concept

Linux memory management system call

Linux memory management implementation

The cache

Linux page table

Page allocation and unallocation

The memory mapping

On-demand paging

The file system

Basic Concepts of Linux file system

Linux file system call

Implementation of Linux file system

NFS Network file system

Linux IO

Basic Concepts of Linux IO

network

Linux I/O system call

Linux IO implementation

Modules in Linux

Linux security

Basic Concepts of Linux security

Linux security-related system calls

Linux Security Implementation

Sorry, I’m a little fluttered after learning all this Linux

Linux introduction

Related Posts