Base series - Operating system basics cheat sheet

Computer Basic knowledge building trilogy:

Part I: Foundation series – Cheat sheet version of basic computer knowledge

Including history, CPU, bus, memory, instruction system, controller, arithmetic, bit operation, etc

The second part: Base series – operating system basic knowledge cheat sheet edition

Including process and thread synchronization management, job management, storage management, virtual memory, Linux, file management and so on

The third part: Foundation series – computer network basic knowledge cheat sheet edition

It includes OSI seven layer model, IP protocol, TCP\IP protocol, Http protocol, DNS protocol, etc

1 Operating system Overview

1.1 What is an Operating system

Manage and configure memory, determine the sequence of resource supply and demand, and control input and output devices
An operating system is a computer program that manages computer hardware and software resources
An operating system provides an interface that allows users to interact with the system
From mobile phones to supercomputers, operating systems can be simple or complex
There are many kinds of operating systems, not limited to computers
The operating system presents a variety of operations to the user on different devices
Android, IOS, HarmonyOS
Windows ，Linux ，MacOS
Summary: Software systems that manage hardware and provide user interaction

1.2 Why is an OPERATING system needed

It is impossible for us to operate computer hardware directly
A variety of devices require a unified interface
A variety of devices require a unified interface

1.3 Basic Operating System Functions

The operating system centrally manages computer resources
- Processor resources
- IO Device Resources
- Memory resource
- File resources
The operating system realizes the abstraction of computer resources
- IO Device management software that provides read and write interfaces
- Users do not need to program to a hardware interface
- File management software, provide file operation interface
The operating system provides the interface between the user and the computer
- Order form
- Image window form
- System call form

1.4 Concepts related to operating systems

Concurrency/parallelism
- Multiprogramming
  - Multiprogramming is the practice of storing multiple programs simultaneously in the computer’s memory
  - Multiple programs run interspersed with each other under the computer’s hypervisor
- Parallelism means that two or more events can occur at the same time
- Concurrency means that two or more events can occur at the same time interval
participatory
- Shareability means that resources in an operating system can be shared by multiple concurrent programs
- This form of shared use is called resource sharing
- Multiple programs can use main memory resources simultaneously
- Resource sharing can be divided into two modes based on attributes
  - Mutually exclusive and shared
    - While the resource is occupied by program A, other applications that want to use it have to wait
    - Other processes can use the resource only after process A uses it
    - The printer
  - Simultaneous access form
    - A resource is accessed concurrently by multiple programs over a period of time
    - This “simultaneity” is macro, from which the resource can be accessed simultaneously
    - Writes data to a disk
Virtual sex
- Virtuality is the transformation of a physical entity into several logical entities
- Physical entities are real, logical entities are virtual
- Virtual technology mainly includes time division multiplexing technology and space division multiplexing technology
- Time division multiplexing
  - Resources are reused over time and used concurrently by different programs
  - The time-sharing use of computer hardware resources by multiple programs
  - Improve resource utilization
  - Virtual processor technology
    - With the help of multiprogramming techniques
    - Create processes for each program
    - Multiprogram time-sharing multiplexing processor
  - Virtual device Technology
    - A physical device is virtualized as multiple logical devices
    - Each program occupies one logical device
    - Multiple programs are accessed concurrently through logical devices
- Space division multiplexing technology
  - Space division multiplexing technology is used to realize virtual disk, virtual memory, etc
  - Improve resource utilization and programming efficiency
  - Virtual disk technology
    - The physical disk is virtualized as a logical disk
    - Logical disks such as C, D, and E
    - It is safer and more convenient to use
  - Virtual memory technology
    - Logically expand the storage capacity of a program
    - Use more capacity than actual memory
    - Greatly improve the efficiency of programming
asynchrony
- In a multiprogram environment, multiple processes are allowed to execute concurrently
- Processes may need to wait or give up while using resources
- The process is not implemented in one go, but in a stop-start fashion
- The process moves forward with unpredictable speed

2. Process management

2.1 Process Overview

Why processes are needed
- Resources belong to the currently running program before the OS is configured
- After configuring the OS, the concept of multiprogramming is introduced
- Isolate resources and operating environments to improve resource utilization
- Process is the basic unit of resource allocation and scheduling
- Process as the carrier of program independent operation to ensure the normal execution of the program
- Processes greatly improve the utilization of operating system resources
Process Form in Main Memory – Process Control Block (PCB)
- A generic data structure used to describe and control the operation of a process
- Record the current status of the process and all the information that controls the running of the process
- The basic unit of a PCB that enables a process to run independently
- PCB is the information that is often read by the operating system for scheduling
- PCB is resident memory, stored in the system specially opened in the PCB area
- identifier
  - An identifier uniquely identifies a process to distinguish it from other processes
- state
  - Marks the process status of a process, such as running
- priority
- Program counter
  - The address of the next instruction to be executed by the process
- Pointer to the memory
  - Pointers to program code and process data
- Context data
  - The data stored by the processor while the process is executing
- IO status information
  - List of files occupied by process IO operations
- Accounting information
  - Total processor time, clock count, etc

Processes and threads

Relationship between
- A process can have one or more threads
- Process is the basic unit of resource allocation and scheduling
- Thread is the smallest unit of operating system scheduling
- Contained in a process, is the unit of work that actually runs in the process
- A process can have multiple concurrent threads, each performing a different task
- Threads of a process share process resources
The difference between

	process	thread
resources	The basic unit of resource allocation	Not owning resources
scheduling	The basic unit of independent dispatch	The smallest unit of independent dispatch
overhead	The process has high system overhead	Threads have low overhead
communication	Process the IPC	Read and write the same process data communication

2.2 Five-state model of process management

The ready state
- After the process has been allocated all necessary resources except CPU
- As soon as the CPU usage is reacquired, it can run immediately
- All other resources are ready except the CPU resource
- Ready queue: In a system, multiple ready processes are usually in a queue
The blocking state
- The process cannot continue due to some reason, for example, other devices are not ready
- The state in which the CPU is thus abandoned is called the blocking state
- Blocking queue
Execution status
- A process that obtains the CPU and whose program is executing is called the execution state
- In a single processor, only one process is running at a time
Create a state of
- Assign PCB– > Insert ready queue
- The state in which the PCB is present at the time the process is created but other resources are not ready is called the created state
- The operating system provides the fork function interface to create processes
Termination status
- System cleanup –>PCB return
- The process is terminated when the PCB is cleaned or returned by the system

2.3 Process Synchronization

Why do I need interprocess synchronization
- Producer-consumer problems
  - The producer process provides the products produced to the consumer process for consumption
  - Producer and consumer processes can execute concurrently
  - A buffer pool with n buffers is set up between the two
  - The producer removes the product consumption from the product buffer, the consumer process
  - It is ok to look at the producer program or the consumer program alone
  - When the two are executed concurrently, something can go wrong
  - Critical resource: cache area
- The problem of eating for philosophers
  - There were five philosophers who lived by alternating between thinking and eating
  - The philosophers shared a round table and sat on five chairs around it
  - There are five bowls and five chopsticks on the round table. When they are hungry, they try to reach for the left and right chopsticks that are close to them
  - Only when he has both chopsticks can he eat
  - After eating, put down your left and right chopsticks and continue thinking
  - Five philosophers picked up the left chopsticks at the same time
  - The five philosophers waited for the right chopstick to release
  - Five philosophers starved to death
  - Critical resource: chopsticks
- Root of the problem
  - The root of the problem is that they don’t communicate with each other
  - If the producer tells the consumer that I’ve finished a piece of production
  - The philosopher said to the philosopher next to him, “I want to have dinner.
- The purpose of process synchronization
  - Coordinate the use sequence of competing resources among multiple processes
  - Enables efficient use of resources and cooperation between concurrent processes
- Critical resources
  - Critical resources are shared resources that cannot be accessed by multiple threads at the same time.
  - When a process is using a critical resource, other processes must wait for the occupying process to release the shared resource according to the synchronization mechanism of the operating system before competing for the shared resource again.
Principles for interprocess synchronization
- Idle Concession: The resource is not occupied and allowed to be used
- Busy waiting: Resources are occupied, and the request process is waiting
- Finite wait: Resources are guaranteed to be available for finite wait time
- Grant wait: While waiting, the process needs to relinquish the CPU
Methods for synchronizing between processes
- The message queue
- Shared storage
- A semaphore
Thread synchronization
- Threads of a process share process resources
- Synchronization is also required when multiple threads concurrently use process resources
- Methods for synchronizing between threads
  - The mutex
  - Read-write lock
  - spinlocks
  - Condition variables,

2.4 Linux Process Management

Type of process
- Foreground process
  - A foreground process is a process that has a terminal and can interact with the user
- Background processes
  - In contrast to the foreground process, the background process does not occupy the terminal
  - Background programs basically do not interact with the user and have a lower priority than foreground processes
  - The command to be executed ends with an ampersand (&)
- daemon
  - Daemons are special background processes
  - Many daemons start at boot time and run until the system shuts down
  - Linux has many typical daemons
  - Processes whose names end in “D” are usually daemons
    - crond
    - httpd
    - sshd
    - mysqld

Process marking

The process ID
- A process ID is a unique identifier for a process, and each process has a different ID
- The process ID appears as a non-negative integer, the maximum value of which is limited by the operating system
- The top command displays information about all processes in the system
- The idle process whose ID is 0 is the first process created in the system
- The process whose ID is 1 is init and a child process of process 0. The process completes system initialization
- The Init process is the ancestor of all user processes

The status flag of the process

Man ps Command To view the help document of user commands

The status symbol	State that
R	(TASK_RUNNING), the process is running
S	(TASK_INTERRUPTIBLE), the process is sleeping
D	(TASK_UNINTERRUPTIBLE), the process is in the I/O wait sleep state
T	(TASK_STOPPED), the process is being paused
Z	(TASK_DEAD or EXIT_ZOMBIE), the process is exiting, or zombie process

Father and son process
- The operating system provides the fork function interface to create processes
- Parent-child relationship: process A calls fork to create process B. Process A is the parent process of process B
- You can run the pstree command to view the relationship between the parent and child processes

Commands for operating processes
- The ps command
  - The ps command is used to display the status of the current process
  - The ps command often combines the aux or ef parameter with the grep command to retrieve a particular process
- jobs
  - Only the (SIGKILL 9) signal can terminate the process unconditionally. Other signal processes have the right to ignore it
- nohup
  - Run the command without hanging up
- Fg/bg command
  - The fg command switches a background command to the foreground terminal to continue execution
  - The bg command changes a background pause command to continue execution
  - CTRL + Z pauses the foreground work
- kill
  - The kill command sends the specified signal to the process
  - Kill -l You can view the signals supported by the operating system
  - Only the (SIGKILL 9) signal can terminate the process unconditionally. Other signal processes have the right to ignore it

3. Job management

3.1 Process Scheduling

Process scheduling is when the computer decides which ready process gets CPU usage
Keep the running information of the old process, please remove the old process (clean up the baggage)
Select a new process, prepare the running environment, and allocate CPU (new resident)
Scheduling mechanism
- Delegate mechanism for ready queues
  - Queue ready processes in such a way that the scheduler can find them the fastest
- Select the delegate mechanism for running the process
  - The scheduler selects the ready process with a certain policy to allocate CPU resources to it
- Context switching mechanism for old and new processes
  - Saves context information for the current process and loads the run context of the delegate executing process
  - The context of the old process is saved in main storage
  - The context of the new process is loaded into the cache

scheduling

Non-preemptive scheduling
- Once a processor has assigned a process, it keeps that process in use
- The scheduler does not preempt the processor being used for any reason
- The processor is not relinquished until the process finishes its work or is blocked by IO
Preemptive scheduling
- Allows the scheduler to suspend a currently running process with certain policies
- Save the context information of the old process and assign the handler to the new process

	Preemptive scheduling	Preemptive scheduling
overhead	Frequent switching and high overhead	Less switching times, low overhead
fairness	Relatively fair	Not fair
application	General system	Dedicated system

Scheduling algorithm
- First come, first served scheduling algorithm
  - Schedule from the ready queue in order from the queue head
- Short process priority scheduling algorithm
  - The scheduler preferentially selects the process with the shortest estimated running time in the ready queue
  - Short process priority scheduling algorithm is unfavorable to the execution of long process
- High priority priority scheduling algorithm
  - Processes carry a priority, and the scheduler selects the process with the highest weight first
  - The high priority priority scheduling algorithm enables urgent tasks to be processed first
  - Foreground processes have a higher priority than background processes
- Time slice rotation algorithm
  - Arrange ready processes on a first come, first served basis
  - Each time a pending process is removed from the queue head, a time slice is allocated for execution
  - Is a relatively fair scheduling algorithm, but can not guarantee timely response to users

3.2 a deadlock

A deadlock is a phenomenon in which two or more processes are blocked during execution, either by competing for resources or by communicating with each other, and cannot proceed without external action. The system is said to be in a deadlock state or a deadlock occurs in the system. These processes that are always waiting for each other are called deadlocked processes.
Deadlock generation
- The basic reason
  - Compete for resources
    - The number of shared resources does not meet the requirements of each process
    - Resource contention between processes results in a deadlock
  - The process scheduling sequence is incorrect
    - As A result, shared resources are held by multiple processes and cannot be released. Therefore, process A can schedule shared resources first and then process B
- A necessary condition for
  - Mutual exclusion conditions
    - A process’s use of resources is an exclusive use
    - A resource can be used by only one process, and other processes need to wait
  - Request hold condition
    - The process holds at least one resource and makes new resource requests
    - The new resource is occupied and the request is blocked
    - A blocked process does not release resources it holds
  - Inalienable condition
    - Resources acquired by a process cannot be taken away until they have been used
    - Acquired resources can only be released by the process itself
  - Loop waiting condition
    - When a deadlock occurs, there must be a process-resource circular chain
    - P1(R1)-> P2(R2)->P3(R3->)P4(R4) ->P1(R1)
Deadlock handling
- Ways to prevent deadlocks – Break prerequisites
  - Discard request hold conditions
    - The system requires that all required resources be applied for at one time before a process can run
    - The process does not make resource requests during runtime, thus discarding the request hold condition
  - Discard inalienable conditions
    - When a process’s request for a new resource is not met, the occupied resource must be released
    - Resources held by a process at runtime can be freed, meaning they can be taken away
  - Discard loop wait conditions
    - Available resources are sorted linearly, and applications must be incrementally requested
    - Linear applications no longer form loops, thus discarding the loop waiting condition, A, B, C, D, E
- Banker’s algorithm
  - Is a well-known algorithm for avoiding deadlocks
  - An algorithm based on allocation strategy of bank lending system
  - Customers apply for a limited amount of loans, each application must state the maximum amount of capital
  - Bankers should lend money to customers whenever they can meet it
  - After using the loan, customers can repay the loan in time to satisfy other customers
  - Table of allocated resources
  A B C D
  
  P1 0 0 1 4
  
  P2 1 4 3 2
  
  P3 1 3 5 4
  
  P4 1 0 0 0
  - Table of Required Resources
  A B C D
  
  P1 0 6 5 6
  
  P2 1 9 4 2
  
  P3 1 3 5 6
  
  P4 1 7 5 0
  - Table of allocatable resources
  A B C D
  
  – 1 5 2 0
  - Resource tables also need to be allocated
  - Allocate resources to each process by looking at the table of available resources to see which process it satisfies. If it does not, it does not execute until there are resources that satisfy any of them
  - So P2 will get the resources first
  A B C D
  
  P1 0 6 4 2
  
  P2 0 5 1 0
  
  P3 0 0 0 2
  
  P4 0 7 5 0

	A	B	C	D
P1	0	0	1	4
P2	1	4	3	2
P3	1	3	5	4
P4	1	0	0	0

	A	B	C	D
P1	0	6	5	6
P2	1	9	4	2
P3	1	3	5	6
P4	1	7	5	0

	A	B	C	D
–	1	5	2	0

	B	C	D
P1	6	4	2
P2	5	1	0
P3	0	0	2
P4	7	5	0

4. Storage management

4.1 Necessity of computer storage management

Early computer programming did not require much storage management
As computers and programs become more complex, storage management becomes necessary
Make sure your computer has enough memory to process data
Ensure that the program can get a portion of memory usage from available memory
Ensure that programs can return used memory for use by other programs

4.2 Memory Allocation Process

The methods of distributing the
Single continuous distribution
- Single contiguous allocation is the simplest way to allocate memory
- This parameter can be used only in a single-user, single-process operating system
- It is divided into system area and user area
Fixed partition allocation
- Fixed partition allocation is the simplest storage allocation method that supports multiprogramming
- Memory space is divided into a number of fixed-size regions
- Each partition can only be used by one program without interference
Dynamic partition allocation (common)
- Dynamically allocate memory space based on process requirements
- Related data structure, allocation algorithm
- Dynamically partitioned free table data structure, 1: used, 0: unused
partition 1 2 3 4

tag 0 1 1 0
- Dynamically partitioned free-link data structures
  - Using bidirectional linked lists to link blocks of memory, contiguous free areas can be merged into a linked list node
  - Note The storage capacity of the node
- Dynamic partition allocation algorithm
  - First adaptation algorithm (FF algorithm)
    - When allocating memory, the appropriate memory areas are found in the starting order
    - If there is no suitable free area, the allocation fails
    - Starting at the head each time, the head address space is continuously divided
    - Cyclic adaptation algorithm, each allocation does not start from the beginning, starting from the last end of the allocation
  - Best fit algorithm (BF algorithm)
    - The best adaptive algorithm requires that the free list be sorted by capacity
    - Walk through the free zone list to find the best fit free zone
  - Fast adaptation algorithm (QF algorithm)
    - The fast adaptive algorithm requires multiple free region linked lists
    - Each free area list stores one capacity of free area

partition	1	2	3	4
tag	0	1	1	0

4.3 Memory Reclamation Process

There are four cases where memory can be reclaimed
- The recycle area is linked with the free area and linked behind
  - There is no need to create new free linked list nodes
  - You only need to increase the capacity of free zone 1 to free zone
- The recycle area is linked to the free area and is linked to the front
  - Merge the recycle area with the free area
  - The new free area uses the address of the reclaim area
- The recycle area is linked to the free area and is linked in the middle
  - Merge free zone 1, free zone 2, and reclaim zone
  - The new free zone uses the address of free zone 1
- Unlinked free area, single recycling area
  - Create a new free node for the recycle area
  - Insert into the corresponding free area list

4.4 Process Storage Management

Page storage management
- A block is a definition of a relative physical device, and a page is a definition of a relative logical space
- Divide the process logical space into pages of equal size
- Divide the physical memory space into physical blocks corresponding to the page size
- Packing process space into discrete physical blocks of physical memory on a page basis
- The page size should be moderate, too large to allocate, too small memory fragmentation
- The page size is usually 512B to 8K
- Page table: A mapping of the logical and physical space of a process, expressed as [page number, block number]
- Page address: [address, in-page offset]
- The problem: Modern computer systems can support very large logical address space (2^ 32~2^64), so that the page table becomes very large, to occupy a very large memory space, for example, with a 32-bit logical address space paging system, the specified page size is 4KB, then in each process page table page table entries can be up to 1M(2^20). If each page table entry takes up 1Byte, each process takes up 1MB of memory for page tables alone
  - 2^32/2^12=2^20=1M page entries
- Multilevel page tables: The root page table blocks hold the addresses of the child page tables, which can be loaded on demand at run time
- Disadvantage: Having a contiguous piece of logic spread over multiple pages greatly reduces execution efficiency
Segment storage management
- Divide the process logical space into segments (not equally)
- The length of the segment is determined by the length of the continuous logic
- MAIN function, subroutine segment X, sub function Y, etc
- Segment table: [segment number, base address. Segment length]
- Segment address: [segment number, intra-segment offset]
Page storage management versus section storage management
- Both segment and page storage manage the logical space of a process discreetly
- Pages are physical units and segments are logical units
- Paging is for rational use of space, segmentation is to meet user requirements
- Page size is fixed by hardware and segment length can change dynamically
- Page table information is one-dimensional, segment table information is two-dimensional
Segment-page storage management
- Paging can greatly improve memory utilization (albeit with in-page fragmentation)
- Segmentation is better for the user because the logic can be written by the user
- The combination of the two forms segment-page storage management
- First, the logical space is managed in segments
- Then divide the space into several pages according to page management
- Section page address: [section number, inside page number. Inside page address]

4.5 Virtual Memory

Virtual Memory Overview
- Some processes actually require more memory than physical memory can hold
- Multiprogramming makes available physical memory per process scarcer
- It is impossible to increase physical memory indefinitely; there will always be a time when physical memory will run out
- Virtual memory is the key technology of memory management in operating system
- Make multi-program running and large program running become a reality
- The program use memory partition, part of the temporarily unused memory placed in secondary storage
- Virtual memory is actually a complement to physical memory, with a speed close to memory and a cost close to memory
The locality principle of programs
- The principle of locality states that when a CPU accesses memory, both instructions and data, the accessed memory units tend to be clustered in a small contiguity area.
- When the program runs, it is not necessary to load all the memory, but only some parts can be loaded
- If the page is not in memory, a page-missing interrupt is issued and a page replacement is initiated
- At the user level, the program has a lot of space, which is virtual memory
The replacement of virtual memory
The replacement policy occurs at the cache-main storage and main-secondary levels
- The cache-main memory level replacement strategy is mainly to solve the speed problem
- The main – secondary storage level is mainly to solve the capacity problem
- The displacement time
  - Cache replacement timing
    - The cache has no data and needs to load the required data from main memory
  - Replacement timing of main memory pages
    - Main memory is missing pages and needs to load page data from secondary memory
- Replacement algorithm
  - First in first out (FIFO) algorithm
    - Think of main memory as a first-in, first-out queue
    - The first block to enter the queue is replaced first
  - Least Frequent use of algorithms (LFU)
    - Preference is given to the least frequently used blocks
    - Extra space is required to record the frequency of word block usage
  - Least Recently Used Algorithm (LRU)
    - Blocks that have not been used in a period of time are preferentially eliminated
    - If the block in use is in the cache, move it to the head of the table, ensuring that the list head node is the most recently used
    - There are many ways to do this, usually using bidirectional linked lists

4.6 Storage Management in Linux

Within the page fragments
- Internal fragmentation is the amount of memory that has already been allocated (to indicate which process it belongs to) more than is required for the request, and memory that cannot be utilized is internal fragmentation.
Outside the page fragments
- An external fragment is a free chunk of memory that has not been allocated (that does not belong to any process) but is too large to be allocated to a new process that claims memory space.
Buddy memory management algorithm
- Buddy algorithm is a classical memory management algorithm
- Algorithm based on the advantages of computer processing binary has high efficiency
- The algorithm is mainly to solve the problem of out-of-memory fragmentation
- In effect, the out-of-memory fragmentation problem is transferred to the in-memory fragmentation problem
- Try to make memory allocation and adjacent memory merge fast
- Memory allocation principle
  - Rounded up to a power of two
  - 70 k to 128 k
  - 129 k to 256 k
  - 666 k to 1024 k
- Partner system
  - “Partner” refers to the “partner” of memory
  - The “partner” of a contiguous piece of memory is another contiguous piece of memory of the same size
- Allocation process
  - Create a list of free block linked lists, each of which is a power of two
  - Assume that the storage space is 1 MB and 100K memory is allocated
  - 100k to the power of 2 is 128K
  - Is there a 128K free memory block?
  - No! Is there a 256K free memory block?
  - No! Check whether there are 512K free memory blocks.
  - No! Check whether there are 1 MB free memory blocks.
  - If yes, remove 1 MB free memory blocks and allocate them
  - Remove 512K and put it in 512K’s free linked list, and allocate the rest
  - Remove 256K and put it on the 256K free list, and allocate the rest
  - Unplug the 128K free list and allocate the rest
  - The end
- The recycling process
  - Is the allocated memory partner on the free linked list?
  - In! Remove partner, merge to 256K free memory, judge
  - In! Remove partner, merge to 512K free memory, judge
  - In! Remove partner, merge to 1M free memory
  - Insert 1M free linked list and recycle is complete
Linux swap space
- A Swap is a partition of a disk
- When Linux physical memory is full, some memory is swapped to Swap space
- The Swap space is configured during system initialization
- The top command is used to view the swap space allocation
- Main uses:
  - Cold start memory dependency
  - Systematic sleep dependence
  - Large process space dependency
- Swap space VS virtual memory
  - Swap space is an operating system concept
  - Swap space Resolves the problem of insufficient physical memory of the system
  - Swap space exists on disks
  - The Swap space is replaced with the main storage. Procedure
  - Virtual memory is a process concept
  - Virtual memory Solves the problem of insufficient physical memory for processes
  - Virtual memory exists on disk
  - Virtual memory is replaced with main memory

5. Document management

5.1 File Management in the Operating System

Logical structure of the file
- Logical structure of the file type
  - Structured file
    - Text files, documents, media files
    - File contents are composed of fixed-length records and variable-length records
    - Fixed-length records store structured data items such as file format and file description
    - Variable length record storage file details
    - Example: PNG file tag –PNG data block — end of file tag
  - No structure file
    - Exe file, DLL link library file, so file
    - Binaries, link libraries
    - Also known as streaming files
    - The length of the file content is in bytes
- Sequential file
  - Sequential files refer to files stored sequentially in storage media
  - The storage feature of tape allows only sequential files to be stored
  - Sequential files are the most efficient of all logical files
  - The efficiency of adding, deleting and modifying sequential files is low
- Index file
  - Variable – length files are not suitable for sequential file format storage
  - Index file is a file format invented to solve variable-length file storage
  - Index files are stored together with index tables
  - Index table: [key, logical address]
Auxiliary storage space allocation
- Supplementary storage distribution method
  - Continuous distribution
    - Sequential reading of file contents is very easy and fast
    - The storage requirements are high, requiring continuous storage space of sufficient capacity
  - Link allocation
    - Link allocation can store files in discrete disk blocks that require additional storage space to store files in disk block link order
    - Implicit linking
      - The next link that is implicitly allocated is stored in the current disk block
      - Implicit allocation is suitable for sequential access, while random access is inefficient
      - Poor reliability, any link problems will affect the entire file
    - Explicit links
      - FAT(File Allocation Table) [Physical blocks, next disk blocks]
      - No support for efficient direct storage (FAT entries)
      - FAT tables take up a lot of storage space when retrieving them (you need to load the whole FAT into memory)
  - The index distribution
    - Centralized storage of all disk blocks of a file (index)
    - When a file is read, the file index is read into memory
    - Each file has an index block to record information about all disk blocks
    - The index allocation mode supports direct access to disk blocks
    - Index allocation has obvious advantages when files are large
Auxiliary storage space management
- The free table
  - [Serial number, first free disk block number, number of free disk blocks]
  - Free disk allocation is similar to memory allocation
  - First time adaptive algorithm, cyclic adaptive algorithm, etc
  - The reclamation process is also similar to memory reclamation
- Free list
  - The free linked list method combines all free extents into a free linked list
  - Each linked list node stores free disk blocks and the number of free disks
- Shown in figure
  - Bitmaps have low maintenance costs
  - Bitmaps make it very easy to find free disk blocks
  - Bitmaps use 0/1 bits and take up very little space
  - 0: unused. 1: used
  Disk block/track 1 2 3
  
  1 0 0 0
  
  2 0 1 0
  
  3 1 0 1
Directory management
- Directory tree:
  - There is only one path to any file or directory
- File Description
  - File ID File Type File Permission File Physical Address File Length File Connection count File Access time index Node NUMBER File Status Access count Link pointer

Disk block/track	1	2	3
1	0	0	0
2	0	1	0
3	1	0	1

5.2 Basic Operations on Linux Files

The Linux directory

Linux everything is a file
/bin/etc /home /usr /opt /proc /dev/mdn/lib/var…

directory	describe
/bin	Store binary executable files (ls,cat,mkdir, etc.), common commands are generally here
/etc	Stores system management and configuration files
/home	The root directory that stores all user files is the base point of the user’s home directory. For example, the home directory of the user is /home/user
/usr	Important directory for storing system applications /usr/local Installation directory of the local system administrator software
/opt	The location where the optional application package for additional installation is placed
/proc	A virtual file system directory is a mapping of system memory. You can access this directory directly to obtain system information.
/root	The home directory of the super user (system administrator)
/sbin	Store binary executable files accessible only by root
/dev	Used to store device files
/mnt	This directory is provided by the system to allow users to temporarily mount other file systems.
/boot	Store various files used for system boot
/lib	Store shared libraries and kernel modules needed to run programs in the file system
/var	Files used to store data that needs to be changed at run time

Relative path: the directory that starts relative to the current directory
Absolute path: the directory where the root directory starts

Common operations for Linux files
- Create:
  - touch file
  - Vim file2 Creates and edits file2
  - Mkdir dir1 Creates folder dir1
- delete
  - rm file
  - Rm -r dir1/ Recursively delete folder dir1
- read
  - cat file2
- write
  - Vim file2 Creates and edits file2
The file type
- Common Document (-)
- Directory file (D)
- Symbolic link (L)
- Equipment Files (B, C)
- Socket (s)
- FIFO(p)
Linux file system
- File system overview
  - FAT
    - FAT(File Allocation Table)
    - FAT16, FAT32, etc., file systems used by Microsoft Dos/Windows
- Use a table to hold information about disk blocks
  - NTFS
- NTFS (New Technology File System)
  - A file system in a WindowsNT environment
  - NTFS has improved FAT and replaced the old file system
- Ext2/3/4 – Extended File System (EXT) : Extended file system – Linux file system – ext2/3/4 indicates the generation number
  - Ext file system
    - Boot Sector: Starts the Sector and installs the Boot management program
  - Block Group: Block Group where data is stored
    - Boot Sector
    - Block Group
      - SuperBlock
        
        A place to record information about the entire file system
        
        Block and Inode usage
        
        Time information, control information, etc
      - Inode Bitmap
        
        Bitmap of inodes
        
        Records allocated and unallocated inodes
      - Block Bitmap
        
        Function is similar to Inode bitmap
        
        Records the usage of Data blocks
      - Inode Table
        
        The place where file inodes are stored
        
        Each file (directory) has an Inode
        
        Is the index node of each file (directory)
        
        The Inode:
        
        File Type File Permission File Physical Address File Length File Connection count File Access time index Node NUMBER File Status Access count Link pointer…
        
        File names are not stored in the Inode, but in the Inode of the directory
        
        Directory files are listed without loading the Inode of the file
      - Data Block
        
        A Data block is a place where the contents of a file are stored
        
        Each block has a unique number
        
        A file’s block is recorded on the file’s Inode

6. Device management

IO devices in the broad sense
- For the CPU, anything that does data entry to the CPU
- For CPU, all CPU data output is the output device is the input device
Broad IIO device classification
- Classification of service characteristics
  - Storage device: USB flash drive Memory disk
  - Interactive IO devices: keyboard, monitor, mouse
- A unit of information exchange
  - Block device: SD card of a disk
  - Character device: Printer Shell terminal
- Device Share Properties
  - Exclusive equipment
  - The Shared devices
  - Virtual device
- Transmission rate
  - Low speed device
  - Medium-speed equipment
  - High-speed device
Buffer of IO device
- Background: The rate of the CPU is inconsistent with that of the I/O device
- Reduce the frequency of CPU processing I/O requests
- Improve parallelism between CPU and IO devices
- Dedicated buffers are only applicable to specific I/O processes
- When there are many SUCH I/O processes, memory consumption is also high
- The operating system delineates common buffers, called buffer pools, that can be used by multiple processes
SPOOLing technology
- Virtual device Technology
- How do slow character devices exchange information with computer hosts
- A high-speed sharing device is used to simulate a low-speed exclusive device as a high-speed sharing device
- Logically, each user is assigned a separate high-speed exclusive device
- SPOOLing technology changes synchronous calls to slow devices to asynchronous calls
- Queue dump link added between input and output (input Wells, output Wells)
- SPOOLing is responsible for scheduling between incoming and outgoing Wells and low-speed equipment
- Logically, processes interact directly with high-speed devices, reducing process wait time

Practice of 7.

7.1 Practice of Thread synchronization

The mutex
- The cross execution of instructions between the two threads causes synchronization problems
- Mutex guarantees sequential execution
- atomic
  - The sequence of operations is either complete or not
  - Atomicity is the property that a series of operations cannot be interrupted
  - There is no partial execution and partial non-execution
- Mutex, a variable in one of two states: unlock and lock
- Mutex is the simplest way to synchronize threads
- Two states guarantee serial access to the resource
- Developers can directly use the API to complete resource locking and unlocking operations
- The operating system provides a mutex API directly
  - The C language
    - pthread_mutex_lock
    - pthread_mutex_t
    - pthread_mutex_unlock
  - Java
    - synchronized
spinlocks
- How is it different from a mutex?
- Threads that use spinlocks check repeatedly that the lock variable is available
- A spin lock is also a variable for multi-threaded synchronization
- The spin lock does not yield the CPU and is a busy wait state
- An infinite loop waits for the lock to be released
- Spin locks are used in many places inside the operating system
- Spin locks avoid the overhead of process or thread context switching
- Spinlocks are not suitable for use on single-core cpus because they do not yield the CPU
- api
  - pthread_spinlock_t
  - pthread_ spinlock lock
  - pthread_ spinlock _unlock
Read-write lock
- Reading does not change the value of the critical resource
- Critical resources read more and write less
- Is there a more efficient way to synchronize?
- Allow multiple readers to access resources simultaneously to improve read performance
- Read/write locks are a special type of spin lock
- It is mutually exclusive for writes
- API
  - pthread_rwlock_t
  - Pthread_rwlock_rdlock (read lock)
  - Pthread_rwlock_wrlock (write lock)
- Mutex, spin lock, read/write lock synchronization process: wait unlock — lock — [critical resource] — unlock
Condition variables,
- Condition variables allow threads to sleep until some condition is met
- Condition variables are a relatively complex approach to thread synchronization
- When the condition is met, the thread can be signaled to wake up
- Producer-consumer problem
  - When the buffer is full, producers are not allowed to produce into the buffer and must wait
  - When the buffer is less than or equal to 0, the consumer is not allowed to consume and must wait
  - When a producer produces a product, it awakens consumers who might be waiting
  - When a consumer consumes a product, it awakens a producer who might be waiting
- API
  - Pthread_cond_t, used with mutex
  - Pthread_cond_wait (wait condition satisfied)
  - Pthread_cond_signal (waiting to be woken up)
- Condition variable synchronization process: wait to unlock — lock the condition variable — wait for the condition to meet the wake up — [critical resources] — unlock

Synchronized methods	describe
The mutex	The simplest method of thread synchronization blocks the thread
spinlocks	A thread synchronization method that avoids switching, which is “busy waiting” and does not give up CPU
Read-write lock	A thread synchronization approach designed for “read more than write less” resources can significantly improve performance
Condition variables,	A relatively complex thread synchronization method with more flexible usage scenarios

7.2 Process Synchronization Practice

Use the fork system call to create the process
- Fork Creates a process with the same initialization state as the parent process
- The fork system call is used to create the process
- The system allocates new resources to the fork process
- Fork returns the child process ID and 0 twice
- Fork System call has no parameters
- The parent process returns the child process ID, and the child process returns 0
The Shared memory
- Threads of a process share process resources
- Processes share computer resources
- To some extent, multiple processes share physical memory
- Due to the process management of the operating system, memory space between processes is independent
- Processes cannot access memory outside of process space by default
- Shared storage allows unrelated processes to access the same piece of physical memory
- Shared memory is the fastest way to share and transfer data between two processes
- Shared memory does not provide synchronization and requires other mechanisms to manage access, such as a Boolean variable to control whether it is readable or writable
- Shared memory is the most common process synchronization method in high-performance backend development
- Shared memory usage process
  - Applying for a Shared Memory
  - Connect to the process space
  - Out of process space
  - Use shared memory & delete
- Code implementation
Unix domain socket
- Domain sockets are an advanced method of interprocess communication
- Unix domain sockets can be used for interprocess communication on the same machine
- A socket was originally a term used in network communication
- Domain sockets provided by Unix systems provide similar functionality to network sockets
- Nginx, uWSGI
- The service side
  - Create a socket
  - Bind a socket
  - Listen socket
  - Receive & process information
- The client
  - Create a socket
  - Connect a socket
  - Send a message
- Code implementation
- It provides simple and reliable process communication synchronization service on a single machine
- Can only be used in a single machine, not across machines

8. About me

A small code farmers focusing on basic knowledge, in line with the basic, system, practice, sharing of the learning concept, in self-improvement at the same time to share their own experience, continuous improvement, cycle after cycle.

Personal website basedev.cn

Github

BaseDev series only sort out the knowledge program so far, without deep understanding; He who wishes to know why must return to books and practice

9. The reference

Heavy learning operating system | retractor education

Network programming based | necessary for class

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Base series – Operating system basics cheat sheet

directory

1 Operating system Overview

1.1 What is an Operating system

1.2 Why is an OPERATING system needed

1.3 Basic Operating System Functions

1.4 Concepts related to operating systems

2. Process management

2.1 Process Overview

2.2 Five-state model of process management

2.3 Process Synchronization

2.4 Linux Process Management

3. Job management

3.1 Process Scheduling

3.2 a deadlock

4. Storage management

4.1 Necessity of computer storage management

4.2 Memory Allocation Process

4.3 Memory Reclamation Process

4.4 Process Storage Management

4.5 Virtual Memory

4.6 Storage Management in Linux

5. Document management

5.1 File Management in the Operating System

5.2 Basic Operations on Linux Files

6. Device management

Practice of 7.

7.1 Practice of Thread synchronization

7.2 Process Synchronization Practice

8. About me

9. The reference

Base series – Operating system basics cheat sheet

directory

1 Operating system Overview

1.1 What is an Operating system

1.2 Why is an OPERATING system needed

1.3 Basic Operating System Functions

1.4 Concepts related to operating systems

2. Process management

2.1 Process Overview

2.2 Five-state model of process management

2.3 Process Synchronization

2.4 Linux Process Management

3. Job management

3.1 Process Scheduling

3.2 a deadlock

4. Storage management

4.1 Necessity of computer storage management

4.2 Memory Allocation Process

4.3 Memory Reclamation Process

4.4 Process Storage Management

4.5 Virtual Memory

4.6 Storage Management in Linux

5. Document management

5.1 File Management in the Operating System

5.2 Basic Operations on Linux Files

6. Device management

Practice of 7.

7.1 Practice of Thread synchronization

7.2 Process Synchronization Practice

8. About me

9. The reference

Related Posts

Put the book read thin | “the beauty of the design patterns” learning guide and object oriented

Apk decompilation tools for MAC collection and resources

Jetpack Compose design and implementation of the interpretation and thinking