Operating system is between hardware and software, so the development of operating system is actually according to the requirements of hardware to achieve their own functions. That Microsoft is according to the Intel manual to do development.

  • Intel Developer Manual official website: software.intel.com/en-us/artic…
  • CPU handbook: www.intel.com/content/www…

Start writing a minimal operating system right away

The computer boots up by executing the BIOS program on the motherboard for self-checking, and then executing the code in the main boot sector (512 bytes in sector 0 of the main boot device), the last two bytes of which end in 0x55AA. Based on this principle, you can write a piece of code that predates the operating system, which we call the smallest operating system. Using the Bochs VIRTUAL machine to do the experiment, we started by writing our minimal “operating system” (this operating system’s function is to display Hello OS World!). . Then assemble and write the binary code to sector 0 of the bochs virtual floppy disk, setting the floppy disk as the primary boot device of the Bochs virtual machine. Our operating system will be able to show it.

After the BIOS loads the main boot sector code, it goes to 0x7C00. Why 0x7C00? Because the previous operating system only had 32K memory, that is, the maximum memory address was 0x7FFFF, so it put the boot sector code at the end of the memory and left 512 bytes as the boot sector data storage space. After the boot sector code is executed, memory does not need to keep the code and can be recycled.

The following figure shows the minimum operating system experimental interface:

Protected mode

Protected mode is relative to real mode, and for the smallest operating systems above, memory addressing uses segment address: offset address to represent physical addresses. The problem with this approach is that the application can access the memory at will, which is extremely insecure. Thus, in the Intel 80286 era, protected mode was implemented, with a larger addressing range and more secure memory access. Addressing in protected mode is changed to [selector: offset address], the selector points to the GDT global descriptor table, and the lowest two bits of the selector indicate the privilege. The GDT is a set of Descriptor that describes information such as memory base address, which is equivalent to segment address in real mode.

The real mode jumps to the protected mode, so that the 0 position of cr0 is 1, and the 0 bit of cr0 is 1, indicating that cr0 is in the protected mode. Then JMP goes to selectorCode32:0, and SelectorCode32 should be regarded as a selector instead of a segment address. SelectorCode32 needs to be resolved as selectors.

The following figure shows the experimental interface of real mode jumping to protected mode (there is a red P in the right of the middle) :

LDT

LDT and GDT are similar in that they are descriptor lists, and they both store descriptors. The descriptor stored in GDT can be of the GDT type, which directly calculates the memory address, or of the LDT type, which calculates the location of the LDT table. LDT table can also put descriptors, pointing to a specific segment.

Privilege transfer

In protected mode, different code segments have different permission levels, which are implemented through CPL, RPL, DPL, and consistent code segments, which are represented in descriptors. To solve this problem, for example, you can use a gate descriptor or TSS.

paging

Memory paging is the basis of virtual memory implementation. Modern computers almost use paging mode to manage memory instead of segmenting mode. The difference between segmentation and paging is that logical addresses are converted into linear addresses through segmentation and linear addresses into physical addresses through paging. If paging is not enabled, the logical address is segmented into a linear address, which is the physical address.

General memory pages using multi-level paging, to reduce the size of the paging table to save memory, here we talk about two-level paging. The first level represents the page directory PDE, which specifies the page table base address, and the second level represents the page table PTE, which specifies the page base address. The control Register of CR3(Controll Register) specifies the base address of the page directory. PDE and PTE contain permission information.

The operation of paging is as follows: mov AX, [0x1234h] for example, the CPU reads the base address of the page directory in CR3 to obtain the page directory, then obtains the page table offset based on the 0 to 9 bits before 0x1234, and obtains the page entry PDE based on the offset. The PDE contains the page base address, plus the first 10 to 20 digits in 0x1234, which indicates the page offset, to obtain the physical page base address. Add the last 12 bits of 0x1234 to the physical page base to get the physical address.

Implementation of virtual memory

When we write assembly code, it operates on computer memory such as 0x1234, but we find that when we run a copy of the same code, we find that the same memory operates on 0x1234 without data tampering, which is due to virtual memory. Because the memory address is the same, but the page directory is different, different physical addresses are accessed.

Here, call the same code (same address), but switch the page directory, the physical address is different.

interrupt

Interrupts in protected mode are completely different from those in real mode. In protected mode, there is interrupt vector table IDT, similar to GDT, which is also a storage descriptor. Descriptors in IDT are of three types: interrupt gate descriptors, trap gate descriptors, and task gate descriptors. The door descriptor has the meaning of “open door”, meaning that the processing to be performed must be privileged to verify that a specific permission is met before it can be operated.

Put the operating system into protected mode

Intuitive idea, boot to execute the code in the main boot sector, we can load in the main boot sector to start our control system can not it. Don’t forget that the main boot sector is only 512 bytes, which is probably not enough to write a piece of code that loads an operating system. Therefore, we need an intermediate module named Loader. The main boot sector loads the Loader, and the Loader does the loading of the operating system, so that there is no 512 byte limit.

process

The process needs to be switched during process scheduling, and the state of the process is register and memory. Modern operating systems use virtual memory, and each process is independent, so there is no need to save the memory state. Therefore, only the register state needs to be saved during process switchover. There are pushad, pusHA and other batch save register instructions.

Classification of assembly

Machine instructions can be divided into complex instruction set CISC and reduced instruction set RISC, which are two completely different machine instructions. Intel(CISC) and ARM(RISC) design and manufacture these two instruction set chips. With the development of technology Intel CPU is more and more powerful, so there are 8086 assembly (16 bit instructions), x86 assembly (32 bit), X64 assembly (64 bit). Assembly language is divided into two kinds, NASM and MASN, their differences are similar to high-level language C language, Java language.

Input output system

Use interrupts to respond

Interprocess communication IPC

Microkernels and macro kernels. Unix is microkernel design, while Linux is macro kernel. The microkernel idea is to make the kernel do as little as possible and focus on process scheduling, leaving everything else, such as IO and memory management, to the system processes, and using messages to communicate between processes. The macro kernel includes IO and so on in the kernel, avoiding the frequent switching of the kernel state.

The file system

Hard disk drives

Memory management

Fork () function