A: the preface

Operating system learning is a boring thing, boring in you have to read the assembly, read the Intel manual, see….. . But it was fun, at least to see how the code I’d written to lose my hair all night would work on a physical machine.

From last year to now, lu Lu off and on to see several books related to the operating system, because it is not a professional kernel (certainly do not have that ability), every time after seeing it is easy to forget, as the saying goes good memory is worse than bad pen, rather than their own carding record, the right to take notes, if there are mistakes, welcome correction.

Two: operating mode of the operating system

When you open the computer, gently press the power button, solidified in THE ROM BIOS program began to run quietly, when it completed the power self-check and other operations, will be from the disk to find Boot Boot program to load the kernel. How to find? Not so mysterious, in fact, is a write dead provisions, BIOS will detect the disk 0 head 0 track 1 sector of the content is ending with 0x55AA, if so, then think that the first sector stored is the Boot Boot program, the way to copy it to the physical memory 0x7C00, then jump here to start executing, So we just put the compiled boot program in the disk sector mentioned above and let the BIOS find it. After the CPU control is successfully transferred to our Boot program, we enter the real mode described below.

2.1 real mode

In the early days, there was no such thing as a real pattern, because programmers didn’t think there would be multiple patterns, and real patterns were called so in order to distinguish them from later patterns.

In order to talk about real patterns, we have to talk about memory addressing, so let’s review the assembly class from college.

Our code is divided into code segments, data segments, etc. All memory addresses are accessed according to the segment base address: in-segment offset. Such address form is called logical address. Why so complicated? Because the early 8086 processor register width only 16 bits, 16 bit registers can only address 64 KB, while the 8086 has 20 root address lines, according to the address line can address 1 MB, so 16 bit width registers are obviously not enough to meet the requirements. To solve this problem, Clever programmers came up with the idea of extending the addressing space with base-segment address: in-segment offset.

Physical address = Segment base address << 4 + Segment offset

Before accessing the memory, the CPU will pass through the segment components and calculate the physical address according to the conversion formula above, so that the two 16-bit registers together will be 20 bits wide.

In real mode, the segment register directly stores the segment base address, such as the CPU CS: IP register, which stores the address of the current instruction.

So what’s wrong with the real model? First of all, it is not secure. The program can access any physical address at will, like visiting a vegetable market. In order to prevent some illegals from wandering around, the protection model was born!

2.2 Protection Mode

As time went on, so did THE CPU, and processor manufacturers developed 32-bit registers to meet ever-increasing memory requirements, along with a protection mode.

In protected mode, it is important to note that the segment register no longer stores the segment base address directly, but the segment selectors.

Segment selector?? Keep it simple! Keep it simple!

Well, you can use it as an index, excluding privileges and so on, so… Index what? Index segment descriptor.

Segment descriptors, as the name implies, describe information about a segment. The length is 64 bits, of which 32 bits are used to store the segment base address and the remaining 32 bits are used to store information such as segment boundaries.

So where do we store the segment descriptor? The GDT, the global descriptor table, the global descriptor table is going to hold all the segment descriptors, and of course the LDT, which I’m not going to mention here.

Ok, so if I tell the CPU the index (segment selector), how does he know where to find GDT? Of course, you have to tell the CPU the address of the GDT in advance:

LGDT [GDT address]Copy the code

Once the above instruction is executed, the CPU will record the address of the GDT and store it in the GDTR register.

With these terms in mind, let’s tease out the addressing mode in protected mode:

  1. Segment registers hold segment selectors.
  2. The CPU finds the corresponding segment descriptor from the GDT according to the segment selector.
  3. Retrieves the segment base address from the segment descriptor.
  4. According to the previous formula, combining the segment base address and the segment offset, the physical address is calculated.
Protected mode addressing mode when paging is not enabled

If paging is enabled, step 3 computs a linear address that needs to be translated into a physical address by a page component.

How do I turn on protected mode? Of course, it is not so mysterious, just turn on the flag bit in the CR0 control register. In addition to turning on the switch, you need to prepare some data needed for protected mode, such as the global descriptor table described above, and then jump directly to one of the constructed segment selectors to complete the jump from real mode to protected mode.

2.3 IA – 32 e pattern

32-bit registers were already available for 4GB of memory addressing, but that didn’t seem to be enough, so a 64-bit register was developed with 48 bits for addressing, which worked fine, at least for now.

Along with the 64-bit processor came a new model: ia-32E.

Ia-32e is protected mode based, that is, it is addressed by segment selectors, segment descriptors, etc. Unlike 32-bit protected mode, however, for IA-32E, the segment base address in all segment descriptors is 0, and the segment length is the maximum addressable length, so that in paging cases, The in-segment offset is directly equal to the linear address, without the need for formula calculation.

Three: privilege level

In addition to addressing changes, protected mode also has a new name: privilege level.

There are four privilege levels: 0 is the highest privilege level, which is run by kernel code, and 3 is the lowest privilege level, which is run by user programs.

The privilege level required to access the current segment will be recorded in the segment descriptor. When the program accesses a segment, it needs to construct the segment selector first. Two segments in the segment selector are specifically responsible for representing the privilege level when the current program requests to access the target segment, namely RPL. Generally speaking, RPL = CPL, CPL is the privilege level of the current code segment, and CPL exists in the last two digits of the CS register (because the CS register holds the segment selector of the current code segment).

The privilege level of the target segment is called the DPL. When an application accesses the target segment, if the DPL privilege is higher than either CPL or RPL, the access will be denied, thus protecting the application

Four:

In general, the operation mode is the operating system and a supplement each other between the processor, the product of common development, although most people are not kernel developers, but to understand the operation mode can better help us understand the underlying operating system works, after all it is a programmer’s self-improvement:)

Five: Reference books:

Operating System True Restore

Orange’s: The Implementation of an Operating System

Design and Implementation of a 64-bit Operating System

Incidentally, I recommend several compilation and assembly related books:

Assembly Language

X86 Assembly Language: From Real mode to Protected Mode

Modern Compilation Principles

Programmer Self-cultivation: Linking, loading, and Libraries