Series directory

  • introductory
  • The preparatory work
  • BIOS starts in real mode
  • GDT and Protected Mode
  • Exploration of Virtual Memory
  • Load and enter the kernel
  • Display and print
  • The global descriptor table GDT
  • Interrupt handling
  • Virtual memory improvement
  • Implement heap and malloc
  • The first kernel thread
  • Multithreading switch
  • The lock is synchronized with multithreading
  • Implementation of a process
  • Enter user mode
  • A simple file system
  • Load the executable program
  • Implementation of system calls
  • The keyboard driver
  • To run a shell

From the MBR to loader

Next up is the BIOS boot into real mode, which starts the loader writing. Let’s start by reviewing the disk image and memory distribution:

At present, we only need to pay attention to the memory distribution below 1MB, mainly the yellow MBR and blue Loader part. In the previous article, the MBR has been loaded into memory, and the last instruction of program flow through the MBR, JMP LOADER_BASE_ADDR (0x8000), has been executed to the entrance of the loader. Now we need to implement the loader.

Loader work

In general, the main work of a loader is as follows:

  • To establishGDT (Global Descriptor Table), initializes the kernel code and data segment registers (segment registers), leads the CPU into protected mode (protection mode);
  • Set up the kernel page directory (page directory) and the page table (page tables), open virtual memory (virtual memory), enter thepagingMode;
  • loadingkernelMirrored to memory, and then into the kernel code execution, so that the control of the system to the kernel;

As you can see, Loader does a lot of work and touches on some core parts of the x86 architecture, so in order to understand and implement Loader, you must be prepared with the following knowledge:

  • GDT, Segment Memory Addressing, Segment Register, Protection Mode;
  • Virtual memory, page directory, page table;
  • elfFile format, because the kernel will be compiled and linked to files in that format;

Loader realize

As before, here is my project code link SRC /boot/ loader.s for your reference.

This source code has been quite a lot, especially it is written in assembly, and the code also contains a lot of utility functions and print-related functions. In order to avoid confusion, a few of the most important key nodes (functions) are extracted, which represent the work that the loader needs to do as described above:

Protection_mode protection_mode_entry # init the kernel page directory and the page table setup_page # to load and enter the GDT kernel init_kernel

Let’s implement each of these functions one by one. In this article, we first initialize GDT and enter 32-bit protection mode.

Enter the loader

Before we start, let’s first look at the beginning of the loader code, like the MBR, we still define the starting memory address of the loader code first, 0x8000, this is because we have designed it in front of us, The MBR will load the loader from disk to memory 0x8000 and jump forward, so the addressing of the loader must start at that address.

; LOADER_BASE_ADDR = 0x8000
SECTION loader vstart=LOADER_BASE_ADDR

The first code to officially enter the loader is JMP loader_start, it is a simple jump, we jump to loader_start to actually perform the work of the loader:

loader_entry: jmp loader_start ; Global data; . loader_start: call clear_screen call setup_protection_mode

If you are not familiar with this way of assembly coding, you may wonder why you need to JMP it. What is the skipped part in the middle? The answer is, in the middle is the part of the data that we want to define, similar to the global variables defined in the.c file. There’s a bunch of strings defined for printing and, crucially, the GDT.

As you are probably aware, the instructions and data parts of the assembly source code are freely intermixed, and the final compiled binaries are arranged in exactly the same order as the source code. So you can place your instructions and data wherever you want, as long as the flow of instructions flows and executes smoothly and doesn’t fly. Of course, the starting position of the entire loader, i.e. 0x8000, must be the entry code, because this is the agreed jump address with the MBR. As for the back all can be free to play and arrangement.

Initialize the GDT table

Coming to the definition of global data section above, you can skip some of the printed string information I added and go straight to the definition of the GDT. There are four GDT entries defined, each of which is 8 bytes or 64 bits. For the meaning and field format of the GDT, you can read these tutorials here, as well as Jamesm’s Kernel Development Tutorials, which I recommended earlier. These are the historical baggage of the x86 architecture, and I don’t want to waste time explaining them again, but our code must implement and follow its principles.

The first entry in a GDT is reserved for use; The fourth is the video segment descriptor for the display. This is not required, so you can ignore it. So we only need to focus on the second and third items, which are:

  • Kernel snippets (kernel code) descriptor;
  • Kernel data segment (kernel data) descriptor;

We define these two segment descriptors with dd directives:

CODE_DESC:
  dd DESC_CODE_LOW_32
  dd DESC_CODE_HIGH_32

DATA_DESC:
  dd DESC_DATA_LOW_32
  dd DESC_DATA_HIGH_32

DESC_CODE_LOW_32, DESC_CODE_HIGH_32, DESC_DATA_LOW_32, and DESC_DATA_HIGH_32 are defined in SRC /boot/boot.inc. You can verify each bit against the manual documentation given above. Again, this is a boring, troublesome, meticulous but unavoidable work, there is no difficulty, what is needed is the patience to read the documentation manual.


For those of you who are not familiar with assembly, it is necessary to explain the function of dd directives. DD stands for define double (4-bytes), and similarly for DB (byte) and DW (word, 2-bytes), which appear in the assembly source code and are defined in the compiled binary at that location. From this you can see again the relationship between assembly and compiled binary, which is almost a rigid translation.

Go into protected mode

After setting the GDT, we can enter protected mode:

; enable A20
in al, 0x92
or al, 0000_0010b
out 0x92, al

; load GDT
lgdt [gdt_ptr]

; open protection mode - set cr0 bit 0
mov eax, cr0
or eax, 0x00000001
mov cr0, eax

; refresh pipeline
jmp dword SELECTOR_CODE:protection_mode_entry

Note that the LGDT instruction is used to load the GDT, and the protected mode bit of the CR0 register is turned on, officially entering the protected mode. Then through a far jump, the CS segment register is initialized into the kernel code segment. Note that the value of the CS register cannot be set directly by the MOV instruction, but must be set implicitly by the jump statement.

After the jump, the program then goes to the execution of Protection_Mode_Entry, where several kernel data segment registers are initialized:

protection_mode_entry:
  ; set data segments
  mov ax, SELECTOR_DATA
  mov ds, ax
  mov es, ax
  mov ss, ax

  ; set video segment
  mov ax, SELECTOR_VIDEO
  mov gs, ax

To this protection mode initialization work is finished, and then comes to the key part of the loader setup_page function, began to establish the kernel virtual memory, for the next article.