Series directory

  • introductory
  • The preparatory work
  • BIOS starts in real mode
  • GDT and Protected Mode
  • Exploration of Virtual Memory
  • Load and enter the kernel
  • Display and print
  • The global descriptor table GDT
  • Interrupt handling
  • Virtual memory improvement
  • Implement heap and malloc
  • The first kernel thread
  • Multithreading switch
  • The lock is synchronized with multithreading
  • Implementation of a process
  • Enter user mode
  • A simple file system
  • Load the executable program
  • Implementation of system calls
  • The keyboard driver
  • To run a shell

Kernel disk image

Following up on the introduction to virtual memory, this article will load and launch the kernel, which is shown in green:

Of course, kernel mirroring is read and loaded from disk, so here’s an old picture of the relationship between disk and memory:

By the way, the diagonal shaded question mark in the image above is the Kernel Page Tables from the previous chapter, the orange part of the first image. The 256 tables are 1MB in size.

Write the kernel

Going back to the kernel, the green part of the figure, it doesn’t actually exist yet, so first we need to implement and compile a simple demo kernel. For those of you who have no idea what a kernel is, you might ask: What does a kernel look like?

The answer is simple: kernel is almost the same as an executable you would write in C, starting with a main function.

Let’s implement our first kernel:

void main() {
  while (1) {}
}

It’s as simple as that, nothing more than a while loop, but it’s good enough for our demo here.

Compile the kernel

There are a lot of compiler parameters, such as 32-bit coding, disable the C library, etc. (This is our own custom OS, and it is not compatible with the C library).

gcc -m32 -nostdlib -nostdinc -fno-builtin -fno-stack-protector -no-pie -fno-pic -c main.c -o main.o

Link to the kernel:

ld -m elf_i386 -Tlink.ld -o kernel main.o

A link configuration file is used here, link.ld:

ENTRY(main)
SECTIONS
{
  .text 0xC0800000:
  {
    code = .; _code = .; __code = .;
    *(.text)
  }

  .data ALIGN(4096):
  {
     data = .; _data = .; __data = .;
     *(.data)
     *(.rodata)
  }

  .bss ALIGN(4096):
  {
    bss = .; _bss = .; __bss = .;
    *(.bss)
    . = ALIGN(4096);
  }

  end = .; _end = .; __end = .;
}

The most important thing is to define the starting address of the text segment 0xC0800000, which is also the starting address of the whole kernel. If you remember from the previous post, we mapped out the virtual memory distribution of the kernel space:

0xC0800000 will be the entry address of the kernel, because the text segment will be loaded here, followed by data, BSS, etc. After the loader ends, it will jump to this address.

It also defines main as the entry function for the entire executable.

The compiled link is an ELF binary, so we can disassemble it and dump it:

objdump -dsx kernel

You can see that the address of the main function is 0xC080000, which is the first instruction in the kernel.

Making a kernel image

dd if=kernel of=scroll.img bs=512 count=2048 seek=9 conv=notrunc

Seek =9 because the MBR and loader already occupy the first 9 sectors on the disk. In this case, the kernel size is 2048 sectors (1MB), which is large enough for our project.

Now the disk image finally looks like this:

Read and load the kernel

With the mirror ready, the kernel can now be read and loaded. I’ll start with the init_kernel link for your reference.

Unlike previous MBR and loader loads, the words read and load are separated here because they are two steps:

  • Read: Copy the original binary of the kernel disk image to a spare place in memory, where the binary is in ELF format.
  • Loading: is to take the previous step to get the ELF executable binary to be parsed, and will each onesectionCopy them to beaddressingPlace;

Let’s start with the first step, “read.” We chose 1MB at the top of the virtual memory, i.e. 1MB between 0xffffffff-1MB) and 0xFFFFFFFF as the storage address of the binary image. Of course, this is a personal choice, I choose here because no one will disturb it at present. Of course, it also needs to be assigned the corresponding physical page frames and set up the mapping in the page table, so I also found 1MB of free space in the remaining physical memory to map it; You can then read the kernel image in just as you read the MBR and loader before.

Next comes the second step, “Load.” This involves parsing according to the ELF file format specification, which requires you to take some time to understand the relevant documentation, mainly getting the location and size of each section from the Program Header Table, as well as the memory address (virtual address, of course) that was loaded. Then copy the data over. The memory address of this load is the starting location of 0xC0800000. Of course, before copy, they must be pre-allocated frames and the memory maps are set up in the page table. All this work is done ahead of time in the allocate_pages_for_kernel function.

Into the kernel

With everything in place, it’s time to actually enter the kernel:

init_kernel:
  call allocate_pages_for_kernel
  call load_hd_kernel_image
  call do_load_kernel
  
  ; init floating point unit before entering the kernel
  finit

  ; move stack to 0xF0000000
  mov esp, KERNEL_STACK_TOP - 16
  mov ebp, esp

  ; let's jump to kernel entry :)
  jmp eax
  ret

The CPU’s floating-point unit is initialized first to prevent later exceptions.

I then move the stack to the higher address 0xF0000000, which is not necessary, but a personal choice. The current stack location is actually pretty good too (somewhere below 0xC0007B00, where 0x7B00 is transferred in the MBR, and we access it with 0xC0000000 + 0x7B00 when paging is turned on, if you remember). I just hope that the stack position can be moved to a new place after I enter the kernel later, so I have made so many steps. Stack placement is flexible, as long as it’s an unused, undisturbed place.


Then, quite simply, a JMP EAX instruction jumps to the kernel entry.

Why EAX? This is the return value of do_load_kernel, which is the function that parsers the ELF binaries of the loaded kernel. It returns the value of the kernel’s entry address, which is the address of the main function, This address is given by the e_entry field of the ELF Header in the ELF file. The ELF executable binary’s ENTRY address is determined during the linking phase, which is actually specified by the ENTRY(main) in the previous link.ld.

If everything goes well, the results are as follows:

The program has successfully entered the kernel and is running at 0xC0800003, the location of the while loop. This will be the real beginning of the kernel’s journey 🙂