Write OS kernel from scratch - multithreading switch

Series directory

introductory
The preparatory work
BIOS starts in real mode
GDT and Protected Mode
Exploration of Virtual Memory
Load and enter the kernel
Display and print
The global descriptor table GDT
Interrupt handling
Virtual memory improvement
Implement heap and malloc
The first kernel thread
Multithreading switch
The lock is synchronized with multithreading
Implementation of a process
Enter user mode
A simple file system
Load the executable program
Implementation of system calls
The keyboard driver
To run a shell

To prepare

Following on from the previous post, we started the first Thread, as if we were just running a normal function on a completely new stack. To be a real operating system, you need to be able to run multiple threads. The process of creating multiple Threads is simple, but the problem is getting them to run alternately. Going back to the previous post about the two core components of Thread:

code
stack

This article will enter the world of multithreading. Here, we show how the two Threads complete the transformation of the stack, and the execution flow of the code is automatically switched as the stack is transformed. And based on the switching ability of multi-threads, we will preliminarily build a scheduler scheduler.

thread

To start with, the program runs in the light blue area of the stack:

If there is no external trigger, the program will run here forever, and no thread switch can occur, even if there is another thread somewhere else, with its own stack. So we need a trigger here, and that’s ClockInterrupt. At the end of the Interrupt Handling section, we tried to turn on ClockInterrupt, where the handler is simply a function that prints. Now we need to implement Threads toggle inside.

Imagine that a program is running on the stack in the image above, and when clock-interrupt occurs, the CPU automatically initiates interrupt handling. To recap from Interrupt Handling, the stack looks like this:

After a series of stack operations by CPU and kernel, the program execution process is roughly like this:

Isr32 (32 is clock interrupt number) --> isr_common_stub --> isr_handler --> timer_callback

Finally, the timer_callback function:

struct task_struct* crt_thread;
struct task_struct* next_thread;

static void timer_callback(isr_params_t regs) {
  // For only 2 threads switch.
  struct task_struct* old_thread = crt_thread;
  struct task_struct* new_thread = next_thread;
  
  // swap, for next time switch.
  crt_thread = new_thread;
  next_thread = old_thread;

  context_switch(old_thread, new_thread);
}

If you don’t remember the task_struct definition, this is the thread description structure.

struct task_struct {
  uint32 kernel_esp;
  uint32 kernel_stack;
  
  uint32 id;
  // ...
};

The above sample code should be easy to understand. Assuming we only have two threads, we will switch between them in timer_callback.

The key is the last line, context_switch, which switches between two threads. It actually splits the upper and lower parts of the thread:

Save the current thread’s context:

context_switch:
  push eax
  push ecx
  push edx
  push ebx
  push ebp
  push esi
  push edi

  mov eax, [esp + 32]
  mov [eax], esp

Complete the recovery of the next thread to be run:

  ; jump to next thread's kernel stack
  ; and recover its execution
  mov eax, [esp + 36]
  mov esp, [eax]

resume_thread:
  pop edi
  pop esi
  pop ebp
  pop ebx
  pop edx
  pop ecx
  pop eax

  sti
  ret

Esp + 32 and esp + 36 are the two arguments to the context_switch function, which are the new and old threads to switch:

void context_switch(old_thread, new_thread);

The task_struct parameter is a pointer to the task_struct parameter, and the first field in the task_struct parameter is kernel_esp. The next time the thread resumes, read Kernel_esp to find the ESP of the last moment before it was cut away, and continue running.

Moreover, after the stack transformation, the execution flow of the code is automatically restored to the original execution flow of the next thread, which is actually the same stream. Because the thread to be switched, when it was cut before, also through the context_switch function, push all the universal registers, save ESP, and then cut away, now it is restored in the same instruction location, continue to execute.

So the most important thing to understand here is that the execution of the context_switch function is split between the new and old threads. When the old thread finishes executing the first half, it is cut away and suspended. It has to wait until the next time it is scheduled, recover in place, continue to run the second half, together into a complete context_switch execution stream; The return then moves up the hierarchy, eventually returning to the place where the clock interrupted it.

Switch to the newly created Thread

In the above case, the next thread to be run is a previously run thread that has been cut and suspended, so its stack layout is the same as that of the current running thread, which comes to the running stack of the context_switch function. But what if the Thread to run is a newly created one? If you go back to the previous chapter, you’ll see that the design is seamless.

The figure above shows the newly created Thread stack, which happens to initialize a switch stack and points the kernel_esp field to that stack, so it immediately initializes the esp correctly by starting with the lower half of the context_switch function. And pop all universal registers, which is the same way that the first Thread was started.

That is, the first thread is started from Resume_thread; All of the subsequent Threads are started perfectly in the same way with context_swith.

Thread scheduler

With the ability to switch between two threads, we can actually create more threads and start building a scheduler scheduler. The simplest scheduling mode is for all threads to run in sequence and iterate over and over again, so this first requires a linked list to hold all threads.

SRC /utils/linked_list.c is a generic linked list. Each node holds a pointer to an actual object, which in our case is task_struct*. The specific implementation can refer to the code.

struct linked_list_node {
  type_t ptr;
  struct linked_list_node* prev;
  struct linked_list_node* next;
};
typedef struct linked_list_node linked_list_node_t;

struct linked_list {
  linked_list_node_t* head;
  linked_list_node_t* tail;
  uint32 size;
};
typedef struct linked_list linked_list_t;

So we can define the list of threads to run, namely ready_tasks. All threads in this list are in the TASK_READY state and can be run by the next scheduled task:

static linked_list_t ready_tasks;

Therefore, the scheduler logic is simple, as shown here in pseudocode:

next_thread = Fetch head from ready_tasks queue;
set next_thread.status = TASK_RUNNING;
set crt_thread.status = TASK_READY;
Enqueue crt_thread to tail of ready_tasks queue;

context_switch(crt_thread, next_thread);

Task time slice

Threads switches do not occur with each clock interrupt, but instead are cut when the task’s CPU time slice is exhausted. Each task structure records the current time slice consumed:

struct task_struct {
  // ...
  uint32 ticks;
  // ...
};

We organize the schedule logic into a single function, maybe_context_switch. Each time the clock interrupt ticks the field ticks + 1. When the ticks reach a specified limit, the thread slice runs out. Fire the do_context_switch, which actually performs the THREAD switch.

static void timer_callback(isr_params_t regs) {
  maybe_context_switch();
}

void maybe_context_switch() { disable_interrupt(); merge_ready_tasks(); bool need_context_switch = false; if (ready_tasks.size > 0) { tcb_t* crt_thread = get_crt_thread(); crt_thread->ticks++; // Current thread has run out of time slice. if (crt_thread->ticks >= crt_thread->priority) { crt_thread->ticks = 0; need_context_switch = true; } } if (need_context_switch) { do_context_switch(); }}

thread yield

In some cases, a thread may also voluntarily surrender the CPU to the next thread to end its run. This is called yield, for example, when a lock race fails and the thread ends. Note that the yield is not a block. The current thread does not block sleep. It simply gives up the CPU, putting itself back into the ready_tasks queue, and at some point it will be rescheduled to run.

So the yield implementation is simple, and you can simply call do_context_switch:

void schedule_thread_yield() { disable_interrupt(); merge_ready_tasks(); / /... if (ready_tasks.size > 0) { do_context_switch(); }}

conclusion

In this article, we have implemented the basic framework of multi-threading and scheduler. In fact, this kernel can be called a real OS. It already has the basic skeleton of the simplest operating system:

Memory management
Interrupt handling
Task management

In the future work, we will continue to enrich the kernel to make it more functional and better user interaction ability.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Write OS kernel from scratch – multithreading switch

Series directory

To prepare

thread

Switch to the newly created Thread

Thread scheduler

Task time slice

thread yield

conclusion

Write OS kernel from scratch – multithreading switch

Series directory

To prepare

thread

Switch to the newly created Thread

Thread scheduler

Task time slice

thread yield

conclusion

Related Posts

How does Luajit work – Explain the pattern

Experience pipelining/multi-launch of CPU

Write the OS kernel-process implementation from scratch