Start the

  • Public id: Rand_cs

The boot aspect of the article has been written before, that was my first article, this article improves on the basis of the previous article, and then added the case of multi-processor boot, without further ado to see directly.

Start can be divided into two kinds, a kind of for cold start, refers to the computer off press POWER key to start, also called hardware, such as boot, before starting the startup mode of the computer is in POWER position, need to add electricity to maintain storage components such as memory inside the content is lost, charging the inside of the boot that moment the values are random, The operating system initializes it.

The hot start is started under the condition of power, also called software start, such as restart, this start mode before and after the start of the power, memory and other storage components in the value will not change, but after all, it is the start process, the operating system will initialize it.

Regardless of the startup, the CPU is signaled to start and boot is initiated. As in the first article, there are five steps to boot: BIOS->MBR->Bootloader->OS->Multiprocessor. Let’s look at them one by one.

BIOS

The CS and IP registers are initialized at the moment of startup: CS=0xf000,IP=0xfff0CS=0xf000,IP= 0xFFF0.

When it started, it was in real mode. In real mode, the address bus used only 20 bits, only 220=1M2^{20}=1M220=1M addressing space, which is the low 1M1M1M of memory used. At this time, the paging mechanism was not established, and the CPU was running at the actual physical address.

But in real mode registers are used only 16 bits. How can registers be used to address a 20 bit address space? Intel uses a fragmentation mechanism to access memory, that is, segment base address: segment offset, address = segment base address + offset Segment base address: segment offset, address = segment base address + offset segment base address: segment offset, address = segment base address + offset segment base address: segment offset Segment offset, address = segment base address + offset, but register in real mode can only use 16 bits, so the real mode address = segment base address x 16+ offset address = segment base address \times16+ offset address = segment base address x 16+ offset address.

So according to CS=0xf000,IP=0xfff0CS=0xf000,IP=0xfff0CS=0xf000,IP= 0xFFf0, The resulting address=0xf000<<4+ 0xFFF0 = 0xFFFF0Address =0xf000<<4+ 0xFFF0 = 0xFFFF0.

What is this address? Let’s look at the memory layout with 1M1M1M lower memory:

Look at the top two lines, you can know that 0xFFFF00xFF00xFF0 address stored is a jump instruction, the CPU executes this command and then jumps to the BIOS code body part, BIOS mainly do a few things:

  • Self check, then do some simple initialization of the hardware
  • Build interrupt load interrupt service routine to table
  • Load the MBR from the original sector of the hard disk (usually the boot device is the hard disk) to
    0 x 7 c 00 0x7c00

MBR

About MBR(Master Boot Record), I am in a stroke of a disk and partition a more detailed, here do not repeat, simply say again about the structure of MBR:

  1. Bootstrap and some parameters, 446 bytes
  2. Partition table DPT, 64 bytes
  3. Closing tag signatures, 0x55 and 0xAA, two bytes

The MBR code looks for partitions in the partition table that can boot the existing operating system, that is, the active partition marked 0x80, loads the active partition’s boot block, and executes the operating system boot program Bootloader within it.

Bootloader

The main purpose of the Bootloader, whatever it is called, is to load the operating system into memory. The operating system is also a program that needs to be loaded into memory to run. Normally running computer we can use the exec family function to load and run a program, the same to load and run the operating system this program use Bootloader.

Some other things are done in Bootloader, such as entering protected mode, turning on paging, setting up memory mapping, etc. Such as GRUB, U-boot and so on belong to the Bootloader, but more powerful.

OS

After the operating system kernel is loaded into the memory, it does some initialization work to establish a good working environment, such as the initialization of each hardware, reset GDT, IDT and other initial operations. Initialization starts other processors (if there are more than one). I’m not going to go into detail here, and I’m not going to describe it, but let’s go straight to what example Xv6 does and how it does it.

Multiprocessor

The above startup process is the single processing of the startup process, the multi-processor situation is somewhat different, with a sentence to briefly summarize the multi-processor situation: start a CPU, use it as a basis to start other processors.

The CPU that starts first is called the BootStrap Processor (BSP), and other cpus are called the Application Processor (AP). BSP is determined by the system hardware or BIOS dynamic selection.

The multi-processor startup process is roughly divided into the following steps:

  1. The process of starting BSP from BIOS is similar to that of BIOS-MBR-bootloader-OS
  2. The BSP obtains the multi-processor Configuration information from the MP Configuration Table
  3. BSP starts APs by sending init-sipi-siPI messages to APs
  4. When APs starts, each APs processor has to set up its own mechanism like BSP, such as protection mode, paging, interrupts, and so on

Here, we mainly focus on the second point, which is to obtain Configuration information of multi-processor. The computer is specially equipped with data MP Configuration Table to describe. There is also a data Structure, Floating Pointer Structure, to point to the MP Configuration Table.

See the Floating Pointer’s construction first:

struct mp {             // floating pointer
  uchar signature[4];   // "_MP_"  
  void *physaddr;       // phys addr of MP config table MP configuration table address
  uchar length;         // 1 structure length
  uchar specrev;        // [14] MP version
  uchar checksum;       // All bytes must add up to 0
  uchar type;           // MP system config type 0 indicates that the configuration table exists
  uchar imcrp;          // Only bit 7 is used, 0 indicates PIC mode, 1 indicates APIC mode
  uchar reserved[3];
};
Copy the code

This structure can only appear in three places, so when looking for a floating pointer use the following sequence:

  1. Extended BIOS Data Area (EBDA) initial 1KB
  2. Last 1KB of system base memory (639kb-640kb for 640KB base memory, 511kb-512kb for 512KB base memory)
  3. The ROM area of the BIOS is between 0x0F00000x0F00000x0F0000 and 0xFFFFF0xFFFFF0xFFFFF

Then there is the structure of the MP Configuration Table Header, which is the Header of the Configuration Table:

struct mpconf {         // configuration table header
  uchar signature[4];   // "PCMP", signature
  ushort length;        // total table length
  uchar version;        // [14], version
  uchar checksum;       // All bytes must add up to 0
  uchar product[20];    // product ID Product ID
  uint *oemtable;       // OEM table pointer, OEM table 0 is optional
  ushort oemlength;     // OEM table length
  ushort entry;         // entry count Number of entries
  uint *lapicaddr;      // Address of local APIC Lapic address
  ushort xlength;       // Extended Table length Extended table length
  uchar xchecksum;      // Extended Table checksum Extended table checksum
  uchar reserved;       / / to keep
};
Copy the code

There are many types of MP Configuration Table entries. We only list the Entry structure of the processor:

struct mpproc {         // processor table entry
  uchar type;           // Entry Type (0) Entry type: processor
  uchar apicid;         // local APIC id Lapic id
  uchar version;        // Local APIC Verison version
  uchar flags;          // CPU flags 0x02 indicates that this is BSP
    #define MPBOOT 0x02     // This proc is the bootstrap processor.
  uchar signature[4];   // CPU signature Indicates the CPU signature
  uint feature;         // feature flags from CPUID instruction
  uchar reserved[8];   
};
Copy the code

As far as the above data structures are concerned (admittedly, some of them are not clear to me), the layout of these data structures is shown below:

These structures are mainly used to find the number of cpus, about the configuration of the multi-processor data structure to learn here, how to use the specific example explained.

Xv6

Front is some theoretical knowledge, the following actual look at an operating system Xv6 is how to start, first to look at the overall flow chart of XV6 start, good have a general understanding:

Don’t be intimidated by this mess, xV6’s startup process is relatively simple, with a lot of simplification in terms of startup, or the entire operating system, otherwise it wouldn’t be just a few thousand lines of code. Because of some simplification, the process may not be as clear-cut as above, but it is similar.

The BIOS is a read-only ROM area that the operating system can’t do anything about, but we know how to execute it. The BIOS code starts at 0xFFFF00xFF00xffff0. Then load the MBR of sector 0 (LBA addressing) on the disk, that is, the original sector, to 0x7C000x7C000x7C00, and execute.

The operating system can control the following code, but Xv6 does not actually construct the MBR structure, and from the Makefile we know that the initial sector is written to the bootblock. Bootblock is derived from bootasm.S and bootmain.c files after a series of compilation and assembly links and conversion format.

bootblock: bootasm.S bootmain.c
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
dd if=bootblock of=xv6.img conv=notrunc
Copy the code

These are two lines taken from a Makefile. Bootblock is generated by bootasm.S and bootmain.c, and written to xv6.img using the dd command. This Xv6.img can be considered a disk image.

Description of the dd command:

  • If =FILEif=FILEif=FILE, which specifies the FILE to read
  • Of =FILEof=FILEof=FILE, which specifies the FILE to output data to
  • Bs =BYTESbs=BYTESbs=BYTES specifies the block size. The basic unit of DD operation IO is one block. The default value is 512 BYTES
  • Count =BLOCKScount=BLOCKScount=BLOCKS, which specifies the number of BLOCKS for the operation
  • Seek =BLOCKSseek=BLOCKSseek=BLOCKS, which specifies how many BLOCKS to skip when exporting BLOCKS to a file
  • Conv =CONVSconv=CONVSconv=CONVS, specify how to convert files, usually specified as notrunc, one dimension without interrupting truncated files

The dd command will write the bootblock to xv6.img, not seek, so not skip, that is to write the zeroth/sector, equivalent to writing to the beginning of the disk sector.

bootasm.S

This section to specific analysis of bootasm.S, mainly do one thing: enter the protected mode, mainly divided into four steps: open A20 -> build load GDT -> set CR0 register -> start32 call bootmain. In fact, boot involves a lot of things behind, such as hard disk, APIC, the establishment of various mechanisms and so on, some of the details do not do a detailed explanation for the later explanation, nonsense no longer say more, one by one:

Open the A20

In a previous article, I described a method to open A20 using system port 0x92. This method is simple, but very dangerous, and can cause conflicts with other hardware and force shutdown. Xv6 uses another approach: use the keyboard controller to open A20 and look directly at the code:

Seta20.1: Inb $0x64,%al # read the keyboard status from port 0x64 testb $0x2,%al # test whether the keyboard is busy Movb $0xd1,%al # send 0xd1 to port 0x64 to write command outb %al,$0x64 seta20.2: Inb $0x64,%al # Wait for not busy same as testb $0x2,%al JNZ seta20.2movb $0xdf,%al # write 0xdf to port 0x60, open A20 outb %al,$0x60Copy the code

About the operation of the keyboard, the front has also written an article keyboard, can refer to the reference, here is not repeated, the above notes should also be understood, in short, is to write to a specific port command open A20.

When A20 is turned on, the address bus can use 32 roots, and the addressing range is reached
2 32 = 4 G 2^{32} = 4G
.

Build loads the GDT

I build GDT (bootasm. H)

#define SEG_NULLASM \. Word 0, 0; \ .byte 0, 0, 0, 0 #define SEG_ASM(type,base,lim) \ .word (((lim) >> 12) & 0xffff), ((base) & 0xffff); \. Byte (((base) > > 16) & 0 XFF), (0 x90 | (type), \ [0 xc0 | (((lim) > > 28) & 0 xf)), (((base) > > 24) & 0 XFF) GDT GDT # building: SEG_NULLASM # null seg GDT in the first period of descriptors without SEG_ASM (STA_X | STA_R, 0 x0, 0 XFFFFFFFF) # code seg code segment descriptor, Read permission SEG_ASM(STA_W, 0x0, 0xFFFFFFFF) # data SEG Data segment descriptor write permissionCopy the code

Segment selector (MMU.h) :

#define SEG_KCODE 1  // kernel code
#define SEG_KDATA 2  // kernel data+stack
Copy the code

Two segment descriptors are built according to the SEG_ASM macro: the code segment descriptor and the data segment descriptor, which is built first because the index of the code segment in the GDT is set to 1. The first GDT descriptor is useless, so set it to 0.

II Construct GDTR data

The CPU needs to know where the GDT is being built, so it needs to load the starting address and bounds of the GDT into the GDTR register

Gdtdesc: # The 6-byte data used to construct GDTR. Word (gdtdesc-GDDT-1) # sizeof(GDT) -1 boundary = size -1. Long GDT # address GDT start addressCopy the code

Gdtdesc is the 48-bit data required by GDTR, which includes the starting position and limit of GDT

III loaded GDT

LGDT gdtDESC # load GDTCopy the code

Loading GDT has a special instruction LGDT, which is easy to use, as shown in the figure above

Set the CR0 register

Turn the PE position 1 of register CR0 into protected mode

  movl    %cr0, %eax     
  orl     $CR0_PE, %eax 
  movl    %eax, %cr0
Copy the code

Began into protection mode, 16 bit CPU became a 32-bit CPU, at the moment before and after the instruction format is not the same, using 16-bit instructions before then, after that use 32-bit instruction, how many people are here in order not to say that the length of this directive, but two different coding mode instructions, This means that the machine code of the same instruction may be different in both modes.

But we all know that in order to speed up the CPU’s execution of instructions, there is a mechanism: pipelining, which simply means loading multiple instructions onto the pipeline and running different parts of each instruction at the same time. The problem is that there may still be 16-bit instructions on the pipeline after entering the protected mode, so the pipeline needs to be cleared after entering the protected mode. The unconditional jump JMP command can be used to clear the pipeline:

LJMP $(SEG_KCODE<<3), $start32 # jump to CS=(SEG_KODE<<3) EIP=start32, segment base address = 0 This should be followed by instructions in 32-bit protected modeCopy the code

In addition, after entering the protected mode, the segment register is no longer stored in the segment base address, but the segment selector, using the high 13 bits of the segment selector as the index to GDT to obtain the corresponding segment base address, plus the offset is the last address. Because of this extra step, and the segment selectors, the segment descriptors have some attribute bits in them, the access to memory is restricted for protection. (So protecting a computer is limiting its freedom?)

Many segments share a segment selector, and the segment base address in the segment descriptor is mostly 0, because the address bus and some commonly used registers are extended to 32 bits (except the segment register), addressing range is 232=4G2^{32}=4G232=4G, Ability to address all addresses. Unlike in real mode, where a single 16-bit register cannot find a 20-bit address space, the segment base address in the segment register must be moved 4 bits to the left and added to the segment offset to address it.

start32

The long jump above goes to the following code:

Movw $(SEG_KDATA<<3), %ax # Our data segment selector Data Segment movw %ax, %es # -> ES: Extra Segment movw %ax, %ss # -> SS: Stack Segment movw $0, %ax # Zero segments not ready for use FS, % FS # -> FS movw % AX, % GS # -> GS Segment movw $0, % FS # -> FS movw % AX, % GS # -> GSCopy the code

Nothing to say, set the segment register, SEGKDATA<<3SEG_KDATA<<3SEGKDATA<<3, move three left is the property bit, all set to 0, specific each bit represents what, refer to my previous article also wrote the corresponding real mode to protect mode. There is no CS register set above, CS is set in the long jump instruction.

The last thing bootasm.S does:

Movl $start, %esp # set start0x7c00 to call bootmainCopy the code

Set the top of the stack to 0x7C00 and then call bootmain. An operating system stack change is always a very confusing process, to grasp the stack change.

bootmain

Equivalent to bootloader, mainly is to load the kernel, the whole kernel is an ELF file, about the ELF file can refer to this article I wrote, this article does not repeat. Load the kernel. Where is the kernel? On disk, so read disk first. Bootmain. There are three c on disk operating function, the details can’t understand it doesn’t matter, we first understand the three functions of specific meaning, put behind the article explain the implementation details.

- void waitdisk(void)     // Wait for the disk to be free
- void readsect(void *dst, uint offset)   // Read a single sector offset to DST
- void readseg(unchar *pa, uint count, uint offset)   // Read the count byte into pa from the sector where offset is located by adding 1 because the kernel starts from sector 1
Copy the code

Function bootmain with the above three functions in mind:

void bootmain(void)
{
  struct elfhdr *elf;
  struct proghdr *ph, *eph;
  void (*entry)(void);
  uchar* pa;

  elf = (struct elfhdr*)0x10000;  // The Scratch space kernel starts at this location

  // Read 1st page off disk
  readseg((uchar*)elf, 4096.0);   // Read from sector 1, 4096 bytes to 0x10000, i.e. 8 sectors

  // Is this an ELF executable?
  if(elf->magic ! = ELF_MAGIC)// Check if it is an ELF file
    return;  // let bootasm.S handle error

  // Load each program segment (ignores ph flags).
  ph = (struct proghdr*)((uchar*)elf + elf->phoff);    // The position of the first program header
  eph = ph + elf->phnum;        // The position of the last program header
  for(; ph < eph; ph++){      // The for loop reads the program segment
    pa = (uchar*)ph->paddr;   // The location of the program segment
    readseg(pa, ph->filesz, ph->off);   // Off is the offset to ELF, filesz is the size of the segment, i.e., read filesz from the sector where off is to memory address pa
    if(ph->memsz > ph->filesz)     // Because of the BSS section, the ELF file does not need to have a BSS entity, but it needs space in memory, so it may be larger
      stosb(pa + ph->filesz, 0, ph->memsz - ph->filesz); // Call stosb to set the rest of the segment to zero
  }

  // Call the entry point from the ELF header.
  // Does not return!
  entry = (void(*) (void))(elf->entry);   //entry, the entry point of the program
  entry();   / / call entry
}
Copy the code

If you are familiar with the ELF file, the above program should be easy to understand, I will not explain the detailed comments, if there is any confusion, please refer to the article on ELF section.

So one thing bootmain does is load the kernel into memory and call Entry. After loading the kernel, the layout in memory looks like this:

entry

Entry also does one main thing: turn on the paging mechanism and jump to main. There are four steps: build the page table -> load the page table -> set the CR3 register -> jump to main

.globl _start _start = V2P_WO(entry) //_start Specifies the default entry to the assembly, which is converted to a physical address. Set PSE bit in register CR4, Allow 4M movL %cr4, % eAX orl $(CR4_PSE), % eAX movl %eax, %cr4 # # % eax movl % eax and % cr3, set the register CR0 PG bit open the paging mechanism movl % CR0, % eax orl $(CR0_PG | CR0_WP), % eax movl % eax, %cr0 # Set up the stack pointer. Movl $(stack + KSTACKSIZE), %esp $main, %eax // jump to main JMP *% eax. comm stack, KSTACKSIZE // If the definition of stack cannot be found, allocate KSTACKSIZE of uninitialized memory.Copy the code

This code should also be easy to understand, with a few points:

  1. The page table, defined in main.c, maps only the lower 4 M of physical memory, and this article uses very little virtual memory in a later article
  2. Address again in the stack, stack can view the kernel. The asm, 0 x8010b5c00x8010b5c00x8010b5c0, seems to be nothing special, just grab a piece of to proper place as a stack. Of course, this memory allocation is related to links, links I am not familiar with, perhaps there is something mysterious, there is no further investigation, if there is really another mystery, there are know big guy also please inform.
  3. JMP *%eax, using an indirect jump, fetches the absolute address of the destination directly from eAX, otherwise using a direct jump generates a relative addressing code that encodes the difference between the address of the destination instruction and the address of the instruction immediately following the jump instruction

main

Finally, we come to the main function, which is used to initialize various mechanisms. We’ll look at three that are related to CPU startup, and the others will come later:

int main(void)
{
  mpinit();        // detect other cpus

  startothers();   // start other processors
    
  mpmain();        // Finish this processor's setup
}

Copy the code

mpinit

The floating Pointer (mpinit() function obtains the CPU Configuration information from the MP Configuration Table. Floating Pointer can only come in those three places, so look for it one by one. Mpinit () in mp.c, we look at the functions from top to bottom

static uchar sum(unchar *addr, int len);  // add len to addr-addr
static struct mp* mpsearch1(uint a, int len) // Find the floating Pointer in a~a+len
{
  uchar *e, *p, *addr;
  addr = P2V(a);   // Convert to a virtual address
  e = addr+len;   / / the end
  for(p = addr; p < e; p += sizeof(struct mp))
    if(memcmp(p, "_MP_".4) = =0 && sum(p, sizeof(struct mp)) == 0)   // Compare the signature and checksum, and if they match, there is a floating pointer
      return (struct mp*)p;
  return 0;
}

static struct mp* mpsearch(void)     // Find the mp floating pointer
{
  uchar *bda;
  uint p;
  struct mp *mp;

  bda = (uchar *) P2V(0x400);     //BIOS Data Area address
    
  if((p = ((bda[0x0F] < <8)| bda[0x0E]) < <4)) {// Look for the first 1K in EBDA
    if((mp = mpsearch1(p, 1024)))
      return mp;
  } else {                                 // Look in the last 1K of base memory
    p = ((bda[0x14] < <8)|bda[0x13]) *1024;    
    if((mp = mpsearch1(p- 1024..1024)))
      return mp;
  }
  return mpsearch1(0xF0000.0x10000);   // Search in 0xf0000~0xfffff
}
Copy the code

Floating Pointer (EBDA, Base memory, EBDA); floating Pointer (BDA, Base memory, EBDA); The BDA is the BIOS data area, fixed at 0x400, which contains the information we need.

Look at the two items I’ve highlighted. The two bytes starting at address 0x040E are four bits to the right of the address of EBDA. Bda [0xE] << 4 indicates that the EBDA address is 8 bits lower than the EBDA address. Bda [0xF] << 8 represents the top 8 bits of EBDA, which add up to the position of EBDA

The two bytes starting at address 0x0413 indicate the number of bytes in front of EBDA. This number is the size of the base memory and the end address of the base memory. The same operation is not explained again.

After finding the floating Pointer structure, the MP Configuration Table can be found according to its element PhysADDR. This Table is composed of two parts, the Table header and the Table item. Many of the Table items are not needed at present, so we only focus on the processor part. In a nutshell, the processor-specific part of the mpinit function is to find how many processor entries there are, which means how many processors there are, and then fill that information into the global CPU data structure:

struct cpu cpus[NCPU];        // Global CPU data structure, NCPU indicates how many cpus are supported
int ncpu;        / / number of CPU

for(p=(uchar*)(conf+1), e=(uchar*)conf+conf->length; p<e; ) {// Skip the header and start the for loop from the first entry
    switch(*p){     // Select the current entry
    case MPPROC:     // If it is a processor
      proc = (struct mpproc*)p;     
      if(ncpu < NCPU) {
        cpus[ncpu].apicid = proc->apicid;  // ApIC ID can be used to identify a CPU
        ncpu++;          // Find a CPU entry and increase the number of cpus by 1
      } 
      p += sizeof(struct mpproc);    // Skip the current CPU entry and continue the loop
      continue;
Copy the code

Xv6 defines a global CPU data structure. This mpinit function queries how many cpus there are and then initializes a portion of the CPU data structure, which involves some knowledge of the advanced interrupt controller APIC. See article: Retalking interrupts (APIC). Each CPU has a LAPIC. The LAPIC ID can be used to uniquely identify a CPU.

startothers

Find out how many cpus there are, and have the id information for each CPU, so you can start them, directly tothe startothers code:

static void
startothers(void)
{
  extern uchar _binary_entryother_start[], _binary_entryother_size[];
  uchar *code;
  struct cpu *c;
  char *stack;

  // entryOther. S is the code to run when APs starts, and the linker places the image in _binary_entryother_start
  // Then move it to 0x7000
  code = P2V(0x7000);
  memmove(code, _binary_entryother_start, (uint)_binary_entryother_size);

  for(c = cpus; c < cpus+ncpu; c++){  // The for loop starts APs
    if(c == mycpu())  // Exclude yourself
      continue;

    // Tell entryother.S what stack to use, where to enter, and what
    // pgdir to use. We cannot use kpgdir yet, because the AP processor
    // is running in low memory, so we use entrypgdir for the APs too.
    stack = kalloc();   // Assign a stack to each AP* (void**)(code4 -) = stack + KSTACKSIZE;   // Fill in the top address of the stack where code-4 is located* (void(* *),void))(code- 8 -) = mpenter;     // Enter the mpenter address in code-8* (int**)(code- 12) = (void *) V2P(entrypgdir);  // Enter the address of the page directory where code-12 is located

    lapicstartap(c->apicid, V2P(code));   // Call lapicstartap to start the AP, passing apIC ID and the address of the code to execute

    // wait for cpu to finish mpmain()
    while(c->started == 0)    // Wait for the current AP to start before the next loop; }}Copy the code

Bootasm. S and EntryOther. S, which are implemented by BSP, are a combination of entryother.S and entryOther. S. The main job is to go into protected mode, turn on paging, and then call the mpenter() function, which does the final boot work, as we’ll see later.

This is followed by a for loop to start APs, which skips its own BSP, and the number of loops is the number of cpus, which mpinit() initialized earlier. For each CPU there is a stack, BSP is allocated by the linker using the.comm statement, APs is allocated by the kalloc() function, kalloc is also described later in memory management, now just know that Kalloc can allocate a physical page, and then return the starting virtual address.

Entryother. S = entryOther.s = entryOther.s = entryOther.s = entryOther.s = entryOther.s = entryOther.s = entryOther.s = entryOther.s = entryOther.s = entryOther.s = entryother.s

Finally, the lapicstartap() function is called to start APs. Look at this function

lapicstartap

The BSP startup basically sends init-sipi-sipi signals to APs. As I mentioned in this article, a CPU communicates with other cpus by writing the ICR register of the LAPIC.

void lapicstartap(uchar apicid, uint addr)
{
  int i;
  ushort *wrv;

  // THE BSP must set the CMOS status register A to 0x0A so that it will jump to the 40:67h entry point
  outb(CMOS_PORT, 0xF);  // offset 0xF is shutdown code
  outb(CMOS_PORT+1.0x0A);
  // Setting the reset vector at this position is equivalent to filling in the address of the program code
  wrv = (ushort*)P2V((0x40<<4 | 0x67));  // Warm reset vector
  wrv[0] = 0;
  wrv[1] = addr >> 4;

  //// Sends the INIT message
  lapicw(ICRHI, apicid<<24);                   
  lapicw(ICRLO, INIT | LEVEL | ASSERT);
  microdelay(200);
  lapicw(ICRLO, INIT | LEVEL);
  microdelay(100);    // should be 10ms, but too slow in Bochs!

  // Send two STARTUP IPI messages
  for(i = 0; i < 2; i++){           
    lapicw(ICRHI, apicid<<24);
    lapicw(ICRLO, STARTUP | (addr>>12));
    microdelay(200); }}Copy the code

The code above is actually BSP setting up the ICR register and sending init-sipi-sipi message like APs. There’s no reason why you should do this, there’s no reason why you should set it this way, it’s an inherent feature of Intel. The specific register setting functions used above, CMOS, APIC, etc., are described in the interrupt section. It is good to understand this process first.

When APs receives three messages from BSP, it goes to 40: 0x67 takes the address of its own startup code, which is the parameter V2P(code) passed by BSP to lapicstartap(c-> apicID, V2P(code)), 0x7000. Here is the physical address, because APs is not in protected mode yet. Paging has not been enabled to create virtual memory.

In addition, I am not clear how to jump to 0x7000 after this code is executed. There is no information on this aspect. I guess it should be setting CMOS status register A, warm reset vector, and sending SIPI set vector at last. Vector = 0x7000; vector = 0x7000; vector = 0x7000

After get the code address, you can perform entryother. S code, the bootasm in front of the assembly code. S and entry. S is mostly the same, we only see two sentences:

Movl (start-4), %esp # assign top to esp call *(start-8) # call mpenter()Copy the code

Now the AP has its own stack. Run mpenter() to start the AP

mpenter

static void mpenter(void)
{
  switchkvm();  // Switch to the kernel page table
  seginit();   // Reset and load the GDT
  lapicinit(); // Initialize the APIC
  mpmain();   / / see below
}
static void mpmain(void)
{
  cprintf("cpu%d: starting %d\n", cpuid(), cpuid());
  idtinit();       / / load GDT
  xchg(&(mycpu()->started), 1); // Set started to 1
  scheduler();     // Start scheduling process execution program
}
Copy the code

As you can see, most of the work is done to initialize the setup environment. Finally, the CPU structure element started 1 indicates that the CPU is started, and the startothers function is notified that the next AP can be started. Finally, call scheduler() to begin scheduling the executor.

After executing Startothers (), all APs will be started, and finally the BSP itself will execute mpenter itself. By this point, all cpus will have been started, that is, the computer will have been started, and various environments will be set up to execute various programs. I’ve done all sorts of things.

Finally, look at the following xv6 startup flowchart:

This article is about the start of the knowledge is so much, it can be seen that the start is a very big project, involving various parts, this article focuses on the start process, which omitted the hardware operation, ready to talk about each part later to detail.

Reference for this article:

Intel® 64 and IA-32 Architectures Software Developer Manuals

Memory Map (x86) – OSDev Wiki

MultiProcessor Specification (cmu.edu)

The first is the development manual of Intel, chapter 8 is about the management and startup protocol of the multi-processor, the second is the low 1M memory mapping in real mode, the third is the specification of the multi-processor, configuration table and so on there are detailed instructions.

I open xv6 this article is about the first article in this series, the previous articles about will be the main part of the operating system about, um actually much worse, but I think some things really are still and source combined with the actual to tell, otherwise empty talk is not much effect, so I’m going to rest and xv6 tell directly. It is said that the MIT operating system course is the magic course of learning operating system, so let’s take a slow analysis of Xv6, by the way, also to the previous mentioned to string together. However, the operating system involves a wide range of things, and I really do not have the time and energy to thoroughly understand all parts. As can be seen from this article, I am not quite clear about the details of some hardware and links.

In addition, human energy is limited, some details are not specialized in that aspect of the research work and indeed do not need to go too far, just grasp the main body, the main body I can ensure to provide a correct and complete closed loop. This article ends here, if there is any mistake, please also criticize and correct, and welcome everyone to communicate with me about learning progress.