Moore’s law to calculate the force slows, learning is a 32-bit x86 assembly before, have to say x86 architecture instruction set is really historical burden is too heavy, the shortage of the x86 architecture in contemporary need strong force under the background of increasingly obvious, large power consumption, less number of general-purpose registers, computer hardware utilization is low, addressing problems such as smaller range, It’s hard to keep up with the pace of computing power. At the same time, ARM architecture has begun to glow with a different vitality in the current mobile Internet. Therefore, I recently took a look at risC-V architecture derived from ARM architecture and reviewed the relevant content of computer composition principle.

Assembly environment setup

The most recent content has mostly covered the specifics of the facilities, underlying management, and programming skills shown above. As mentioned above, the open source RISC-V instruction set is more conducive to learning due to X86 features. MARS is a lightweight interactive development environment (IDE containing MIPS assembler and runtime emulator) for programming in MIPS assembly language. Designed for educational use with Patterson and Hennessy’s computer organization and design. Here is that it’s website: courses.missouristate.edu/kenvollmar/…

If you have a Java runtime environment (JRE), you can use it directly. If you have a Java runtime environment (JRE), you should download JRE or JDK first.

The C code corresponding to the assembly code in the figure above is:

#include <stdio.h>

int main(){
    int r, x;
    x = 2;
    r = x + 3;
    printf("%d", r);
    return 0;
}
Copy the code

Assembly related concepts

Assembly language features

ASM is also the extension of the Assembly Language source program, assembler programmers also called Assembly FOR ASM, Assembly Language (Assembly Language) is machine-oriented programming Language, Assembly Language is a very powerful programming Language, but also the use of all the hardware characteristics of the computer and can directly control the hardware Language.

In assembly language, Mnemonic is used instead of opcodes, and address code is used instead of Symbol or Label. In this way, machine language becomes assembly language by replacing the binary code of machine language with symbols, so assembly language is also called symbol language. The use of assembly language prepared by the program, the machine can not be directly recognized by a program to translate assembly language into machine language, this translation function of the program called assembly program, assembly program is the language processing system software system software. Assembler The process of translating assembly language into machine language is called assembly. Assembly language is easier to read and write, debug and modify than machine language, and has all the advantages of machine language. However, when writing complex programs, the code of the relative high-level language is larger, and the assembly language depends on the specific processor architecture and cannot be used in general, so it cannot be directly transplanted between different processor architectures.

To sum up, the characteristics of assembly language are as follows:

A machine-oriented low-level language, usually designed specifically for a particular computer or family of computers.

2, maintain the advantages of machine language, with direct and simple characteristics.

3, can effectively access, control computer hardware devices, such as disk, memory, CPU, I/O

Disassembly and pseudoinstruction

Disassembly: The analysis of the binary in an executable file into an assembler.

Decomcompile: An analysis of an executable program into a high-level language source code format, where complete conversion is generally not possible, compiler optimizations and other factors

Pseudo-assembly instruction: An instruction used to tell the assembler how to assemble. It neither controls the operation of the machine nor is it compiled into machine code. It is only recognized by the assembler and instructs the assembler how to assemble. Loads an address relative to a program or register into a register.

Preprocessing directives (pseudo-compilation directives) : such as #define and #ifdef are used to make source code easily modified or compiled in different execution environments. These instructions in the source code tell the preprocessor to perform specific operations. Such as telling the preprocessor to replace a particular character in the source code.

Through this comparison can be very deep understanding of the meaning of disassembly and pseudo-instructions, which is the same as decompilation and preprocessing instructions.

Instruction decoding and mnemonics

Let’s look at the data space address and instruction space address in the example of 5000×0.2 CPU calculation:

The instructions are essentially a binary string:

Below is a general preview of the instruction set, in which ADD, SUB, MUL, AND AND OR are basic mathematical OR logical operations, LOAD, STORE, MOV AND other instructions are related to register loading data control, AND finally there are BRANCH, BREQ, BRNE, BRIO AND other instructions related to process control.

A program pointer, also known as a PC pointer, is a special register that stores the memory location of the next desired program:

Therefore, the whole calculation process is shown in the figure below:

So the key things to know are:

1. The computer commands the computer to work through instructions

2. The CPU is clock-driven, constantly reading instructions pointed to by the PC pointer, and increasing the PC pointer to read instructions from memory and execute them, and so on.

3. Different CPU architectures use different instructions, the most widely used is RISC (Abbreviated instruction set).

Instruction decoding process: All instruction lengths are the same for this RISC. Examples of instructions for the MPS-32 architecture [a RISC processor architecture] :

Different Opcodes correspond to different split methods:

Or if opcode=25:

So what you need to know is that the CPU recognizes how the instruction should be decoded and executed based on opcode.

Mnemonics: In the 32-bit RISC instruction set, Opcode is a 6-digit number, which is too abstract to remember, so we usually use mnemonics to remember them, hence the various assembly instructions above.

Four addressing modes

Addressing mode is the part of the instruction set that determines how many operators the instruction has and how the address is computed. You don’t need to remember the addressing modes, it doesn’t matter, different instruction sets have different addressing modes; What you see by studying addressing patterns is how to make good use of binary instructions.

The register addressing operator is a register, which addresses 2n registers with n bits, for exampleadd $r10, $r1, $r2, add the contents of register 1 and register 2 and store them in register 10.

2. There are values in the immediate addressing operator, for exampleaddi $r1, $zero, 1000The number size is limited, as is the case with 32-bit machines[15-1 ~ 2-2 ^ ^ 15]

Address based on the base address and offset, the final address is calculated at the base address and offset, for example:lw $r0, 8($sp) Sp as the base register, offset is 8, calculated from this offset memory address, the value of the changed address into the target register

4. The position of the PC relative to the next PC pointer to be addressed depends on the distance from the current position to the Label (the difference between the current line of code and the line of code on which the Label is located), e.gbeq $r3, $r9, LABAL When the R3 register has the same value as the R9 register, it jumps to Label, which is equivalent to the GOTO statement

Memory read and write instruction

The load/store directive is used to read and write from memory. There are usually multiple versions of the implementation, and the mnemonic is:

Load class: LW, LB, LH, store class: SW, SB, sh

Mathematical operation instruction

Addi, subi, divi, multi: addi $sp, $sp, 4

Register addressing add, sub, div, mult e.g. Add $d, $rs, $rt

Bit operation instruction

And, OR, xOR, etc

Conditional jump

Relative addressing j LABLE

Register indirect addressing jr $a0

All in one jal LABLE, the function call is implemented by storing the current PC pointer + 4 (for 32-bit machines) into the $RA register and then executing j LABLE

Opcode stands for the type of instruction; Opcode also determines the addressing mode

Direct, indirect, offset addressing don’t memorize, understand.

32 registers for MIPS

MIPS has 32 general purpose registers ($0-$31). The functions of each register and the conventions used in the assembler are as follows. The following table describes the aliases and uses of 32 general purpose registers:

REGISTER NAME USAGE
$0 $zero Constant value 0
The $1 $at Reserved for assembler
$2 to $3 $v0-$v1 Values for Results and expression Evaluation
$4 - $7 $a0-$a3 Function call arguments (arguments)
$8 - $15 $t0-$t7 Temporary (or casual)
$16 - $23 $s0-$s7 Saved (or, if used, saved/restored)(saved)
$24 - $25 $t8-$t9 Temporary (or casual)
$28 $gp Global Pointer
$29 $sp Stack Pointer
$30 $fp Frame Pointer
$31 $ra Return address

$0: $zero, this register always returns zero, providing a concise encoding for the useful constant 0. Move $t0,$t1 actually add $t0,$0,$t1, using pseudo-instructions can simplify the task, assembler provides a richer instruction set than hardware.

$1: $at, this register is reserved for assembly. Since the immediate digit segment of type I instruction is only 16 bits, when loading large constants, the compiler or assembler needs to take the large constants apart and reassemble them into the register. For example, loading a 32-bit instant number requires luI and addi. Like the MIPS program that dismantleds and reassembles large constants, the assembler requires a temporary register to reassemble large constants, which is one of the reasons $AT is reserved for assembly.

$2.. $3:($v0-$v1) for a subroutine that is not a floating point result or return value, for the subroutine how to pass parameters and how to return, MIPS range has a set of conventions, the contents of a few positions in the stack into the CPU register, the corresponding memory location is left undefined, when the two registers are not enough to store the return value, The compiler does this through memory.

$4.. $7:($a0-$a3) used to pass the first four parameters to the subroutine, not enough to use the stack. A0-a3 and V0-V1 together with RA support subroutine/procedure calls to pass parameters, return results, and store return addresses, respectively. When more registers are needed, the stack is needed, and the MIPS compiler always leaves room in the stack for parameters in case they need to be stored.

$8.. $15:($t0-$t7) temporary registers that subroutines can use without reserving.

$16.. $23:($s0-$s7) save registers, which need to be retained during procedure calls (the called saves and restores, also including $fp and $ra), MIPS provides temporary registers and save registers, thus reducing register overflow (the process of putting uncommon variables into storage), When compiling a leaf procedure (one that does not call any other procedure), the compiler always uses the registers that need to be saved only after the temporary registers have been allocated.

$24.. $25 ` : ` ($t8 - $t9) ` ` ($t0 - $t7 has)Copy the code

$26.. $27:($k0,$k1) reserved for operating system/exception handling, at least one must be reserved. An exception (or interrupt) is a procedure that does not require a call to be displayed in a program. MIPS has a register called exception Program Counter (EPC), which belongs to the CP0 register and holds the address of the instruction that caused the exception. The only way to view the control register is to copy it to the general register. The instruction MFC0 (Move from System Control) can copy the address in EPC to a general register. Through the jump statement (JR), the program can return to the instruction that caused the exception to continue to execute. MIPS programmers must reserve two registers, $k0 and $k1, for use by the operating system. When an exception occurs, the values of these two registers are not recovered, and the compiler does not use k0 and k1. The exception handler can place the return address in either of these two registers, and then use JR to jump to the instruction that caused the exception and continue execution.

$28:($gp) To simplify access to static data, the MIPS software retains a register: $gp (global pointer, $gp); a global pointer only wants a run-time address in the static data area. To access data in the range of 32KB above and below the GP value, only a gP-based pointer instruction is required. At compile time, the data must be in the 64KB range of a gP-based pointer.

$29:($sp)MIPS hardware does not support stack directly, you can use it for other purposes, but in order to use someone else’s program or get someone to use your program, you still have to follow this convention, but it has nothing to do with the hardware.

$30:($fp) the GNU MIPS C compiler uses a frame pointer, whereas the SGI C compiler does not use this register and uses it as a save register ($s8), which saves on call and return overhead but adds complexity to code generation.

$31:($ra) stores the return address. MIPS has a jal (jump-and-link) instruction. When jumping to an address, it puts the next instruction’s address in $ra. Use to support subroutines, such as the calling program to put the parameters in $A0 ~$a3, then jal X jumps to the X procedure, when the called procedure is complete to put the result in $v0,$v1, and then use jr $ra to return.

Registers in the MIPS architecture are 32 bits in size. 32-bit groups are called words, and MIPS begins with # as a comment:

The name The sample note
32 registers In order to$Start, S0-s7, T0-T9, zero, A0-A3, v0, v1, k0, k1, gp, FP, sp, ra, at Registers are used for fast data access.$zeroIs always 0,$atIs reserved by the assembler for handling large constants
230 memory words 230 memory words The Memory [0], [4], Memory… , Memory[4294967292] The memory can only be accessed by data transfer instructions. Word address difference 4, memory is used to hold data structures, arrays, overflow registers.

Arithmetic instruction:

instruction note
add $s1, $s2, $s3 Addition,$s1 = $s2 + $s3
sub $s1, $s2, $s3 Subtraction,$s1 = $s2 - $s3
addi $s1, $s2, 20 Immediate number addition, used to add constant data,$s1 = $s2 + 20

Data transmission:

instruction note
lw $s1, 20($s2) In word,$s1 = Memory[$s2 + 20]
sw $s1, 20($s2) Characters,Memory[$s2 + 20] = $s1
lh $s1, 20($s2) Take half word
lhu $s1, 20($s2) Take an unsigned halfword
sh $s1, 20($s2) Save half word
lb $s1, 20($s2) In bytes
lbu $s1, 20($s2) Take an unsigned byte
sb $s1, 20($s2) Remaining bytes
ll $s1, 20($s2) Take the link word as the first half of the atomic swap (1st half)
sc $s1, 20($s2) The conditional word (conditional storage), which acts as the second half of atomic exchange,Memory[$s2 + 20] = $s1; $s1 = 0 or 1
lui $s1, 20 Load upper immediate,$s1 = 20 * 2 ^ 16, take the immediate number and place it in the highest 16 bits

MIPS word is 4 bytes, not x86 word is 2 bytes.

Logic:

instruction note
and $s1, $s2, $s3 And,$s1 = $s2 & $s3
or $s1, $s2, $s3 Or, ` s1 = s1 = s2
nor $s1, $s2, $s3 Or not, ‘s1= (s1= (s2
andi $s1, $s2, 20 And immediately
ori $s1, $s2, 20 The number immediately or
sll $s1, $s2, 10 Shift left logical,$s1 = $s2 << 10
srl $s1, $s2, 10 Shift right logical,$s1 = $s2 >> 10

Conditional branch:

instruction note
beq $s1, $s2, 25 Branch on equalgoto PC + 4 + 100
bne $s1, $s2, 25 Branch on not equalgoto PC + 4 + 100
slt $s1, $s2, $s3 Set on less than,$s1 = ($s2 < $s3 ? 1:0)
sltu $s1, $s2, $s3 Set on less than unsigned
slti $s1, $s2, 20 If the value is smaller than the immediate value, the value is set
sltiu $s1, $s2, 20 Set when unsigned comparison is less than immediate

Unconditional jump:

instruction note
j 2500 Jump, jump),goto 10000
jr $ra Jump to jump registergoto $ra, for switches and procedure calls
jal 2500 Jump and link,$ra = PC + 4; goto 10000For procedure calls

The function recursively takes the factorial

func fact(int i){
    if(i == 0){
        return 1;
    }
    return fact(i - 1);
}
Copy the code

This recursive function will be implemented using MIPS assembly below:

For if-else constructs:

The BNE instruction compares the contents of the R3 and R4 registers, continues if they are equal, and jumps to the ELSE identifier if they are not

For for-loop:

We are mainly concerned with parameter passing and return value fetching:

Notice how we understand recursive calls to relate to real control procedures:

Here is the complete assembly code:

Addiu $sp, $0, $0, 5 # n=5 sw $s0, 0($sp) -4 jal FACT nop j END nop FACT: Sw $ra, 0($sp) addiu $sp, $sp, -4 -4 # RECURSION # if (n == 0) {return 1} bne $s0, $0, RECURSION nop # Addiu $sp, $sp, 1 sw $s0, 0($sp) addiu $sp, $sp, -4 jr $t1 nop RECURSION: # recursion # return fact(n-1) * n # recursion # return fact(n-1) * n # recursion -4 What does jal FACT nop # stack look like now? Parameter | | return address return value | | son function's parameter son function's return value | current SP # current parameters of lw $s0, 20 # ($SP) sub function return value of lw $s1, 4 ($SP) # return address of lw $t1, Addiu $sp, $sp, $sp, -4 jr $t1 nop END:Copy the code

The result is already in the $sp register:

Other examples of MIPS

If the above recursive call process is too complicated, you can also look at these examples first:

$sp: Stack pointer register, register 29, adjusted in words. Historically, stacks “grow” in order of addresses from highest to lowest. When data is pushed, the stack pointer value decreases.

int leaf_example(int g, int h, int i, int j) {
  return (g + h) - (i + j);
}
Copy the code

We can write code like this:

leaf_example: $sp, $sp, -12 sw $t1, 8($sp) sw $t0, 4($sp) sw $s0, 0($sp) add $t0, $a0, $a1 add $t1, $a2, Lw $s0, 0($sp) lw $t0, 4($sp) lw $t1, 8($sp) addi $sp, $sp, 12 jr $ra nopCopy the code

For a simple summation function

Addiu $sp, $0, 5 # a = 5 sw $s0, 0($sp) addiu $sp, $sp -4 addiu $s1, $zero, 9 # b = 9 sw $s1, 0($sp) addiu $sp, $sp, -4 j ADD nop ADD: Lw $t0, 8($sp) lw $t1, 12($sp) #addiu $t2, $zero, 0 add $t2, $t1, $t0 move $a0, $t2 li $v0, 1 syscallCopy the code

Interrupts and interrupt vectors

Interrupts tell the CPU that it should pay attention to a signal (event) when the outside world changes. At this point, the program currently executing by the CPU will be interrupted, the current execution state will be saved, and the interrupt response program will be executed.

Brief process of interrupt triggering and handling:

1. (when OS is loaded) Write interrupt direction table

2. Generate an interrupt request and send it to the CPU

Interrupt Vector Table Interrupt Vector Table Interrupt Vector Table

4. Locate the interrupt responder according to the interrupt vector

5. OS takeover is interrupted

Interrupt Request

Hardware sent to the motherboard (printer, keyboard, mouse, etc.)

Hardware interrupt: CPU exception (divided by 0), clock signal, etc

Software interrupts: emitted (exceptions, switching to kernel state, etc.)

Interrupt Vector Table: An area (usually in memory) that stores the correspondence between Interrupt types and Interrupt responders

Each row is called an interrupt vector. The 01 interrupt in the following table is usually used to debug the program and is executed step by step.

Meaning of interruption:

1, improve work efficiency (recall the problems with polling)

2, fault recovery (exception handling, emergency, etc.)

3. Simplified programming model (try-cache, timer, etc.)