Should embedded engineers learn ARM assembly instructions?

What is arm architecture, ARMv8, ARM instruction set, AND SOC?

Refer to the following article on IDE development environments used by ARM directives

1. Learn ARM from 0 – Install Keil MDK uVision integrated Development environment

2. Learn arm-CPU principle from 0, EXPLAIN SOC based on ARM

3. Learn ARM-ARM mode, register and assembly line from 0

4. Learn ARM-ARM instruction from 0, shift, data processing, BL, machine code

5. Learn arm-MRS, MSR, addressing operation and atomic operation principle from 0

6. Learn ARM- Exception and interrupt Processing, Exception Direction Scale and SWI from 0

From zero to learn ARM

First, the difference between MDK and GNU pseudo-instruction

When we learn assembly code, we will see the following two styles of code:

Gnu code begins with:

Global _STARt_start: @ Assembly entry LDR sp,=0x41000000. End @ The assembly program endsCopy the code

The MDK code begins with:

AREA Example,CODE,READONLY ; Declare the code snippet Example ENTRY; Start MOV R0,#0 OVER ENDCopy the code

These two styles of code use different compilers, and our previous example code was MDK style.

So much for us beginners to learn what kind of wind? The answer is yes, learn GNU style assembly code, because do Linux driver development must master the Linux kernel, uboot, and these two software is GNU style.

In order not to waste too much energy on temporarily useless knowledge, we will focus on GNU style assembly.

2. GNU Assembly writing Format:

1. Comment symbols in lines of code:

‘@’ Full line comment symbol: ‘#’ Statement separation symbol: direct operand prefix: ‘#’ or ‘$’

2. Global label:

The label can only be a to Z, a to z, 0 to 9, “. The value consists of periods (.), letters, digits, and underscores (_) and cannot start with a digit except local label.) and must be followed by a colon (:).

The address value of the segment label is determined during assembly; The address value of the segment label is determined at join time.Copy the code

3. Local label:

Local labels are mainly used within the local scope and can be repeated. It has two parts and it starts with a direct number from 0 to 99 followed by a “:”.

F: indicates the direction in which the compiler searches only forward and the number of lines of code increases/B: indicates the direction in which the compiler searches only backward and the number of lines of code decreasesCopy the code

Note the jump of local labels, for example, the nearest principle:

File location Arch/ARM /kernel/ entry-armV.sCopy the code

Three, pseudo-operation:

1. Symbol definition pseudoinstruction

label

meaning

.global

Makes the symbol visible to the connector, making it a global variable available to the entire project

_start

The default ENTRY point for assembler programs is the _ start label, and users can also indicate other ENTRY points with the ENTRY flag in connection script files.

.local

Indicates that the symbol is not visible externally, only to this file

2. Data Definition pseudo-operation

Data definition pseudo operations are used to allocate storage units to specific data and initialize allocated storage units. Common data definition pseudo-operations are as follows:

label

meaning

.byte

Single byte definitions 0x12, ‘a’,23 [must be even]

.short

Define 2 bytes of data 0x1234,65535

.long /.word

Define 4-byte data 0x12345678

.quad

Define 8 bytes. Quad 0x1234567812345678

.float

Define a floating point number. Float 0f3.2

.string/.asciz/.ascii

ASCII “abcd\0”, note:.ASCII pseudo-operations define strings that require each line to end with the character ‘\0’, others do not

.space/.skip

Allocates a contiguous storage area and initializes it to the specified value, then fills it with 0 if the following fill value is omitted.

.rept

Repeat the following instructions, starting with.rept and ending with.endr

[for example]

.word

val: .word 0x11223344mov r1,#val ; Set the value 0x11223344 to register R1Copy the code

.space

label: .space size,expr ; Expr can be a floating point number up to 4 bytes a: space 8, 0x1Copy the code

.rept

.rept cnt ; CNT is number of repetitions.endrCopy the code

Note:

  1. Variables are defined after stop and before.end

  2. Labels are mnemonics of addresses and do not occupy storage space. It’s just before the end, relatively arbitrary.

3. If choice

Grammatical structure

. If the logical - expressing... . The else... .endifCopy the code

Similar to conditional compilation in C.

[for example]

.if  val2==1 mov r1,#val2.endif
Copy the code

4. Macro

.macro,.endm macros define functions similar to those found in C.

Macro pseudooperations can define a piece of code as a whole, called macro instructions. This code can then be called multiple times in the program with macro instructions.

Syntax format:

.macro {$label} name {$parameter{,$parameter}... }... . code .endmCopy the code

Where the $label is replaced with a user-defined symbol when the macro is expanded.

A macro operation can take one or more parameters, which are replaced with the corresponding values when the macro operation is expanded.

Note: Define before use

For example:

Example 1: macros with no arguments return subfunctions

.macro MOV_PC_LR MOV PC, lr.endm is called MOV_PC_LRCopy the code

[Example 2] : implement the return of subfunction with parameter macro

 .macro MOV_PC_LR ,param    mov r1,\param    MOV PC,LR .endm
Copy the code

The call method is as follows:

MOV_PC_LR  #12
Copy the code

4. Miscellaneous pseudo operations

label

meaning

.global/

Used to declare a global symbol

.arm

Define code to compile using the ARM instruction set

.thumb

Define code compiled using the Thumb instruction set

.section

.section expr defines a segment. Expr can enable.text.data.. BSS

.text

.text {subsection} compiles the code at the beginning of the definition to the subsection

.data

Data {subsection} compiles the code at the beginning of the delimiter to the data segment, and initializes the data segment

.bss

Subsection} stores variables in the. BSS segment, uninitialized data segment

.align

.align{alignment}{,fill}{, Max} aligns the current position with the specified boundary by padding with zero or the specified data

.align 4 — 16 bytes align 2 to the fourth power

.align (4) — 4 bytes aligned

.org

Org offset{,expr} specifies the number of memory units from the current address to the current address with offset, filled with zero or specified data

.extern

Used to declare an external symbol for compatibility with other assemblies

.code 32

With the arm.

.code 16

With the thumb.

.weak

Used to declare a weak symbol. If the symbol is not defined, the compiler ignores it without an error

.end

End of file

.include

.include “filename” contains the specified header file in which you can put an assembly constant definition

.equ

Format:.equ symbol, expression Defines a symbol as a value. This directive does not allocate space, similar to #define in C

.set

Assigning a global variable or a local variable is the same as.equ

For example: the set

.set start, 0x40mov r1, #start ; R1 is 0x40Copy the code

For example, the equ

.equ start, 0x40 mov r1, #start ; Inside r1 is 0x40 #define PI 3.1415Copy the code

Is equivalent to

.equ   PI, 31415
Copy the code

5. GNU pseudo-instruction

Key point: the pseudo-instruction is converted to the corresponding ARM instruction at compile time

  1. ADR pseudoinstruction: this instruction loads the address of the label into a register. ADR pseudoinstructions are small range address reading pseudoinstructions, and the relative offset range used: when the address value is byte aligned (8 bits), the value ranges from -255 to 255; when the address value is word aligned (32 bits), the value ranges from -1020 to 1020. Syntax format:

    ADR{cond} register,label ADR R0, lable

  2. ADRL pseudoinstruction: Reads medium range addresses into registers

ADRL pseudoinstruction is medium range address read pseudoinstruction. Relative offset range: When the address value is byte aligned, the value ranges from -64 KB to 64KB. When the address value is word-aligned, the value ranges from -256 KB to 256KB

Syntax format:

ADRL{cond} register,labelADRL R0, LableCopy the code
  1. LDR pseudoinstruction: THE LDR pseudoinstruction loads a 32-bit constant and an address into a register. Syntax format:

    LDR {cond} the register, = [expr | label – expr] LDR, R0 = 0 xffff0000; Mov R1,#0x12

Note: (1) LDR pseudo-instruction is distinguished from LDR instruction:

LDR R1,=val @ r1 =val is a sham instruction, assigning the address of val to R1Copy the code

Here is the LDR instruction:

LDR R2,val @r1 = *val is an ARM instruction, and gives the contents of the address of the label val to r2val:. Word 0x11223344Copy the code

(2) How to use LDR pseudo-instruction to realize long jump

LDR PC, = 32-bit addressCopy the code

(3) Arm pseudoinstruction LDR is used to solve the problem of non-immediate number in coding

LDR r0 = 0 x999; 0x999 is not an immediate number,Copy the code

6. Compilation of GNU assembly

1. Compilation without LDS files

Suppose we have the following code, including 1 main.c file and 1 start.s file: start.s

.global _STARt_start: @ assembly entry LDR sp,=0x41000000 b main.global myStrCopy. Textmystrcopy: LDRB r2, [R1], #1 STRB R2, [r0], #1 CMP r2, #0 B stop @ infinite loop to prevent running is equivalent to while(1).end@ assembler endCopy the code

main.c

extern void mystrcopy(char *d,const char *s);int main(void){ const char *src ="yikoulinux"; char dest[20]={}; mystrcopy(dest,src);//调用汇编实现的mystrcopy函数 while(1);    return 0;}
Copy the code

Makefiles are written as follows:

1. TARGET=start   2. TARGETC=main3. all:4.   arm-none-linux-gnueabi-gcc -O0 -g -c -o $(TARGETC).o  $(TARGETC).c5.    arm-none-linux-gnueabi-gcc -O0 -g -c -o $(TARGET).o $(TARGET).s6.    #arm-none-linux-gnueabi-gcc -O0 -g -S -o $(TARGETC).s  $(TARGETC).c  7.    arm-none-linux-gnueabi-ld $(TARGETC).o $(TARGET).o -Ttext 0x40008000 -o $(TARGET).elf8.    arm-none-linux-gnueabi-objcopy   -O binary -S  $(TARGET).elf  $(TARGET).bin9. clean:10.  rm -rf *.o *.elf *.dis *.bin
Copy the code

Makefile has the following meanings:

  1. Define the environment variable TARGET=start, where start is the name of the assembler file

  2. Define the environment variable TARGETC=main, which is a C language file

  3. Target: all, lines 4 to 8 are instruction statements for this instruction

  4. Compile main.c to generate main.o, and $(TARGETC) is replaced with main

  5. If start.s is compiled to start.o,$(TARGET) will be replaced with start

  6. 4-5 can also be implemented with 1 instruction in this line

  7. Run the ld command to generate start.elf by linking main.o and start.o. -ttext 0x40008000 indicates that the start address of the code segment is 0x40008000

  8. Elf is converted to start.bin by objCopy. -o binary (or –out-target=binary) is output to the original binary, -s (or –strip-all) is output to no relocation information or symbol information. I’ve reduced the file size,

  9. The clean target

  10. Clean the execution statement of the target to remove temporary files generated by compilation

[supplementary]

  1. GCC code optimization level, compile commands in makefiles level 4 O0 — O3 The higher the number, the higher the optimization level. O3 maximum Optimization

  2. Volatile Volatile variables that are volatile are not optimized by the compiler and are actually accessed in the memory address space each time.

2. Rely on LDS file compilation

The actual engineering files, the complexity of the segment is much more complex than ours, especially the Linux kernel has tens of thousands of files, the distribution of the segment is extremely complex, so we need to define the distribution of memory with the help of LDS files.

File list

Main. c and start.s are identical to the previous section.

map.lds

OUTPUT_FORMAT("elf32-littlearm", "elf32-littlearm", "elf32-littlearm")/*OUTPUT_FORMAT("elf32-arm", "elf32-arm", "elf32-arm")*/OUTPUT_ARCH(arm)ENTRY(_start)SECTIONS{ . = 0x40008000; . = ALIGN(4); .text      : {  .start.o(.text)  *(.text) } . = ALIGN(4);    .rodata :  { *(.rodata) }    . = ALIGN(4);    .data :  { *(.data) }    . = ALIGN(4);    .bss :     { *(.bss) }}
Copy the code

Explain the above example:

  1. OUTPUT_FORMAT(“elf32-littlearm”, “elf32-littlearm”, “elf32-littlearm”) specifies the default binary format for output object files. You can use objdump -i to list supported binary file formats.

  2. OUTPUT_ARCH(arm) Specifies the output platform as ARM. You can query the supported platform through objdump -i.

  3. ENTRY(_start) : Sets the value of symbol _start to the ENTRY address;

  4. . = 0x40008000: Sets the locator symbol to 0x40008000(if not specified, the initial value of the symbol is 0).

  5. .text : {.start.o(.text) *(.text)} : the former means to put start.o in the first position of the text segment, the latter means to merge all the input file (* symbol for any input file).

  6. .rodata: {*(.data)} : Merges the.rodata sections of all input files into one.rodata section;

  7. .data: {*(.data)} : Merges all the.data sections of the input file into a.data section;

  8. .bss: {*(.bss)} : Merges the.bss sections of all input files into a.bss section; This section usually holds global uninitialized variables

  9. . = ALIGN(4); Represents the segment 4 byte alignment below

The connector values the locator symbol each time it reads a section description

increase

The size of the section.

Let’s see how a Makefile should be written:

# cortex-a9 PERI DRIVER CODE# VERSION 1.0# ATHUOR Linux# MODIFY data # 2020.11.17 Makefile#=================================================#CROSS_COMPILE = arm-none-linux-gnueabi-NAME =startCFLAGS=-mfloat-abi=softfp -mfpu=vfpv3 -mabi=apcs-gnu -fno-builtin -fno-builtin-function -g -O0 -c LD = $(CROSS_COMPILE)ldCC = $(CROSS_COMPILE)gccOBJCOPY = $(CROSS_COMPILE)objcopyOBJDUMP = $(CROSS_COMPILE)objdumpOBJS=start.o  main.o#================================================#all: $(OBJS) $(LD) $(OBJS) -T map.lds -o $(NAME).elf $(OBJCOPY) -O binary $(NAME).elf $(NAME).bin $(OBJDUMP) -D $(NAME).elf >  $(NAME).dis %.o: %.S $(CC) $(CFLAGS) -c -o $@ $<%.o: %.s $(CC) $(CFLAGS) -c -o $@ $<%.o: %.c $(CC) $(CFLAGS) -c -o $@ $<clean: rm -rf $(OBJS) *.elf *.bin *.dis *.oCopy the code

The result is as follows:

Finally, start.bin is generated, and the modified file can be burned to the development board for testing. Because this example has no intuitive phenomenon, we will add other functions to test again in the subsequent article.

【 note 】

  1. The cross-compilation tool chain ARM-None-linux-Gnueabi-should be selected according to the actual platform. This example is based on Samsung’s Exynos-4412 tool chain.

  2. The address 0x40008000 was not chosen arbitrarily,Readers can find this address in the SOC manual corresponding to the development board in their hands.

Linux kernel exception direction table

Linux kernel memory distribution also depends on the LDS file definition, Linux kernel compilation is not discussed, after the compilation will generate the LDS file in the following location:

arch/arm/kernel/vmlinux.lds
Copy the code

Let’s take a look at part of the document:

  1. OUTPUT_ARCH(ARM) specifies the corresponding processor;

  2. ENTRY(SText) indicates that the ENTRY to the program is SText.

We can also see that the partition of Linux memory is more complex, and we will continue to analyze this file when we discuss the Linux kernel.

3. Differences between ELF and bin files:

1) the ELF

The ELF file format is an open standard used by executables on various UNIX systems and comes in three different types:

  • Relocatable Object File (Relocatable, or Object File)

  • Executable files

  • A Shared Object or Library

The ELF format provides two different perspectives. The linker sees the ELF file as a collection of sections, and the loader sees the ELF file as a collection of segments.

2) bin

The BIN file is a straightforward binary file with no internal address markers. The internal data of the bin file is arranged according to the physical space address of the code segment or data segment. Generally use the programmer to burn from 00, and if the download to run, then download to compile time address.

On Linux OS, in order to run executables, they follow the ELF format, usually gcc-o test test.c, which generates the test file in ELF format and is ready to run, executes the ELF file and the kernel uses the loader to parse the ELF file and execute it.

In Embedded, if you power up and run without an OS, you will fail if you burn ELF files, symbol tables, character tables and other sections containing ELF files. If you use objCopy to generate pure binary files, You get rid of sections like symbol tables, just keep code sections and data sections, and you can run the program step by step.

Elf files contain symbol tables, etc. The BIN file is an image of code, data, and custom segments extracted from the ELF file.

And the location of the code segment data segment in the ELF file is not its actual physical location. His actual physical location is marked on the table.

For more embedded Linux dry goods, please pay attention to “a Mouthful of Linux”