A: the specification of ARM assembly

In ARM assembly, all labels must be top-coded on one line, and there is no need to add a high-level language colon to the end, and all instructions cannot be top-coded. The ARM assembler is case-sensitive to identifiers and must be case-sensitive when writing labels and instructions. In an ARM assembler, an ARM instruction, directive, or register name can be all uppercase or all lowercase, but cannot be written in a mixture of case and upper case. Comments use “;” (semicolon), the annotation content is composed of “;” (semicolon) from the beginning to the end of the line, comments can be written at the top of the line.

Standard of ARM instruction format is: [mark] < instruction conditions | | S > < operand > comments [;] source program are allowed in the free line, proper insert a blank line can improve the readability of the source code. If a single line is too long, it can be broken using the character “\”, which cannot be followed by any characters, including Spaces, tabs, and so on. For variable Settings and constant definitions, the identifier must be written at the top of the line.

Examples of correct and incorrect assembly instructions are as follows:

Correct example:

Examples of mistakes:

2: symbol

In ARM assembly, symbols can stand for addresses, variables, and numeric constants. When symbols stand for addresses, they are also called labels. Symbols are variable names of variables, names of numeric constants, and labels.

1. Symbols are made up of upper and lower case letters, numbers and underscores;

2. Except for local labels that begin with a number, other symbols cannot begin with a number;

3. Symbols are case-sensitive and all characters are meaningful;

4. Symbol is unique in its scope;

5. Symbols cannot have the same name with symbols within the system or predefined by the system;

6. Symbols do not have the same name as instruction mnemonic and pseudoinstruction.

Three: constant

There are three kinds of constant: numeric constant, character constant and Boolean constant.

1. There are three ways to represent a numeric constant:

Decimal numbers, such as 12,5,876,0;

Hexadecimal numbers, such as 0x4387, 0xFF0, 0x1;

N-base number is represented by N-XXX, where N is 2 ~ 9 and XXX is the specific number. Such as 2-010111, 8-4363156, etc.

2. Character constants

A character constant is represented by a pair of single quotes followed by an intermediate string. Standard C language escape characters can also be used. If you want to include double quotes or $, you must use """" and $$instead. For example: Hello SETS “Hello World!” Errorl SETS “The parameter” “VFH” “error$$2”

Boolean constants have logic TRUE {TRUE} and logic FALSE {FALSE}. For example: testno SETS {FALSE}

IV. Basic format and description of instructions

The basic format of ARM instruction is:

< opcode > {< cond >} {s} < Rd >, < Rn > {, < operand2 >} < > number of item is a must, within the # {} is optional.

The descriptions are as follows:

Opcode: instruction mnemonic

Cond: Execute the condition

S: Whether to affect the value of the CPSR register

RD: Target register

Rn: Register of the first operand

Operand2: The second operand

The use of the conditional code “cond” can achieve efficient logical operations (save jumps and conditional statements) and improve code efficiency. All ARM instructions can be conditionally executed, while only B(jump) instruction can be conditionally executed. If an instruction does not indicate a condition code, it defaults to unconditional (AL) execution.

Five: ARM instruction condition domain

ARM instructions are generally divided into five domains:

The first field is the cond associated with conditional execution, the conditional code field.

The second field is the instruction code field, opceode;

The third field is the address base Rn, the first operand, which is the register

The fourth field is the destination or source register RD;

The fifth domain is the address offset or operation register, the operand region, or OP2;

The 5 domains of the above instruction are: 0000 0010 1001 0001 0000 0000 0000 1000 condition instruction execution, there are 4 bits of COND, a total of 16 combinations, cooperate with CPSR to decide whether the instruction is executed, as shown in Table 2.1.

For example, instructions ADDEQ R4, R3, #1, with the EQ flag bit, are executed only at the Z position in the CPSR.

Wherein, the relationship between conditional suffix and S suffix is as follows:

1. If there are both conditional suffixes and S suffixes, then S is written after, such as ADDEQS R1,R0,R2. This instruction is executed at Z=1, putting the value of R0+R2 into R1 and refreshing the conditional flag bit.

2. The condition suffix is to test the condition flag bit, and the S suffix is to refresh the condition flag bit.

3. The conditional suffix tests the flag bit before execution, while the S suffix changes the conditional flag according to the result of the instruction.

Summary:

Through the study of this lesson, we understand the specification, instruction condition field and the format of ARM instruction in the assembly: <opcode> {<cond>} {s} <Rd>, <Rn> {,<operand2>}, where the item in <> is required and the item in {} is optional.