“This is the second day of my participation in the First Challenge 2022, for more details: First Challenge 2022”.

In the last data structure analysis of class DE (general), we have basically made a comprehensive analysis of OC classes. This article is to continue to analyze and record the method calls in the class along with the ideas. (This article contains about 2396 words)

This paper focus on

  1. Partially commonarm64Assembly instruction
  2. objc_msgSendFind process analysis quickly

The preparatory work

  1. Objc4-818 source
  2. This static analysis of assembly logic is based on arm64 architecture branch.
  3. The detailed meaning of macro definitions in the analysis process can be seen in objC source code common macro definitions.

A simple assembly command

Before starting the analysis, let’s add some arm64 assembly commands to avoid confusion later when analyzing the objc_msgSend quick lookup flow.

1.1 Data Operation

  • Mov: Copy the value of one register to another register ** (used only between registers or between registers, not memory addresses) **

    Mov x1, x0 // copy the value of x0 to x1


  • Add: Add the value of one register to the value of another register ** and save the result in another register

    Add x0, x1, x2 // Add the x1 and x2 values and save them to x0

  • Sub: Subtract the value of one register from the value of another register and save the result in another register:

    Sub x0, x1, x2 // save x1 and x2 to x0


  • And: Places the value of one register with the value of another register and saves the result in the other register

    And x0, x0, #0x1 // save the value of register x0 and constant 1 to register x0

  • ORR: Puts the value of one register in bitwise or ** with the value of another register and saves the result in another register

    ORR x0, x0, #0x1 // Save the value of register x0 to register x0 by bit or after constant 1

  • Eor (XOR) : Bitwise xor of the value of one register and ** of another register and save the result in the other register

    ORR x0, x0, #0x1 // Save the value of register x0 and constant 1 ** by bit into register x0


  • LSR: Logical shift right, >>
  • lsl: logic moved to the left, <<

    LSR x0, #48 //x0>>48, used with add, mov, eOR, etc.

1.2 Load – Single

  • STR: writes the value in the register to memory

    STR x0, [x0, x8] // write the value of register x0 to stack memory x0+x8

  • LDR: Reads a value from memory into a register

    LDR x0, [x0, x8] LDR x0, [x0, x8] LDR x0, [x0, x8

1.3 compare

  • CMP: Comparison instruction that compares whether the values of two registers are equal.

    CMP x0, x1 // Compare x0, x1, execute with jump instruction

  • CMMP: Multiple comparison instruction that compares whether the values of two registers are equal.

    CCMP x13, x12, #0x0, ne //x13, x12 are compared with #0x0, ne is the conditional field. X13, x12! = 0


  • CBZ: Compare with 0. If true, transfer

  • CBNZ: Compared with non-zero, the result is true and the transfer occurs

    CMP x0, address =0, ==0 jump address


  • TBZ: The test bit is compared with 0, and the result is true

  • TBNZ: Tests the bit to compare with non-zero, and transfers if the result is true

    TBZ p1, #0x6, /Function /Function

1.4 jump

  • B: The simplest jump instruction. Encountering a B instruction, the ARM processor will immediately jump to the given destination address and continue execution from there.

    B address // Redirect address

  • Bl: Also a jump instruction, the next instruction will be saved in the LR register before the jump. Therefore, it is possible to go back to the instruction execution after the jump instruction by reloading the contents of the LR into the PC.

  • BLR: Similar to the BL instruction, but the jump address is taken from a specific register.

  • Ret: Returns. Press Return to understand.

1.5 conditions domain

  • B.l e: < =
  • B.g e: > =
  • B.l t.
  • B.g t: >
  • B.e q: =
  • B.n: e! =
  • B. Hi: unsigned >
  • B. hs: no sign >=
  • B. Lo: unsigned <
  • B. ls: unsigned <=

What is the meaning of the b and F after the jump instructions b.le1b, B.erg2f, b.gt 4f?

  1. B: Backward, F: forward
  2. Take Objc source code as an example: there may be multiple local tags 1, 2 and 3 in an assembly source code. The jump tag indicates the direction of the search.

The above is to analyze the source code and supplement all the assembly Instructions, are more common Instructions, more ARM64 assembly Instructions, can refer to the document armDeveloper — A64 General Instructions

Two, quick search process

Because the quick lookup of objc_msgSend is written in assembly language, the instruction knowledge is supplemented at the beginning to facilitate the analysis of the assembly, and the instructions recorded above are annotated later in the analysis process. So get down to business and start to test and analyze OC’s message mechanism -objc_msgSend’s quick lookup process.

OC layer [xx method] is finally converted to the runtime API for execution, such as alloc, init, etc. These methods are called by _objc_msgSend(id self, SEL op…). The _objc_msgSend message is sent.

2.1 Find msgSend source code

Assembler files are suffixed with (.s), so a global search of objc_msgSend folds all files to find the source files for the schema, or use bottom Filter to Filter (.s) files.

When you find the file expand, findENTRY _objc_msgSendStart analyzing:

As shown in the figure, there are only so many ENTRY codes. According to CMP P0, #0, there are three logical situations:

  1. LNilOrTagged: tagged_pointer, optimized isa->class->CacheLookup.
  2. LReturnZero :reviver == nil
  3. GetClassFromIsa_p16: raw_ISA, raw ISA ->class->CacheLookup.

Isa (red box 3) is used to find the Class in class. cache.

instance -> isa ->class -> cache -> lookup

2.2 GetClassFromIsap16

Then let’s look at GetClassFromIsap16 and see how to find CLS via ISA and store it in P16:

Note that __LP64__ is a data model, not the CPU architecture of MacOS. In addition to LP64, there are LLP64, ILP64, SILP64, ILP32 and so on. More model information can be found in the objC source common macro definitions.

2.3 CacheLookup analysis

After analyzing the ENTRY of objc_msgSend, I will analyze the implementation of CacheLookup to see how IMP lookup is performed once THE CLS is found, which is also the core logic of the quick lookup process.

Looking for IMP code compared to ENTRY isa-> CLS is much more, the difficulty of analysis has also added a lot, but fortunately line by line detailed logic analysis, instruction meaning has been marked in the figure completed, for the overall logic of a simplified summary (pay attention to the key logic part) :

Class -> lookup cache_t -> (buckets -> mask -> index) -> do… While (buckets [0-index]) — > do… while((buckets [index-max])

CacheHit and __objc_msgSend_uncached (discussed in the next article). CacheHit and __objc_msgSend_uncached (cached)

2.4 CacheHit analysis

Since found sel, must have Hit the logic is simple, it should be: {imp, sel} with the storage time, according to the different CPU architecture coding/decoding then return true imp:

2.5 Browse the real machine Assembly

At this point, the static analysis of the objc_msgSend quick lookup logic is complete, and then find a simple piece of code to run through the real code, to see if the dynamic debugging assembly code logic is the same as the static analysis of the integration of the logic path, verification at the same time can consolidate the understanding of the quick lookup process.

    UIPerson* p =  [[UIPerson alloc]init];
    [p superclass];
Copy the code
  1. Execute to breakpoint[p superclass]
  2. Xcode menu -> Debug -> Debug Workfolw -> Always Show Disassemly
  3. hold ctrl + setp into

MacOS version is too low, iOS is too high, Xcode cannot be promoted, and DEBUG cannot be accessed. The picture will be added later

Third, summary


Above is this for objc_msgSend quick lookup of the assembly source code for a comprehensive analysis, in addition to the source logic of the text analysis, most of the source logic interpretation analysis is attached to the screenshots, pictures also need to look carefully.

So far is already hit IMP, if still not found? LookUpImp slow lookup, jumps to __objc_msgSend_uncached and gets cached.

If you have any help, please like 👍, save ✨ and comment ✍️. If not, welcome to correct 🙆🏻♂️.