The main purpose of this article is to understand the method lookup flow of objc_msgSend

In iOS- Underlying Principles 11: Cache principles in Objc_class, we examined the cache write process. Before the write process, there are two cache read processes: objc_msgSend and cache_getImp

Before analyzing, what is Runtime

The Runtime is introduced

Runtime is called runtime and is distinguished from compile time

  • Runtime is a dynamic phase in which code runs, is loaded into memory, and if something goes wrong, the program crashes

  • Compilation is the process of translating source code into code that can be recognized by the machine. It is mainly the most basic check of the language to report errors, that is, lexical analysis, grammar analysis and so on. It is a static stage

Runtime can be used in the following three ways, and the relationship between the three implementation methods and the compilation layer and the bottom layer is shown in the figure

  • Through OC code, such as [Person sayNB]

  • With an NSObject method, such as isKindOfClass

  • Through the Runtime API, such as class_getInstanceSize

The compiler is the compiler we know, that is, LLVM. For example, OC alloc corresponds to the underlying objC_alloc, and Runtime System libarary is the underlying library

Explore the nature of the method

Nature of method

In iOS- Underlying Principles 07: Isa and class association principle article, through the source code of clang compilation, understand the nature of OC objects, similarly, using clang compilation main. CPP file, by looking at the main function method call implementation, as shown below

LGPerson *person = [LGPerson alloc]; [person sayNB]; [person sayHello]; LGPerson *person = ((LGPerson *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("LGPerson"), sel_registerName("alloc")); ((void (*)(id, SEL))(void *)objc_msgSend)((id)person, sel_registerName("sayNB")); ((void (*)(id, SEL))(void *)objc_msgSend)((id)person, sel_registerName("sayHello"));Copy the code

As you can see from the above code, the essence of the method is objc_msgSend message sending

To verify, the call to [Person sayNB] is done with the objc_msgSend method to see if the prints are consistent

Note: Call objc_msgSend directly #import

select target –> Build Setting –> search MSG — will enable strict checking of obc_msgSend Calls changed from YES to NO to turn off the draconian checking mechanism, otherwise objc_msgSend will report an error

LGPerson *person = [LGPerson alloc];   
objc_msgSend(person,sel_registerName("sayNB"));
[person sayNB];

Copy the code

[person sayNB] is equivalent to objc_msgSend(person,sel_registerName(“sayNB”)).

Object method calls – the actual execution is the implementation of the superclass. In addition to validation, we can also try to make the call to Person implement the implementation of the superclass, via objc_msgSendSuper

  • Define two classes: LGPerson and LGTeacher, whose parent class implements the sayHello method

  • Calls in main

LGPerson *person = [LGPerson alloc]; LGTeacher *teacher = [LGTeacher alloc]; [person sayHello]; struct objc_super lgsuper; lgsuper.receiver = person; // The recipient of the message is person lgsuper.super_class = [LGTeacher class]; Objc_msgSendSuper (&lgsuper, sel_registerName("sayHello"));Copy the code

Objc_msgSendSuper has two parameters (struct sel), whose struct type is the struct object defined by objc_super, and needs to specify two properties of receiver and super_class. The source code implementation & definition is as follows

  • Objc_msgSendSuper method parameter

  • Objc_super source code definition

The print result is as follows

Either [Person sayHello] or objc_msgSendSuper executes the sayHello implementation of the parent class, so we can make a guess: method calls first look in the class, if not found in the class, will look in the parent class.

With that in mind, let’s explore the source code implementation of objc_msgSend

Objc_msgSend Quick lookup process analysis

In objC4-781 source code, search objc_msgSend, because we are daily development is arm64 architecture, so need to find objc_msgSend source code implementation in arm64 suffix file, found is assembly implementation, its assembly overall execution flow is as follows

Objc_msgSend Assembler source code

Objc_msgSend is the source code entry for sending messages, which is implemented using assembly. The _objc_msgSend source code is implemented as follows

//---- message sending -- assembly ENTRY --objc_msgSend Is mainly to get the recipient's ISA information ENTRY _objc_msgSend //---- No window UNWIND _objc_msgSend, NoFrame //---- p0 and null comparison, that is, to determine whether the receiver exists, Where p0 is the first parameter to objc_msgSend - message receiver CMP P0, //---- le < -- Supports taggedPointer (small object type) process #if SUPPORT_TAGGED_POINTERS B.le LNilOrTagged // (MSB tagged pointer looks negative) #else //---- p0 = 0, //---- p0 is the process that must exist in the receiver. //---- takes out ISA from the object, that is, from the address pointed to by the x0 register. LDR p13, [x0] // p13 = isa //---- in 64-bit architecture p16 = isa (p13) & ISA_MASK, GetClassFromIsa_p16 p13 = class LGetIsaDone: // calls IMP or objc_msgSend_uncached //---- if you have ISA, go to CacheLookup CacheLookup NORMAL, _objc_msgSend #if SUPPORT_TAGGED_POINTERS LNilOrTagged: //---- = null, Return empty b.eqlreturnZero // nil check // tagged adrp x10, _objc_debug_taggedpointer_classes@PAGE add x10, x10, _objc_debug_taggedpointer_classes@PAGEOFF ubfx x11, x0, #60, #4 ldr x16, [x10, x11, LSL #3] adrp x10, _OBJC_CLASS_$___NSUnrecognizedTaggedPointer@PAGE add x10, x10, _OBJC_CLASS_$___NSUnrecognizedTaggedPointer@PAGEOFF cmp x10, x16 b.ne LGetIsaDone // ext tagged adrp x10, _objc_debug_taggedpointer_ext_classes@PAGE add x10, x10, _objc_debug_taggedpointer_ext_classes@PAGEOFF ubfx x11, x0, #52, #8 ldr x16, [x10, x11, LSL #3] b LGetIsaDone // SUPPORT_TAGGED_POINTERS #endif LReturnZero: // x0 is already zero mov x1, #0 movi d0, #0 movi d1, #0 movi d2, #0 movi d3, #0 ret END_ENTRY _objc_msgSendCopy the code

There are mainly the following steps

  • Step 1: Judgeobjc_msgSendThe first argument to the methodreceiverWhether is empty
    • If the supporttagged pointerJump toLNilOrTagged.
      • ifSmall objectsIs null, then null is returned, that isLReturnZero
      • ifSmall objectsNot empty, processing small objectsisa, to [Step 2]
    • If neither is a small object,receiverIt’s not empty. There are two steps
      • fromreceiverRemove theisadepositp13Register,
      • throughGetClassFromIsa_p16,arm64Pass under architectureisa & ISA_MASKTo obtainshiftclsBit-field class information, i.eclass.GetClassFromIsa_p16The assembly is implemented as follows, and then goes to step 2.
.macro GetClassFromIsa_p16 /* SRC */ //---- here used for watchOS #if SUPPORT_INDEXED_ISA // Indexed ISA //----  p16, $0 // optimistically set dst = src tbz p16, #ISA_INDEX_IS_NPI_BIT, 1f // done if not non-pointer ISA -- Determine if nonapointer ISA // isa in P16 is indexed //---- base the page where _objc_indexed_classes is located Read x10 register adrp x10 _objc_indexed_classes@PAGE //---- x10 = x10 + _objc_indexed_classes(offset in page) --x10 base address memory offset according to offset add x10, x10, _objc_indexed_classes@PAGEOFF //---- starting from the ISA_INDEX_SHIFT bit of P16, extract the ISA_INDEX_BITS bit into register P16, Ubfx p16, p16, #ISA_INDEX_SHIFT, #ISA_INDEX_BITS // extract index LDR p16, [x10, p16, UXTP #PTRSHIFT] // load class from array 1: //-- For 64-bit systems #elif __LP64__ // 64-bit Packed ISA //---- p16 = class = ISA & ISA_MASK(shiftCLs) and p16, $0, #ISA_MASK #else // 32-bit raw isa ---- for 32-bit systems mov P16, $0 # endif.endmacroCopy the code
  • [Step 2] After obtaining ISA, the slow search process is enteredCacheLookup NORMAL

CacheLookup CacheLookup assembly source code

//!!!!!!!!!重点!!!!!!!!!!!!
.macro CacheLookup 
    //
    // Restart protocol:
    //
    //   As soon as we're past the LLookupStart$1 label we may have loaded
    //   an invalid cache pointer or mask.
    //
    //   When task_restartable_ranges_synchronize() is called,
    //   (or when a signal hits us) before we're past LLookupEnd$1,
    //   then our PC will be reset to LLookupRecover$1 which forcefully
    //   jumps to the cache-miss codepath which have the following
    //   requirements:
    //
    //   GETIMP:
    //     The cache-miss is just returning NULL (setting x0 to 0)
    //
    //   NORMAL and LOOKUP:
    //   - x0 contains the receiver
    //   - x1 contains the selector
    //   - x16 contains the isa
    //   - other registers are set as per calling conventions
    //
LLookupStart$1:

//---- p1 = SEL, p16 = isa --- #define CACHE (2 * __SIZEOF_POINTER__),其中 __SIZEOF_POINTER__表示pointer的大小 ,即 2*8 = 16
//---- p11 = mask|buckets -- 从x16(即isa)中平移16字节,取出cache 存入p11寄存器 -- isa距离cache 正好16字节:isa(8字节)-superClass(8字节)-cache(mask高16位 + buckets低48位)
    ldr p11, [x16, #CACHE]              
//---- 64位真机
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 
//--- p11(cache) & 0x0000ffffffffffff ,mask高16位抹零,得到buckets 存入p10寄存器-- 即去掉mask,留下buckets
    and p10, p11, #0x0000ffffffffffff   // p10 = buckets 

//--- p11(cache)右移48位,得到mask(即p11 存储mask),mask & p1(msgSend的第二个参数 cmd-sel) ,得到sel-imp的下标index(即搜索下标) 存入p12(cache insert写入时的哈希下标计算是 通过 sel & mask,读取时也需要通过这种方式)
    and p12, p1, p11, LSR #48       // x12 = _cmd & mask 

//--- 非64位真机
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4 
    and p10, p11, #~0xf         // p10 = buckets
    and p11, p11, #0xf          // p11 = maskShift
    mov p12, #0xffff
    lsr p11, p12, p11               // p11 = mask = 0xffff >> p11
    and p12, p1, p11                // x12 = _cmd & mask
#else
#error Unsupported cache mask storage for ARM64.
#endif

//--- p12是下标 p10是buckets数组首地址,下标 * 1<<4(即16) 得到实际内存的偏移量,通过buckets的首地址偏移,获取bucket存入p12寄存器
//--- LSL #(1+PTRSHIFT)-- 实际含义就是得到一个bucket占用的内存大小 -- 相当于mask = occupied -1-- _cmd & mask -- 取余数
    add p12, p10, p12, LSL #(1+PTRSHIFT)   
                     // p12 = buckets + ((_cmd & mask) << (1+PTRSHIFT)) -- PTRSHIFT是3

//--- 从x12(即p12)中取出 bucket 分别将imp和sel 存入 p17(存储imp) 和 p9(存储sel)
    ldp p17, p9, [x12]      // {imp, sel} = *bucket 

//--- 比较 sel 与 p1(传入的参数cmd)
1:  cmp p9, p1          // if (bucket->sel != _cmd) 
//--- 如果不相等,即没有找到,请跳转至 2f
    b.ne    2f          //     scan more 
//--- 如果相等 即cacheHit 缓存命中,直接返回imp
    CacheHit $0         // call or return imp 

2:  // not hit: p12 = not-hit bucket
//--- 如果一直都找不到, 因为是normal ,跳转至__objc_msgSend_uncached
    CheckMiss $0            // miss if bucket->sel == 0 
//--- 判断p12(下标对应的bucket) 是否 等于 p10(buckets数组第一个元素,),如果等于,则跳转至第3步
    cmp p12, p10        // wrap if bucket == buckets 
//--- 定位到最后一个元素(即第一个bucket)
    b.eq    3f 
//--- 从x12(即p12 buckets首地址)- 实际需要平移的内存大小BUCKET_SIZE,得到得到第二个bucket元素,imp-sel分别存入p17-p9,即向前查找
    ldp p17, p9, [x12, #-BUCKET_SIZE]!  // {imp, sel} = *--bucket 
//--- 跳转至第1步,继续对比 sel 与 cmd
    b   1b          // loop 

3:  // wrap: p12 = first bucket, w11 = mask
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
//--- 人为设置到最后一个元素
//--- p11(mask)右移44位 相当于mask左移4位,直接定位到buckets的最后一个元素,缓存查找顺序是向前查找
    add p12, p12, p11, LSR #(48 - (1+PTRSHIFT)) 
                    // p12 = buckets + (mask << 1+PTRSHIFT) 
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
    add p12, p12, p11, LSL #(1+PTRSHIFT)
                    // p12 = buckets + (mask << 1+PTRSHIFT)
#else
#error Unsupported cache mask storage for ARM64.
#endif

    // Clone scanning loop to miss instead of hang when cache is corrupt.
    // The slow path may detect any corruption and halt later.
//--- 再查找一遍缓存()
//--- 拿到x12(即p12)bucket中的 imp-sel 分别存入 p17-p9
    ldp p17, p9, [x12]      // {imp, sel} = *bucket 

//--- 比较 sel 与 p1(传入的参数cmd)
1:  cmp p9, p1          // if (bucket->sel != _cmd) 
//--- 如果不相等,即走到第二步
    b.ne    2f          //     scan more 
//--- 如果相等 即命中,直接返回imp
    CacheHit $0         // call or return imp  

2:  // not hit: p12 = not-hit bucket
//--- 如果一直找不到,则CheckMiss
    CheckMiss $0            // miss if bucket->sel == 0 
//--- 判断p12(下标对应的bucket) 是否 等于 p10(buckets数组第一个元素)-- 表示前面已经没有了,但是还是没有找到
    cmp p12, p10        // wrap if bucket == buckets 
    b.eq    3f //如果等于,跳转至第3步
//--- 从x12(即p12 buckets首地址)- 实际需要平移的内存大小BUCKET_SIZE,得到得到第二个bucket元素,imp-sel分别存入p17-p9,即向前查找
    ldp p17, p9, [x12, #-BUCKET_SIZE]!  // {imp, sel} = *--bucket 
//--- 跳转至第1步,继续对比 sel 与 cmd
    b   1b          // loop 

LLookupEnd$1:
LLookupRecover$1:
3:  // double wrap
//--- 跳转至JumpMiss 因为是normal ,跳转至__objc_msgSend_uncached

    JumpMiss $0 
.endmacro

//以下是最后跳转的汇编函数
.macro CacheHit
.if $0 == NORMAL
    TailCallCachedImp x17, x12, x1, x16 // authenticate and call imp
.elseif $0 == GETIMP
    mov p0, p17
    cbz p0, 9f          // don't ptrauth a nil imp
    AuthAndResignAsIMP x0, x12, x1, x16 // authenticate imp and re-sign as IMP
9:  ret             // return IMP
.elseif $0 == LOOKUP
    // No nil check for ptrauth: the caller would crash anyway when they
    // jump to a nil IMP. We don't care if that jump also fails ptrauth.
    AuthAndResignAsIMP x17, x12, x1, x16    // authenticate imp and re-sign as IMP
    ret             // return imp via x17
.else
.abort oops
.endif
.endmacro

.macro CheckMiss
    // miss if bucket->sel == 0
.if $0 == GETIMP 
//--- 如果为GETIMP ,则跳转至 LGetImpMiss
    cbz p9, LGetImpMiss
.elseif $0 == NORMAL 
//--- 如果为NORMAL ,则跳转至 __objc_msgSend_uncached
    cbz p9, __objc_msgSend_uncached
.elseif $0 == LOOKUP 
//--- 如果为LOOKUP ,则跳转至 __objc_msgLookup_uncached
    cbz p9, __objc_msgLookup_uncached
.else
.abort oops
.endif
.endmacro

.macro JumpMiss
.if $0 == GETIMP
    b   LGetImpMiss
.elseif $0 == NORMAL
    b   __objc_msgSend_uncached
.elseif $0 == LOOKUP
    b   __objc_msgLookup_uncached
.else
.abort oops
.endif
.endmacro

Copy the code

It is mainly divided into the following steps

  • In objc_class, the header address is exactly 16 bytes away from the cache. That is, isa is 8 bytes away from the cache, and superClass is 8 bytes away from the cache. In the cache, the top 16 bits store the mask, and the bottom 48 bits store the buckets. The p11 = cache

  • [Step 2] Fetch buckets and mask from the cache, and then mask calculates the hash subscript according to the hash algorithm

    • The & operation of cache and mask (0x0000FFFFFFFFFFFF) is used to erase the 16-bit mask to get the address of buckets pointer, that is, P10 = buckets

    • Move the cache 48 bits to the right to obtain mask, that is, P11 = mask

    • Select * from objc_msgSend (p1) and msak (_cmd) to hash the sel-IMP bucket index (p12 = index = _cmd & mask). Since sel-IMP is stored by calculating the hash subscript using the same hash algorithm, it also needs to be read in the same way, as shown below

  • [Step 3] Fetch the bucket corresponding to index and buckets according to the first address of their hash indexes

    • Bucket_t: sel = 8 bytes; IMP = 8 bytes; PTRSHIFT = 3

    • Multiply the hash subscript index by the memory size occupied by a bucket to obtain the offset of the starting address of buckets in the actual memory

    • Obtain the bucket corresponding to hash index by starting address + actual offset

  • [Step 4] According to the obtained bucket, remove the IMP and store it in P17, that is, P17 = IMP, remove SEL and store it in P9, that is, P9 = SEL

  • [Step 5] The first recursive loop

    • Compare sel in the obtained bucket with the _cmd(p1) of the second parameter of objc_msgSend

    • If they are equal, we jump directly to CacheHit, which is a CacheHit, and imp is returned

    • If they are not equal, there are two situations

      • If not, skip to CheckMiss, because $0 is normal, then skip to __objc_msgSend_uncached

      • If the bucket obtained according to index is equal to the first element of buckets, the bucket is set as the last element of Buckets (by moving the first address of buckets +mask 44 places to the right (the same as moving the mask 4 places to the left) to the last element of bucker. Then continue the recursive loop (the first recursive loop nested the second recursive loop), i.e. [Step 6]

      • If the current bucket is not equal to buckets’ first element, the search continues and the first recursive loop is entered

  • [Step 6] The second recursive loop: If the bucket is the same as the first element of buckets, we jump to JumpMiss ($0 = normal, __objc_msgSend_uncached). Enter the slow search process

Here’s how the values change throughout the quick lookup process