What to learn from this article:

  • Why explore objc_msgSend?
  • What is the stored procedure for cache reading?
  • Compile – time profiling of OC
  • This section describes the Runtime version
  • What are the ways in which methods are called?
  • What is the call link relationship for a method
  • How do I play objc_msgSeng at the code level
  • Objc_msgSend Assembler analysis
  • Objc_msgSend assembles the process of finding imp based on SEL
  • Objc_msgSend Assembly flow chart
  • The source code

To explore the reasons

Objc_msgSend * is an example of a methodCache that has been read and stored

Cache Cache read, stored procedure

I have excerpted lines 62-77 from the objc_cached.mm file describing the cache process as follows

  • Cache readers (PC-checked by collecting_in_critical())
  • objc_msgSend*
  • cache_getImp
  • Cache readers/writers (hold cacheUpdateLock during access; not PC-checked)
  • cache_t::copyCacheNolock (caller must hold the lock)
  • cache_t::eraseNolock (caller must hold the lock)
  • cache_t::collectNolock (caller must hold the lock)
  • cache_t::insert (acquires lock)
  • cache_t::destroy (acquires lock)

OC compile time to runtime

Before we do that, it’s worth taking a look at what OC does at compile time and runtime:

Compile time:

A Building, as the name suggests, is when code is being compiled. What is compilation? The machine translates the source code for you into code that the machine can recognize, doing compile-time type checking (static type checking). Errors or waring messages will appear if there are any problems in the translation process, and help you make some optimizations to improve the efficiency of code execution, which is done by LLVM.

The runtime

Running: the program is Running, loaded into memory, and executed. This is run-time type checking, which is different from compile-time type checking. It’s not a simple type scan and static analysis

Runtime version

Objective-C Runtime Programming Guide

  • Legacy version
    • Earlier versions, for Objective-C 1.0, 32-bit Mac OS X platforms
  • The Modern version
    • The current version is available on Objective-C 2.0, iPhone applications and 64-bit systems after Mac OS X V10.5

Objc_msgSend exploring the Runtime API

Since I’m going to start exploring objc_msgSend (Runtime API), which is essentially sending messages, I have a few questions about who to send messages to and how to send messages. Here are some of the questions I’m going to explore:

  1. objc_msgSendWho is it called by, and for what?
  2. whyOCThe code incompileWill be translated intoobjc_msgSendWhat are the advantages?
  3. Whether to passparameterIf so, how are parameters stored?
  4. How to implement objc_msgSend source code? Why choosehuiMake up, notC/C++?
  5. objc_msgSendSuperWhat is it?

Method is called

Daily writing code inObjective-C CodeLayers, lots of themFramework,servicesIt’s all on this floor,Runtime System LibraryFor the underlying related libraries, throughcomplier(compiler layer) intercepts in the middle layer, providing support for the upper Framework and Runtime

  • First: OC level method call

  • Second: NSObject calls the associated API

  • Third: the underlying API provided by ObjC

Method call link relation:

Case code:

int main(int argc, const char * argv[]) { @autoreleasepool { // insert code here... FFPerson *person = [FFPerson alloc]; // call method: OC [person likeGirls]; Framework [Person performSelector:@selector(likeGirls)]; } return 0; }Copy the code

Compile to a.cpp file

int main(int argc, const char * argv[]) {
    /* @autoreleasepool */ { __AtAutoreleasePool __autoreleasepool; 

        FFPerson *person = ((FFPerson *(*)(id, SEL))(void *)objc_msgSend)((id)objc_getClass("FFPerson"), sel_registerName("alloc"));
        ((void (*)(id, SEL))(void *)objc_msgSend)((id)person, sel_registerName("likeGirls"));
        ((id (*)(id, SEL, SEL))(void *)objc_msgSend)((id)person, sel_registerName("performSelector:"), sel_registerName("likeGirls"));
    }
    return 0;
}
Copy the code

As sensed by the.cpp file, the upper OC code is interpreted as objc_msgSend at the lower level, including alloc, etc. The alloc method of FFPerson is translated into objc_msgSend ((id)objc_getClass(“FFPerson”), sel_registerName(“alloc”)) in the.cpp file; To simplify, objc_msgSend(“FFPerson”,”alloc”), that is, objc_msgSend takes two parameters, receiver (the receiver of the message) and SEL (method number). This time, the objc_msgSend parameter is a receiver, SEL, and SEL, which means that the objc_msgSend parameter can be more than two.

Play objc_msgSeng at the code level

The Demo address

Preparations:

  1. xcodeSettings:target -> Build Settings -> App Clang Perprocessing -> Enable Strict Checking of objc_msgSend callsSet toNO, which means close the compiler pairobjc_msgSendObjc_msgSend allows you to use objc_msgSend freely in your code
  2. Import in a fileruntimeLibrary, that is,#import <objc/message.h>

Case code

H > #import "ffperson.h" #import "ffboys.h" #import <objc/message.h> //OC layer method calls void MethodCall (void) {FFPerson *person = [FFPerson alloc]; // call method: OC [person likeGirls]; Framework [Person performSelector:@selector(likeGirls)]; } //objc_msgSend void objc_msgSendCall(void) {// Allocate memory to FFperson FFperson *person = objc_msgSend(objc_getClass("FFPerson"), sel_registerName("alloc")); // call the objc_msgSend(person, sel_registerName("likeGirls")) method; } //objc_msgSendSuper calls void objc_msgSendSuperCall(void) {// Allocate memory to FFperson FFBoys *boys = objc_msgSend(objc_getClass("FFBoys"), sel_registerName("alloc")); Struct objc_super boysSuper; boysSuper.receiver = boys; // The super_class can be the current class FFBoys or FFPerson. The super_class is only the first object to be searched for by the specified method. // boyssuper. super_class = objc_getClass("FFBoys"); boysSuper.super_class = objc_getClass("FFPerson"); // call the super method objc_msgSendSuper(&boyssuper, sel_registerName("likeGirls")); } int main(int argc, const char * argv[]) { @autoreleasepool { // insert code here... methodCall(); objc_msgSendCall(); objc_msgSendSuperCall(); } return 0; }Copy the code

Print the result

2021-06-25 23:03:20.412797+0800 001- FFPerson likeGirls [10528:1084950] -[FFPerson likeGirls] 2021-06-25 23:03:20.413186+0800 [FFPerson] -[FFPerson] -[FFPerson] -[FFPerson] -[FFPerson] -[FFPerson] -[FFPerson 2021-06-25 23:03:20.413534+0800 001- [FFPerson likeGirls] Program ended with exit code: 0Copy the code

I’ve defined three methods objc_msgSendCall, objc_msgSendCall, and objc_msgSendSuperCall. The first is a pure Objective-C method, and the second is objc_msgSend. Send a message to FFPerson, and the last one is objc_msgSendSuper, which inherits FFPerson by creating an FFBoys class that declares a likeGirls method with no business, and then looks up the method implementation from the parent class with objc_msgSendSuper, It is also possible to print -[FFPerson likeGirls].

Conclusion:

  1. compileThe upper codeinThe middle layer (c + +)Will correspond to aexplain, when the method is calledThe middle layerThe corresponding isobjc_msgSengMessage is sent
  2. The upperObjective-CMethod is called inThe middle layerWill be translated intoobjc_msgSendorobjc_msgSendSuper.
  3. Objc_msgSend (objc_msgSend(Message receiver.Message body (SEL + argument))
  4. super_classSet up thereceiver, is set towhoWhen the method is calledFirst responderWho is that.
  5. Through this exploration process, the essence of method invocation is message sending

Objc_msgSend Assembler analysis

Objc_msgSend assembles the process of finding imp based on SEL

  • The first step:cmp p0, #0P0 is the message receiverreceiverAnd compare theP0 and 0, if there is no receiver, then objc_msgSendIt makes no sense.
  • Step 2: Decide if it isSUPPORT_TAGGED_POINTERSType, does that meantagged pointersPointer If yes, executeb.le LNilOrTaggedAnd then execute it insideb.eq LReturnZero. If not SUPPORT_TAGGED_POINTERS, go straightb.eq LReturnZeroThe objc_msgSendinvalidTo stop sending the message.
  • Step 3: Ifp0Exists, willx0Deposit top13The xo isreceiver, i.e.,classThat classThe first address, i.e.,isaThat is to sayp13=isa.
  • Step 4: EnterGetClassFromIsa_p16, the parameter passedsrc=p13,needs_auth=1,auth_address=x0, judge whether or notSUPPORT_INDEXED_ISA(32-bit ISA), do not meet this condition, will enter next__LP64__(This source code refers to Mac OS X) branch.
  • Step 5: Because_need_auth=1, enter the branchExtractISA p16, \src, \auth_addressThe ExtractISA forThe macro, the operation is to\ SRC (isa),#ISA_MASKdowithOperation, we getClassThe result is saved top16In the.
  • Step 6LGetIsaDone:Obtaining ISA is complete. Next to executeCacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncached:
  1. mov x15, x16Hidden,isaThat will bex16The register is assigned tox15.
  1. Now we’re exploringarm64The architecture of theassembly, so enter such as branchCACHE_MASK_STORAGE_HIGH_16, the implementation ofldr p11, [x16, #CACHE]Instructions,#define CACHE 16, thenp11=x16+0x10Is the same asisa+0x10Isa is shifted to the right16Byte, got itcache_t, i.e.,p11=cache_t.
  1. Into theCONFIG_USE_PREOPT_CACHESThe branches, I’ll do it hereThe A12ZThat is the above chip, so do not enter#if __has_feature(ptrauth_calls)Branch, enterelseperformand p10, p11, #0x0000fffffffffffeInstruction,p11and#0x0000fffffffffffe(preoptBucketsMask) getbucketsAddress, existsp10. And then execute the instructionstbnz p11, #0Cache_t (p11) = 0; if cache_t is 0, you don’t need to go to bucketsLLookupPreopt
  1. eor p12, p1, p1, LSR #7Because register P0 isreceiver.p1Register forSecond parameter.SEL _cmd, sop1=_cmd, corresponding to the above instructions can be obtainedp12 = (_cmd >> 7) ^ _cmd
  1. and p12, p12, p11, LSR #48P11 = chahe_t = _bucketsAndMaybeMask p11 = chahe_t = _bucketsAndMaybeMask p12 = P12&(_bucketsAndMaybeMask >> 48)
  1. add p13, p10, p12, LSL #(1+PTRSHIFT)PTRSHIFT= buckets and p12 = index PTRSHIFT= buckets and p12 = index PTRSHIFT= buckets and P10 P13 = p10 + (p12 << (1+3)), move index 4 bits to the left, then get the result N, move response n steps to the start address of buckets, find the final bucket_t.
  1. 1: ldp p17, p9, [x13]Since x13 is bucket_tj, the first value is IMP, the second value is sel, so p17= IMP, p9=sel.
  1. cmp p9, p1Compare p1 with p9, that is, compare what I fetch in the cacheseL andobjc_msgSendThe second argument to_cmdIf not, execute the commandb.ne 3f, will execute three instructions:
  • Article 1 with acbz p9, \MissLabelDynamicI can’t find itsel
  • Article 2:cmp p13, p10Loop lookup criteria when looking forbucket_tIs greater than the address of bucetsThe first address“, keep looking
  • Article 3 theb.hs 1b, return to 1 to perform SEL comparison, if equal2: CacheHitHit a method in the cache, find the cache, enterCacheHit
  • Step 7CacheHitDivided into3#define NORMAL 0, #define GETIMP 1, #define LOOKUP 2, either way, the end result is to find sel corresponding IMP, and then return it

The source code

_objc_msgSend

ENTRY _objc_msgSend UNWIND _objc_msgSend, NoFrame cmp p0, #0 // nil check and tagged pointer check #if SUPPORT_TAGGED_POINTERS b.le LNilOrTagged // (MSB tagged pointer looks negative) #else b.eq LReturnZero #endif ldr p13, [x0] // p13 = isa GetClassFromIsa_p16 p13, 1, x0 // p16 = class LGetIsaDone: // calls imp or objc_msgSend_uncached CacheLookup NORMAL, _objc_msgSend, __objc_msgSend_uncached #if SUPPORT_TAGGED_POINTERS LNilOrTagged: b.eq LReturnZero // nil check GetTaggedClass b LGetIsaDone // SUPPORT_TAGGED_POINTERS #endif #if SUPPORT_TAGGED_POINTERS  LNilOrTagged: b.eq LReturnZero // nil check GetTaggedClass b LGetIsaDone // SUPPORT_TAGGED_POINTERS #endif LReturnZero: // x0 is already zero mov x1, #0 movi d0, #0 movi d1, #0 movi d2, #0 movi d3, #0 retCopy the code

GetClassFromIsa_p16

.macro GetClassFromIsa_p16 src, needs_auth, auth_address /* note: auth_address is not required if ! needs_auth */ #if SUPPORT_INDEXED_ISA // Indexed isa mov p16, \src // optimistically set dst = src tbz p16, #ISA_INDEX_IS_NPI_BIT, 1f // done if not non-pointer isa // isa in p16 is indexed adrp x10, _objc_indexed_classes@PAGE add x10, x10, _objc_indexed_classes@PAGEOFF ubfx p16, p16, #ISA_INDEX_SHIFT, #ISA_INDEX_BITS // extract index ldr p16, [x10, p16, UXTP #PTRSHIFT] // load class from array 1: #elif __LP64__ .if \needs_auth == 0 // _cache_getImp takes an authed class already mov p16, \src .else // 64-bit packed isa ExtractISA p16, \src, \auth_address .endif #else // 32-bit raw isa mov p16, \src #endif .endmacroCopy the code

ExtractISA

.macro ExtractISA
	and	$0, $1, #ISA_MASK
#if ISA_SIGNING_AUTH_MODE == ISA_SIGNING_STRIP
	xpacd	$0
#elif ISA_SIGNING_AUTH_MODE == ISA_SIGNING_AUTH
	mov	x10, $2
	movk	x10, #ISA_SIGNING_DISCRIMINATOR, LSL #48
	autda	$0, x10
#endif
.endmacro
Copy the code

CacheLookup

.macro CacheLookup Mode, Function, MissLabelDynamic, MissLabelConstant
	//
	// Restart protocol:
	//
	//   As soon as we're past the LLookupStart\Function label we may have
	//   loaded an invalid cache pointer or mask.
	//
	//   When task_restartable_ranges_synchronize() is called,
	//   (or when a signal hits us) before we're past LLookupEnd\Function,
	//   then our PC will be reset to LLookupRecover\Function which forcefully
	//   jumps to the cache-miss codepath which have the following
	//   requirements:
	//
	//   GETIMP:
	//     The cache-miss is just returning NULL (setting x0 to 0)
	//
	//   NORMAL and LOOKUP:
	//   - x0 contains the receiver
	//   - x1 contains the selector
	//   - x16 contains the isa
	//   - other registers are set as per calling conventions
	//

	mov	x15, x16			// stash the original isa
LLookupStart\Function:
	// p1 = SEL, p16 = isa
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
	ldr	p10, [x16, #CACHE]				// p10 = mask|buckets
	lsr	p11, p10, #48			// p11 = mask
	and	p10, p10, #0xffffffffffff	// p10 = buckets
	and	w12, w1, w11			// x12 = _cmd & mask
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
	ldr	p11, [x16, #CACHE]			// p11 = mask|buckets
#if CONFIG_USE_PREOPT_CACHES
#if __has_feature(ptrauth_calls)
	tbnz	p11, #0, LLookupPreopt\Function
	and	p10, p11, #0x0000ffffffffffff	// p10 = buckets
#else
	and	p10, p11, #0x0000fffffffffffe	// p10 = buckets
	tbnz	p11, #0, LLookupPreopt\Function
#endif
	eor	p12, p1, p1, LSR #7
	and	p12, p12, p11, LSR #48		// x12 = (_cmd ^ (_cmd >> 7)) & mask
#else
	and	p10, p11, #0x0000ffffffffffff	// p10 = buckets
	and	p12, p1, p11, LSR #48		// x12 = _cmd & mask
#endif // CONFIG_USE_PREOPT_CACHES
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
	ldr	p11, [x16, #CACHE]				// p11 = mask|buckets
	and	p10, p11, #~0xf			// p10 = buckets
	and	p11, p11, #0xf			// p11 = maskShift
	mov	p12, #0xffff
	lsr	p11, p12, p11			// p11 = mask = 0xffff >> p11
	and	p12, p1, p11			// x12 = _cmd & mask
#else
#error Unsupported cache mask storage for ARM64.
#endif

	add	p13, p10, p12, LSL #(1+PTRSHIFT)
						// p13 = buckets + ((_cmd & mask) << (1+PTRSHIFT))

						// do {
1:	ldp	p17, p9, [x13], #-BUCKET_SIZE	//     {imp, sel} = *bucket--
	cmp	p9, p1				//     if (sel != _cmd) {
	b.ne	3f				//         scan more
						//     } else {
2:	CacheHit \Mode				// hit:    call or return imp
						//     }
3:	cbz	p9, \MissLabelDynamic		//     if (sel == 0) goto Miss;
	cmp	p13, p10			// } while (bucket >= buckets)
	b.hs	1b

	// wrap-around:
	//   p10 = first bucket
	//   p11 = mask (and maybe other bits on LP64)
	//   p12 = _cmd & mask
	//
	// A full cache can happen with CACHE_ALLOW_FULL_UTILIZATION.
	// So stop when we circle back to the first probed bucket
	// rather than when hitting the first bucket again.
	//
	// Note that we might probe the initial bucket twice
	// when the first probed slot is the last entry.


#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16_BIG_ADDRS
	add	p13, p10, w11, UXTW #(1+PTRSHIFT)
						// p13 = buckets + (mask << 1+PTRSHIFT)
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
	add	p13, p10, p11, LSR #(48 - (1+PTRSHIFT))
						// p13 = buckets + (mask << 1+PTRSHIFT)
						// see comment about maskZeroBits
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
	add	p13, p10, p11, LSL #(1+PTRSHIFT)
						// p13 = buckets + (mask << 1+PTRSHIFT)
#else
#error Unsupported cache mask storage for ARM64.
#endif
	add	p12, p10, p12, LSL #(1+PTRSHIFT)
						// p12 = first probed bucket

						// do {
4:	ldp	p17, p9, [x13], #-BUCKET_SIZE	//     {imp, sel} = *bucket--
	cmp	p9, p1				//     if (sel == _cmd)
	b.eq	2b				//         goto hit
	cmp	p9, #0				// } while (sel != 0 &&
	ccmp	p13, p12, #0, ne		//     bucket > first_probed)
	b.hi	4b

LLookupEnd\Function:
LLookupRecover\Function:
	b	\MissLabelDynamic

#if CONFIG_USE_PREOPT_CACHES
#if CACHE_MASK_STORAGE != CACHE_MASK_STORAGE_HIGH_16
#error config unsupported
#endif
LLookupPreopt\Function:
#if __has_feature(ptrauth_calls)
	and	p10, p11, #0x007ffffffffffffe	// p10 = buckets
	autdb	x10, x16			// auth as early as possible
#endif

	// x12 = (_cmd - first_shared_cache_sel)
	adrp	x9, _MagicSelRef@PAGE
	ldr	p9, [x9, _MagicSelRef@PAGEOFF]
	sub	p12, p1, p9

	// w9  = ((_cmd - first_shared_cache_sel) >> hash_shift & hash_mask)
#if __has_feature(ptrauth_calls)
	// bits 63..60 of x11 are the number of bits in hash_mask
	// bits 59..55 of x11 is hash_shift

	lsr	x17, x11, #55			// w17 = (hash_shift, ...)
	lsr	w9, w12, w17			// >>= shift

	lsr	x17, x11, #60			// w17 = mask_bits
	mov	x11, #0x7fff
	lsr	x11, x11, x17			// p11 = mask (0x7fff >> mask_bits)
	and	x9, x9, x11			// &= mask
#else
	// bits 63..53 of x11 is hash_mask
	// bits 52..48 of x11 is hash_shift
	lsr	x17, x11, #48			// w17 = (hash_shift, hash_mask)
	lsr	w9, w12, w17			// >>= shift
	and	x9, x9, x11, LSR #53		// &=  mask
#endif

	ldr	x17, [x10, x9, LSL #3]		// x17 == sel_offs | (imp_offs << 32)
	cmp	x12, w17, uxtw

.if \Mode == GETIMP
	b.ne	\MissLabelConstant		// cache miss
	sub	x0, x16, x17, LSR #32		// imp = isa - imp_offs
	SignAsImp x0
	ret
.else
	b.ne	5f				// cache miss
	sub	x17, x16, x17, LSR #32		// imp = isa - imp_offs
.if \Mode == NORMAL
	br	x17
.elseif \Mode == LOOKUP
	orr x16, x16, #3 // for instrumentation, note that we hit a constant cache
	SignAsImp x17
	ret
.else
.abort  unhandled mode \Mode
.endif

5:	ldursw	x9, [x10, #-8]			// offset -8 is the fallback offset
	add	x16, x16, x9			// compute the fallback isa
	b	LLookupStart\Function		// lookup again with a new isa
.endif
#endif // CONFIG_USE_PREOPT_CACHES

.endmacro
Copy the code

CacheHit

#define NORMAL 0 #define GETIMP 1 #define LOOKUP 2 // CacheHit: x17 = cached IMP, x10 = address of buckets, x1 = SEL, x16 = isa .macro CacheHit .if $0 == NORMAL TailCallCachedImp x17, x10, x1, x16 // authenticate and call imp .elseif $0 == GETIMP mov p0, p17 cbz p0, 9f // don't ptrauth a nil imp AuthAndResignAsIMP x0, x10, x1, x16 // authenticate imp and re-sign as IMP 9: ret // return IMP .elseif $0 == LOOKUP // No nil check for ptrauth: the caller would crash anyway when they // jump to a nil IMP. We don't care if that jump also fails ptrauth. AuthAndResignAsIMP x17, x10, x1, x16 // authenticate imp and re-sign as IMP cmp x16, x15 cinc x16, x16, ne // x16 += 1 when x15 ! = x16 (for instrumentation ; fallback to the parent class) ret // return imp via x17 .else .abort oops .endif .endmacroCopy the code

Objc_msgSend Assembly flow chart

Bottom – level assembly exploration feeling

In the exploration of objc_msgSend assembly, I found that there are not many essential codes and the logic is very simple. Compared with the upper code, assembly is very direct. The difficulty lies in that I do not know the meaning of some instructions, and some partners are discouraged. In fact, there is no need, some time later, under some effort, is no foundation of partners can also understand, look at the assembly of the most important point must be calm, not urgent, not urgent, not urgent. It’s so important that it should be repeated for three times. Original all partners in assembly roam can like a duck to water.