This article uses objC source code version 818.2

1. Introduce clang

Clang is an Apple-led C/C++/Objective-C/Objective-C++ compiler written in C++, based on LLVM and published under the LLVM BSD license. It is almost completely compatible with the GNU C language specification (although there are some incompatibables, including some differences in compiler command options) and adds additional syntactical features such as C function overloading (which modifies functions by __attribute__((overloadable)), One of its goals is to go beyond GCC.

In April 2013,Clang fully supported the C++11 standard and began implementing C++1y features (i.e. C++14, the next minor update to C++). Clang will support its normal lambda expressions, simplified handling of return types, and better handling of constEXPr keywords.

1.1 Simple use of CLang

We usually want to look at the internal implementation logic of the code, usually converting the source files into CPP files

clang -rewrite-objc main.m -o main.cpp
Copy the code
  • Main.m Target file
  • Main. CPP Converted file

1.2 UIKit error

When we want to convert UIKit related things, the command above will give us an error. Run the following command

Clang-rewrite-objc-fobjc-arc-fobjc-runtime = ios-14.0.0-isysroot / Applications/Xcode. App/Contents/Developer/Platforms/iPhoneSimulator platform/Developer/SDKs/iPhoneSimulator14.3. The SDK ViewController.m

If iphonesimulator14.3. SDK is not found, use xcode-contents to find the corresponding SDK.

1.3 xcrun

Xcode is installed with the xcrun command, which is a bit more wrapped around clang to make it easier to use.

  • Emulator – Use the following command

xcrun -sdk iphonesimulator clang -arch arm64 -rewrite-objc main.m -o main-arm64.cpp

  • A:

xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc main.m -o main-arm64.cpp

Class 2.

Create a Person class under main.m and use the clang command above to find the CPP file we need.

@interface Person: NSObject // Add a property to make sure this is the class we're looking for @Property (nonatomic, copy) NSString name; @end @implementation Person @endCopy the code

After the transformation, in the CPP file, we found the following structure.

2.1 Class declaration

// @interface Person: NSObject Struct Person_IMPL {struct NSObject_IMPL NSObject_IVARS; NSString *_name; }; struct NSObject_IMPL { Class isa; };Copy the code

We find that an object, which is itself a structure, has a variable Class ISA inside it.

/// Represents an instance of a class.
struct objc_object {
    Class _Nonnull isa  OBJC_ISA_AVAILABILITY;
};

/// A pointer to an instance of a class.
typedef struct objc_object *id;
Copy the code

Objc source code, we found objc_object definition, its internal isa Class isa. This is the same as the NSObject_IMPL after our clang compilation. So NSObject_IVARS are what we often call isa Pointers.

We often use id types to declare variables without *, just because the underlying processing is done.

2.2 Implementation of class

// @implementation Person

static NSString * _I_Person_name(Person * self, SEL _cmd) 
{ return (*(NSString **)((char *)self + OBJC_IVAR_$_Person$_name)); }
extern "C" __declspec(dllimport) void objc_setProperty (id, SEL, long, id, bool, bool);

static void _I_Person_setName_(Person * self, SEL _cmd, NSString *name) 
{ objc_setProperty (self, _cmd, __OFFSETOFIVAR__(struct Person, _name), (id)name, 0, 1); }
// @end
Copy the code

In the code above, we see two methods

  1. _I_Person_nameThis is a get method that does a return directly.
  2. _I_Person_setName_This is a set method calledobjc_setProperty.

2.2.1 set method

Using objc’s source code, we look for the objc_setProperty method.

void objc_setProperty(id self, SEL _cmd, ptrdiff_t offset, id newValue, BOOL atomic, signed char shouldCopy) { bool copy = (shouldCopy && shouldCopy ! = MUTABLE_COPY); bool mutableCopy = (shouldCopy == MUTABLE_COPY); reallySetProperty(self, _cmd, newValue, offset, atomic, copy, mutableCopy); }Copy the code

Internally determine whether it’s copy or mutableCopy, and then call reallySetProperty.

static inline void reallySetProperty(id self, SEL _cmd, id newValue, ptrdiff_t offset, bool atomic, bool copy, bool mutableCopy) { if (offset == 0) { object_setClass(self, newValue); return; } id oldValue; id *slot = (id*) ((char*)self + offset); if (copy) { newValue = [newValue copyWithZone:nil]; } else if (mutableCopy) { newValue = [newValue mutableCopyWithZone:nil]; } else { if (*slot == newValue) return; newValue = objc_retain(newValue); } if (! atomic) { oldValue = *slot; *slot = newValue; } else { spinlock_t& slotlock = PropertyLocks[slot]; slotlock.lock(); oldValue = *slot; *slot = newValue; slotlock.unlock(); } objc_release(oldValue); }Copy the code

The main operation here is to release oldValue and retain the new value.

This is often asked in an interview to state what goes on inside an @property:

  1. Automatically create with_The variables.
  2. Automatic implementation of set, GET methods.

Apple’s approach to design is worth learning from. It provides an external interface for the upper layer to call, and it calls the underlying methods internally. In this way, no matter how the upper layer changes, it does not affect the underlying interface and implementation.

3. isa

We remember from alloc, init, new that we had a callAlloc method, and one of the steps in callAlloc is to associate objects.

obj->initInstanceIsa(cls, hasCxxDtor); inline void objc_object::initInstanceIsa(Class cls, bool hasCxxDtor) { ASSERT(! cls->instancesRequireRawIsa()); ASSERT(hasCxxDtor == cls->hasCxxDtor()); initIsa(cls, true, hasCxxDtor); }Copy the code

Now let’s see what’s going on here. The code has been simplified, if you need to check the source code.

inline void objc_object::initIsa(Class cls, bool nonpointer, UNUSED_WITHOUT_INDEXED_ISA_AND_DTOR_BIT bool hasCxxDtor) { ASSERT(! isTaggedPointer()); Create an isa_t. What is isa_t? Let's click inside and have a look. isa_t newisa(0); // The following code can wait to see my isa_t before coming back to it. // Assign the default value to the bits content newISa. bits = ISA_MAGIC_VALUE; // isa.magic is part of ISA_MAGIC_VALUE // isa.nonpointer is part of ISA_MAGIC_VALUE newisa.has_cxx_dtor = hasCxxDtor; // Here is the associative object, which is the focus of this section. Let's look at how this setClass is implemented. newisa.setClass(cls, this); newisa.extra_rc = 1; isa = newisa; }Copy the code

So let’s look at isa_t

union isa_t { isa_t() { } isa_t(uintptr_t value) : bits(value) { } uintptr_t bits; Private: // This is private and does not actively assign, but is given by assigning other variables (bits). Class cls; public: #if defined(ISA_BITFIELD) struct { ISA_BITFIELD; // defined in isa.h }; #endif void setClass(Class cls, objc_object *obj); Class getClass(bool authenticated); Class getDecodedClass(bool authenticated); }Copy the code

The code above has been simplified to make it look easier. This is actually a federated bitfield. A union is a union. There’s a struct in there. This way is to optimize the memory space, in the case of very little memory, to use. Here’s an example:

If we need to declare a car class, define four attributes, driving forward, backward, left, and right. If the data is of type int, then 4 * 4 = 16 bytes of space, i.e. 128 bits. However, if you use federated bitfields, you can greatly reduce the space. You only need four.

union car {
    struct {
        char forward;   //1
        char back;      //1
        char left;      //1
        char right;     //1
    }
}
Copy the code

So 0000, the first 0 is the front, the second 0 is the back, and so on.

Now that we know what the federated bitfield looks like, let’s take a look at what the ISA_BITFIELD is.

define ISA_MASK        0x0000000ffffffff8ULL
#     define ISA_MAGIC_MASK  0x000003f000000001ULL
#     define ISA_MAGIC_VALUE 0x000001a000000001ULL
#     define ISA_HAS_CXX_DTOR_BIT 1
#     define ISA_BITFIELD                                                      \
        uintptr_t nonpointer        : 1;                                       \
        uintptr_t has_assoc         : 1;                                       \
        uintptr_t has_cxx_dtor      : 1;                                       \
        uintptr_t shiftcls          : 33; /*MACH_VM_MAX_ADDRESS 0x1000000000*/ \
        uintptr_t magic             : 6;                                       \
        uintptr_t weakly_referenced : 1;                                       \
        uintptr_t unused            : 1;                                       \
        uintptr_t has_sidetable_rc  : 1;                                       \
        uintptr_t extra_rc          : 19
#     define RC_ONE   (1ULL<<45)
#     define RC_HALF  (1ULL<<18)
Copy the code

Note: this is ARM64 storage, whereas computers running on (non-M1 chips) are x86_64 based, so the location of the values stored here is somewhat variable.

In particular, shiftcls is 33 bits under ARM64 and 44 bits under x86, resulting in magic starting at positions 36 and 47, which will come in handy later.

  • 0: indicates the pure ISA pointer. 1: indicates not only the address of the class object, but also the class information and reference count of the object. In iOS, nonpointer is usually equal to 1.
  • Has_assoc: flag bit of the associated object. 0 does not exist and 1 exists
  • Has_cxx_dtor: does the object have a destructor for C++ or Objc? If it has a destructor, the destructor logic needs to be done. If not, the object can be freed faster. In the ocdealloc
  • Shiftcls: Stores the value of the class pointer. With pointer optimization turned on, 33 bits are used to store class Pointers in the ARM64 architecture.
  • Magic: Used by the debugger to determine whether the current object is a real object or uninitialized empty
  • Weakly_referenced: A weak variable that records whether an object is pointed to or used to point to an ARC. Objects without weak references can be released faster.
  • Unsed: different versions aredeallocating, indicating whether the object is freeing memory
  • Has_sidetable_rc: When the object reference technique is greater than 10, this variable is borrowed to store carry
  • Extra_rc: When representing the reference count of this object, the reference count is actually subtracted by 1. For example, if the object’s reference count is 10, the extra_rc is 9. If the reference count is greater than 10, the following has_sideTABLE_rc is used.

Now that we know what ISA is, let’s go back and look at how objects are managed. With the above code in mind, let’s move on to how setCalss is implemented.

// Simplify the code,  inline void isa_t::setClass(Class newCls, UNUSED_WITHOUT_PTRAUTH objc_object *obj) { shiftcls = (uintptr_t)newCls >> 3; }Copy the code

Isn’t that amazing? It’s just shifted by 3 bits to the right by newCls. Why offset by 3 bits? We know that ISA -> Shiftcls stores the value of the class pointer. It starts at bit 3 in ISA’s memory. It’s that simple. Because there is no way to store the class name directly in memory, we store the numeric substitution.

3.1 Procedure for verifying isa pointer association

Person *p = [[Person alloc] init]; Run the objC source project. Breakpoint into objc_Object ::initIsa

inline void objc_object::initIsa(Class cls, bool nonpointer, UNUSED_WITHOUT_INDEXED_ISA_AND_DTOR_BIT bool hasCxxDtor) { ASSERT(! isTaggedPointer()); // ① Create newisa isa_t newisa(0); // set the default value to bits newISa. bits = ISA_MAGIC_VALUE; Isa. magic is part of ISA_MAGIC_VALUE // isa.nonpointer is part of ISA_MAGIC_VALUE // ③ newisa.has_cxx_dtor = hasCxxDtor; SetClass = setClass = setClass = setClass = setClass = setClass; newisa.setClass(cls, this); newisa.extra_rc = 1; isa = newisa; }Copy the code

When the break point goes to ②. Let’s print out some newisa content

(lldb) p newisa (isa_t) $1 = { bits = 0 cls = nil = { nonpointer = 0 has_assoc = 0 has_cxx_dtor = 0 shiftcls = 0 magic =  0 weakly_referenced = 0 unused = 0 has_sidetable_rc = 0 extra_rc = 0 } }Copy the code

Proceed to the next step and still print newisa

(lldb) p newisa
(isa_t) $5 = {
  bits = 8303511812964353
  cls = 0x001d800000000001
   = {
    nonpointer = 1
    has_assoc = 0
    has_cxx_dtor = 0
    shiftcls = 0
    magic = 59
    weakly_referenced = 0
    unused = 0
    has_sidetable_rc = 0
    extra_rc = 0
  }
}
Copy the code

Bits has an initial value, CLS has been assigned, and magic has been assigned. These are the default values. We said above that internal ISA is 64-bit data. Let’s put the values of CLS in our binary calculator and see what we have. The first 1 corresponds to nonpointer=1

If you look at this graph, the first six bits of the 47th digit are 110111. What is this binary number? That’s exactly 59.

After that, continue with the breakpoint next step. (4). Then inside the setClass method, we execute the statement to see the CLS offset value.

(lldb) po (uintptr_t)newCls
(uintptr_t) $15 = 4295000320
(lldb) po (uintptr_t)newCls >> 3
536875040
Copy the code

Then proceed to the next step and print newisa

lldb) p newisa
(isa_t) $11 = {
  bits = 8303516107964673
  cls = Person
   = {
    nonpointer = 1
    has_assoc = 0
    has_cxx_dtor = 0
    shiftcls = 536875040
    magic = 59
    weakly_referenced = 0
    unused = 0
    has_sidetable_rc = 0
    extra_rc = 0
  }
}
Copy the code

Uh-huh… Right? That’s how good it is. Shiftcls stores the value of a class pointer. It also validates what we said above, starting from the third bit in ISA’s memory. It’s that simple. Because there is no way to store the class name directly in memory, we store the numeric substitution.

Return to the _class_createInstanceFromZone function and pause for a second to verify this with object_getClass.

3.2 Reverse Verification of ISA_MASK

We reverse verify isa pointing with object_getClass. All of this simplifies the code. If necessary please view the source code.

Class object_getClass(id obj) { if (obj) return obj->getIsa(); else return Nil; } inline Class objc_object::getIsa() { if (fastpath(! isTaggedPointer())) return ISA(); } inline Class objc_object::ISA(bool authenticated) { ASSERT(! isTaggedPointer()); return isa.getDecodedClass(authenticated); } inline Class isa_t::getClass(MAYBE_UNUSED_AUTHENTICATED_PARAM bool authenticated) { uintptr_t clsbits = bits; clsbits &= ISA_MASK; return (Class)clsbits; }Copy the code

The result is bits & ISA_MASK to return the current class. Remember what bits is? Flipping up, bits is the first element inside the ISA pointer. So let’s use this ampersand to verify that some of the data we’re returning is a person

(lldb) x/4gx obj 0x10060d9b0: 0x011d800100008101 0x0000000000000000 0x10060d9c0: 0x0000000000000000 0x86C8F7c495bce30f So use the ISA_MASK value under x86 (LLDB) Po 0x011D800100008101&0x00007ffffffffFF8ull PersonCopy the code

That’s all for ISA. But are these things in isa really useful? It must be useful, we can find a clue from the dealloc function implementation.

4. Supplementary dealloc

Find dealloc in objc source code.

- (void)dealloc {
    _objc_rootDealloc(self);
}

void
_objc_rootDealloc(id obj)
{
    ASSERT(obj);

    obj->rootDealloc();
}
Copy the code

All right, it’s time for the miracle.

inline void objc_object::rootDealloc() { if (isTaggedPointer()) return; // fixme necessary? if (fastpath(isa.nonpointer && ! isa.weakly_referenced && ! isa.has_assoc && #if ISA_HAS_CXX_DTOR_BIT ! isa.has_cxx_dtor && #else ! isa.getClass(false)->hasCxxDtor() && #endif ! isa.has_sidetable_rc)) { assert(! sidetable_present()); free(this); } else { object_dispose((id)this); }}Copy the code

The value of each isa attribute in objC_object ::rootDealloc can be determined as free operation or object_Dispose. Without further ado about the free function, please look at the Dispose operation.

id object_dispose(id obj) { if (! obj) return nil; objc_destructInstance(obj); free(obj); return nil; } void *objc_destructInstance(id obj) { if (obj) { // Read all of the flags at once for performance. bool cxx = obj->hasCxxDtor(); bool assoc = obj->hasAssociatedObjects(); // This order is important. if (cxx) object_cxxDestruct(obj); if (assoc) _object_remove_assocations(obj, /*deallocating*/true); obj->clearDeallocating(); } return obj; }Copy the code

This is the whole dealloc process. Using the source code is just to deepen the impression of these processes.