Because OC’s bottom layer is a superset written by C and C++, it is necessary to know the relevant code to learn the essence of OC’s bottom object, so a tool clang is introduced first. Using Clang to compile m files into CPP files, we can learn more about the underlying implementation principles.

The compiler Clang

1. What is Clang

Clang is a lightweight compiler for C, C++, and Objective-C. Source code is published under the BSD protocol. Clang will support its normal lambda expressions, simplified handling of return types, and better handling of constEXPr keywords. In April 2013,Clang fully supported the C++11 standard and began implementing C++1y features (i.e. C++14, the next minor update to C++). Clang is a C/C++/Objective-C/ Objective-C++ compiler written in C++, based on LLVM and published under the LLVM BSD license. It is almost completely compatible with the GNU C language specification (although there are some incompatibables, including some differences in compiler command options) and adds additional syntactical features such as C function overloading (which modifies functions by __attribute__((overloadable)), One of its goals is to go beyond GCC.

2. Use Clang

First, clang-rewrite-objc main.m -o main.cpp compilers the object file into c++ files. Then there is a UIkit error that we need to fix with Xcrun.

Clang-rewrite-objc-fobjc-arc-fobjc-runtime =ios-13.0.0 -isysroot / Applications/Xcode. App/Contents/Developer/Platforms/iPhoneSimulator platform/Developer/SDKs/iPhoneSimulator13.0. The SDK main.m

Xcode is installed with the xcrun command, which is a bit more wrapped around clang to make it easier to use. Xcrun – SDK iphonesimulator clang -arch arm64 -rewrite-objc main.m -o main-arm64.cpp (emulator) xcrun – SDK iphoneOS clang -arch arm64 -rewrite-objc main.m -o main-arm64.cpp (handset)

The nature of the object

Run the clang command to obtain the compiled main.m file. The content of the main.m file is as follows

#import <objc/runtime.h> @interface XJPerson : NSObject @property (nonatomic,strong) NSString *name; @end @implementation XJPerson @end int main(int argc, const char * argv[]) { @autoreleasepool { // insert code here... NSLog(@"Hello, World!" ); } return 0; }Copy the code

The compiled mian. CPP file is long and thick, so we only look at the required part.

1. XJPerson object

A global search of XJPerson yields the following core code:

typedef struct objc_object XJPerson; typedef struct {} _objc_exc_XJPerson; #endif // XJPerson_IMPL struct XJPerson_IMPL {struct NSObject_IMPL NSObject_IVARS; NSString *_name; }; /* @end */ // @implementation XJPerson // @endCopy the code

Code interpretation:

1. Define an alias XJPerson that points to the struct objc_object type; 2. In the structure implementation XJPerson_IMPL, there isa member variable NSObject_IVARS from the inherited structure, isa; The other member variable is _name, which is the XJPerson attribute, as defined at the OC level. You can see that the underlying nature of an object is a structure

2. NSObjec object

Through the NSObject_IMPL global search, get the declaration and implementation of NSObject class code.

typedef struct objc_object NSObject; typedef struct {} _objc_exc_NSObject; Struct NSObject_IMPL {Class isa; };Copy the code

Read the code:

1. Define an alias NSObject that also points to the struct objc_object type. 2. In the NSObject implementation, there isa member variable isa of type Class;

3. Low-level analysis of object attributes

static NSString * _I_XJPerson_name(XJPerson * self, SEL _cmd) { return (*(NSString **)((char *)self + OBJC_IVAR_$_XJPerson$_name)); } static void _I_XJPerson_setName_(XJPerson * self, SEL _cmd, NSString *name) { (*(NSString **)((char *)self + OBJC_IVAR_$_XJPerson$_name)) = name; }Copy the code

Member variable memory access is essentially by pointer translation, first get the object address through the offset, and then calculate the address to access memory. Finally, retain the new value and release the old value.

(char *)self: the first address of the object

OBJC_IVAR_$_XJPerson$_name: offset address of member variable name

(*(NSString **)((char *)self + OBJC_IVAR_$_XJPerson$_name)) memory for the member variable name

Bit fields and Unions (Commons)

Before analyzing ISA, you need to understand bitfields and unions

1. A domain

The XJCar1 structure has four BOOL members, each of which requires one byte, so XJCar1 occupies four bytes. XJCar2 specifies the number of bits per member. Four bits are only 1 byte. XJCar1 wastes 3 times more space than XJCar2.

2. Consortium

Let’s start with a code example to see the difference between a structure and a union

You can see this in the code; The three members of the coexistence input of the structure all have values, while the output value of the union only age is correct, and the previously assigned name is empty. It can be concluded that the mutual exclusion feature, that is, the value before the last value is assigned is cleared, and only one member can be assigned a new value. The height of XJPlay2 is not assigned a value, but it has a bunch of weird numbers, indicating that it is dirty memory data.

All variables in a struct are “co-existing” — the advantage is “tolerant” and comprehensive; The disadvantage is that the allocation of struct memory space is extensive, regardless of use, full allocation. In a union, the variables are mutually exclusive — the disadvantage is that they are not “inclusive” enough. But the advantage is that memory usage is more delicate and flexible, and also saves memory space

4. ISA exploration

1. Isa_t consortium

Through the above case, we recognize the difference between the union and the structure, and understand the advantages of the in-place domain in saving memory. Isa encapsulates data by combining bitfields with consortia. See the source code below:

After the above analysis, we recognize the advantages of bit-fields in saving memory, but also understand the characteristics of the union. The following exploration of ISA reveals that ISA is actually using the pattern of federated bitfields to encapsulate data. The source code is as follows:

// Isa_t {isa_t() {} ISA_t (uintptr_t value) : bits(value) {} CLS; uintptr_t bits; #if defined(ISA_BITFIELD) struct { ISA_BITFIELD; // defined in isa.h }; #endif };Copy the code

Isa_t isa union with two attributes Class CLS; And uintptr_t bits. , the two attributes are mutually exclusive, and the union occupies 8 bytes of memory space.

  • Class cls; Typedef struct objc_class *Class;
  • uintptr_t bits; .nonpointer isa, uses the structural position domain forarm64The architecture andx86The architecture provides differentBit-field setting rules.
#if SUPPORT_PACKED_ISA // ios Real environment #if __arm64__ # define ISA_MASK 0x0000000ffffffff8ULL # define ISA_MAGIC_MASK 0x000003f000000001ULL # define ISA_MAGIC_VALUE 0x000001a000000001ULL # define ISA_BITFIELD \ uintptr_t nonpointer : 1; \ uintptr_t has_assoc : 1; \ uintptr_t has_cxx_dtor : 1; \ uintptr_t shiftcls : 33; /*MACH_VM_MAX_ADDRESS 0x1000000000*/ \ uintptr_t magic : 6; \ uintptr_t weakly_referenced : 1; \ uintptr_t deallocating : 1; \ uintptr_t has_sidetable_rc : 1; \ uintptr_t extra_rc : 19 # define RC_ONE (1ULL<<45) # define RC_HALF (1ULL<<18) # define ISA_MASK 0x00007ffffffffff8ULL # define ISA_MAGIC_MASK 0x001f800000000001ULL # define ISA_MAGIC_VALUE 0x001d800000000001ULL # define ISA_BITFIELD \ uintptr_t nonpointer : 1; \ uintptr_t has_assoc : 1; \ uintptr_t has_cxx_dtor : 1; \ uintptr_t shiftcls : 44; /*MACH_VM_MAX_ADDRESS 0x7fffffe00000*/ \ uintptr_t magic : 6; \ uintptr_t weakly_referenced : 1; \ uintptr_t deallocating : 1; \ uintptr_t has_sidetable_rc : 1; \ uintptr_t extra_rc : 8 # define RC_ONE (1ULL<<56) # define RC_HALF (1ULL<<7) # else # error unknown architecture for packed isa # endif // SUPPORT_PACKED_ISA #endifCopy the code

2. Nonpointer isa

  • Nonpointer: indicates whether pointer optimization is enabled for isa Pointers. 0: indicates pure ISA Pointers. 1: indicates not only the address of the class object, but also the class information and reference count of the object
  • Has_assoc: flag bit of the associated object. 0 does not exist and 1 exists
  • Has_cxx_dtor: does the object have a destructor for C++ or Objc? If it has a destructor, the destructor logic needs to be done. If not, the object can be freed faster
  • Shiftcls: Stores the value of the class pointer. With pointer optimization turned on, 33 bits are used to store class Pointers in the ARM64 architecture
  • -magic: Used by the debugger to determine whether the current object is a real object or has no space to initialize
  • Weakly_referenced: A weak variable that records whether an object is pointed to or used to point to an ARC. Objects without weak references can be released faster
  • Unused: Indicates whether it has not been used
  • Has_sidetable_rc: hash table that needs to be borrowed to store carry when object reference technique is greater than 10
  • Extra_rc: When representing the reference count of this object, the reference count is actually subtracted by 1. For example, if the object’s reference count is 10, the extra_rc is 9. If the reference count is greater than 10, the following has_sideTABLE_rc is used

3. Bit operation of ISA

Get Shiftcls through ISA’s bit operation. In the case of macOS, shiftcls is positioned in a 64-bit structure: there are three bits on the right and 17 bits on the left. It contains 44 bits.Code validation

The value of the bit operation is equal to the value of the box object, indicating that the bit operation has successfully obtained Shiftcls.

conclusion

The underlying nature of the object is that the structure ISA is determined to be pure ISA or NONPOINTER_ISA through the mutual exclusion of the shared body (union), and to achieve space saving through the bitfield.