preface

Words that we say almost every day, whether in life or at work. If you don’t have a partner in your life, then brother you have to cheer up, if not, I will give you a new one. In this happy joke we walk into the object at work, and when we say object we have to say ISA, because ISA tells us who the object belongs to. Let’s explore the nature of the object and ISA.

The preparatory work

Union

Unions and struct types are also composed of different types of data. Let’s explore unions in code

union LWPerson { int a; //4 short b; //2 char c; / / 1}; int main(int argc, char * argv[]) { @autoreleasepool { union LWPerson person; person.a = 8; NSLog(@"a=%d---b=%d---c=%c",person.a,person.b,person.c); person.b = 2; NSLog(@"a=%d---b=%d---c=%c",person.a,person.b,person.c); person.c = 'd'; NSLog(@"a=%d---b=%d---c=%c",person.a,person.b,person.c); NSLog(@"%lu---%lu",sizeof(person),sizeof(union LWPerson)); } return 0; }Copy the code
2021-06-10 15:50:45.789591+0800 ISA Inquiry [78582:10551210] A =8-- B =8-- c= 2021-06-10 15:50:45.790087+0800 Isa inquiry [78582:10551210] a=2-- b=2-- c= 2021-06-10 15:50:45.790107+0800 ISA inquiry [78582:10551210] a=100-- b=100-- c=d 2021-06-10 15:50:45.790125+0800 ISA Inquiry [78582:10551210] 4-- 4Copy the code

Conclusion:

  • A union can define multiple members of different types, the union ofMemory sizeBy includingThe size of the largest memberDecision.
  • Modifying one of the variables in the union willcoverValues of other variables.
  • All variables of the unionCommon memoryBetween variablesThe mutex.

Advantages and disadvantages of consortium

  • Advantages: More flexible memory usage, save memory.
  • The bad: Not tolerant enough.

A domain (Bit field)

Some information does not need to be stored in a whole byte, but only in a few or a binary bit. For example, in the storage of a only 0 and 1 two state members, with a binary can, the purpose is to save storage space, easy to handle. Let’s explore the bitfield in code

struct OldCar {
    BOOL front;
    BOOL back;
    BOOL left;
    BOOL right;
};

struct NewCar {
    BOOL front: 1;
    BOOL back : 1;
    BOOL left : 1;
    BOOL right: 1;
};

int main(int argc, char * argv[]) {
  
    @autoreleasepool {
        
    struct OldCar oldCar;
    struct NewCar newCar;
    NSLog(@"----%lu----%lu",sizeof(oldCar),sizeof(newCar));
        
    }
    return 0;
}
Copy the code
2021-06-10 16:25:04.266985+0800 ISA inquiry [78608:10560097] ----4----1Copy the code

Summary: oldCar has a memory size of 4 bytes and newCar has a memory size of 1 byte. 1 byte contains 8 bits. All variables in newCar are stored in this byte by bit. The specific format is stored from right to left to 0000 1111, front, back, left and right.

Nature of objects

Before you dive into the nature of your objects, learn about the editor Clang

Clang

  • ClangIs aThe C language,C++,Objective-CLanguage lightweight compiler, is made byAppleLead authoring
  • ClangMainly used to compile source files into low-level files, such asmain.mFile compilation intomain.cpp,main.oorExecutable file. Easy to observe the underlying logical structure, easy to explore the bottom.

ClangTerminal compilation command

Clang -rewrite-objc main.m -o main. CPP //UIKit error clang -x objective-c-rewrite-objc -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk main.m //3, compile xcrun - SDK iphonesimulator clang -arch arm64 -rewrite-objc main.m -o //4, xcrun-sdk iphoneos clang-arch arm64-rerewrite -objc main.m -o main-arm64.cppCopy the code

Nature of objects

To explore the nature of the object, the code for exploring by example is as follows

@interface LWPerson : NSObject
{
    NSString * height;
}
@property(nonatomic,  copy)NSString   *LWname;
@property(nonatomic,assign)NSInteger  age;
@end

@implementation LWPerson

@end

int main(int argc, char * argv[]) {
  
    @autoreleasepool {
    }
    return 0;
}
Copy the code

The clang command is used to compile the main.m file into main. CPP

#ifndef _REWRITER_typedef_NSObject
#define _REWRITER_typedef_NSObject
typedef struct objc_object NSObject;
typedef struct {} _objc_exc_NSObject;
#endif

struct NSObject_IMPL {
	Class isa;
};

extern "C" unsigned long OBJC_IVAR_$_LWPerson$_LWname;
extern "C" unsigned long OBJC_IVAR_$_LWPerson$_age;
struct LWPerson_IMPL {
	struct NSObject_IMPL NSObject_IVARS;
	double  height;
	NSString *_LWname;
	NSInteger _age;
};

// @property(nonatomic, copy)NSString *LWname;
// @property(nonatomic,assign)NSInteger age;
/* @end */

// @implementation LWPerson
Copy the code

The LWPerson_IMPL structure has 4 variables. Height, _LWname, _age are custom attributes. NSObject_IVARS is isa in NSObject. LWPerson inherits from NSObject, which means that LWPerson also has all of NSObject’s member variables.

The underlying NSObject is that the NSObject_IMPL has only one member variable which is Class ISA. Usually isa is called an ISA pointer, so the Class here should be a pointer type. Search the main.cpp file globally for *Class. The following code

typedef struct objc_class *Class; Struct objc_object {Class _Nonnull isa __attribute__((deprecated)); }; Struct NSObject_IMPL {Class isa; };Copy the code

Source code analysis:

  • ClassThe type is actually oneobjc_classType,objc_classIs the underlying implementation of all classes. So we guessisaThere may be important associations with the class information, which will be explored below.
  • NSObjectThe underlying implementation andobjectWhat is the difference between the underlying implementation of. The structure of both member variables isClass isaThen there must be inheritance relationship. All objects are inherited at the bottomobjc_objectIn theOCBasically all the objects in the class are inheritedNSObject, but the real underlying implementation isobjc_objectThe structure type of.

The essential extension of the object

A global search for *Class in the main.cpp file found the following lines of code

typedef struct objc_object *id;
typedef struct objc_selector *SEL;
Copy the code

Source code analysis: familiar ID and SEL. The usual id is originally a pointer to an objc_object structure, which explains why there is no * when id is used to modify variables and return values. Surprisingly, SEL is also a pointer to a structure.

In main.cpp, the underlying implementation code for the method was found as follows

As shown in the figure:

  • Properties of thegetandsetHow do we get the current variable? The underlying implementation is that of the current objectThe first address+ variableOffset value.
  • I have a custom local variable in my codeheight, but the underlying code only adds onevariable. While defining the properties, the underlying layer automatically adds the tape_ variableAs well asgetandsetMethod implementation. Well, that’s a common interview question..cppThe paperwork is pretty good.

Summary of object Essence

  • The essence of an object isThe structure of the body
  • LWPersontheisaIs inheritedNSObjectIn theisa
  • NSObjectThere is only one member variable inisa

isaAssociation class

The first variable that we know about the object is ISA. In the alloc study of the underlying principle of IOS, we found that the core three methods of alloc an object are CLS ->instanceSize to calculate the memory size. (id)calloc(1, size) open memory return address pointer, obj->initInstanceIsa initializes the ISA association class. Since so many places point to ISA, let’s explore the structure of ISA and how ISA relates to classes.

isastructure

Alloc –> _objc_rootAlloc –> callAlloc –> _objc_rootAllocWithZone –> _class_createInstanceFromZone, The breakpoint is obj->initInstanceIsa. Go to obj->initInstanceIsa

inline void objc_object::initInstanceIsa(Class cls, bool hasCxxDtor) { ASSERT(! cls->instancesRequireRawIsa()); ASSERT(hasCxxDtor == cls->hasCxxDtor()); initIsa(cls, true, hasCxxDtor); }Copy the code

Break point into initIsa

inline void objc_object::initIsa(Class cls, bool nonpointer, UNUSED_WITHOUT_INDEXED_ISA_AND_DTOR_BIT bool hasCxxDtor) { ASSERT(! isTaggedPointer()); isa_t newisa(0); // isa initialization if (! nonpointer) { newisa.setClass(cls, this); } else {ASSERT(! DisableNonpointerIsa); ASSERT(! cls->instancesRequireRawIsa()); #if SUPPORT_INDEXED_ISA ASSERT(cls->classArrayIndex() > 0); newisa.bits = ISA_INDEX_MAGIC_VALUE; newisa.has_cxx_dtor = hasCxxDtor; newisa.indexcls = (uintptr_t)cls->classArrayIndex(); #else newisa.bits = ISA_MAGIC_VALUE; # if ISA_HAS_CXX_DTOR_BIT newisa.has_cxx_dtor = hasCxxDtor; # endif newisa.setClass(cls, this); #endif newisa.extra_rc = 1; } isa = newisa; }Copy the code

We found out that the structure type of ISA is isa_t, go into isa_t

union isa_t {
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    uintptr_t bits;

private:
    // Accessing the class requires custom ptrauth operations, so
    // force clients to go through setClass/getClass by making this
    // private.
    Class cls;

public:
#if defined(ISA_BITFIELD)
    struct {
        ISA_BITFIELD;  // defined in isa.h
    };

    bool isDeallocating() {
        return extra_rc == 0 && has_sidetable_rc == 0;
    }
    void setDeallocating() {
        extra_rc = 0;
        has_sidetable_rc = 0;
    }
#endif

    void setClass(Class cls, objc_object *obj);
    Class getClass(bool authenticated);
    Class getDecodedClass(bool authenticated);
};
Copy the code

Source code analysis: ISA_T isa consortium. Isa_t has two variables: bits and CLS. The union is mutually exclusive, which means there are two ways to initialize ISA:

  • bitsAssigned,clsNo value or overwritten
  • clsAssigned,bitsNo value or overwritten

Isa_t also has a struct member variable, ISA_BITFIELD. This macro pair should have two ends, one __arm64__ (iOS) and one __x86_64__ (macOS). So ISA_BITFIELD stores information in bit fields, so let’s see what that information is

The emulator data for iOS is omitted and only the real and macOS macros are left

64 bit storage distribution diagram of bits

Meaning of each variable:

  • nonpointer: indicates whether it is correctisaPointer optimization,0Represents a pure pointer,1Represents more than the address of the class object, ISA contains class information, objects, reference counts, and so on
  • has_assoc: Associated object flag bit,0Is not associated,1Said associated
  • has_cxx_dtor: Whether the object isC ++orObjcIf there is a destructor, the destructor logic needs to be done. If there is no destructor, the object is freed
  • shiftcls: stores the value of the class pointer, in the case of pointer optimizationarm64In the architecture33Bits are used to store Pointers to classes,x86_64The architecture of44position
  • magic: used by the debugger to determine whether the current object isThe real objectorNot initializedThe space of
  • weakly_referenced: Indicates whether an object is or has been pointed toARCObjects without weak references can be freed faster
  • deallocating: indicates whether the object is being released
  • has_sidetable_rc: When the object reference count is greater than10, you need to borrow the variable to store the carry
  • hextra_rc: represents the reference count of this object, actually subtracting the reference count1, for example, if the object’s reference count is10, thenextra_rcfor9If it is greater than10I need to use the one abovehas_sidetable_rc

isaSummary of structural analysis

  • isaDivided intononpointerThe types and thenonpointer. nonnonpointerThe type is just a pure pointer,nonpointerIt also contains information about the class
  • isaisA consortium+A domainTo store information. The thing about this approach is thatSaves a lot of memory. Everything is an object, as long as it is an objectisaPointers, a lot of themisaIt takes up a lot of memory,A consortiumSharing a piece of memory saves some memory whileA domainIt stores information on the basis of saving memory, so to speakisaThe pointer’s memory is fully utilized.

isaRelated research

First define a LWPerson class and initialize [LWPerson alloc] as follows: alloc –> _objc_rootAlloc –> callAlloc –> _objc_rootAllocWithZone –> _class_createInstanceFromZone–> Obj ->initInstanceIsa ->initIsa

Newisa (0) assigns to bits, and all variables in bits are 0

Newisa.bits = ISA_MAGIC_VALUE, ISA_MAGIC_VALUE isa macro ISA_MAGIC_VALUE = 0x001D800000000001, The variable value changes include bits = 8303511812964353, CLS = 0x001D800000000001, nonpointer = 1, margic = 59. Let’s go back to the assignment. The following figure

  • 0x001d800000000001Converted todecimalIs equal to the8303511812964353
  • cls = 0x001d800000000001Because tobitsIt’s overwritten in the assignmentcls.isa_tisA consortium
  • The first0bit1Is equal to thenonpointer = 1
  • margicWhat’s the value of theta, in bits from47To start with, the length is 0sixAs a result,111011From the first0Let’s start at 1111011thedecimalis59(Compare the first calculation chart with the third calculation chart)

Break point to enter setClass

Shiftcls = 536873007

Figure shows

  • shiftcls = 536873007With the above algorithmshiftcls=(uintptr_t)newCls >> 3You get the same thing.
  • LWPersonA class of address> > 3fordecimalConvert assignment toshiftcls. At this timeisaHas been associatedLWPersonClass,clsVariables arecover.cls = LWPerson.

Question :(uintptr_t)newCls >> 3 why do you need to move 3 bits to the right

  • MACH_VM_MAX_ADDRESSRepresents virtual memoryThe biggest addressingIn space,__arm64__In theMACH_VM_MAX_ADDRESS = 0x1000000000Virtual memoryThe biggest addressingSpace is36position in__x86_64__In theMACH_VM_MAX_ADDRESS = 0x7fffffe00000Virtual memoryThe biggest addressingSpace is47position
  • Byte alignment is8 bytesAlign, which means the address of the pointer can only be8Multiples of the pointer addressAfter threeCan only be0, such as0x8.0x18.0x30Convert to binaryAfter threeAre all0.

Based on the above two, in order to save memory space, erase the last three bits are 0. Shiftcls has 33 bits in __arm64__ and 44 bits in __x86_64__. So you need to move the class address 3 bits to the right, i.e. (uintPtr_t)newCls >> 3. Isa is optimized to the extreme.

isaSummary of Association Classes

The way CLS associates classes with ISA is that shiftCLs stores class information in the ISA pointer.

conclusion

Through the exploration of isa and the nature of the object, the necessity and importance of the exploration of the hierarchy are realized. Although the process of the exploration is complicated and tedious, the results are exciting, which is the charm of the exploration at the bottom.

supplement

isaSeveral ways to associate classes

  • shiftcls =(uintptr_t)newCls >> 3
  • isaAn operation
  • isa&ISA_MASK

Shiftcls =(uintPtr_t)newCls >> 3

isaAn operation

Class information is stored in the ISA pointer, shiftCLs is stored bitwise in ISA, macOS is stored from the 3rd bit, size is 44 bits. The purpose of the bit operation is to keep only the shiftCLs information, and erase the other bit information. The relative position of the bit-operation shiftCLs should remain the same. The following figure

  • isaThe value is0x011d8001000041a1.0x011d8001000041a1 >> 3The result is equal to the0x0023b00020000834
  • 0x0023b00020000834 << 20The result is equal to the0x0002000083400000
  • 0x0002000083400000 >> 17The result is equal to the0x00000001000041a0
  • po 0x00000001000041a0As a result of theLWPerson,isaClasses are already associated

isaBit operation process diagram

isa&ISA_MASK

ISA_MASK is a macro. The value of __x86_64__ is equal to 0x00007FFFFFFff8ull, and the value of __arm64__ is equal to 0x0000000FFFFFF8ull. The isa value is 0x011D8001000041A1, verify the result 0x011D8001000041A&0x00007FFFFFFff8ULL. The following figure

The figure shows that the result of Po 0x011D8001000041A&0x00007FFFFFFFF8ULL is LWPerson. Isa is associated with a class. Print ISA_MASK p/ T 0x00007FFFFFFFF8ULL in binary mode and the result shows that the high 17 bits are 0, the low 3 bits are 0, and the middle 44 bits are 1 to display shiftcls in ISA. ISA_MASK is like a mask that shows what is exposed and removes everything else.

initandnewTo explore the

The probe into Init and new has been updated to the IOS Underlying Principles alloc Probe blog