Nature of object

Clang is a lightweight c/ C ++/ Objective-C/Objective-C ++ compiler written in C ++, based on LLVM and distributed under the LLVM BSD license. Here’s the main.m code:

#import <Foundation/Foundation.h> @interface OCPeople : NSObject{ NSString *nickName; } @property (nonatomic,copy) NSString *name; @property (nonatomic,assign) int age; @end @implementation OCPeople @end int main(int argc, const char * argv[]) { @autoreleasepool { // insert code here... NSLog(@"Hello, World!" ); } return 0; }Copy the code

Here’s oneclangCommand:clang -rewrite-objc main.m -o main.cpp, the order is tomain.mFile conversion tomain.cppFile. Enter the above command at the terminal,main.mOne will be generated in the folder wheremain.cppThe screenshot of some code of the transformed file is as follows:Analyze the underlying code: discover objectsOCPeopleIt’s compiled into a structure at the bottom. The structure of the bodyOCPeople_IMPLThere is also a nested structure insideNSObject_IMPL NSObject_IVARSBecause,OCPeopleInherited fromNSObject.

NSObject_IVARSIs the member variable ISA, as shown below:

It is worth noting that there is a sentence here:typedef struct objc_object OCPeople;.Why OCPeopleThe type of the class nature isobjc_objectWhat about this type? becauseOCPeopleInherited fromNSObject.NSObjectAt the bottom is the essenceobjc_object. theclassWhat type is it? The search found that it wasobjc_class *, which is a pointer to the structure type, as shown below:Why do we use it so oftenidCan point to any object with no tape*Because,idIt isobjc_object *In OC development, almost all objects are inherited fromNSObjectAnd the essence of it isobjc_objectStructure type.

The underlying codemain.cppThere is a code like this in:This code is an attributenameandagethegetMethods andsetMethods. You can see from the underlying code that the system automatically adds these two attributesgetMethods andsetMethods. And the variable that we definednickName, but the system does not automatically addgetMethods andsetMethod, this one needs to be distinguished.OCPeople * self, SEL _cmdAre the two underlying hidden parameters.((char *)self + OBJC_IVAR_$_OCPeople$_name)It can be interpreted as follows:OCPeopleObject in the heap to create memory space, heap space storageisa.name.ageAnd so on, some variables, the initial address of the object plusOBJC_IVAR_$_OCPeople$_nameIn order to get the translation of the memory spacenameAddress, get the address to get the value of name inside.

Summary: The essence of an object is the structure objc_object. All objects contain a class type isa that is inherited from NSObject. The essence of class is the structure type pointer objc_class *.

Bit-field, union

Before we get into bitfields, let’s look at a bit of code:

struct OCBike1 {
    BOOL front;
    BOOL back;
    BOOL left;
    BOOL right;
};
Copy the code

Instantiate one from this structurebike1, print its size as 4, as shown below: bike1The size of the structure is 4 bytes, a byte is 8 bits, 4 bytes is 32 bitsOCBike1Each member of the Boolean is a Boolean type, and the Boolean type is either 0 or 1, which really stores the structureOCBike1For each member, it only needs 4 bits, which is equivalent to half a byte, but without half a byte, it still needs 1 byte to store, and the remaining 3 bytes are actually a waste of space.

Let’s modify the structure OCBike1 to OCBike2 as follows:

struct OCBike1 {
    BOOL front;
    BOOL back;
    BOOL left;
    BOOL right;
};

struct OCBike2 {
    BOOL front: 1;
    BOOL back : 1;
    BOOL left :1 ;
    BOOL right : 1;
};
Copy the code

The structure of the bodyOCBike2Relative to the structureOCBike1, only set for each memberA domainBoth of them are set to 1, representing 1 bit and occupying one position. The bit field value cannot be greater than 8; otherwise, an error message is displayed. Let’s instantiate the structureOCBike2, then observe the size of these two structures after instantiation, print the following:Bike1 takes up 4 bytes and bike2 takes up 1 byte, so bike2 takes up a lot less space, which is also an optimization when writing code. In the code, we design a car, and there is only one direction at a time, forward can not go back, left can not go right, this design is mutually exclusive.

Let’s look at another structure with the following code:

struct OCProfessor1 {
    char *name;
    int age;
    double height;
};
Copy the code

Instantiate the structureOCProfessor1To giveprofessor1Assigns a value to the member of, and prints it out as it runsprofessor1Object, as shown below:At firstprofessor1Object all members are empty, and all subsequent member variables are successfully assigned. Here we introduce the word”A consortium“What is the difference between a union and a structure? So let’s look at the combinationOCProfessor2The code is as follows:

union OCProfessor2 {
    char *name;
    int age;
    double height;
};
Copy the code

A consortiumOCProfessor2The members and structures ofOCProfessor1They look the same, they look the same, but they are different, the members of a structure can coexist, but the members of a union are mutually exclusive. Let’s use a unionOCProfessor2Instantiate aprofessor2To giveprofessor2Member assignment, run while printing outputprofessor2Look at the sum structureprofessor1Is the output still the same, as shown below:When the code runs to line77,professor2The member has not been assigned yet. We print itp professor2.name,ageandheightAll values assigned to null display normal. After continuing to line78, we will print out,nameIt shows the value Jones that we currently assign, butageandheightThe value of is changed from 0 to a long list of values 16296 and 2.12. The reason why this happens is because of the variablesageandheightThe memory area is dirty memory and displays dirty data. Continue to line79, we print out again,nameThe value of theta is gone,ageIs 13. At this point in time,nameandageOnly one can be used, sharing memory.

  • The structure of the bodystructIn, all variables are co-existing, but the disadvantage is that the allocation of storage space in the structure is extensive and will be allocated whether it is used or not.
  • A consortiumunionEach variable is mutually exclusive, which has the advantage of finer and more flexible memory usage and saving memory space. In general, unions are used together with bitfields.

All members of a union (also known as a community) share a memory segment, so the offset of each member’s start address relative to the base address of the union variable is 0, that is, the start address of all members is the same. In order for all members to share a piece of memory, the space must be large enough to accommodate the widest member of these members.

The main purpose of using bitfields is to compress storage. The general rule is as follows: If adjacent bitfields are of the same type and the sum of their bitwidths is less than the sizeof the type, the following fields will be stored next to the previous one until it cannot accommodate them. If adjacent bit-field fields are of the same type, but the sum of their bitwidths is greater than the sizeof the type, the following fields will start from the new storage unit and have an offset that is an integer multiple of their type size.

isa

In the _class_createInstanceFromZone function, we can use initIsa as follows:

The _class_createInstanceFromZone function, which binds the heap memory request structure pointer to the CLS:

_class_createInstanceFromZone(Class cls, size_t extraBytes, void *zone, int construct_flags = OBJECT_CONSTRUCT_NONE, bool cxxConstruct = true, size_t *outAllocatedSize = nil) { ASSERT(cls->isRealized()); // Read class's info bits all at once for performance bool hasCxxCtor = cxxConstruct && cls->hasCxxCtor(); bool hasCxxDtor = cls->hasCxxDtor(); bool fast = cls->canAllocNonpointer(); size_t size; size = cls->instanceSize(extraBytes); if (outAllocatedSize) *outAllocatedSize = size; . . . if (! zone && fast) { obj->initInstanceIsa(cls, hasCxxDtor); } else { // Use raw pointer isa on the assumption that they might be // doing something weird with the zone or RR. obj->initIsa(cls); }... . . }Copy the code

Enter the initIsa function as follows:

inline void 
objc_object::initIsa(Class cls)
{
    initIsa(cls, false, false);
}
Copy the code

Continue into the lower levelinitIsa, as shown below:

The top function has oneisa_tLet’s take a look at thisisa_t, as shown below:We found thisisa_tIt’s a union, line81 and line82isa_tThe construction method of,bitsIs attribute.

NonPointerIsa: nonPointerIsa: nonPointerIsa: nonPointerIsa It’s not a simple pointer. Class is an object that is a pointer to the class there are a lot of information can be stored, the pointer is 8 bytes, 8 * 8 = 64, this 64 – bit if just to save a pointer, then the space is big waste, we can optimize some space, because each class has a almost isa, cause it is a waste, So Apple wanted to optimize those Spaces. In ISA, the classes that we always look at, and the variables associated with the classes and things like that we can write in there. For example, if the oc is being freed, reference counting, weak, associated objects, destructors (oc is c/ C ++, oc is not really freed, but c/ C ++ is freed, it is not freed until the lower level of the function.) And so on. This information is all about classes, and you can store this information in these 64 bits, so nonPointerIsa.

Look at the combinationisa_tSo what’s in there? What’s in there? What’s in thereISA_BITFIELD, as shown below:This is for x86_64 (macOS)ISA_BITFIELDThere are also ARM64-bit architectures (iOS)ISA_BITFIELDStructure.

  • nonpointer0: indicates the pure ISA pointer. 1: indicates that the ISA contains not only the address of the class object but also the class information and reference count of the object.
  • has_assocIs the flag bit of the associated object. 0 indicates that there is no association and 1 indicates that there is association.
  • has_cxx_dtorDoes the object have a destructor for c++ or objc? If it has a destructor, then the destructor logic needs to be done. If not, the object can be freed faster.
  • shiftclsIs the address of the class pointer that stores the value of the class pointer. With pointer optimization enabled, the x86_64 architecture has 44 bits to store class Pointers, and the ARM64 architecture has 33 bits to store class Pointers.
  • magicUsed by the debugger to determine whether the current object is a real object or has no space to initialize.
  • weakly_referencedIf an object is or has been referred to an ARC weak variable, objects without weak references can be released faster.
  • unusedIs it not in use?
  • has_sidetable_rcHash table, which is borrowed to store carry when the object reference count is greater than 10.
  • extra_rcRepresents the referential count value of the object, which is actually the referential count value minus 1. If the object’s reference count is 10, extra_rc is 9. If the reference count is greater than 10, has_sidetable_rc is used.

We drew an x86_64 schemaISA_BITFIELDThe storage distribution is as follows:

It’s a total of 64 bits. Bit 0 stores nonpointer, bit 1 stores has_ASsoc, bit 2 stores has_cxx_dtor, bit 3 to 46 stores Shiftcls, bit 47 to 52 stores Magic, bit 53 stores Weakly_referenced, Bits 54 store unused, 55 store has_sideTABLE_rc, and bits 56 through 63 store extra_rc. Pure poniterIsa stores only pointer addresses of classes, and creates nonPointerIsa by default. Through nonPointerIsa we mainly get shiftcls.

To analyze the ISA from the actual code, create a demo1, main.m file with the following code:

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        // insert code here...
        OCPeople *p = [OCPeople alloc];
        NSLog(@"%@",p);
    }
    return 0;
}
Copy the code

The breakpoint runs and prints p, as shown:

Output the first hexadecimal data (ISA) in p’s memory address in binary, and print the memory address of OCPeople class, as shown below: 0x011d8001000080e9When printed in binary, you can see that its 64 bits are not fully stored, and there are 7 bits left unused, which is a lot of waste. The current objectpAssociated with a classOCPeopleHow is the address of the object and the address of the class0x00000001000080e8Student: Related? This is where you have to introduceISA_MASK.

ISA_MASK

X86_64 (macOS)ISA_BITFIELDStructure, there is a macro definition in the screenshot# define ISA_MASK 0x00007ffffffffff8ULL.ISA_MASKIt’s the mask, the mask for the class, will0x011d8001000080e9andISA_MASKBy doing and, we can get the information we want (the address of the class0x00000001000080e8), as shown below:objectpWith isa, which is impure, with a mask operation, you get CLS. Why is it possible to do this? Because the original deposit is this deposit, now is the reverse to take. It is normal to associate objects by class, now it is the other way around.

Let’s go back to the initIsa function mentioned above and run objc’s source code, with main.m as shown below:

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        
        OCPeople *ocp = [OCPeople alloc];
        NSLog(@"%@",ocp);
        }
    return 0;
}
Copy the code

Into theinitIsaFunction:

At this timenonpointerTrue, let’s print it outnewisaThat is as follows:

newisaIt’s through the unionisa_tIf all the variables in newisa are 0 or nil, the code continues, runs to line363, printsnewisaAs shown in figure:At this timenewisa.bitsIt’s worth it,ISA_MAGIC_VALUEIs the macro definition, for0x001d800000000001ULL, print it outbitsThe value is8303511812964353And this value is equal toISA_MAGIC_VALUEOne is base 10 and the other is base 16. tobitsAfter the assignment,clsThe value of theta changes,bitsThe value of is also assigned tocls.p/t 0x001d800000000001Print out the0x001d800000000001The binary of0b0000 0000 0001 1101 1000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001Is 64-bit, the 0th bit of it1Assigned tononpointerIs not a pure ISA pointer. It contains not only the address information of the class object, but also the class information and reference count information. Numbers 47 through 52 are111011.p 0b111011It happens to be 59, which is assigned tomagic.

The code continues to run, as shown below:

From this sentencenewisa.setClass(cls, this);We break into thesetClassFunction:Where the if… There’s a lot of else, so I’m going to do the dumbest thing I can, I’m going to break at the branch, and when I see this function coming in, I’m going to go straight to line213. We’re throughp/x newClsThe command is printed in hexadecimal formatnewCls, and print the result of moving the address 3 bits to the right536875229.shiftclsIs equal to536875229.

The code continues to line367 and prints outnewisaAs shown in figure: shiftclsThe printed value is what we calculated above536875229It’s the same.OCPeopleThe address of the class, moved 3 bits to the rightshiftclsThe value of the. At this timenewisaWith the classOCPeopleIt’s already associated, printed outclsIs equal to theOCPeople.

Run further down to line376 and print outnewisa. At this timeextra_rcI also have a value of 1.bitsThe value of delta has changed,bits = 80361110145894121. butshiftclsThe value of is unchanged and has been successfully associated in the previous exploration. The diagram below: To summarize: THE CLS association with ISA is that the shiftcls in the ISA pointer stores the class information.

Next, we go back to Demo1, run with a breakpoint,x/4gx pA printoutpObject:objectpIsa is0x011d8001000080e9.0x011d8001000080e9We move 3 to the right0x0023b0002000101d.0x0023b0002000101dTo the left 20 is0x0002000101d00000.0x0002000101d00000We move 17 places to the right0x00000001000080e8. Print out the classOCPeopleThe address is0x00000001000080e8Is the same as the final result after isa’s displacement calculation. You can see the objectpThe ISA is already associated with the classOCPeople.

Isa displacement operation process is illustrated as follows:

To summarize isa and class associations:

  • The address of the class is shifted 3 bits to the rightshiftclsThe value of theshiftclsIn the isaISA_BITFIELDField under.
  • Isa is the address of the class for the displacement operation.
  • The isa withISA_MASKThe ampersand (&) operation also yields the address of the class.

So much for exploring the nature of objects and ISA. This article was created with reference to cooci’s explanation and students’ blogs. Thank you for sharing! If there are any mistakes in this article, please correct them