IOS underlying principles + reverse article summary

The main purpose of this article is to analyze the class-class structure, and the whole article is a few explorations around a class

The analysis of the class

Class analysis is mainly to analyze isa direction and inheritance relationship

The preparatory work

Define two classes

  • Inherited fromNSObjectThe class ofCJLPerson.
@interface CJLPerson : NSObject
{
    NSString *hobby;
}
@property (nonatomic, copy) NSString *cjl_name;
- (void)sayHello;
+ (void)sayBye;
@end

@implementation CJLPerson
- (void)sayHello
{}
+ (void)sayBye
{}
@end
Copy the code
  • Inherited fromCJLPersonThe class ofCJLTeacher
@interface CJLTeacher : CJLPerson
@end

@implementation CJLTeacher
@end
Copy the code
  • Define two objects in main as two:person & teacher
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        //ISA_MASK  0x00007ffffffffff8ULL
        CJLPerson *person = [CJLPerson alloc];
        CJLTeacher *teacher = [CJLTeacher alloc];
        NSLog(@"Hello, World! %@ - %@",person,teacher);  
    }
    return 0;
}
Copy the code

The metaclass

First, let’s introduce metaclasses through LLDB debugging in a case

  • Run the program with a breakpoint in the CJLTeacher section of main
  • Enable the LLDB debugging. The following figure shows the debugging process

Based on the debugging process, we have a question: Why is p/x 0x001d8001000022DD & 0x00007FFFFFFff8ULL and P /x 0x00000001000022B& 0x00007FFFFFFFF8ull in the figure All the class information printed out is CJLPerson, right?

  • 0x001d8001000022ddpersonThe object’sIsa pointer address, its&After theThe results ofisCreate the personThe class ofCJLPerson
  • 0x00000001000022b0Is of the class to which the class information obtained in ISA refersisaPointer address of, i.eCJLPerson class in the classisaPointer address, in Apple, we just call itCJLPerson class in the classforThe metaclass
  • So, both printsCJLPersonIs becauseThe metaclassAs a result of

Description of metaclasses

To explain what metaclasses are, there are mainly the following points:

  • We all know that the isa of an object refers to a class, and the isa of a class is actually an object, so we can call it a class object, and the bit field of the ISA refers to the metaclass defined by Apple

  • Metaclasses are given by the system. Their definition and creation are done by the compiler. In this process, the class ownership comes from the metaclasses

  • Metaclasses are classes of class objects. Each class has a unique metaclass that stores information about class methods.

  • The metaclass itself has no name, and because it is associated with a class, it uses the same name as the class name

The following throughlldbCommand to exploreDirection of metaclasses, that is,isatheWalk a, as shown in the figure below, a relationship chain can be obtained:Object -> class -> metaclass -> NSobject, NSobject points to itself

conclusion

As you can see from the picture

  • objectisaPoint to theclass(Also calledClass object)
  • classisaPoint to theThe metaclass
  • The metaclassisaPoint to theA metaclass, i.e.,NSObject
  • A metaclassisaPointing to itoneself

How many NSObject are there?

As you can see from the figure, the last root metaclass is NSObject. Is this NSObject the same as the NSObject we know from development?

The following two verification methods are available

  • 【 Method 1 】lldbCommand to verify
  • 【 Method 2 】codevalidation

[Method 1] Verify using the LLDB command

We also use LLDB debugging to verify that the two NSObject are the same, as shown in the figure below

As we can see from the figure, the last metaclass of class NSObject is also NSObject, which is the same metaclass as the root metaclass (NSObject) of CJLPerson above. Therefore, we can conclude that there is only one root metaclass of class NSObject in memory, and the metaclass of the root metaclass refers to itself

【 Method 2 】 Code verification

Get the class in three different ways and see if they print the same address

Class class1 = [CJLPerson Class]; void classNum (){Class class1 = [CJLPerson Class]; Class class2 = [CJLPerson alloc].class; Class class3 = object_getClass([CJLPerson alloc]); NSLog(@"\n%p-\n%p-\n%p-\n%p", class1, class2, class3); }Copy the code

Here is the result of the code running

As you can see from the result, the address is the same, so there is only one copy of NSObject, that is, there is only one copy of NSObject in memory

[Interview question] : How many classes exist?

There is only one copy of a class object because the class information is always in memory

The famous ISA routing & inheritance diagram

Based on the above exploration and various tests,Object, class, metaclass, root metaclassIs shown in the following figure

Isa walk a

The trend of ISA has the following points:

  • Isa points to class for Instance of Subclass

  • Class isa refers to Meta class

  • Meta class isa refers to Root metal class

  • The ISA of the Root metal class points to itself as a closed loop, in this case NSObject

Superclass walk a

The direction of superclass (i.e. inheritance relationship) is also illustrated in the following points:

  • classBetween theinheritanceRelationship:
    • SubClass inherits from superClass

    • The superClass inherits from the RootClass. In this case, the RootClass is NSObject

    • The root class inherits from nil, so the root class NSObject can be thought of as the origin of everything, something out of nothing

  • The metaclassThere is also ainheritance, the inheritance relationship between metaclasses is as follows:
    • A metaclass of a SubClass (Metal SubClass) inherits from a metaclass of a SuperClass (Metal SuperClass)

    • The metal SuperClass of the parent Class inherits from the Root Metal Class

    • The Root metal Class inherits from the Root Class. In this case, the Root Class is NSObject

  • 【 note 】Instance objectsbetweenNo inheritance.classbetweenThere is inheritance

For example

Mentioned earlierCJLTeacher and the target teacherCJLPerson and object PersonThe following figure shows an example

  • Isa chain (two)

    • Teacher (subclass) –> CJLTeacher (subclass) –> CJLTeacher (subclass) –> NSObject (root metaclass) –> NSObject (root metaclass) –> NSObject (root metaclass, self)

    • Person isa bitmap: Person (superclass object) –> CJLPerson (superclass) –> CJLPerson (superclass) –> NSObject (root metaclass) –> NSObject (root metaclass, self)

  • Superclass routing chain (two)

    • Class inheritance chain: CJLTeacher (subclass) –> CJLPerson(superclass) –> NSObject (root class) –> nil

    • Metaclass inheritance chain: CJLTeacher (child metaclass) –> CJLPerson(parent metaclass) –> NSObject (root metaclass) –> NSObject (root class) –> nil

objc_class & objc_object

Now that the ISA move is clear, a new question arises: Why do objects and classes have ISA properties? There are two struct types that have to be mentioned: objc_class and objc_object

On the basis of these two structures, the above questions are explored below.

In the previous article ios-underlying principles 07: isa’s principles of class association, the main. M file was compiled using clang. The compiled c++ file shows the c++ source code below

  • NSObjectThe underlying compilation of isNSObject_IMPLStructure,
    • Among themClassisisaThe type of the pointer is made up ofobjc_classDefined type,
    • whileobjc_classIt’s a structure. In iOS, all of themClassAre based onobjc_classCreated for the template
struct NSObject_IMPL {
	Class isa;
};


typedef struct objc_class *Class;
Copy the code
  • Search for the definition of objc_class in the objC4 source code. There are two versions of the definition in the source code

    • The old versionLocated in theruntime.hHas been abolished

    • The new inobjc-runtime-new.hThis is,objc4-781The newly optimized structure analysis of our later classes is also based on the new version.

From the new version of the definition, you can see that objc_class structure types inherit from objc_object,

  • Search the objC4 source codeObjc_object (or objc_object {There are also two versions of this type
    • A is located in theobjc.h, has not been repealed from the compiledmain.cppAs you can see, use this version of theobjc_object

    • Located in theobjc-privat.h

The following is the definition of objc_object in the compiled main.cpp

struct objc_object {
    Class _Nonnull isa __attribute__((deprecated));
};
Copy the code

What does objc_class have to do with objc_object?

Through the above source search and the low-level compilation of the source code in main. CPP, the following points are explained:

  • The structure type objc_class inherits from objc_object, which is also a structure and has an ISA property, so objc_class also has an ISA property

  • In the mian. CPP underlying compilation file, the ISA in NSObject is defined at the bottom level by Class, where the underlying code for Class comes from objc_class, so NSObject also has the ISA property

  • NSObject isa class that initializes an instance object, objc, that satisfies the properties of objc_object (that is, has the isa property), mainly because isa is inherited from objc_class by NSObject, And objc_class inherits from objc_Object, which has the ISA property. So all objects have an ISA, and ISA stands for pointing to, from the current objc_object

  • Objc_object (structure) is the current root object. All objects have the property objc_object, which has the ISA property

【 baidu interview question 】 The relationship between objc_object and objects

  • All objects are inherited from objc_object

  • All objects are from NSObject (OC), but what really goes to the bottom is an objC_Object (C/C++) struct type

【 Conclusion 】 The relationship between objc_object and objects is inheritance

conclusion

  • All objects + classes + metaclasses have ISA properties

  • All objects are inherited from objc_object

  • Everything is an object, and everything comes from objc_object. There are two conclusions:

    • All objects created with objc_Object as the template have the ISA property

    • All classes created using objc_class as a template have the ISA property

  • At the structural level, it can be popularly understood as the connection between the upper OC and the bottom:

    • The lowerIs through theThe structure of the bodyThe definition of theThe template, e.g.Objc_class, objc_object
    • The upperIt’s through the bottomThe template to createFor exampleCJLPerson

Objc_class, objc_object, ISA, object, NSObject, as shown in the figure below

Class structure analysis

The main thing is to analyze what is stored in the class information

Supplementary knowledge – Memory offset

Before you can analyze the class structure, you need to understand memory offsets because they are used when accessing the class information

【 Common pointer 】

// Int a = 10; Int b = 10; NSLog(@"%d -- %p", a, &a); NSLog(@"%d -- %p", b, &b);Copy the code

The print result is shown in the following figure

  • A and B both point to 10, but the addresses of a and B are different. This is a copy of value copy, also known as deep copy

  • The addresses of A and B differ by four bytes, depending on the type of a and B

Its address points to the figure

【 Object Pointer 】

// object CJLPerson *p1 = [CJLPerson alloc]; CJLPerson *p2 = [CJLPerson alloc]; NSLog(@"%d -- %p", p1, &p1); NSLog(@"%d -- %p", p2, &p2);Copy the code

The print result is shown in the figure

  • P1 and P2 are Pointers, p1 is the address of the space created by [CJLPerson alloc], that is, the memory address, and p2 is the same

  • &p1 and &p2 are the addresses of Pointers to objects p1 and p2. This pointer is a secondary pointer

Its pointer points to the following figure

[Array pointer]

Int c[4] = {1, 2, 3, 4}; int *d = c; NSLog(@"%p -- %p - %p", &c, &c[0], &c[1]); NSLog(@"%p -- %p - %p", d, d+1, d+2);Copy the code

The print result is as follows

  • &c&c[0]Is to takeThe first address, i.e.,The array name equals the first address
  • &c&c[1]Difference between4Bytes, the number of bytes between addresses, depending on the storageThe data type
  • Can be achieved byInitial address + offsetFetch the other elements in the array, where the offset is the array’s index, and the first address in memory is the actual addressThe number of bytes moved is equal to the offset * number of data type bytes

Its pointer points to something like the following

Explore what’s in the class information

When exploring what’s in the class information, we don’t know what the structure of the class is beforehand, but can we get a first address from the class, and then get all the values from the address by translation

According to the new objc_class definition (objC4-781 version) mentioned above, it has the following properties

struct objc_class : objc_object { // Class ISA; //8 bytes Class superclass; //Class type 8 bytes cache_t cache; // formerly cache pointer and vtable class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags //.... Method partially omitted, not posted}Copy the code
  • Isa property: Isa inherited from objc_object, 8 bytes

  • Superclass attribute: type Class, which is defined by objc_object, is a pointer to 8 bytes

  • Class_data_bits_t is a structure type. The size of the structure is determined by its internal properties. The structure pointer is 8 bytes

  • Bits attribute: Bits can be obtained only when the first address is translated by the sum of the memory sizes of the preceding three attributes

Calculate the memory size of the cache class

Going into the definition of cache_t (which only posts properties that are not static on the structure, mainly because static properties do not exist in the memory of the structure), we have the following properties

struct cache_t { #if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED explicit_atomic<struct bucket_t *> _buckets; Explicit_atomic <mask_t> _mask; // is type mask_t, which is an alias for an unsigned int, #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 explicit_atomic<uintptr_t> _maskAndBuckets; Mask_t _mask_unused; #if __LP64__ uint16_t_flags; // uint16_t = 1 byte; // uint16_t = 1 byte; // Uint16_t is an alias for unsigned short of 2 bytesCopy the code
  • Calculate the memory size of the first two attributes, and the final memory size will add up to 12 bytes in both cases

    • [Situation 1]ifprocess
      • The buckets type is struct bucket_t *, which is a pointer to a structure and is 8 bytes long

      • Mask is of type mask_t, and mask_t is an alias for an unsigned int, 4 bytes long

    • [Situation 2]elseifprocess
      • _maskAndBuckets is the UintPtr_T type and it is a pointer that takes 8 bytes

      • _mask_unused is of type mask_t while mask_t is an alias of uint32_t and takes up 4 bytes

  • Uint16_t is an alias for unsigned short and takes up 2 bytes

  • Uint16_t is an acronym for unsigned short and it’s 2 bytes long

Summary: So finally calculate the cache class memory size = 12 + 2 + 2 = 16 bytes

To obtain bits

Therefore, to obtain the contents of bits, we simply need to shift 32 bytes from the first address of the class

The following is approvedlldbCommand debugging

  • There are two ways to get the first address of a class
    • Get the first address directly from p/x cjLPerson. class

    • Print memory information by x/4gx cjLPerson. class

  • One of thedata()Get the data fromobjc_classMethods provided

  • from$2This can be seen in the printout of the pointerbitsOf the typeclass_rw_t, is also a structure type. But we still don’t see itProperty list, method listWait, need to continue to explore

Explore the property list, namely property_list

By looking at theclass_rw_tDefined source discovery,The structure of the bodyThere areprovideThe correspondingmethodsTo obtainProperty list, method listEtc., as shown below

inTo obtain bitsAnd print thebitsInformation based throughclass_rw_tProvide ways to continue exploringbitsIn theProperty listThe following is a graphical representation of the LLDB exploration process

  • The propertoes method in the p $8.properties() command is provided by class_rw_t, and the actual type returned in the method is property_array_t

  • Since the type of list is property_list_t, it is a pointer, so the information in memory is obtained by p *$10, and it also shows that the property_list, which is the list of properties, is stored in bits

  • P $11. Get (1), try to get the member variable ‘ ‘bobby’ ‘in the CJLPerson, error: the array is out of line, there is only one property in the property_list, cjl_name

[Problem] Explore the storage of member variables

The difference between property_list and member variables is whether there are set and get methods. If there are methods, it is a property; if not, it is a member variable.

So the question is, where are member variables stored? Why does this happen? Move on to the analysis and exploration at the end

Explore the list of methods, methods_list

Preparation: Add two methods (instance method & class method) to CJLPerson mentioned earlier

//CJLPerson.h
@property (nonatomic, copy) NSString *cjl_name;
- (void)sayHello;
+ (void)sayBye;
@end

//CJLPerson.m
@implementation CJLPerson
- (void)sayHello
{}
+ (void)sayBye
{}
@end
Copy the code

LLDB debugging is also used to get the list of methods, as shown in the figure

  • Get the list structure of the concrete list of methods by p $4.methods(), where methods are also provided by class_rw_t

  • By printing count = 4, we know that four methods are stored, and a single method can be obtained by means of p $7.get(I) memory offset, with I ranging from 0 to 3

  • If you print p $7.get(4) to get the fifth method, you will also see an error indicating that the array is out of bounds

The exploration of new problems

[Problem] Explore the storage of member variables

Property_list contains only properties and no member variables. Where are the member variables stored? Why does this happen?

In addition to methods, properties, and protocols, there is also a ro method whose return type is class_ro_t. By looking at the definition of class_rw_t, the class that stores data in the bits property of objc_class, we can find that in addition to methods, properties, and protocols, there is also a ro method whose return type is class_ro_t. We can make a guess as to whether a member variable is stored in the ivars attribute of type ivar_list_t.

The following is the debugging process for LLDB

  • class_ro_tThe attributes in the structure are shown below, which you want to getivarsWhich requires roThe first address is shifted by 48byte
struct class_ro_t { uint32_t flags; //4 uint32_t instanceStart; //4 uint32_t instanceSize; //4 #ifdef __LP64__ uint32_t reserved; //4 #endif const uint8_t * ivarLayout; //8 const char * name; / / 1? 8 method_list_t * baseMethodList; // 8 protocol_list_t * baseProtocols; // 8 const ivar_list_t * ivars; const uint8_t * weakIvarLayout; property_list_t *baseProperties; // Omit the method}Copy the code

As can be seen from the figure, the ivars attribute obtained, in which the count is 2, is found by printing that there are not only hobby but also name in the member list, so the following conclusions can be drawn:

  • Member variables defined by {} are stored in the bits attribute of the class. Bits –> data() –>ro() –> ivars is used to get a list of member variables, including the member variables defined by the attribute as well

  • Properties defined via @property are also stored in bits properties. The bits –> data() –> properties() –> list is used to get a list of properties that contain only properties

[Problem] Explore the storage of class methods

From this we can see that the Methods list has only instance methods and no class methods, so the question is, where are the class methods stored? Why does this happen? So let’s break it down a little bit

In the first half of the article, we mentionedThe metaclass.Class objecttheisaPoint isThe metaclass.The metaclassIs used toStorage classSo we guessed:Whether class methods are stored in bits of the metaclass? Can be achieved bylldbCommand to verify our guess. The following figure shows the debugging flow of the LLDB command

By printing the metaclass method list in the figure, we can see that our guess is correct, so we can draw the following conclusions:

  • The instance methods of the class are stored in the bits property of the class. For example, the instance method sayHello of the CJLPersong class is stored in the bits property of the CJLPerson class. The list of methods in a class includes not only instance methods, but also set and get methods for properties

  • Class methods are stored in the bits attribute of the metaclass. To obtain a list of class methods, run the metaclass bits –> methods() –> list. For example, the class method sayBye in CJLPerson is stored in the bits attribute of the metaclass (also named CJLPerson) of the CJLPerson class