I. Analysis of the relationship between instance object, class and metaclass

1. Instance object, class, metaclass diagram analysis

I believe that the above classic instance object, class, metaclass relationship diagram we are familiar with, next I will analyze this diagram. Create a new FXPerson class

FXPerson *person   = [FXPerson alloc];
Copy the code

The mask ISA_MASK is defined in arm64 and X86_64

# if __arm64__
#   define ISA_MASK        0x0000000ffffffff8ULL
...
# elif __x86_64__
#   define ISA_MASK        0x00007ffffffffff8ULL
Copy the code

Isa & ISA_MSAK can be used to view information about classes that isa points to.

Debug information using LLDB directives

X /4gx: Prints the address content in hexadecimal format and reads four 16-byte contents

p/x: Print variable in hexadecimal format p is short for expression. Po is short for expression -O (the description method of –object-description NSObject), which prints the description method of variables

(lldb) x/4gx person
0x6000000044a0: 0x001d800100003c31 0x0000000000000000
0x6000000044b0: 0x0000000000000000 0x00000000000007fb
(lldb) p/x 0x001d800100003c31 & 0x00007ffffffffff8ULL
(unsigned long long) $9 = 0x0000000100003c30
(lldb) po 0x0000000100003c30
FXPerson

(lldb) x/4gx 0x0000000100003c30
0x100003c30: 0x0000000100003c08 0x00007fff92740118
0x100003c40: 0x0000600002c30f00 0x0004801000000007
(lldb) p/x 0x0000000100003c08 & 0x00007ffffffffff8ULL
(unsigned long long) $11 = 0x0000000100003c08
(lldb) po 0x0000000100003c08
FXPerson

(lldb) x/4gx 0x0000000100003c08
0x100003c08: 0x00007fff927400f0 0x00007fff927400f0
0x100003c18: 0x0000600002c2c880 0x0003e03100000007
(lldb) p/x 0x00007fff927400f0 & 0x00007ffffffffff8ULL
(unsigned long long) $13 = 0x00007fff927400f0
(lldb) po 0x00007fff927400f0
NSObject

(lldb) x/4gx 0x00007fff927400f0
0x7fff927400f0: 0x00007fff927400f0 0x00007fff92740118
0x7fff92740100: 0x0000600003e10700 0x000ae0310000000f
(lldb) p/x 0x00007fff927400f0 & 0x00007ffffffffff8ULL
(unsigned long long) $15 = 0x00007fff927400f0
(lldb) po 0x00007fff927400f0
NSObject

(lldb) x/4gx teacher
0x600000010810: 0x001d800100003be1 0x0000000000000000
0x600000010820: 0x00007fff899ff560 0x0000000000000000
(lldb) p/x 0x001d800100003be1 & 0x00007ffffffffff8ULL
(unsigned long long) $25 = 0x0000000100003be0
(lldb) po 0x0000000100003be0
FXTeacher

(lldb) x/4gx 0x0000000100003be0
0x100003be0: 0x0000000100003bb8 0x0000000100003c30
0x100003bf0: 0x00007fff6afaa140 0x0000801000000000
(lldb) p/x 0x0000000100003bb8 & 0x00007ffffffffff8ULL
(unsigned long long) $27 = 0x0000000100003bb8
(lldb) po 0x0000000100003bb8
FXTeacher

(lldb) x/4gx 0x0000000100003bb8
0x100003bb8: 0x00007fff927400f0 0x0000000100003c08
0x100003bc8: 0x0000600002c04080 0x0003e03100000007
(lldb) p/x 0x00007fff927400f0 & 0x00007ffffffffff8ULL
(unsigned long long) $29 = 0x00007fff927400f0
(lldb) po 0x00007fff927400f0
NSObject

(lldb) x/4gx 0x00007fff927400f0
0x7fff927400f0: 0x00007fff927400f0 0x00007fff92740118
0x7fff92740100: 0x0000600003e10700 0x000ae0310000000f
(lldb) p/x 0x00007fff927400f0 & 0x00007ffffffffff8ULL
(unsigned long long) $15 = 0x00007fff927400f0
(lldb) po 0x00007fff927400f0
NSObject
Copy the code
  • personInstance objectisaPoints to theFXPerson
  • FXPersonThe class objectisaPoints to theFXPersonThe metaclass
  • FXPersonOf a metaclass objectisaPoints to theNSObject
  • NSObjectThe class objectisaI point to myself
  • teacherInstance objectisaPoints to theFXTeacher
  • FXTeacherThe class objectisaPoints to theFXTeacherThe metaclass
  • FXTeacherOf a metaclass objectisaPoints to theNSObject
  • NSObjectThe class objectisaI point to myself

The conclusion we get through the LLDB instruction debugging is exactly the same as the above figure, verified!

2. Description of metaclasses

To explain what metaclass is, there are mainly the following points:

  • We all know that the isa of an object refers to a class. The isa of a class is actually an object, which can be called a class object, and its bit field refers to the metaclass defined by Apple

  • Metaclasses are given by the system, defined and created by the compiler, and in this process, class ownership comes from the metaclasses

  • A metaclass is a class of objects. Each class has a unique metaclass that stores information about its methods.

  • Metaclasses themselves are nameless, and because they are associated with the class, they use the same name as the class name

Class structure definition

We discussed the instructions for generating CPP files in isa infrastructure analysis. From the CPP file we can see the following line of code:

typedef struct objc_class *Class;
Copy the code

From this we can conclude that Class is a pointer to the objC_class structure. Next, we need to explore the objC source code for objC_class structure.

typedef struct objc_class *Class;
typedef struct objc_object *id;
Copy the code
struct objc_class : objc_object { // Class ISA; Class superclass; cache_t cache; // formerly cache pointer and vtable class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags class_rw_t *data() const { return bits.data(); }... }Copy the code
struct objc_object { private: isa_t isa; public: // ISA() assumes this is NOT a tagged pointer object Class ISA(); . }Copy the code

From objC source we can see that our Class is of type objc_class, which is derived from type objC_Object, which has an ISA member variable. The objc_class structure inherits from the objc_object structure and has an isa member variable, which is given by the parent objc_class structure. The isa of the objc_class structure refers to the parent objc_object, which indicates that the class isa class object.

【 baidu interview question 】 Objc_object and object relationship

All objects are inherited from the objc_Object template

All objects are from NSObject (OC), but what really goes to the bottom is an objC_Object (C/C++) struct type

【 Summary 】 The relationship between objC_object and object is inheritance

conclusion

All objects + classes + metaclasses have isa attributes

All objects are inherited from objc_Object

In a nutshell, everything is an object and everything comes from objc_Object. There are two conclusions:

All objects created using objc_Object as a template have isa attributes

All classes created using objc_class as a template have isa attributes

At the structural level, it can be popularly understood as the docking between upper OC and bottom OC:

The bottom layer is the template defined by the structure, such as objC_class, and the top layer of objC_Object is the type created by the bottom template, such as FXLPerson

Among themobjc_objectobjc_classThe diagram is as follows:

Memory offset

Before exploring the attribute method analysis of a class, I will add the concept of memory offset, mainly to better understand the structure of subsequent classes.

Int c [4] = {1, 2, 3}; Int *d = c; / / and then define a pointer pointing to the d c NSLog (@ "% % % p - p - p", & c & c [0], & c [1]). NSLog(@"%p - %p - %p",d,d+1,d+2); Result: 0x7FFeefBff4A0-0x7FFeefbff4A0-0x7FFeefbff4A4-0x7FFeEFbff4a8 Result: 0x7ffeefBff4A0-0x7Ffeefbff4A4-0x7FfeEFBff4a8 The address of c is the same as that of C [0], and d is equal to the first address of C and through d+1, D +2 can also find the corresponding element of the array, so we can point to the next consecutive memory address by pointer offset.Copy the code

Fourth, class attribute and method analysis

1. Objc_class analysis

Objc_class objc_class objc_class objc_class objc_class objc_class

struct objc_class : objc_object { // Class ISA; Class superclass; cache_t cache; // formerly cache pointer and vtable class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags class_rw_t *data() const { return bits.data(); }... }Copy the code

1.1The first propertyClass ISACommented out, which means inherited from the parent class, we go inobjc_objectAs you can see in there, it takes 8 bytes.

struct objc_object { private: isa_t isa; . }Copy the code

1.2Second propertyClass superclassThe parent class, which is a pointer, takes 8 bytes.

typedef struct objc_class *Class;
Copy the code

1.3The third propertycache_t cacheA structure of 16 bytes.

struct cache_t { #if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED explicit_atomic<struct bucket_t *> _buckets; explicit_atomic<mask_t> _mask; #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 explicit_atomic<uintptr_t> _maskAndBuckets; mask_t _mask_unused; // How much the mask is shifted by. static constexpr uintptr_t maskShift = 48; . Are static variables, #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4 // _maskAndBuckets stores the mask shift in the low 4 bits, and // the buckets pointer in the remainder of the value. The mask // shift is the value where (0xffff >> shift) produces the correct // mask. This is equal to 16 - log2(cache_size). explicit_atomic<uintptr_t> _maskAndBuckets; mask_t _mask_unused; static constexpr uintptr_t maskBits = 4; . #else #error Unknown cache mask storage type. #endif #if __lp64__16_t _flags; #endif uint16_t _occupied; . Both methods and static variables do not count toward structure size}Copy the code

Now you can see that there are only four member variables left in the structure: _buckets, _mask, _flags, and _occupied. _buckets: 8 bytes ② _mask: uint32_t type, 4 bytes

#if __LP64__
typedef uint32_t mask_t;  // x86_64 & arm64 asm are less efficient with 16-bits
#else
typedef uint16_t mask_t;
#endif
Copy the code
typedef unsigned int uint32_t;
Copy the code

③ _flags: Uint16_t type, 2 bytes

typedef unsigned short uint16_t;
Copy the code

4. _occupied: uint16_t

C OC 32 – A 64 – bit
bool BOOL (64 bits) 1 1
signed char (__signed char)in8_t, BOOL (32 bits) 1 1
unsigned char Boolean 1 1
short int16_t 2 2
unsigned short unichar 2 2
int int32_t NSInterger(32 bits), Boolean_T (32 bits) 4 4
unsigned int Boolean_t (64-bit), NSInterger(32-bit) 4 4
long NSInterger (64-bit) 4 8
unsigned long NSInterger (64-bit) 4 8
long long int64_t 8 8
float CGFloat(32位) 4 4
double CGFloat(64位) 8 8

1.4The fourth property, class_data_bits_t bits, is a struct, structbitsThere is a waybits.data()We can see the methoddata()class_rw_t Type, viewclass_rw_t Type, we’ll find the properties and methods we’re looking for in there

struct class_rw_t { ... const method_array_t methods() const { auto v = get_ro_or_rwe(); if (v.is<class_rw_ext_t *>()) { return v.get<class_rw_ext_t *>()->methods; } else { return method_array_t{v.get<const class_ro_t *>()->baseMethods()}; } } const property_array_t properties() const { auto v = get_ro_or_rwe(); if (v.is<class_rw_ext_t *>()) { return v.get<class_rw_ext_t *>()->properties; } else { return property_array_t{v.get<const class_ro_t *>()->baseProperties}; } } const protocol_array_t protocols() const { auto v = get_ro_or_rwe(); if (v.is<class_rw_ext_t *>()) { return v.get<class_rw_ext_t *>()->protocols; } else { return protocol_array_t{v.get<const class_ro_t *>()->baseProtocols}; }}};Copy the code
1.4.1 Structure analysis of class_DATA_bits_T
struct class_data_bits_t {
    friend objc_class;

    // Values are the FAST_ flags above.
    uintptr_t bits;

public:

    class_rw_t* data() const {
        return (class_rw_t *)(bits & FAST_DATA_MASK);
    }
    void setData(class_rw_t *newData)
    {
        ASSERT(!data()  ||  (newData->flags & (RW_REALIZING | RW_FUTURE)));
        // Set during realization or construction only. No locking needed.
        // Use a store-release fence because there may be concurrent
        // readers of data and data's contents.
        uintptr_t newBits = (bits & ~FAST_DATA_MASK) | (uintptr_t)newData;
        atomic_thread_fence(memory_order_release);
        bits = newBits;
    }
}
Copy the code

On 64-bit architecture cpus, the 3rd to 46th bytes of bits store class_rw_t. Class_rw_t stores flags, witness, firstSubclass, nextSiblingClass, and class_rw_ext_t.

struct class_rw_ext_t {
    const class_ro_t *ro;
    method_array_t methods;
    property_array_t properties;
    protocol_array_t protocols;
    char *demangledName;
    uint32_t version;
};
Copy the code

Class_rw_ext_t stores class_RO_T, methods, properties, and protocols. Class_ro_t also stores baseMethodList (method list), baseProperties (attribute list), baseProtocols (protocol list), instance variables, class names, sizes, and so on.

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
#ifdef __LP64__
    uint32_t reserved;
#endif

    const uint8_t * ivarLayout;
    
    const char * name;
    method_list_t * baseMethodList;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;

    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;
};
Copy the code

When the class is compiled, the binary class is represented on disk as follows:

The first is the class object itself, which contains the most frequently accessed information: Pointers to metaclass (ISA), superclass (superclass), and method cache (cache). It also has Pointers to class_ro_t, a structure that contains more data, including the class name, method, protocol, instance variables, and other compile-time information. Where ro means read only.

Classes are always laid out like this when they are first loaded from disk into memory, but they change once they are used:

To understand the changes that occur when a class is loaded by the Runtime, we need to know two concepts:

  • Clean Memory: block of memory that does not change after loading,class_ro_tBelong toClean MemoryBecause it is read-only.
  • Dirty Memory: a block of memory that changes at run time. Once the class structure is loaded, it becomesDirty MemoryBecause the runtime writes new data to it. For example, we can go throughRuntimeAdd methods to classes dynamically.

To be clear, Dirty Memory is much more expensive than Clean Memory. Because it requires more memory information, it must be retained as long as the process is running. On the other hand, Clean Memory can be removed to save more Memory space, because if you need Clean Memory, the system can be reloaded from disk.

Dirty Memory is why this class data is split into two parts. For us, more Clean Memory is obviously better because it saves more Memory. We can keep most of our class data in Clean Memory by separating out parts of the data that never change. How do we do that? Before we get into tuning, let’s take a look at what the structure of a class looks like after it’s loaded.

When a class is first used, the Runtime allocates it additional storage for a structure class_rw_t that reads/writes data. In this structure, new information is stored that is generated only at run time. For example, all classes are linked into a tree structure via firstSubclass and nextSiblingClass Pointers, which allows the Runtime to traverse all classes currently in use. But why should there be a list of methods and a list of properties? Because they can be changed at run time. When a category is loaded, it can add new methods to the class. Programmers can also add them dynamically through the Runtime API.

Class_ro_t is read-only and stores information about fields determined at compile time. Class_rw_t is created at runtime, and it makes a copy of class_ro_t, and then it adds in the attributes, methods, protocols, etc., of the class, because objective-C is a dynamic language, You can change their methods, properties, and so on at run time, and classes can add new methods to a class without changing the class design.

Because class_ro_t is read-only, we need to keep track of these things in class_rw_t. This, of course, takes up quite a bit of memory. As it turns out, class_rw_T takes up more memory than class_ro_t, and in Apple’s tests, the iPhone had about 30MB of class_rw_T on the system. How do you optimize this memory? By measuring usage on real devices, only about 10% of classes actually have dynamic change behavior (dynamically adding methods, using Category methods, and so on). So, apple’s engineers took these dynamic parts and put them in a separate area called class_rw_ext_t, and that cut the size of class_rw_t in half, so the structure looks something like this.

About 90% of the classes never need this extended data, and 90% of the classes can be split to Clean Memory. At the system level, Apple has tested the effect of saving about 14MB of Memory, making the Memory available for more productive use.

2. Class attribute method analysis

Let’s debug LLDB against the following code

@interface FXPerson : NSObject

@property (nonatomic,copy) NSString *name;     // XuPengfei
@property (nonatomic,copy) NSString *nickName; // FX
@property (nonatomic) int height; // 180

- (void)sayHello;

+ (void)sayBye;

@end
Copy the code
FXPerson *person = [[FXPerson alloc] init];
person.name     = @"XuPengfei";
person.nickName  = @"FX";
person.height    = 180;
Copy the code

2.1 Print the FXPerson class information

(lldb) x/4gx FXPerson.class
0x1000022b0: 0x0000000100002288 0x0000000100334140
0x1000022c0: 0x00000001038be0d0 0x0001802c00000007
Copy the code

2.2 In 1. Objc_class analysis, the Class information structures are Class ISA, Class superclass, cache_t cache, and class_data_bits_t bits. And the Class ISA, Class superclass, and cache_t cache sizes are 8, 8, and 16 bytes, respectively. So the first address of the class is 8 + 8 + 16 = 32 bytes to get bits of type class_datA_bits_t. (0x1000022B0 + 0x20(32bit) = 0x1000022D0)

(lldb) p (class_data_bits_t *)0x1000022d0
(class_data_bits_t *) $1 = 0x00000001000022d0
Copy the code

2.3 Next we print the bits.data() information.

(lldb) p $1->data() (class_rw_t *) $2 = 0x00000001038bdd00 (lldb) p *$2 (class_rw_t) $3 = { flags = 2148007936 witness =  1 ro_or_rw_ext = { std::__1::atomic<unsigned long> = 4294975648 } firstSubclass = nil nextSiblingClass = NSUUID }Copy the code

Bits.data () is of class_rw_t type. Class_rw_t has member variables such as methods(), properties(), protocols(), etc. We can print the properties() information

(lldb) p $3.properties() (const property_array_t) $4 = { list_array_tt<property_t, property_list_t> = { = { list = 0x0000000100002218 arrayAndFlag = 4294976024 } } } (lldb) p $4.list (property_list_t *const) $5 = 0x0000000100002218 (lldb) p *$5 (property_list_t) $6 = { entsize_list_tt<property_t, property_list_t, 0> = { entsizeAndFlags = 16 count = 3 first = (name = "name", attributes = "T@\"NSString\",C,N,V_name") } } (lldb) p $6.get(0) (property_t) $7 = (name = "name", attributes = "T@\"NSString\",C,N,V_name") (lldb) p $6.get(1) (property_t) $8 = (name = "nickName", attributes = "T@\"NSString\",C,N,V_nickName") (lldb) p $6.get(2) (property_t) $9 = (name = "height", Attributes = "Ti,N,V_height") (LLDB) p $6.get(3) Assertion failed: (i < count), function get, file /Users/xxx/xxx, line 438.Copy the code

2.5 Next we look at the methods() information

(lldb) p $3.methods() (const method_array_t) $10 = { list_array_tt<method_t, method_list_t> = { = { list = 0x00000001000020e8 arrayAndFlag = 4294975720 } } } (lldb) p $10.list (method_list_t *const) $11 = 0x00000001000020e8 (lldb) p *$11 (method_list_t) $12 = { entsize_list_tt<method_t, method_list_t, 3> = { entsizeAndFlags = 26 count = 8 first = { name = "sayHello" types = 0x0000000100000f6a "v16@0:8" imp = 0x0000000100000cb0 (KCObjc`-[FXPerson sayHello]) } } } (lldb) p $12.get(0) (method_t) $13 = { name = "sayHello" types = 0x0000000100000f6a "v16@0:8" imp = 0x0000000100000cb0 (KCObjc`-[FXPerson sayHello]) } (lldb) p $12.get(1) (method_t) $14  = { name = ".cxx_destruct" types = 0x0000000100000f6a "v16@0:8" imp = 0x0000000100000dc0 (KCObjc`-[FXPerson .cxx_destruct]) } (lldb) p $12.get(2) (method_t) $15 = { name = "name" types = 0x0000000100000f80 "@16@0:8" imp = 0x0000000100000cc0 (KCObjc`-[FXPerson name]) } (lldb) p $12.get(3) (method_t) $16 = { name = "height" types = 0x0000000100000f93 "i16@0:8" imp = 0x0000000100000d80 (KCObjc`-[FXPerson height]) } (lldb) p $12.get(4) (method_t) $17 =  { name = "setName:" types = 0x0000000100000f88 "v24@0:8@16" imp = 0x0000000100000cf0 (KCObjc`-[FXPerson setName:]) } (lldb) p $12.get(5) (method_t) $18 = { name = "setHeight:" types = 0x0000000100000f9b "v20@0:8i16" imp = 0x0000000100000da0 (KCObjc`-[FXPerson setHeight:]) } (lldb) p $12.get(6) (method_t) $19 = { name = "setNickName:" types = 0x0000000100000f88 "v24@0:8@16" imp = 0x0000000100000d50 (KCObjc`-[FXPerson setNickName:]) } (lldb) p $12.get(7) (method_t) $20 = { name = "nickName" types = 0x0000000100000f80 "@16@0:8" imp = 0x0000000100000d20 (KCObjc`-[FXPerson Assertion failed: (I < count), function get, file /Users/ XXX/XXXCopy the code

2.6 Don’t we still have class methods to print? Why do I report a set of out-of-bounds errors? Quite simply, as we said earlier, each class has a unique metaclass to store its methods. The isa pointer of a class object points to the metaclass of a class object. So the first address of the metaclass is 8 + 8 + 16 = 32 bytes to get the bits information of type class_datA_bits_t of the metaclass.

(lldb) p (class_data_bits_t *)0x00000001000022a8
(class_data_bits_t *) $21 = 0x00000001000022a8
Copy the code

2.7 Next we print the bits.data() information for the metaclass.

lldb)  p $21->data()
(class_rw_t *) $22 = 0x00000001038bdce0
(lldb) p *$22
(class_rw_t) $23 = {
  flags = 2684878849
  witness = 1
  ro_or_rw_ext = {
    std::__1::atomic<unsigned long> = 4294975544
  }
  firstSubclass = nil
  nextSiblingClass = 0x00007fff88e04cd8
}
Copy the code

2.8 Next we look at the methods() information for metaclasses

(const method_array_t) $24 = { list_array_tt<method_t, method_list_t> = { = { list = 0x0000000100002080 arrayAndFlag = 4294975616 } } } (lldb) p $24.list (method_list_t *const) $25 = 0x0000000100002080 (lldb) p *$25 (method_list_t) $26 = { entsize_list_tt<method_t, method_list_t, 3> = { entsizeAndFlags = 26 count = 1 first = { name = "sayBye" types = 0x0000000100000f6a "v16@0:8" imp = 0x0000000100000ca0 (KCObjc`+[FXPerson sayBye]) } } } (lldb) p $26.get(0) (method_t) $27 = { name = "sayBye" types = 0x0000000100000f6a "v16@0:8" imp = 0x0000000100000CA0 (KCObjc '+[FXPerson sayBye])} (LLDB) p $26.get(1) Assertion failed: (I < count), function get, file /Users/ XXX/XXXCopy the code