IOS Low-level exploration series

  • IOS low-level exploration – alloc & init
  • IOS underlayer exploration – Calloc and Isa
  • IOS Low-level exploration – classes
  • IOS Low-level exploration – cache_t
  • IOS Low-level exploration – Methods
  • IOS Low-level exploration – message lookup
  • IOS Low-level exploration – Message forwarding
  • IOS Low-level exploration – app loading
  • IOS low-level exploration – class loading
  • IOS Low-level exploration – classification loading
  • IOS low-level exploration – class extension and associated objects
  • IOS Low-level exploration – KVC
  • IOS Basics – KVO

IOS leak check and fill series

  • IOS leak fix – PerfromSelector
  • IOS bug fix – Threads
  • – RunLoop for iOS
  • IOS – LLVM & Clang

We explored the principles of objects in iOS earlier, and there’s a famous saying in object-oriented programming:

Everything is an object

So where do objects come from? Have a foundation of object-oriented programming students must know that is derived from the object class, so today we will explore the underlying principle of the class.

A,iOSWhat exactly is the class in?

In most of our daily development, we derive the classes we need from the base class NSObject. So at the bottom of OC, how exactly is our Class compiled?

Let’s create a new macOS console project and create an Animal class.

// Animal.h
#import <Foundation/Foundation.h>

NS_ASSUME_NONNULL_BEGIN

@interface Animal : NSObject

@end

NS_ASSUME_NONNULL_END

// Animal.m
@implementation Animal

@end

// main.m
#import <Foundation/Foundation.h>
#import "Animal.h"
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        Animal *animal = [[Animal alloc] init];
        NSLog(@"%p", animal);
    }
    return 0;
}
Copy the code

We execute the clang command at the terminal:

clang -rewrite-objc main.m -o main.cpp
Copy the code

This command is to rewrite our main.m as main. CPP. We open this file and search for Animal:

We found Animal in multiple places:

/ / 1
typedef struct objc_object Animal;

/ / 2
struct Animal_IMPL {
	struct NSObject_IMPL NSObject_IVARS;
};

/ / 3
objc_getClass("Animal")
Copy the code

We start with a global search for the first typedef struct objc_object and find 843 results

Using the Command + G shortcut, we find the Class definition in line 7626:

typedef struct objc_class *Class;
Copy the code

From this line of code we can conclude that the Class type underneath is a pointer to a structure type called objc_class. Typef struct objc_class = typedef struct objc_class = typedef struct objc_class = typedef struct objc_class = typedef struct objc_class = typedef struct objc_class

Struct objc_class struct objc_class struct runtimenew.h objc-runtimenew.h objc-runtimenew.h

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    class_rw_t *data() { 
        returnbits.data(); }}Copy the code

At this point, the observant reader may notice that objc_Object, which we encountered earlier in our exploration of object principles, appears again, this time as the parent of objc_class. Again, to quote the classic saying that everything is an object, that is, a class is an object.

From this, we can briefly summarize the definitions of classes and objects in C and OC respectively

C OC
objc_object NSObject
objc_class NSObject(Class)

What is the structure of a class?

Through the above exploration, we already know that classes are objects in nature, and common member variables, attributes, methods, protocols and so on in daily development exist in classes. So can we guess that in iOS, classes actually store these contents?

We can verify our guess by analyzing the source code.

From the objc_class definition in the previous section, we can tease out four attributes in Class

  • isaPointer to the
  • superclassPointer to the
  • cache
  • bits

It is important to note that the isa pointer here isa hidden property here.

2.1 isaPointer to the

The first is the ISA pointer, which we have explored before. It is understandable that an object can be associated with a class through isa when it is initialized. But why is there an ISA in a class structure? Those of you who read the last article already know the answer to this question. Yes, metaclasses. We need ISA to associate our objects with our classes, just as we need ISA to associate our classes with metaclasses.

2.2 superclassPointer to the

As the name implies, the superclass pointer indicates which parent the current class points to. In general, the root parent of a class is basically an NSObject class. The parent of the root metaclass is also an NSObject class.

2.3 cacheThe cache

The cache data structure is cache_t, which is defined as follows:

struct cache_t {
    struct bucket_t* _buckets;
    mask_t _mask;
    mask_t_occupied; . Omit code... }Copy the code

What does the class cache hold? Is the property? Is it an instance variable? Or method? We can answer this question by reading the objc-cache.mm source file.

  • objc-cache.m
  • Method cache management
  • Cache flushing
  • Cache garbage collection
  • Cache instrumentation
  • Dedicated allocator for large caches

Above is the objc-cache.mm source file’s comment information, we can see the appearance of Method cache Management, which translates as Method cache management. The cache attribute is the method of caching. However, methods in OC have not been explored yet. Let’s assume that we have mastered the underlying principles, which are briefly mentioned here.

The methods we write in the class are actually in the form of SEL + IMP at the bottom. SEL is the method selector, and IMP is the concrete method implementation. Here we can compare the contents and contents of books. When we search for an article, we need to know its title (SEL) first, and then check whether there is a corresponding title in the catalog. If there is, we can turn to the corresponding page, and finally we find the content we want. Of course, the methods in iOS are a little more complicated than the book examples, but it’s easy to understand for now, and we’ll explore the underlying methods later.

2.4 bitsattribute

Bits data structure type is class_datA_bits_t, which is also a structure type. When we read the objc_class source code, we can find bits in many places, such as:

class_rw_t *data() { 
    return bits.data();
}

bool hasCustomRR(a) {
    return ! bits.hasDefaultRR();
}    

bool canAllocFast(a) { assert(! isFuture());return bits.canAllocFast();
}
Copy the code

The data() method of objc_class is actually the data() method of bits. We can see data() in the data() method, such as byte alignment, ARC, metaclass, etc. This indirectly indicates that the bits property is a large container in which memory management, C++ destructor, and so on are defined.

An important point to make here is that the return value of the data() method is a pointer to class_rw_t. We will focus on this later in this article.

Where do class attributes exist?

In the last section, we had a basic understanding of class structures in OC, but we still don’t know where the most common things we deal with are attributes. The next thing we need to do is create a Target in objC4-756’s source code. Why not just use the macOS command line project above? Since we are going to start printing the internal information of some classes with the LLDB, we have to create a target that relies on the objC4-756 source project. Again, we chose the macOS command line as our target.

Next we create a new class Person and add some instance variables and properties.

// Person.h
#import <Foundation/Foundation.h>

NS_ASSUME_NONNULL_BEGIN

@interface Person : NSObject
{
    NSString *hobby;
}
@property (nonatomic.copy) NSString *nickName;
@end

NS_ASSUME_NONNULL_END

// main.m
#import <Foundation/Foundation.h>
#import <objc/runtime.h>
#import "Person.h"
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        
        Person *p = [[Person alloc] init];
        Class pClass = object_getClass(p);
        NSLog(@"%s", p);
    }
    return 0;
}
Copy the code

We break a breakpoint to the NSLog statement in the main.m file and run the target we just created.

After target is running, we print the contents of the pClass on the console:

3.1 Memory structure of class

We need to use pointer translation to explore, and for the memory structure of the class we first look at the following table:

Class memory structure Size (bytes)
isa 8
superclass 8
cache 16

The first two sizes are easy to understand because both ISA and Superclass are structure Pointers, and under ARM64, a structure pointer has a memory footprint of 8 bytes. The third attribute, cache, needs to be stripped down.

cache_t cache;

struct cache_t {
    struct bucket_t* _buckets; / / 8
    mask_t _mask;  / / 4
    mask_t _occupied; / / 4
}

typedef uint32_t mask_t; 
Copy the code

As you can see from the above code, the cache property is actually a cache_T structure with an 8-byte structure pointer and two 4-byte mask_ts inside. So that adds up to 16 bytes. This means that the total memory offset for the first three attributes is 8 + 8 + 16 = 32 bytes, which is the representation of base 10, which in hexadecimal is 20.

3.2 explorebitsattribute

We just printed out the contents of the pClass object on the console. Let’s simply draw the following diagram:

The natural memory address for the bits property of the class is then incremented by 20 in hexadecimal at the initial offset address of ISA. That is

0x1000021c8 + 0x20 = 0x1000021e8
Copy the code

We are trying to print this address. Note that we need to force it here:

The problem is that our target is not associated with the libobjc.a.dylib dynamic library, and we are associated with reruning the project

Let’s repeat the above process:

This time it worked. The objc_class source code contains:

class_rw_t *data() { 
    return bits.data();
}
Copy the code

We might as well print it out:

Returns a class_rw_t pointer object. Let’s search for class_rw_t in objC4-756 source code:

struct class_rw_t {
    // Be warned that Symbolication knows the layout of this structure.
    uint32_t flags;
    uint32_t version;

    const class_ro_t *ro;

    method_array_t methods;
    property_array_t properties;
    protocol_array_tprotocols; Class firstSubclass; Class nextSiblingClass; . Omit code... }Copy the code

Obviously, class_rw_t is also a structure type with methods, properties, protocols, and other familiar content inside. Let’s first guess that our properties should be stored in the class_rw_t properties. To verify our conjecture, we proceed to print LLDB:

Let’s go ahead and print properties:

Properties is empty, is it a bug? It’s not, but there’s a very important property we’re missing here, ro. We come to its definition:

struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
#ifdef __LP64__
    uint32_t reserved;
#endif

    const uint8_t * ivarLayout;
    
    const char * name;
    method_list_t * baseMethodList;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;

    const uint8_t * weakIvarLayout;
    property_list_t*baseProperties; . Hidden code... }Copy the code

The type of ro is the class_RO_T structure, which contains properties such as baseMethodList, baseProtocols, Ivars, baseProperties, etc. Class_ro_t = class_ro_t class_ro_t = class_ro_t class_ro_t = class_ro_t class_ro_t = class_ro_t class_ro_t

Based on the name, we guess that the attribute should be in baseProperties. Let’s print it out:

Bingo! Our nickName was found. What about our instance variable Hobby? We know from the count of $8 that it is definitely not in baseProperites. According to the name, we guess it should be in ivars.

The hobby instance variable is also found, but why is count 2? Let’s print the second element:

The result is _nickName. This result confirms that the compiler will help us generate an instance variable _nickName prefixed with an underscore for the nickName attribute.

So far, we can draw the following conclusions:

Class_ro_t is determined at compile time and stores the class’s member variables, attributes, methods, and protocols. Class_rw_t is a set of attributes, methods, and protocols that can be extended at run time.

Where does class method exist?

Having studied how the attributes of a class are stored, let’s look at the methods of the class.

Let’s start by adding an instance method of sayHello and a class method of sayHappy to our Person class.

// Person.h
- (void)sayHello;
+ (void)sayHappy;

// Person.m
- (void)sayHello
{
    NSLog(@"%s", __func__);
}

+ (void)sayHappy
{
    NSLog(@"%s", __func__);
}
Copy the code

Class_ro_t baseMethodList baseMethodList baseMethodList baseMethodList baseMethodList baseMethodList baseMethodList baseMethodList baseMethodList baseMethodList

SayHello is printed out, indicating that baseMethodList is where the instance methods are stored. Let’s go ahead and print the rest:

You can see that baseMethodList has getter and setter methods for the nickName property and a C++ destructor in addition to our instance method sayHello. But our class method sayHappy is not printed.

Where do class methods exist?

Now that we’ve got the properties, we’re left with a bit of a puzzle about how class methods are stored, so let’s use the Runtime API to actually test them.

// main.m
void testInstanceMethod_classToMetaclass(Class pClass){
    
    const char *className = class_getName(pClass);
    Class metaClass = objc_getMetaClass(className);
    
    Method method1 = class_getInstanceMethod(pClass, @selector(sayHello));
    Method method2 = class_getInstanceMethod(metaClass, @selector(sayHello));

    Method method3 = class_getInstanceMethod(pClass, @selector(sayHappy));
    Method method4 = class_getInstanceMethod(metaClass, @selector(sayHappy));
    
    NSLog(@"%p-%p-%p-%p",method1,method2,method3,method4);
    NSLog(@"%s",__func__);
}

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        
        Person *p = [[Person alloc] init];
        Class pClass = object_getClass(p);
        
        testInstanceMethod_classToMetaclass(pClass);
        NSLog(@"%p", p);
    }
    return 0;
}
Copy the code

After running, the print result is as follows:

The testInstanceMethod_classToMetaclass method tests getting the results of instance methods and class methods from the class and metaclass, respectively. From the printed result, we can know:

  • For class objects,sayHelloIs an instance method stored in the memory of a class object, not in a metaclass object. whilesayHappyIs a class method stored in the memory of a metaclass object, but not in a class object.
  • For metaclass objects,sayHelloIs an instance method of a class object, not related to a metaclass;sayHappyIs an instance method of a metaclass object, so it exists in a metaclass.

Let’s test again:

// main.m
void testClassMethod_classToMetaclass(Class pClass){
    
    const char *className = class_getName(pClass);
    Class metaClass = objc_getMetaClass(className);
    
    Method method1 = class_getClassMethod(pClass, @selector(sayHello));
    Method method2 = class_getClassMethod(metaClass, @selector(sayHello));

    Method method3 = class_getClassMethod(pClass, @selector(sayHappy));
    Method method4 = class_getClassMethod(metaClass, @selector(sayHappy));
    
    NSLog(@"%p-%p-%p-%p",method1,method2,method3,method4);
    NSLog(@"%s",__func__);
}

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        
        Person *p = [[Person alloc] init];
        Class pClass = object_getClass(p);
        
        testClassMethod_classToMetaclass(pClass);
        NSLog(@"%p", p);
    }
    return 0;
}
Copy the code

After running, the print result is as follows:

From the result, we can see that for the class object, the sayHappy obtained by class_getClassMethod has a value, and the sayHello obtained by class_getClassMethod has no value. For metaclass objects, getting sayHappy via class_getClassMethod also has a value, while getting sayHello has no value. The first point here is easy to understand, but the second point is a little confusing. Isn’t it true that class methods are represented as object methods in metaclasses? How to get “sayHappy” from a metaclass using class_getClassMethod?

Method class_getClassMethod(Class cls, SEL sel)
{
    if(! cls || ! sel)return nil;

    return class_getInstanceMethod(cls->getMeta(), sel);
}

Class getMeta(a) {
    if (isMetaClass()) return (Class)this;
    else return this->ISA();
}    
Copy the code

CLS ->getMeta() returns the class isa if it is already a metaclass or if it is not. That explains why “sayHappy” above appeared in the final print.

In addition to the LLDB print above, we can also verify that class methods are stored in metaclasses via isa.

  • Find metaclasses in class objects through ISA
  • Prints the baseMethodsList of the metaclass

I will not repeat the specific process.

When classes and metaclasses are created

As we explore classes and metaclasses, it’s not clear when they were created, so here’s the conclusion:

  • Classes and metaclasses are created at compile time, that is, they are created by the compiler before alloc is performed.

So how do we prove it? There are two ways we can prove it:

  • LLDBPrints Pointers to classes and metaclasses

  • After compiling the project, useMachoViewOpen program binary executable file to view:

Seven,

  • Classes and metaclasses are created at compile time and can be passedLLDBTo print Pointers to classes and metaclasses, orMachOViewView binary executables
  • Everything is an object: classes are by nature objects
  • Class inclass_ro_tThe structure stores properties, member variables, methods, and protocols that are determined at compile time.
  • Instance methods are stored in classes
  • Class methods are stored in metaclasses

We’ve completed a low-level exploration of classes in iOS, so stay tuned for the next chapter, which will take a deeper look at class caching