What can be gained from this article

  • How is the structure in the CPP file structured?
  • How does the taxonomy structure in objC4 source code form?
  • How are categories loaded?
  • When is rWE, the core factor of classification loading, assigned?
  • Why does a class have rWE during loading and what does it do?
  • When is the classification loaded?
  • AttchCategories and attachLists core source code analysis
  • AttachLists core algorithm diagram
  • Experiment of classification loading time

To explore an overview

Structure of classification

To explore the way to

  • Compile the file to.cppTo see
  • Directly throughobjc4Source code to view

Case code (main.m)

@interface FFPerson (AA)

@property (nonatomic, copy) NSString * cate_name;
@property (nonatomic, assign) int cate_age;

- (void)cate_instanceMethod1;
- (void)cate_instanceMethod2;
+ (void)cate_classMethod3;
@end

@implementation FFPerson (AA)

- (void)cate_instanceMethod1 {
    NSLog(@"%s",__func__);
}
- (void)cate_instanceMethod2 {
    NSLog(@"%s",__func__);
}
+ (void)cate_classMethod3 {
    NSLog(@"%s",__func__);
}

@end


int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSLog(@"Hello, World!"); }}Copy the code

Compile mian. M to mian. CPP

Terminal instructions

clang rewrite-objc main.m -o main.cpp
Copy the code

main.cpp

Intercept some of the.CPP code related to the category

static struct _category_t *L_OBJC_LABEL_CATEGORY_$[1] __attribute__((used.section(" __DATA, __objc_catlist.regular.no_dead_strip"))) = {
	&_OBJC_$_CATEGORY_FFPerson_$_AA,
};
static struct _category_t _OBJC_The $_CATEGORY_FFPerson_The $_AA __attribute__ ((used.section(" __DATA, __objc_const"))) = 
{
	"FFPerson".0.// &OBJC_CLASS_$_FFPerson,
	(const struct _method_list_t *)&_OBJC_$_CATEGORY_INSTANCE_METHODS_FFPerson_$_AA,
	(const struct _method_list_t *)&_OBJC_$_CATEGORY_CLASS_METHODS_FFPerson_$_AA,
	0,
	(const struct _prop_list_t *)&_OBJC_$_PROP_LIST_FFPerson_$_AA,
};
static struct/ * _method_list_t* / {
	unsigned int entsize;  // sizeof(struct _objc_method)
	unsigned int method_count;
	struct _objc_method method_list[2].
} _OBJC_$_CATEGORY_INSTANCE_METHODS_FFPerson_$_AA __attribute__ ((used, section ("__DATA,__objc_const"))) = {
	sizeof(_objc_method),
	2,
	{{(struct objc_selector *)"cate_instanceMethod1"."v16@0:8", (void *)_I_FFPerson_AA_cate_instanceMethod1},
	{(struct objc_selector *)"cate_instanceMethod2"."v16@0:8", (void *)_I_FFPerson_AA_cate_instanceMethod2}}
};

static struct/ * _method_list_t* / {
	unsigned int entsize;  // sizeof(struct _objc_method)
	unsigned int method_count;
	struct _objc_method method_list[1].
} _OBJC_$_CATEGORY_CLASS_METHODS_FFPerson_$_AA __attribute__ ((used, section ("__DATA,__objc_const"))) = {
	sizeof(_objc_method),
	1,
	{{(struct objc_selector *)"cate_classMethod3"."v16@0:8", (void *)_C_FFPerson_AA_cate_classMethod3}}
};

static struct/ * _prop_list_t* / {
	unsigned int entsize;  // sizeof(struct _prop_t)
	unsigned int count_of_properties;
	struct _prop_t prop_list[2].
} _OBJC_$_PROP_LIST_FFPerson_$_AA __attribute__ ((used, section ("__DATA,__objc_const"))) = {
	sizeof(_prop_t),
	2,
	{{"cate_name"."T@\"NSString\",C,N"},
	{"cate_age"."Ti,N"}}};Copy the code

category_t

struct _category_t {
	const char *name;
	struct _class_t *cls;
	const struct _method_list_t *instance_methods;
	const struct _method_list_t *class_methods;
	const struct _protocol_list_t *protocols;
	const struct _prop_list_t *properties;
};
Copy the code

Objc4 about category_t source code

struct category_t {
    const char *name;
    classref_t cls;
    WrappedPtr<method_list_t, PtrauthStrip> instanceMethods;
    WrappedPtr<method_list_t, PtrauthStrip> classMethods;
    struct protocol_list_t *protocols;
    struct property_list_t *instanceProperties;
    // Fields below this point are not always present on disk.
    struct property_list_t* _classProperties;

    method_list_t *methodsForMeta(bool isMeta) {
        if (isMeta) return classMethods;
        else return instanceMethods;
    }

    property_list_t *propertiesForMeta(bool isMeta, struct header_info *hi);
    
    protocol_list_t *protocolsForMeta(bool isMeta) {
        if (isMeta) return nullptr;
        else returnprotocols; }};Copy the code

By compiling the middle layer code. CPP file compared to the objC4 source code to the conclusion:

  1. nameRepresents theclassificationName.
  2. Classification does not distinguish classification methods from instance methodsClass methods are inserted into classes, and class methods are inserted into metaclasses, so they don’t have their own metaclasses.
  3. Attribute union of classificationGetters and setters are not automatically generatedThis can be seen in the.cpp file.

How are categories loaded

MethodizeClass (methodizeClass, methodizeClass, methodizeClass, methodizeClass, methodizeClass, methodizeClass, methodizeClass, methodizeClass, methodizeClass, methodizeClass, methodizeClass, methodizeClass (methodizeClass, methodizeClass)

static void methodizeClass(Class cls, Class previously)
{
   auto rwe = rw->ext(a); }Copy the code
class_rw_ext_t *ext(a) const {
   return get_ro_or_rwe().dyn_cast<class_rw_ext_t *>(&ro_or_rw_ext);
}
Copy the code

When is rwe assigned

The key function

class_rw_ext_t *extAllocIfNeeded(a) {
        auto v = get_ro_or_rwe(a);if (fastpath(v.is<class_rw_ext_t* > ())) {return v.get<class_rw_ext_t *>(&ro_or_rw_ext);
        } else {
            return extAlloc(v.get<const class_ro_t*>(&ro_or_rw_ext)); }}Copy the code

Search for which functions called this function

  • static void attachCategories(Class cls, const locstamped_category_t *cats_list, uint32_t cats_count, int flags)
  • static void addMethods_finish(Class cls, method_list_t *newlist)
  • BOOL class_addProtocol(Class cls, Protocol *protocol_gen)
  • static bool _class_addProperty(Class cls, const char *name, const objc_property_attribute_t *attrs, unsigned int count, bool replace)
  • Class objc_duplicateClass(Class original, const char *name, size_t extraBytes)

ExtAllocIfNeeded is triggered to attach classes, add methods to classes, add protocols to classes, duplicate classes, etc. The rWE is triggered when a class needs to be dynamically loaded.

Why rWE

Rw ->set_ro(ro); CLS ->setData(rW); rW ->setData(rW); The loading of the class is done perfectly. The reason why there is RWE is that RW is in the process of using dirty memory, and it is dynamic, which leads to the high cost of using RW. This also explains why Ben said that RW is very expensive to use. The expensive here refers to the value of memory, which is money. And that’s why it’s so important to split, to split the dynamic parts of the RW.

Break up front:

After the split:

So the main exploration direction of classification loading is this RWE

When categories are loaded

When do I get assigned by RWE that must trigger the attachCategories function

  • So the idea of exploration becomesattachCategoriesWhen is it loaded?
  • attachCategoriesCalled by whom?
  • Call to orderWhat is it like?

Globally search attachCategories

AttachToClass -> attachCategories

Tip 2: load_categories_NOLock -> attachCategories

Global search attachToClass

Call attachToClass only methodizeClass, OK, at this point it’s back again, forming a closed loop, which is particularly familiar with class loading

Key source code:

// Attach categories.
    if (previously) {
        if (isMeta) {
            objc::unattachedCategories.attachToClass(cls, previously,
                                                     ATTACH_METACLASS);
        } else {
            // When a class relocates, categories with class methods
            // may be registered on the class itself rather than on
            // the metaclass. Tell attachToClass to look for those.
            objc::unattachedCategories.attachToClass(cls, previously,
                                                     ATTACH_CLASS_AND_METACLASS);
        }
    }
    objc::unattachedCategories.attachToClass(cls, cls,
                                             isMeta ? ATTACH_METACLASS : ATTACH_CLASS);
Copy the code

According to the source code, there are three entrances to attachToClass, two of which are controlled by the previously variable, which is the parameter passed in by methodizeCLass. All the way up, it is concluded that objC4 will only be used for internal debugging. AttachToClass at this time of the call range reduced to 1, namely objc: : unattachedCategories. AttachToClass (CLS, CLS, isMeta? ATTACH_METACLASS : ATTACH_CLASS);

Core source code analysis

attchCategories

Append a list of methods, attributes, and protocols from the category to the class. Categories in Cats are assumed to be loaded and sorted in the order they are loaded, with the oldest category coming first.

static void
attachCategories(Class cls, const locstamped_category_t *cats_list, uint32_t cats_count,
                 int flags)
{
    if (slowpath(PrintReplacedMethods)) {
        printReplacements(cls, cats_list, cats_count);
    }
    if (slowpath(PrintConnecting)) {
        _objc_inform("CLASS: attaching %d categories to%s class '%s'%s",
                     cats_count, (flags & ATTACH_EXISTING) ? " existing" : "",
                     cls->nameForLogging(), (flags & ATTACH_METACLASS) ? " (meta)" : "");
    }

   /** * Only a few classes have more than 64 categories at launch. * This uses a little stack and avoids malloc. * Categories must be added in the correct order, from back to front. To do this by chunking, we iterate over cats_list * from front to back, building the local buffer backwards, and calling attachLists on the block. AttachLists put the list first, so the final results are in the expected order. * /
    constexpr uint32_t ATTACH_BUFSIZ = 64;
    method_list_t   *mlists[ATTACH_BUFSIZ];
    property_list_t *proplists[ATTACH_BUFSIZ];
    protocol_list_t *protolists[ATTACH_BUFSIZ];

    uint32_t mcount = 0;
    uint32_t propcount = 0;
    uint32_t protocount = 0;
    bool fromBundle = NO;
    bool isMeta = (flags & ATTACH_METACLASS);
    auto rwe = cls->data() - >extAllocIfNeeded(a);const char *mangledName = cls->nonlazyMangledName(a);if (strcmp(mangledName, "FFPerson") = =0)
    {
        if(! isMeta) {printf("%s -FFPerson.... \n",__func__); }}for (uint32_t i = 0; i < cats_count; i++) {
        auto& entry = cats_list[i];
        // Insert method: focus on exploration
        method_list_t *mlist = entry.cat->methodsForMeta(isMeta);
        if (mlist) {
            if (mcount == ATTACH_BUFSIZ) {
                // Sort the methods
                prepareMethodLists(cls, mlists, mcount, NO, fromBundle, __func__);
                // The core algorithm for method insertion
                rwe->methods.attachLists(mlists, mcount);
                mcount = 0;
            }
            mlists[ATTACH_BUFSIZ - ++mcount] = mlist;
            fromBundle |= entry.hi->isBundle(a); }// Insert attribute: same as method
        property_list_t *proplist =
            entry.cat->propertiesForMeta(isMeta, entry.hi);
        if (proplist) {
            if (propcount == ATTACH_BUFSIZ) {
                rwe->properties.attachLists(proplists, propcount);
                propcount = 0;
            }
            proplists[ATTACH_BUFSIZ - ++propcount] = proplist;
        }
        // Insert protocol is the same as method
        protocol_list_t *protolist = entry.cat->protocolsForMeta(isMeta);
        if (protolist) {
            if (protocount == ATTACH_BUFSIZ) {
                rwe->protocols.attachLists(protolists, protocount);
                protocount = 0; } protolists[ATTACH_BUFSIZ - ++protocount] = protolist; }}if (mcount > 0) {
        prepareMethodLists(cls, mlists + ATTACH_BUFSIZ - mcount, mcount,
                           NO, fromBundle, __func__);
        rwe->methods.attachLists(mlists + ATTACH_BUFSIZ - mcount, mcount);
        if (flags & ATTACH_EXISTING) {
            flushCaches(cls, __func__, [](Class c){
                // constant caches have been dealt with in prepareMethodLists
                // if the class still is constant here, it's fine to keep
                return! c->cache.isConstantOptimizedCache(a); }); } } rwe->properties.attachLists(proplists + ATTACH_BUFSIZ - propcount, propcount);

    rwe->protocols.attachLists(protolists + ATTACH_BUFSIZ - protocount, protocount);
}
Copy the code

attachLists

There are three cases:

  • 0 lists -> 1 list
  • 1 list -> many lists
  • many lists -> many lists
void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            array_t *newArray = (array_t *)malloc(array_t: :byteSize(newCount));
            newArray->count = newCount;
            array()->count = newCount;

            for (int i = oldCount - 1; i >= 0; i--)
                newArray->lists[i + addedCount] = array()->lists[i];
            for (unsigned i = 0; i < addedCount; i++)
                newArray->lists[i] = addedLists[i];
            free(array());
            setArray(newArray);
            validate(a); }else if(! list && addedCount ==1) {
            // 0 lists -> 1 list
            list = addedLists[0];
            validate(a); }else {
            // 1 list -> many lists
            Ptr<List> oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t: :byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            for (unsigned i = 0; i < addedCount; i++)
                array()->lists[i] = addedLists[i];
            validate();
        }
    }
Copy the code

LLDB debug 1 list -> many lists

Conclusion: Both1 list -> many listsormany lists -> many listsThe array structure itself doesn’t change, eitherA one-dimensional array of Pointers, but inside the storage of Pointers to different, some direct point tomethod_t, some Pointers point tomethod_list_t

Core algorithm diagram

Conclusion:

  • If the currentMethodList does not exist.AddedCount to 1The time,list=addedLists[0], the current methodList has onlyOne of the elements.
  • The existingA methodListTo use thisList as a whole is viewed as an element, merged with the method of category, which is in the newly synthesized methodPosition 0 of methodListStart storing,OldMethod comes last, where the pointer stored at the end of the methodList is pointer toSecondary pointer to method_list_tThere is only one particular one, the others all point tomethod_tAt this point to1 list -> many lists.
  • The existingA methodListMerges with category Method, which is in thePosition 0 of the newly synthesized methodListStart storing,Iterate through oldMethod and start saving forward at the endThe end result is still an array of Pointers, which essentially just becomesElement more pointer arrayThat is to say, ormanyList

Experiment of classification loading time

The experimental code

main.m

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSLog(@"%s",__func__); FFPerson *objc = [FFPerson alloc]; [objc likeGirls]; }}Copy the code

FFPerson.h

@interface FFPerson : NSObject

- (void)likeFood;
- (void)likeLive;
- (void)likeSleep;
- (void)likeGirls;
+ (void)enjoyLife;

@end
Copy the code

FFPerson.m

@implementation FFPerson

+ (void)load {
    NSLog(@"%s",__func__);
}

- (void)likeFood {
    NSLog(@"%s",__func__);
}
- (void)likeLive {
    NSLog(@"%s",__func__);
}
- (void)likeSleep {
    NSLog(@"%s",__func__);
}
- (void)likeGirls{
    NSLog(@"%s",__func__);
}

+ (void)enjoyLife {
    NSLog(@"%s",__func__);
}

@end
Copy the code

FFPerson+BBLv.h

@interface FFPerson (BBLv)
- (void)likeGirls;
- (void)cate_bblv1;
- (void)cate_bblv2;
- (void)cate_bblv3;
- (void)cate_bblv4;
@end
Copy the code

FFPerson+BBLv.m

@implementation FFPerson (BBLv)
+ (void)load {
    NSLog(@"%s",__func__);
}

- (void)cate_bblv1 {
    NSLog(@"%s",__func__);
}
- (void)cate_bblv2 {
    NSLog(@"%s",__func__);
}
- (void)cate_bblv3 {
    NSLog(@"%s",__func__);
}
- (void)cate_bblv4 {
    NSLog(@"%s",__func__);
}

- (void)likeGirls{
    NSLog(@"%s",__func__);
}
@end
Copy the code

Lock FFPerson and insert print code for some key functions

_read_images

void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses)
{

    const char *mangledName = cls->nonlazyMangledName(a);const char *bblvPersonName = "FFPerson";
    if (strcmp(mangledName, bblvPersonName) == 0) {
          printf("Realize non-lazy classes -- %s -- BBLv -- %s\n",__func__,mangledName); }}Copy the code

realizeClassWithoutSwift

static Class realizeClassWithoutSwift(Class cls, Class previously)
{
    auto ro = (const class_ro_t *)cls->data(a);auto isMeta = ro->flags & RO_META;
    const char *mangledName = cls->nonlazyMangledName(a);if (strcmp(mangledName, "FFPerson") = =0)
    {
        if(! isMeta) {printf("%s -FFPerson.... \n",__func__); }}}Copy the code

methodizeClass

static void methodizeClass(Class cls, Class previously)
{
bool isMeta = cls->isMetaClass(a);const char *mangledName = cls->nonlazyMangledName(a);if (strcmp(mangledName, "FFPerson") = =0)
    {
        if(! isMeta) {printf("%s -FFPerson.... \n",__func__); }}}Copy the code

attachToClass

void attachToClass(Class cls, Class previously, int flags)
{
    bool isMeta = cls->isMetaClass(a);const char *mangledName = cls->nonlazyMangledName(a);if (strcmp(mangledName, "FFPerson") = =0)
    {
        if(! isMeta) {printf("%s -FFPerson.... \n",__func__); }}}Copy the code

load_categories_nolock(operator())

static void load_categories_nolock(header_info *hi) 
{
    const char *mangledName = cls->nonlazyMangledName(a);if (strcmp(mangledName, "FFPerson") = =0)
    {
        printf("%s -FFPerson.... \n",__func__); }}Copy the code

attachCategories

static void
attachCategories(Class cls, const locstamped_category_t *cats_list, uint32_t cats_count,
                 int flags)
{
bool isMeta = (flags & ATTACH_METACLASS);
const char *mangledName = cls->nonlazyMangledName(a);if (strcmp(mangledName, "FFPerson") = =0)
    {
        if(! isMeta) {printf("%s -FFPerson.... \n",__func__); }}}Copy the code

Listen for functions necessary for class and classification loading. Including but not limited to _read_images, realizeClassWithoutSwift, ‘ ‘methodizeClass, attachToClass, load_categories_NOLock, and attachCategories. So far all the preparatory work has been finished.

Validation phase

The load method is whether the class isLazy loadingKey factors, so adoptWhether to implement the load methodTo test

Verification 1: FFPerson and FFPerson+BBLv implement load method

readClass -- BBLv -- FFPerson
Realize non-lazy classes -- _read_images -- BBLv -- FFPerson
realizeClassWithoutSwift -FFPerson....
methodizeClass -FFPerson....
attachToClass -FFPerson....
operator() -FFPerson....
attachCategories -FFPerson....
2021-07-28 16:03:39.456705+0800 KCObjcBuild[6195:211286] +[FFPerson load]
2021-07-28 16:03:39.457140+0800 KCObjcBuild[6195:211286] +[FFPerson(BBLv) load]
2021-07-28 16:03:39.457257+0800 KCObjcBuild[6195:211286] main
Copy the code

Conclusion 1: If the load method is implemented for both the class and the classification and the attachCategories key function is executed, then the loading method of the classification is non-lazy loading, that is, the class is loaded before the main function

Verification 2: FFPerson implements load method, FFPerson+BBLv does not implement load method

readClass -- BBLv -- FFPerson
Realize non-lazy classes -- _read_images -- BBLv -- FFPerson
realizeClassWithoutSwift -FFPerson....
methodizeClass -FFPerson....
attachToClass -FFPerson....
2021-07-28 17:29:27.798125+0800 KCObjcBuild[6737:254965] +[FFPerson load]
2021-07-28 17:29:27.798714+0800 KCObjcBuild[6737:254965] main
2021-07-28 17:29:30.243474+0800 KCObjcBuild[6737:254965] -[FFPerson(BBLv) likeGirls]
Copy the code

LLDB attempts:

Conclusion two: if this classFFPerson implements the load methodClassification,FFPerson+BBLv does not implement the load methodThe class is determined during compilationThe lazy loadingIn theCompile timeAlready done rightAdd a list of methods, classification and main class methods are inmachoThe attachCategories function is not used to load the categories.If the classification and main class have the same method, the classification method comes first.

Verification three: FPerson does not implement load method, FFPerson+BBLv implement load method

readClass -- BBLv -- FFPerson
Realize non-lazy classes -- _read_images -- BBLv -- FFPerson
realizeClassWithoutSwift -FFPerson....
methodizeClass -FFPerson....
attachToClass -FFPerson....
2021-07-28 18:10:19.204251+0800 KCObjcBuild[6923:272796] +[FFPerson(BBLv) load]
2021-07-28 18:10:19.204738+0800 KCObjcBuild[6923:272796] main
Copy the code

LLDB tuned to

If the class FFPerson does not implement a load method, and the class FFPerson+BBLv implements a load method, it has been determined during compilation that the class is not lazy to load, this time is a forced business class, the class implements a load method, you must follow the main class. The methodList is added at compile time, and the methodList is read directly from the macho file

Verification 4: FPerson and FFPerson+BBLv do not implement the load method

ReadClass -- BBLv -- FFPerson 2021-07-28 18:24:31.842952+0800 KCObjcBuild[7005:280092] Main realizeClassWithoutSwift -FFPerson.... methodizeClass -FFPerson.... attachToClass -FFPerson.... 2021-07-28 18:24:42.963068+0800 KCObjcBuild[7005:280092] -[FFPerson(BBLv) likeGirls]Copy the code

Conclusion 4: If neither FPerson nor FFPerson+BBLv implements load, then the class loading method will be defined as lazy loading, and all loading will be placed after main, that is, the first time the class sends a message.