Before introducing this process, let’s think about the following questions:

  • 1. What is the form of the various classes we write in the App package, and how are they loaded into memory when the program is running?
  • 2. What exactly does a class contain? When was all this stuff put together?
  • Can the contents of the class be modified? What can be modified? Why is that?

With these questions in mind, let’s talk about the loading of classes.

Opening:void _objc_init(void)Father was born

Void environ_init(void) Take a look at the source code

* environ_init * Read environment variables that affect the runtime. * Also print environment variable help, if requested. **********************************************************************/ void environ_init(void) { if (issetugid()) { // All environment variables are silently ignored when setuid or setgid // This includes OBJC_HELP and OBJC_PRINT_OPTIONS themselves. return; } // Turn off autorelease LRU coalescing by default for apps linked against // older SDKs. LRU coalescing can reorder releases and certain older apps // are accidentally relying on the ordering. // rdar://problem/63886091 // if (! dyld_program_sdk_at_least(dyld_fall_2020_os_versions)) // DisableAutoreleaseCoalescingLRU = true; bool PrintHelp = false; bool PrintOptions = false; bool maybeMallocDebugging = false; // Scan environ[] directly instead of calling getenv() a lot. // This optimizes the case where none are set. for (char **p = *_NSGetEnviron(); *p ! = nil; p++) { if (0 == strncmp(*p, "Malloc", 6) || 0 == strncmp(*p, "DYLD", 4) || 0 == strncmp(*p, "NSZombiesEnabled", 16)) { maybeMallocDebugging = true; } if (0 ! = strncmp(*p, "OBJC_", 5)) continue; if (0 == strncmp(*p, "OBJC_HELP=", 10)) { PrintHelp = true; continue; } if (0 == strncmp(*p, "OBJC_PRINT_OPTIONS=", 19)) { PrintOptions = true; continue; } if (0 == strncmp(*p, "OBJC_DEBUG_POOL_DEPTH=", 22)) { SetPageCountWarning(*p + 22); continue; } const char *value = strchr(*p, '='); if (! *value) continue; value++; for (size_t i = 0; i < sizeof(Settings)/sizeof(Settings[0]); i++) { const option_t *opt = &Settings[i]; if ((size_t)(value - *p) == 1+opt->envlen && 0 == strncmp(*p, opt->env, opt->envlen)) { *opt->var = (0 == strcmp(value, "YES")); break; } } } // Special case: enable some autorelease pool debugging // when some malloc debugging is enabled // and OBJC_DEBUG_POOL_ALLOCATION is not  set to something other than NO. if (maybeMallocDebugging) { const char *insert = getenv("DYLD_INSERT_LIBRARIES"); const char *zombie = getenv("NSZombiesEnabled"); const char *pooldebug = getenv("OBJC_DEBUG_POOL_ALLOCATION"); if ((getenv("MallocStackLogging") || getenv("MallocStackLoggingNoCompact") || (zombie && (*zombie == 'Y' || *zombie == 'y')) || (insert && strstr(insert, "libgmalloc"))) && (! pooldebug || 0 == strcmp(pooldebug, "YES"))) { DebugPoolAllocation = true; } } // if (! os_feature_enabled_simple(objc4, preoptimizedCaches, true)) { // DisablePreoptCaches = true; // } // Print OBJC_HELP and OBJC_PRINT_OPTIONS output. if (PrintHelp || PrintOptions) { if (PrintHelp) { _objc_inform("Objective-C runtime debugging. Set variable=YES to enable."); _objc_inform("OBJC_HELP: describe available environment variables"); if (PrintOptions) { _objc_inform("OBJC_HELP is set"); } _objc_inform("OBJC_PRINT_OPTIONS: list which options are set"); } if (PrintOptions) { _objc_inform("OBJC_PRINT_OPTIONS is set"); } for (size_t i = 0; i < sizeof(Settings)/sizeof(Settings[0]); i++) { const option_t *opt = &Settings[i]; if (PrintHelp) _objc_inform("%s: %s", opt->env, opt->help); if (PrintOptions && *opt->var) _objc_inform("%s is set", opt->env); }}}Copy the code

To do a bit of mischief, the _objc_inform two lines of printed code are unbound and run a little. Don’t see don’t know, a look startled. It turns out there are so many things that can be printed

  • OBJC_PRINT_IMAGES: log image and library names as they are loaded
  • OBJC_PRINT_IMAGE_TIMES: measure duration of image loading steps
  • OBJC_PRINT_LOAD_METHODS: log calls to class and category +load methods
  • OBJC_PRINT_INITIALIZE_METHODS: log calls to class +initialize methods
  • OBJC_PRINT_RESOLVED_METHODS: log methods created by +resolveClassMethod: and +resolveInstanceMethod:
  • OBJC_PRINT_CLASS_SETUP: log progress of class and category setup
  • OBJC_PRINT_PROTOCOL_SETUP: log progress of protocol setup
  • OBJC_PRINT_IVAR_SETUP: log processing of non-fragile ivars
  • OBJC_PRINT_VTABLE_SETUP: log processing of class vtables
  • OBJC_PRINT_VTABLE_IMAGES: print vtable images showing overridden methods
  • OBJC_PRINT_CACHE_SETUP: log processing of method caches
  • OBJC_PRINT_FUTURE_CLASSES: log use of future classes for toll-free bridging
  • OBJC_PRINT_PREOPTIMIZATION: log preoptimization courtesy of dyld shared cache
  • OBJC_PRINT_CXX_CTORS: log calls to C++ ctors and dtors for instance variables
  • OBJC_PRINT_EXCEPTIONS: log exception handling
  • OBJC_PRINT_EXCEPTION_THROW: log backtrace of every objc_exception_throw()
  • OBJC_PRINT_ALT_HANDLERS: log processing of exception alt handlers
  • OBJC_PRINT_REPLACED_METHODS: log methods replaced by category implementations
  • OBJC_PRINT_DEPRECATION_WARNINGS: warn about calls to deprecated runtime functions
  • OBJC_PRINT_POOL_HIGHWATER: log high-water marks for autorelease pools
  • OBJC_PRINT_CUSTOM_CORE: log classes with custom core methods
  • OBJC_PRINT_CUSTOM_RR: log classes with custom retain/release methods
  • OBJC_PRINT_CUSTOM_AWZ: log classes with custom allocWithZone methods
  • OBJC_PRINT_RAW_ISA: log classes that require raw pointer isa fields
  • OBJC_DEBUG_UNLOAD: warn about poorly-behaving bundles when unloaded
  • OBJC_DEBUG_FRAGILE_SUPERCLASSES: warn about subclasses that may have been broken by subsequent changes to superclasses
  • OBJC_DEBUG_NIL_SYNC: warn about @synchronized(nil), which does no synchronization
  • OBJC_DEBUG_NONFRAGILE_IVARS: capriciously rearrange non-fragile ivars
  • OBJC_DEBUG_ALT_HANDLERS: record more info about bad alt handler use
  • OBJC_DEBUG_MISSING_POOLS: warn about autorelease with no pool in place, which may be a leak
  • OBJC_DEBUG_POOL_ALLOCATION: halt when autorelease pools are popped out of order, and allow heap debuggers to track autorelease pools
  • OBJC_DEBUG_DUPLICATE_CLASSES: halt when multiple classes with the same name are present
  • OBJC_DEBUG_DONT_CRASH: halt the process by exiting instead of crashing
  • OBJC_DEBUG_POOL_DEPTH: log fault when at least a set number of autorelease pages has been allocated
  • OBJC_DISABLE_VTABLES: disable vtable dispatch
  • OBJC_DISABLE_PREOPTIMIZATION: disable preoptimization courtesy of dyld shared cache
  • OBJC_DISABLE_TAGGED_POINTERS: disable tagged pointer optimization of NSNumber et al.
  • OBJC_DISABLE_TAG_OBFUSCATION: disable obfuscation of tagged pointers
  • OBJC_DISABLE_NONPOINTER_ISA: disable non-pointer isa fields
  • OBJC_DISABLE_INITIALIZE_FORK_SAFETY: disable safety checks for +initialize after fork
  • OBJC_DISABLE_FAULTS: disable os faults
  • OBJC_DISABLE_PREOPTIMIZED_CACHES: disable preoptimized caches
  • OBJC_DISABLE_AUTORELEASE_COALESCING: disable coalescing of autorelease pool pointers
  • OBJC_DISABLE_AUTORELEASE_COALESCING_LRU: disable coalescing of autorelease pool pointers using look back N strategy

Void tls_init(void) is simply a multithreaded ready initialization

void tls_init(void)

{

#if SUPPORT_DIRECT_THREAD_KEYS

    pthread_key_init_np(TLS_DIRECT_KEY, &_objc_pthread_destroyspecific);

#else

    _objc_pthread_key = tls_create(&_objc_pthread_destroyspecific);

#endif

}

\
Copy the code

Calls to C++ static constructors — all system constructors

/*********************************************************************** * static_init * Run C++ static constructor functions. * libc calls _objc_init() before dyld would call our static constructors, * so we have to do it ourselves. **********************************************************************/ static void static_init() { size_t count; auto inits = getLibobjcInitializers(&_mh_dylib_header, &count); for (size_t i = 0; i < count; i++) { inits[i](); } auto offsets = getLibobjcInitializerOffsets(&_mh_dylib_header, &count); for (size_t i = 0; i < count; i++) { UnsignedInitializer init(offsets[i]); init(); }}Copy the code

The runtime_init() name is very special, because it is the initialization of classes not added to objC and the initialization preparation of classes that have been allocated memory

void runtime_init(void)
{
    objc::unattachedCategories.init(32);
    objc::allocatedClasses.init();
}
Copy the code

5, exception capture initialization void exception_init(void) see the code combined with the annotation found is registered an exception callback, if the OC type exception, can normally throw exception, but is the C language function pointer assignment

/***********************************************************************
* exception_init
* Initialize libobjc's exception handling system.
* Called by map_images().
**********************************************************************/
void exception_init(void)
{
    old_terminate = std::set_terminate(&_objc_terminate);
}


/***********************************************************************
* _objc_terminate
* Custom std::terminate handler.
*

* The uncaught exception callback is implemented as a std::terminate handler. 

* 1. Check if there's an active exception

* 2. If so, check if it's an Objective-C exception

* 3. If so, call our registered callback with the object.

* 4. Finally, call the previous terminate handler.

**********************************************************************/

static void (*old_terminate)(void) = nil;
static void _objc_terminate(void)
{
    if (PrintExceptions) {
        _objc_inform("EXCEPTIONS: terminating");
    }

    if (! __cxa_current_exception_type()) {
        // No current exception.
        (*old_terminate)();
    }

    else {
        // There is a current exception. Check if it's an objc exception.
        @try {
            __cxa_rethrow();
        } @catch (id e) {
            // It's an objc object. Call Foundation's handler, if any.
            (*uncaught_handler)((id)e);
            (*old_terminate)();
        } @catch (...) {
            // It's not an objc object. Continue to C++ terminate.
            (*old_terminate)();
        }
    }
}
Copy the code

Void cache_t::init(); void cache_t::init(); void cache_t::init(); void cache_t::init()

void cache_t::init()
{
#if HAVE_TASK_RESTARTABLE_RANGES
    mach_msg_type_number_t count = 0;
    kern_return_t kr;
    while (objc_restartableRanges[count].location) {
        count++;
    }
    kr = task_restartable_ranges_register(mach_task_self(),
                                          objc_restartableRanges, count);
    if (kr == KERN_SUCCESS) return;
    _objc_fatal("task_restartable_ranges_register failed (result 0x%x: %s)",
                kr, mach_error_string(kr));
#endif // HAVE_TASK_RESTARTABLE_RANGES

}
Copy the code

_imp_implementationWithBlock_init(); You don’t have to do that in general if you’re lazy, but in some processes you need to load earlier so I’ll do that here. Looking at the comments is just to fix some bugs, and it looks like Apple developers will need a variety of inserts and patches as well. Sure enough, every program is doomed.

/// Initialize the trampoline machinery. Normally this does nothing, as
/// everything is initialized lazily, but for certain processes we eagerly load
/// the trampolines dylib.
void _imp_implementationWithBlock_init(void)
{
#if TARGET_OS_OSX
    // Eagerly load libobjc-trampolines.dylib in certain processes. Some
    // programs (most notably QtWebEngineProcess used by older versions of
    // embedded Chromium) enable a highly restrictive sandbox profile which
    // blocks access to that dylib. If anything calls
    // imp_implementationWithBlock (as AppKit has started doing) then we'll
    // crash trying to load it. Loading it here sets it up before the sandbox
    // profile is enabled and blocks it.
    //
    // This fixes EA Origin (rdar://problem/50813789)
    // and Steam (rdar://problem/55286131)
    if (__progname &&
        (strcmp(__progname, "QtWebEngineProcess") == 0 ||
         strcmp(__progname, "Steam Helper") == 0)) {
        Trampolines.Initialize();
    }
#endif
}
Copy the code

_dyLD_OBJC_NOTIFY_register (&map_images, load_images, unmap_image) Look at the comments for this method. In fact, it registers a three callback function with dyld at this location. There are three reasons why this is said. The implementation location of this method was found in dyld source code. Loading images is what DYld does, and objC libraries cannot do this, but objC libraries need to know when each image is loaded. So I registered a callback function in C, hoping to call my map_images function after loading an image, and call load_images and unmap_image methods when appropriate

//
// Note: only for use by objc runtime
// Register handlers to be called when objc images are mapped, unmapped, and initialized.
// Dyld will call back the "mapped" function with an array of images that contain an objc-image-info section.
// Those images that are dylibs will have the ref-counts automatically bumped, so objc will no longer need to
// call dlopen() on them to keep them from being unloaded.  During the call to _dyld_objc_notify_register(),
// dyld will call the "mapped" function with already loaded objc images.  During any later dlopen() call,
// dyld will also call the "mapped" function.  Dyld will call the "init" function when dyld would be called
// initializers in that image.  This is when objc calls any +load methods in that image.

//
void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped);
Copy the code
Void _DYLD_OBJC_notifY_register (_DYLD_OBJC_notifY_mapped, _dyLD_objC_notify_init, _dyld_objc_notify_unmapped unmapped) { log_apis("_dyld_objc_notify_register(%p, %p, %p)\n", mapped, init, unmapped); gAllImages.setObjCNotifiers(mapped, init, unmapped); }Copy the code
void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    runtime_init();
    exception_init();
#if __OBJC2__
    cache_t::init();
#endif
    _imp_implementationWithBlock_init();
    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
#if __OBJC2__
    didCallDyldNotifyRegister = true;
#endif
}
Copy the code

Map_images maps all image files linked by dyLD

Look at the source code and comments to see that the image file mapping logic is handled through the ABI’s interface and runtime locking, and all MachO files are handled through the middle layer interface MAP_images_NOLock

/*********************************************************************** * map_images * Process the given images which are being mapped in by dyld. * Calls ABI-agnostic code after taking ABI-specific locks. * * Locking: write-locks runtimeLock **********************************************************************/ void map_images(unsigned  count, const char * const paths[], const struct mach_header * const mhdrs[]) { mutex_locker_t lock(runtimeLock); return map_images_nolock(count, paths, mhdrs); }Copy the code

Let’s take a look at the main process code for map_images_NOLock which basically maps all MachO files and reads them. The main methods are:

  • preopt_init
  • _getObjcSelectorRefs
  • _read_images

The one directly related to class loading is _read_images

/*********************************************************************** * map_images_nolock * Process the given images which are being mapped in by dyld. * All class registration and fixups are performed (or deferred pending * discovery of  missing superclasses etc), and +load methods are called. * * info[] is in bottom-up order i.e. libobjc will be earlier in the * array than any library that links to libobjc. * * Locking: loadMethodLock(old) or runtimeLock(new) acquired by map_images. **********************************************************************/ #if __OBJC2__ #include "objc-file.h" #else #include "objc-file-old.h" #endif void map_images_nolock(unsigned mhCount, const char * const mhPaths[], const struct mach_header * const mhdrs[]) { static bool firstTime = YES; header_info *hList[mhCount]; uint32_t hCount; size_t selrefCount = 0; if (firstTime) { preopt_init(); } hCount = 0; int totalClasses = 0; int unoptimizedTotalClasses = 0; { uint32_t i = mhCount; while (i--) { const headerType *mhdr = (const headerType *)mhdrs[i]; auto hi = addHeader(mhdr, mhPaths[i], totalClasses, unoptimizedTotalClasses); if (! hi) { continue; } if (mhdr->filetype == MH_EXECUTE) { #if __OBJC2__ if ( ! hi->hasPreoptimizedSelectors() ) { size_t count; _getObjc2SelectorRefs(hi, &count); selrefCount += count; _getObjc2MessageRefs(hi, &count); selrefCount += count; } #else _getObjcSelectorRefs(hi, &selrefCount); #endif } hList[hCount++] = hi; } } if (firstTime) { sel_init(selrefCount); arr_init(); } if (hCount > 0) { _read_images(hList, hCount, totalClasses, unoptimizedTotalClasses); } firstTime = NO; // Call image load funcs after everything is set up. for (auto func : loadImageFuncs) { for (uint32_t i = 0; i < mhCount; i++) { func(mhdrs[i]); }}}Copy the code

_read_imagesBreak down

DoneOnce is something you only do once

After I remove the rest of the printing and exception determination logic from this code block, this code is much clearer. Two things have been done

  • Initializers for small objectsinitializeTaggedPointerObfuscatorLook at the comment is added random code in order to prevent hackers with similar small objects.
  • Initializes a total table of initializersgdb_objc_realized_classesTo facilitate subsequent search and insertion. This is not the same as the source code published earlier, the previous initialized class table is also initialized here, now moved to an earlier location inruntimeinitMethod is initialized
if (! doneOnce) { doneOnce = YES; launchTime = YES; initializeTaggedPointerObfuscator(); int namedClassesSize = (isPreoptimized() ? unoptimizedTotalClasses : totalClasses) * 4 / 3; gdb_objc_realized_classes = NXCreateMapTable(NXStrValueMapPrototype, namedClassesSize); }Copy the code

Repair method pointer, the source code is as follows, mainly to repair some methods on the abnormal situation

// Fix up @selector references

// Fix up @selector references static size_t UnfixedSelectors; { mutex_locker_t lock(selLock); for (EACH_HEADER) { if (hi->hasPreoptimizedSelectors()) continue; bool isBundle = hi->isBundle(); SEL *sels = _getObjc2SelectorRefs(hi, &count); UnfixedSelectors += count; for (i = 0; i < count; i++) { const char *name = sel_cname(sels[i]); SEL sel = sel_registerNameNoLock(name, isBundle); if (sels[i] ! = sel) { sels[i] = sel; }}}}Copy the code

Discover classes and fix unresolved future classes.

  • First, determine if DyLD is optimizing for classes that don’t need to be read immediately, and if so, leave them alone
  • Then, read all the segments in the MachO file and pass_getObjc2ClassListTo get all the classes, useclasslistCollect.
  • Go through all the classes, passreadClassAdd these read classes to the master table of classes. The logic of this function is broken down in detail later
  • Handle exceptions in which future classes are disposed, rememory the total number of future classes, and save assignments
// Discover classes. Fix up unresolved future classes. Mark bundle classes. bool hasDyldRoots = dyld_shared_cache_some_image_overridden(); for (EACH_HEADER) { if (! mustReadClasses(hi, hasDyldRoots)) { // Image is sufficiently optimized that we need not call readClass() continue; } classref_t const *classlist = _getObjc2ClassList(hi, &count); bool headerIsBundle = hi->isBundle(); bool headerIsPreoptimized = hi->hasPreoptimizedClasses(); for (i = 0; i < count; i++) { Class cls = (Class)classlist[i]; Class newCls = readClass(cls, headerIsBundle, headerIsPreoptimized); if (newCls ! = cls && newCls) { // Class was moved but not deleted. Currently this occurs // only when the new class resolved a future class. // Non-lazily realize the class below. resolvedFutureClasses = (Class *) realloc(resolvedFutureClasses, (resolvedFutureClassCount+1) * sizeof(Class)); resolvedFutureClasses[resolvedFutureClassCount++] = newCls; }}}Copy the code

Marks classes that have already been bound. The headerIsBundle here was confusing to me for a while, Until I found the following places in the Class readClass(Class CLS, bool headerIsBundle, bool headerIsPreoptimized) method where this variable should be used. The comment on this code means that, for future reasons, the shared cache cannot contain the bundles in the MachO header (which are marked as bound classes).

// for future reference: shared cache never contains MH_BUNDLEs
    if (headerIsBundle) {
        cls->data()->flags |= RO_FROM_BUNDLE;
        cls->ISA()->data()->flags |= RO_FROM_BUNDLE;
    }
Copy the code

The RO_FROM_BUNDLE macro definition explains that unloadable bound classes must not be assigned by the linker. What do you mean? That is, my bound class is not to be handled by your linker at link time, and my bound class is to be handled at some point in the future.

// class is in an unloadable bundle - must never be set by compiler
#define RO_FROM_BUNDLE        (1<<29)
Copy the code

Fix the old MSG_send method, if repair is supported

This paragraph is nothing to talk about, just to deal with some exceptions or to adapt to the previous version of things

#if SUPPORT_FIXUP
    // Fix up old objc_msgSend_fixup call sites
    for (EACH_HEADER) {
        message_ref_t *refs = _getObjc2MessageRefs(hi, &count);
        if (count == 0) continue;
        if (PrintVtables) {
            _objc_inform("VTABLES: repairing %zu unsupported vtable dispatch "
                         "call sites in %s", count, hi->fname());
        }
        for (i = 0; i < count; i++) {
            fixupMessageRef(refs+i);
        }
    }
#endif
Copy the code

Discover protocols and repair protocol Pointers

This section deals with fixes for protocol references

  • Prepare the hash table of the protocol first, and skip it if it is an optimized protocol during startup
  • It then iterates over the hash table of the insert protocol

There’s a question here, don’t you need a protocol when you start up? Apple notes that skip protocol reading is limited to protocols in the shared cache, and roots is supported. It does use protocols at startup, but only in the shared cache. How do you tell the difference? A protocol in a shared cache with the isCanonical() tag is the only one that is accepted, but it might be different if you choose other non-shared cache secondary files as the specification definition.

// Discover protocols. Fix up protocol refs. for (EACH_HEADER) { extern objc_class OBJC_CLASS_$_Protocol; Class cls = (Class)&OBJC_CLASS_$_Protocol; ASSERT(cls); NXMapTable *protocol_map = protocols(); bool isPreoptimized = hi->hasPreoptimizedProtocols(); if (launchTime && isPreoptimized) { if (PrintProtocols) { _objc_inform("PROTOCOLS: Skipping reading protocols in image: %s", hi->fname()); } continue; } bool isBundle = hi->isBundle(); protocol_t * const *protolist = _getObjc2ProtocolList(hi, &count); for (i = 0; i < count; i++) { readProtocol(protolist[i], cls, protocol_map, isPreoptimized, isBundle); }}Copy the code

Repair protocol reference

The idea here is similar to fixing class references. At startup, we know that the optimized image file reference points to the shared cache definition of the protocol. You can skip it on startup, but you must access @Protocol refs to get the protocol reference in the shared cache image loaded later.

for (EACH_HEADER) { // At launch time, we know preoptimized image refs are pointing at the // shared cache definition of a protocol. We can skip the check on // launch, but have to visit @protocol refs for shared cache images // loaded later. if (launchTime && hi->isPreoptimized()) continue; protocol_t **protolist = _getObjc2ProtocolRefs(hi, &count); for (i = 0; i < count; i++) { remapProtocolRef(&protolist[i]); }}Copy the code

Found that the classification

Here find & load classification. According to Apple’s source code notes, the following explanation is provided to ensure that the classification can be displayed at startup only after the initialization of the classification file has been processed, the discovery & load process is deferred after the first execution of load_images and the registration callback has been completed. Whether through didInitialAttachCategories global and static variables control execution. Discovery timing for categories: When other threads call code for the new category before fixing, discovery of the category must be delayed to avoid potential resource contention.

if (didInitialAttachCategories) {
        for (EACH_HEADER) {
            load_categories_nolock(hi);
        }

Copy the code

Implement non-lazy-loaded classes (classes that implement load methods and statically instantiated methods)

This is a loop through the non-lazy-loaded classes, making sure they have been added to the class table, and then implementing all the non-lazy-loaded classes. For swift,

        classref_t const *classlist = hi->nlclslist(&count);
        for (i = 0; i < count; i++) {
            Class cls = remapClass(classlist[i]);
            if (!cls) continue;

            addClassTableEntry(cls);
            if (cls->isSwiftStable()) {
                if (cls->swiftMetadataInitializer()) {
                    _objc_fatal("Swift class %s with a metadata initializer "
                                "is not allowed to be non-lazy",
                                cls->nameForLogging());
                }
                // fixme also disallow relocatable classes
                // We can't disallow all Swift classes because of
                // classes like Swift.__EmptyArrayStorage
            }
            realizeClassWithoutSwift(cls, **nil**);
        }
    }
Copy the code

Traversing the non-lazy-loaded class table ensures that non-lazy-loaded classes are added to the class table

This is where variable implementation of non-lazy-loaded classes begins

Check if it is a Swift class

An error is reported if the stable Swift class also has a metaclass initialization method

Implement all non-lazy-loaded classesrealizeClassWithoutSwift

Realization of non-Swift classes, this method is an important method of class realization, which involves the processing of RW, RO, and recursive realization of parent class and metaclass, the establishment of parent class and subclass relationship and the assignment of other relevant flag bits, is the most important data processing link in the class. The corresponding method to the Swift class is _objc_realizeClassFromSwift

Verify that you have been added to the class table

For classes that are already implemented, processing is returned directly, and if not implemented, processing continues

Determine whether the class is future
  • If it is a future class, assign values to rw and ro directly from data()
  • If not a future class, open up an RW memory space for RO data copy.
  • If the parent class and metaclass are not already implemented. So do it recursively.
  • If SUPPORT_NONPOINTER_ISA is supported if it is a metaclass, set it to FAST_CACHE_REQUIRES_RAW_ISA if notinstancesRequireRawIsa()Set and fetchsupercls->getSuperclass()Determine if recursive processing is required
Keep parent and metaclass data up to date

Update the parent and metaclass to prevent data changes due to remapping

Adjust memory offset for instance variables

Adjust the memory offset of the instance variable (this will involve cases where the ivars of the parent class may have changed, so the memory offset of the parent class in the instance variable must be adjusted)

Sets the inner size required for fast initialization

Set the memory size of the quick initialization instance variable if it has not already been set. The main focus here is to improve efficiency during initialization

Copy some flag bits from ro -> RW

Copy some flag bit values, CXX construction and destructor flag bits, from ro to RW

Disable flag bit setting for managed objects

Copies the forbidden flag bit of the associated object from the parent class or ro

Adds the current class to the subclass table of the parent class

Add the current class to the subclass table of the parent class addSubclass(supercls, CLS); If there is no parent class it is the root class to add to the root class.

Add the classification

methodizeClass(cls, previously); This is where the classification of non-lazy-loaded classes is added

Extra thinking

This is where non-lazily loaded classes come in, so what about lazily loaded classes? Don’t you deal with them? How can lazy classes trigger class loading? First, lookUpImpOrForward: lazy load means that it is loaded only when it is used, so it is used only when the related method is called. Then it must lookUpImpOrForward into the method. We came to this method was found in realizeAndInitializeIfNeeded_locked downwards to find again, found it

if (slowpath(! cls->isRealized())) { cls = realizeClassMaybeSwiftAndLeaveLocked(cls, runtimeLock); // runtimeLock may have been dropped but is now locked again }Copy the code

Look at this writing will know the next step is to load the class through a layer of interface between realizeClassMaybeSwiftMaybeRelock came here again or realizeClassWithoutSwift this familiar way

if (! cls->isSwiftStable_ButAllowLegacyForNow()) { // Non-Swift class. Realize it now with the lock still held. // fixme wrong  in the future for objc subclasses of swift classes realizeClassWithoutSwift(cls, nil); if (! leaveLocked) lock.unlock(); } else { // Swift class. We need to drop locks and call the Swift // runtime to initialize it. lock.unlock(); cls = realizeSwiftClass(cls); ASSERT(cls->isRealized()); // callback must have provoked realization if (leaveLocked) lock.lock(); }Copy the code

Implement the newly parsed future classes to prevent CF from manipulating them

Iterate over all parsed future classes

  • Check if the swIF class is stable. If it is, report an error. This operation is not allowed in SWIFT
  • To implement the class
  • All subclasses of this class require native ISA Pointers

Debug print

There are two main steps

  • If you debug reliable Ivars, then implement all classes
  • If you need to print pre-optimized information, print pre-optimized information about the optimized methods, dyLD optimizations, pre-optimized classes, methods that have not been pre-optimized, the percentage of methods that have been pre-classified, the percentage of classes that have been pre-registered, and how many protocols have not been pre-optimized.

conclusion

Let’s go back to some of the questions from the beginning, okay?

How did our code get loaded in?

1, through this a series of steps to follow, we find that in fact we write code at compile time been packaged into a MachO file, when to start the class information directly from the MachO read from a file, if readers are interested in this process is recommended to read the self-improvement of the programmers will load the compilation book. After the data is read out, it is mapped to the corresponding data table, the technical term is called hash.

What exactly does a class contain, and when was it put together?

I’ll just say the main thing here, class_data_bits_t bits; This contains the main data of the class, including properties, methods, protocols, member variables, and so on. The assembly time here is in the realizeClassWithoutSwift method. If you need to learn more, please go to my other article about class introduction juejin.cn/post/684490…

Can things inside the class be modified? What can be modified? Why is that?

In the process of realizeClassWithoutSwift, we found that ro in RW is actually read directly from MachO file and only readable, while RW is different, copied from RO and readable and writable. That means things can be modified in RW, but not in RO. But don’t we have an interface for dynamically injecting class runtime? Can’t you just add it? To illustrate this let’s look at what happens when you dynamically register a class. What happens when you add ivar?

void objc_registerClassPair(Class cls) { mutex_locker_t lock(runtimeLock); checkIsKnownClass(cls); if ((cls->data()->flags & RW_CONSTRUCTED) || (cls->ISA()->data()->flags & RW_CONSTRUCTED)) { _objc_inform("objc_registerClassPair: class '%s' was already " "registered!" , cls->data()->ro->name); return; } if (! (cls->data()->flags & RW_CONSTRUCTING) || ! (cls->ISA()->data()->flags & RW_CONSTRUCTING)) { _objc_inform("objc_registerClassPair: class '%s' was not " "allocated with objc_allocateClassPair!" , cls->data()->ro->name); return; } // Clear "under construction" bit, set "done constructing" bit cls->ISA()->changeInfo(RW_CONSTRUCTED, RW_CONSTRUCTING | RW_REALIZING); cls->changeInfo(RW_CONSTRUCTED, RW_CONSTRUCTING | RW_REALIZING); // Add to named class table. addNamedClass(cls, cls->data()->ro->name); }Copy the code

The construction of a symbol RW_CONSTRUCTED could not be remedied if it was already in the zonal state of RW_CONSTRUCTED, This is why ivAR cannot be added after registration.

// Can only add ivars to in-construction classes. if (! (cls->data()->flags & RW_CONSTRUCTING)) { return NO; }Copy the code