IOS weak underlying implementation principle of (a) : SideTable | s, weak_table_t, weak_entry_t data structure

In order to fully and thoroughly understand the working principle of the weak keyword, we start from the bottom data structure to build a complete cognitive system.

template class DisguisedPtr

Template

class DisguisedPtr is a template tool class defined in Project Headers/ objC-privately. h, The main function is to convert the T pointer (the address of the TYPE T variable) into an unsigned long, to achieve the mutual mapping of Pointers to integers, to play the role of pointer camouflage, so that the pointer hidden in system tools (such as Leaks tool). In the DisguisedPtr global search in OBJC4-781, it is found that the abstract type T is only used as objC_object and objc_object *. When the abstract type T is objc_object *, it is used to hide the address of __weak.

DisguisedPtr acts like pointer type T*, except the stored value is disguised to hide it from tools like leaks. nil is disguised as itself so zero-filled memory works as expected, which means 0x80.. 00 is also disguised as itself but we don’t care. Note that weak_entry_t knows about this encoding.

DisguisedPtr is similar to the pointer type T * except that the stored value is disguised as hidden against a tool such as “Leaks”. Nil itself is disguised, so the memory of the zero value can work as expected, allowing a nil pointer to perform its operations as a non-nil pointer would without crashing the program. This means that 0x80.. 00 itself is disguised, but we don’t care. Note that weak_ ENTRy_t knows this code.

template <typename T>
class DisguisedPtr {
    // Value of type unsigned long is sufficient to hold the memory address to convert to integer
    uintptr_t value;

    static uintptr_t disguise(T* ptr) {
        // Convert the address of T to unsigned long and take a negative value
        return- (uintptr_t)ptr;
    }

    static T* undisguise(uintptr_t val) {
        // convert an unsigned long val to a pointer. // convert an unsigned long val to a pointer
        return (T*)-val;
    }

 public:
    DisguisedPtr() {}// The constructor
    
    // Initialization list PTR initializes the value member variable
    DisguisedPtr(T* ptr) : value(disguise(ptr)) { }
    
    // Copy the constructor
    DisguisedPtr(const DisguisedPtr<T>& ptr) : value(ptr.value) { }

    // Overloaded operators:
    // T* Assignment function. When a T pointer is assigned to the DisguisedPtr
      
        type variable, the address-to-integer conversion occurs directly
      
    DisguisedPtr<T>& operator = (T* rhs) {
        value = disguise(rhs);
        return *this;
    }
    
    DisguisedPtr
      
       & References the assignment function
      
    DisguisedPtr<T>& operator = (const DisguisedPtr<T>& rhs) {
        value = rhs.value;
        return *this;
    }

    / / ()
    operator T* () const {
        // Unsigned long value returns the pointer
        return undisguise(value);
    }
    
    // ->
    T* operator- > ()const { 
        // Unsigned long value returns the pointer
        return undisguise(value);
    }
    
    / / ()
    T& operator * () const { 
        // Convert to a pointer and fetch what the pointer points to
        return *undisguise(value);
    }
    
    / / []
    T& operator [] (size_t i) const {
        // unsigned long value returns the pointer and finds the value at the specified subscript I
        return undisguise(value)[i];
    }

    // pointer arithmetic operators omitted 
    // because we don't currently use them anywhere
    // Omit the pointer arithmetic operator, because we don't use it anywhere at the moment.
};

// fixme type id is weird and not identical to objc_object*
// The fixme id type is strange, different from objc_object * (id? Typepedef struct objc_object *id)
/ / = =
static inline bool operator == (DisguisedPtr<objc_object> lhs, id rhs) {
    return lhs == (objc_object *)rhs;
}
/ /! =
static inline bool operator! = (DisguisedPtr<objc_object> lhs, id rhs) {returnlhs ! = (objc_object *)rhs; }Copy the code

template class StripedMap

StripedMap is a map of void* -> T, appropriately sized for cache-friendly lock striping. For example, this may be used as StripedMap<spinlock_t> or as StripedMap where SomeStruct stores a spin lock.

StripedMap is a map of void *-> T that is the size suitable for cache-friendly Lock striping. For example, it can be used as a StripedMap

or StripedMap, where SomeStruct stores the spinlock. Cache-friendly: Then the way caching works, you can see that a program with good locality has a higher probability of being hit by the cache, and in that sense, the program is faster. We call such programs cache-friendly.

Template

class StripedMap Is a hash table whose Key is void * Value and Value is T. A global search for StripedMap in OBJC4-781 found T as a SideTable and spinlock_T type.

SideTables Type: StripedMap

. Use of SideTables: SideTable *table = &SideTables()[obj] It calculates the hash value from the pointer to objc_Object, and then finds the SideTable corresponding to obj from the global hash table SideTables.

StripedMap < an > PropertyLocks: When using atomic properties, the objc_getProperty function internally obtains a lock through PropertyLocks[slot] and locks it to ensure thread-safe id value = objc_retain(*slot).

StripedMap< spinLOCK_t > StructLocks: Used to provide locks to ensure thread-safety when the objc_copyStruct function is called with the atomic argument true.

StripedMap< spinLOCK_t > CppObjectLocks: Ensures thread-safe calls to objc_copyCppObjectAtomic functions.

According to the Lock implementation section of the source code below, it is found that abstract type T must support the Lock, UNLOCK, forceReset, lockdebug_LOCK_Thenodes_LOCK function interfaces. All known struct sidetables are provided.

enum { CacheLineSize = 64 };

template<typename T>
class StripedMap {

#ifTARGET_OS_IPHONE && ! TARGET_OS_SIMULATOR
    enum { StripeCount = 8 }; // iPhone, which also indicates that there are only 8 SideTables in SideTables
#else
    enum { StripeCount = 64 }; // MAC/Simulators, 64 sidetables
#endif

    struct PaddedT {
        // The CacheLineSize value is set to 64
        // T value 64 bytes aligned
        T value alignas(CacheLineSize);
    };
    
    // An 8/64 PaddedT array. PaddedT is a structure with only one member variable, and the member variable is 64 bytes aligned.
    / / (said SideTable structure need to be a 64 - byte alignment, if the PaddedT abandon, namely the array can be directly as a SideTable array)
    PaddedT array[StripeCount];
    
    // The hash function (that is, get the hash value of the objc_object pointer)
    static unsigned int indexForPointer(const void *p) {
        // Convert the p pointer to an unsigned long
        // reinterpret_cast
      
        (expression) this is a C++ cast cast
      
        uintptr_t addr = reinterpret_cast<uintptr_t>(p);
        
        // Perform the xOR operation on the value of addr moved right 4 bits and the value of addr moved right 9 bits.
        // Then modulo StripeCount (8/64) to prevent array from crossing the line
        return ((addr >> 4) ^ (addr >> 9)) % StripeCount;
    }

 public:
    // Hash value (get the SideTable where the object resides)
    T& operator[] (const void *p) { 
        return array[indexForPointer(p)].value; 
    }
    
    // Prototype: const_cast
      
        (expression)
      
    // Const_cast This operator is used to change const or volatile of a type.
    // Type_id and expression are the same type except for const or volatile.
    Const int b => int b1; // Const int b => int b1;
    // 1. The constant pointer is converted to a non-constant pointer and still points to the original object;
    // 2. Constant references are converted to nonconstant references and still point to the original object;
    // 3. Const_cast is usually used to modify Pointers. Const char *p.
    
    // Convert this to StripedMap
      
       , and then call [] above to get T&
      
    const T& operator[] (const void *p) const {
        < const_cast
      
       >(this) < const_cast
       
        >
       
        return const_cast<StripedMap<T>>(this)[p]; 
    }

    // Shortcuts for StripedMaps of locks.
    
    // Loop to lock the value of an array element
    // In the iOS SideTables example, loop to lock 8 SideTables,
    // Struct SideTable member: spinlock_t slock; void lock() {slock.lock(); }
    void lockAll(a) {
        for (unsigned int i = 0; i < StripeCount; i++) {
            array[i].value.lock();
        }
    }
    
    // Unlock
    void unlockAll(a) {
        for (unsigned int i = 0; i < StripeCount; i++) {
            array[i].value.unlock();
        }
    }
    
    // As above, reset Lock
    void forceResetAll(a) {
        for (unsigned int i = 0; i < StripeCount; i++) { array[i].value.forceReset(); }}// Define the lock order for the value of an array element.
    void defineLockOrder(a) {
        for (unsigned int i = 1; i < StripeCount; i++) {
            lockdebug_lock_precedes_lock(&array[i- 1].value, &array[i].value); }}void precedeLock(const void *newlock) {
        // assumes defineLockOrder is also called
        // Assume the defineLockOrder function has already been called
        
        lockdebug_lock_precedes_lock(&array[StripeCount- 1].value, newlock);
    }

    void succeedLock(const void *oldlock) {
        // assumes defineLockOrder is also called
        // Assume the defineLockOrder function has already been called
        
        lockdebug_lock_precedes_lock(oldlock, &array[0].value);
    }

    StripedMap
      
        -> array (StripedMap
       
         -> array
       
    const void *getLock(int i) {
        if (i < StripeCount) return &array[i].value;
        else return nil;
    }
    
    // The constructor, which in DEBUG mode verifies that T is 64 bytes aligned
#if DEBUG
    StripedMap() {
        // Verify alignment expectations.
        // Verify that value (T) is aligned with the CacheLineSize (value 64) memory
        
        uintptr_t base = (uintptr_t)&array[0].value;
        uintptr_t delta = (uintptr_t)&array[1].value - base;
        ASSERT(delta % CacheLineSize == 0);
        ASSERT(base % CacheLineSize == 0);
    }
#else
    constexpr StripedMap(a) {}
#endif
};
Copy the code

weak_referrer_t

The address used to disguise the __weak variable is the address used to disguise objc_object *.

The address of a __weak variable.These Pointers are stored Overlap so memory analysis tools don’t see lots of interior pointers from the weak table into objects.

The address of the __weak variable (objc_object **). These Pointers are stored in disguise, so memory analysis tools do not see a large number of internal Pointers from Weak table to Objects.

// Where T is objc_object*, then T* in DisguisedPtr is objc_object**, which is the pointer of the pointer
typedef DisguisedPtr<objc_object *> weak_referrer_t;
Copy the code

PTR_MINUS_2

Used to identify bit-field lengths on different platforms. Here is the bit field length for num_refs in struct Weak_ENTRY_t.

// Out_of_line_ness and num_refs together share 64 bit memory space
uintptr_t        out_of_line_ness : 2; 
uintptr_t        num_refs : PTR_MINUS_2; // Num_refs is 62 bit higher or 30 bit higher depending on the platform
Copy the code

#if __LP64__
#define PTR_MINUS_2 62
#else
#define PTR_MINUS_2 30
#endif
Copy the code

WEAK_INLINE_COUNT

The internal structure stored in The weak references table. It maintains and stores a hash set of weak references pointing to an object.

If out_of_line_ness ! = REFERRERS_OUT_OF_LINE then the set is instead a small inline array.

Internal structures are stored in weak reference tables. It maintains and stores a hash of a set of weak references to an object (weak_referrer_t). If out_of_line_ness! = REFERRERS_OUT_OF_LINE (0b10), then the collection is a small inline array (weak_referRER_t array of length 4).

#define WEAK_INLINE_COUNT 4
Copy the code

REFERRERS_OUT_OF_LINE

Out_of_line_ness field overlaps with the low two bits of inline_referrers[1]. Inline_referrers [1] is a DisguisedPtr of a pointer-aligned address. The low two bits of a pointer-aligned DisguisedPtr will always be 0b00 (disguised nil or 0x80.. 00) or 0b11 (any other address). Therefore out_of_line_ness == 0b10 is used to mark the out-of-line state.

The out_of_LINe_NESS field overlaps the lower two bits of memory space in inline_referrers [1]. Inline_referrers [1] is the pointer-aligned address DisguisedPtr. The earlier two bits of the pointer-aligned DisguisedPtr are always 0b00 (the last two bits of the 8-byte aligned binary representation of the obtained address are always 0) (disguised as nil or 0x80.. 00) or 0b11 (any other address). Thus, out_of_line_ness == 0b10 can be used to mark the out-of-line status, That is, struct Weak_ENTRY_t internally uses the hash table to store weak_referrer_t instead of using the weak_referrer_t array of length 4.

#define REFERRERS_OUT_OF_LINE 2 // Binary is 0b10
Copy the code

struct weak_entry_t

The function of weak_ENTRY_t is to hold the addresses of all weak reference variables pointing to an object.

The data stored in the hash array of weak_entry_t is Typedef DisguisedPtr< objC_object *> Weak_referrer_t, which is essentially the address of the weak reference variable, Objc_object **new_referrer, by manipulating a pointer to a pointer, you can make a weak reference variable point to nil after the object is destroyed. You have to hold the address of the weak reference variable in order to set it to nil.

struct weak_entry_t {
    // The referent contains the address of an objc_Object instance incarnated as an integer, and the following weak reference variables point to this objc_Object instance
    DisguisedPtr<objc_object> referent;
    
    // Use the inline_referrers array to store the addresses of weak references when the number of weak references to the referent is less than or equal to 4.
    // If the value is greater than 4, use the referrers hash array.
    
    // Union with 32 bytes of memory space
    union {
        struct {
            weak_referrer_t *referrers; // Save the hash array of Weak_referrer_t
            
            // Out_of_line_ness and num_refs are 64-bit bit-domain storage
            uintptr_t        out_of_line_ness : 2; // Tag uses hash array or inline_referrers to save weak_referrer_t
            uintptr_t        num_refs : PTR_MINUS_2; // The number of weak_referrer_t stored in the referrers
            uintptr_t        mask; // Referrers the total length of the hash array is subtracted by 1, and the hash function is evaluated
            
            // The maximum number of hash collisions that can occur to determine whether a logical error has occurred (the number of hash collisions in the hash table must never exceed this value)
            // This value will be updated when a new weak_entry_t is created and a new weak_referrer_t is inserted, which always records the maximum offset value
            uintptr_t        max_hash_displacement;
        };
        struct {
            // Out_of_line_ness coincides with inline_referrers[1]
            // Weak_referrer_t (Dsiguised< objC_Object *>) array of length 4
            weak_referrer_t  inline_referrers[WEAK_INLINE_COUNT];
        };
    };
    
    // Returns true to indicate that the referrers hash array is used. False to indicate that the Inline_referrers array is used to save the Weak_referrer_t
    bool out_of_line(a) {
        return (out_of_line_ness == REFERRERS_OUT_OF_LINE);
    }
    
    // Use memcpy function to copy the contents of other memory to this when the system weak_entry_t assigns the value.
    // Not in the form of a copy constructor or something.
    weak_entry_t& operator= (const weak_entry_t& other) {
        memcpy(this, &other, sizeof(other));
        return *this;
    }

    // The constructor of weak_entry_t
    
    // newReferent is a pointer to the original object,
    NewReferrer is a pointer to a weak reference variable to newReferent.
    
    DisguisedPtr(T* PTR) : value(PTR) {} constructor; // Initialize the list referent(newReferent).
    // Convert newReferent to an integer and assign value to value.
    weak_entry_t(objc_object *newReferent, objc_object **newReferrer)
        : referent(newReferent)
    {
        // Put newReferrer in the array bit 0, and the DisguisedPtr constructor is also called, which converts newReferrer into an integer and saves
        inline_referrers[0] = newReferrer;
        // The loop sets the remaining three bits of the inline_referrers array to nil
        for (int i = 1; i < WEAK_INLINE_COUNT; i++) { inline_referrers[i] = nil; }}};Copy the code

The reason why fixed-length array/hash array is used inside weak_ENTRy_t should be considering that the number of weak reference variables of the instance object is generally relatively small. At this time, using the fixed-length array does not need to dynamically apply for memory space (the two structures in the union share 32 bytes of memory), but uses a contiguous memory space allocated at a time during the initialization of weak_ENTRY_T, which will improve the operation efficiency.

struct weak_table_t

Stores object IDS as keys, and Weak_entrY_T structs as their values.

Weak_table_t is a global hash table that holds weak references. Take object IDS as keys and Weak_entry_t as values.

struct weak_table_t {
    weak_entry_t *weak_entries; // Store the hash array of weak_entry_t
    size_t    num_entries; // The current number of weak_entry_t stored in weak_entries, and the number of elements stored in the hash array
    
    uintptr_t mask; // The total length of the hash array is subtracted by 1, which participates in the hash function calculation
    
    // Record the maximum offset of all entries, that is, the maximum number of times hash collisions occur,
    // This value is used to determine whether a logical error has occurred. The number of collisions in the hash table should never exceed this value.
    // The member variable is used in the weak_entry_t operation function.
    // Weak_table_t uses the open addressing method to solve the hash collision.
    // So the actual storage location of a weak_entry_t is not necessarily the location calculated by the hash function.
    
    uintptr_t max_hash_displacement;
};
Copy the code

struct SideTable

The struct SideTable definition is in the nsobject. mm file. It manages two super important things for us, one is the reference count of RefcountMap Refcnts managed object, and the other is the weak reference variable of weak_TABLE_T Weak_TABLE managed object. Refcnts are not involved in this article, but we will learn more about them later when we study objC_object. In this article, we focus on learning the content involved in Weak_TABLE.

// Template parameters
enum HaveOld { DontHaveOld = false, DoHaveOld = true }; // Whether there is an old value
enum HaveNew { DontHaveNew = false, DoHaveNew = true }; // Whether there is a new value

struct SideTable {
    spinlock_t slock; // Each SideTable has a lock, which corresponds to the lock interface functions mentioned above that the abstract type T must be StripedMap
    RefcountMap refcnts; // Manage the reference count of the object
    weak_table_t weak_table; // The hash table takes object IDS as keys and weak_entry_t as values. If the object IDS has weak references, the object weak_entry_t can be found in the hash table.
    
    // The constructor does only one thing to set the space of weak_table to 0
    SideTable() {
        // Set the memory space of sizeof(weak_table) from &weak_table to 0
        memset(&weak_table, 0.sizeof(weak_table));
    }

    // Destructor (cannot destruct)
    ~SideTable() {
        // See that SidetTable cannot be destructible, and that destructing it will terminate the operation directly
        _objc_fatal("Do not delete SideTable.");
    }
    
    // The three functions correspond to the StripedMap template abstract type T interface requirements, the internal three functions are directly called slock corresponding function
    void lock(a) { slock.lock(a); }void unlock(a) { slock.unlock(a); }void forceReset(a) { slock.forceReset(); }

    // Address-ordered lock discipline for a pair of side tables.
    
    // HaveOld and HaveNew indicate whether lock1 and lock2 exist,
    // indicates whether __weak refers to the old value and the new value to which it is currently pointing.
    
    // Lock1 represents the SideTable where the old value object resides
    // Lock2 represents the SideTable where the new value object resides
    
    // lockTwo is used to set the value of the lock, which triggers lock (C++ method overload),
    // If both have values, then both are locked, and depending on who is low, the first is locked, and then the second is locked
    template<HaveOld, HaveNew>
    static void lockTwo(SideTable *lock1, SideTable *lock2);
    
    // Unlock slock
    template<HaveOld, HaveNew>
    static void unlockTwo(SideTable *lock1, SideTable *lock2);
};

// In the source file below are the lockTwo and unlockTwo functions based on the overloading of the template parameters.
Copy the code

Struct SideTable (struct SideTable);

spinlock_t slock;: spin-lock to ensure operationSideTableIs thread-safe. Look at the two big ones in frontweak_table_t 和 weak_entry_tIf you look closely, you’ll notice that they all have one after the function nameno_lockTheir little tails are just there to remind us that their operation does not involve locking at all. In effect, they leave it to the task of making them thread-safeSideTable, as you can see belowSideTableThe provided functions are thread-safe, and this is all made possible byslockTo get it done.
RefcountMap refcntsTo:DisguisedPtr<objc_object> 为 keyIn order tosize_t 为 valueHash table used to store object reference counts (only when not in useisaOptimization orisaOptimal caseisa_tIs only used when the reference count stored inisa_tIn theuintptr_t has_sidetable_rc 和 uintptr_t extra_rcTwo fields, previously just plain lookingisaThe structure of, to finally be used here, and at this time finally knowrcIs actuallyrefcountAbbreviation for reference count. As a hash table, it uses the square probe method to take values from the hash table, andweak_table_tIs linear detection (open addressing). (RefcountMapLeave it to the reference counting article for detailed analysis.)
weak_table_t weak_tableA hash table that stores weak references to an object, yesweakFunctional implementation of the core data structure.

using spinlock_t = mutex_tt

Spinlock_t is originally a uint32_t type of unfair spinlock (the underlying implementation is replaced by the mutex OS_unfair_lock because of security issues). Non-fair means that the order in which a lock is acquired is independent of the order in which it is applied. That is, the first thread to request a lock may be the last to acquire the lock, or the thread that has just acquired the lock may immediately acquire the lock again, causing other threads to be busy-wait.

The os_UNfair_LOCK member variable _OS_UNfair_LOCK_opaque records information about the thread that obtains the lock. Only the thread that obtains the lock can unlock the lock.

OS_UNFAIR_LOCK_AVAILABILITY
typedef struct os_unfair_lock_s {
    uint32_t _os_unfair_lock_opaque;
} os_unfair_lock, *os_unfair_lock_t;
Copy the code

Os_unfair_lock_opaque specifies the value of the uint32_t. If the value is greater than 0, the lock is available. If the value is equal to or less than 0, the lock has been acquired by another thread and has not been unlocked. The current thread must wait (or block until it can) to acquire the lock.

template class ExplicitInit

We cannot use a C++ static initializer to initialize certain globals because libc calls us before our C++ initializers run. We also don’t want a global pointer to some globals because of the extra indirection. ExplicitInit / LazyInit wrap doing it the hard way.

We cannot use C++ static initializer to initialize some global variables because libc will call us before C++ static initializer is called. Because of the extra indirection, we also don’t need global Pointers to point to some global variable. ExplicitInit/LazyInit Wrap is hard to do.

template <typename Type>
class ExplicitInit {

    // typedef unsigned char uint8_t; The actual type of an int of length 1 byte is unsigned char.
    
    // alignas(Type) indicates that _storage memory alignment is the same as abstract Type Type,
    // Storage is a uint8_t array of sizeof(Type)

    alignas(Type) uint8_t _storage[sizeof(Type)];

public:

    // c++11 has added variable length templates, Ts is the plural of T,
    // What if we want to avoid this transition?
    // We need a way to forward the parameters to another function of the original type so that it is perfect. This is called a perfect forward.
    // STD ::forward can hold the lvalue or Rvalue properties of a parameter.
    
    / / initialization
    template <typename. Ts>void init(Ts &&... Args) {
        new (_storage) Type(std::forward<Ts>(Args)...) ; }Type &get(a) {
        // Force the start address of the _storage array to Type *
        return *reinterpret_cast<Type *>(_storage); }};Copy the code

static StripedMap& SideTables()

SideTables is a static global hash table of type StripedMap

. It is a hash array of fixed length 8 on the iPhone, and a hash array of fixed length 64 on the MAC. It has a simple hash function that calculates the hash value based on the void * input. And then get the corresponding T in the hash array based on the hash value. In SideTables, T is a SideTable.

// ExplicitInit internal _storage array length: alignAS (StripedMap
      
       ) sizeof(StripedMap
       
        )
       
static objc::ExplicitInit<StripedMap<SideTable>> SideTablesMap;

static StripedMap<SideTable>& SideTables(a) {
    return SideTablesMap.get(a); }Copy the code

SideTables() defines several lock-related global functions. The internal implementation is the function interface supported by calling StripedMap’s template abstract type T. The corresponding T type of SideTables is SideTable. SideTable executes a function that calls its spinLOCK_t slock member variable. The mechanism of separate lock is adopted here, that is, one SideTable lock to reduce the blocking pressure when multiple objects are processed in parallel.

weak_entry_for_referent

Return the weak reference table entry for the given referent. If there is no entry for referent, return NULL. Performs a lookup.

Based on the given referent (our object variable) and the weak_table_t hash table, look for the weak_ENTRY_t (the hash table that holds the addresses of all the weak reference variables pointing to the referent) and return it, or NULL if not found.

/** * @param Weak_table can find the SideTable->weak_table_t in the global SideTables by &sidetables ()[referent]. * @param referent The object. Must not be nil. * @return The table of weak referrers to this object. The return value is a pointer to weak_ENTRy_t, which holds the addresses of all weak reference variables of referent. * /
static weak_entry_t *
weak_entry_for_referent(weak_table_t *weak_table, objc_object *referent)
{
    ASSERT(referent);
    
    // The entry of hash array in weak_table_t
    weak_entry_t *weak_entries = weak_table->weak_entries;
    
    if(! weak_entries)return nil;
    
    // hash_pointer The return value of the hash function does and with the mask to prevent the index from crossing the line.
    size_t begin = hash_pointer(referent) & weak_table->mask;
    
    size_t index = begin;
    size_t hash_displacement = 0;
    
    Weak_table ->weak_entries[index] = weak_entry_t; // Weak_table ->weak_entries[index] = weak_entry_t
    while(weak_table->weak_entries[index].referent ! = referent) {// If a hash conflict occurs, +1 continues probing (open addressing)
        index = (index+1) & weak_table->mask;
        
        // If index is incremented by 1 each time to the value equal to BEGIN, the bad_weak_TABLE fatal error is triggered if weak_entry_t is not found
        if (index == begin) bad_weak_table(weak_table->weak_entries);
        
        // Record how far the probe is offset
        hash_displacement++;
        
        If max_hash_displacement exceeds the max_displacement of weak_table_t,
        If there is no referent for weak_entry_t in weak_table, then nil is directly returned.
        if (hash_displacement > weak_table->max_hash_displacement) {
            returnnil; }}// Find weak_entry_t, then take its address and return it.
    return &weak_table->weak_entries[index];
}
Copy the code

hash_pointer

// The hash function does and with the mask to prevent index from crossing the boundary
size_t begin = hash_pointer(referent) & weak_table->mask;
Copy the code

Hash_pointer (referent) calls the general pointer hash function, and the following & Weak_table ->mask bit operation ensures that the BEGIN obtained will not cross the boundary. It is the same function as the module operation (%) we use daily, but it is changed to a bit operation, which improves the efficiency.

The mask & operation ensures that begin does not cross the boundary

Firstly, the value of mask is always 2 to the power of N minus 1. According to the weak_grow_Maybe function, we can see that the length of hash array (weak_entry_t *weak_entries) is at least 64. = 2 ^ 6 (N >= 6); = 2 ^ 6 (N >= 6); = 2 ^ N (N >= 6); = 2 ^ N (N >= 6); 0x0111111(64-1, N = 6), 0x01111111(128-1, N = 7)…. , that is, in the binary representation of mask, the last N bit is always 1, and the first bit is always 0. Therefore, the result of any operation with mask is always within the interval of [0, mask]. For example, if you do and with any number 0x0111111(64-1, N = 6), the result will always be in the interval [0, 63]. This is the subscript range of weak_ENTRY_T * Weak_entries array.

Take a look at the hash_pointer function:

/** * Unique hash function for object pointers only. Unique hash functions apply only to object Pointers. * @param key The object pointer * * @return Size unrestricted hash of pointer. */
static inline uintptr_t hash_pointer(objc_object *key) {

    // typedef unsigned long uintptr_t;
    // Convert the pointer to an unsigned long and call the ptr_hash function
    
    return ptr_hash((uintptr_t)key);
}

// The ptr_hash function distinguishes between 64-bit and 32-bit cases.

#if __LP64__
static inline uint32_t ptr_hash(uint64_t key)
{
    key ^= key >> 4; // Move the key right 4 bits, and then perform xOR bit operation with the original key
    key *= 0x8a970be7488fda55; // 0x8a970be7488fda55 multiply with key
    key ^= __builtin_bswap64(key); // Invert each 64-bit byte to perform xOR with the key
    return (uint32_t)key; // Uint32_t = uint32_t
}
#else
static inline uint32_t ptr_hash(uint32_t key)
{
    key ^= key >> 4;
    key *= 0x5052acdb;
    key ^= __builtin_bswap32(key);
    
    return key;
}
#endif
Copy the code

Add and remove the referrer (address of the weak variable) to the declaration of the function where weak_entry_t and weak point are set to nil

Weak_table_t is the following four function declaration, here we just look at their role, the specific analysis process in the “iOS Weak underlying implementation principle (two) : ObjC-Weak function list full analysis”.

weak_register_no_lock

Add a pair of object weak Pointers to the weak reference list. (That is, when an object has the first weak variable pointing to it, the object will be registered in the hash table of weak_table_T at this time, and the address of the first weak variable will be saved in the hash table of weak_ENTRY_T of the object, if this weak variable is not the first one, then the object will be registered in the hash table of weak_TABLE_T. This indicates that the object already exists in the weak_TABLE_T hash table at this time, and only the address of the weak variable pointing to it needs to be saved into the weak_ENTRY_T hash table of the object.

/// Adds an (object, weak pointer) pair to the weak table.
id weak_register_no_lock(weak_table_t *weak_table, id referent, 
                         id *referrer, bool crashIfDeallocating);
Copy the code

weak_unregister_no_lock

Remove a pair (object, weak Pointer) from the weak reference list. Remove the address of a weak variable from the object’s weak_ENTRY_t hash table.

/// Removes an (object, weak pointer) pair from the weak table.
void weak_unregister_no_lock(weak_table_t *weak_table, id referent, id *referrer);
Copy the code

weak_is_registered_no_lock

Return true if an object is somewhere in the weak reference table, that is, the object is stored in the weak reference table (the object has weak references).

#if DEBUG
/// Returns true if an object is weakly referenced somewhere.
bool weak_is_registered_no_lock(weak_table_t *weak_table, id referent);
#endif
Copy the code

weak_clear_no_lock

This function is called when the object is destroyed. Set all remaining __weak variables to nil, which is exactly what we talk about every day: __weak is set to nil when the object it points to is destroyed.

/// Called on object destruction. Sets all remaining weak pointers to nil.
void weak_clear_no_lock(weak_table_t *weak_table, id referent);
Copy the code

Adjust the length of the weak_table_T hash array

Take weak_table_t as parameter, call the two functions of weak_grow_Maybe and weak_compact_Maybe, which are used to adjust the length of weak_table_T hash array in time when it is too full or too empty, and optimize the memory usage efficiency. And improve the efficiency of hash search. Both functions adjust the length of the weak_table_t hash array by calling the weak_resize function.

weak_grow_maybe

This function is called between creating a weak_entry_t and adding new_entry to a weak_table_t hash array. Let’s look at its implementation below.

// Grow the given zone's table of weak references if it is full. If the weak reference table for a given region is full, it is extended.
static void weak_grow_maybe(weak_table_t *weak_table)
{
    // Get the total length of the current hash array.
    // #define TABLE_SIZE(entry) (entry->mask ? entry->mask + 1 : 0)
    Mask + 1 indicates the total length of the current Weak_TABLE hash array.
    
    size_t old_size = TABLE_SIZE(weak_table);

    // Grow if at least 3/4 full.
    // If the current number of weak_entry_t stored in the hash array exceeds three-quarters of the total length, expand the capacity.
    
    if (weak_table->num_entries >= old_size * 3 / 4) {
        // If the length of the hash array of weak_table is 0, the initial length of the hash array is 64; if not, the capacity is expanded to double the previous length (old_size*2).
        weak_resize(weak_table, old_size ? old_size*2 : 64); }}Copy the code

This function is used to expand the length of weak_ENTRY_T * Weak_entries of Weak_table_t, provided that NUM_entries exceed 3/4 of mask + 1. It can be seen that the initialization length of Weak_entries is 64, and the length of each expansion is twice that of mask + 1. After expansion, weak_ENTRy_T in the original hash array will be re-hashed and inserted into the new space, and all member variables of Weak_TABL_T will be updated. The total capacity of the occupied memory space is (mask + 1) * sizeof(Weak_ENTRY_t) bytes. So mask plus 1 is always 2 to the N. (initially N is 6:2 ^6 = 64, then N >= 6)

weak_compact_maybe

This function will be called in the weak_entry_remove function. The purpose of this function is to remove weak_entry_t from the hash array of weak_table_T, if the hash array has a low usage, Reduce the length of weak_ENTRY_t * Weak_entries to optimize memory usage and improve hash efficiency. Let’s see the implementation below:

// Shrink the table if it is mostly empty. That is, when most space of weak_ENTRY_T * Weak_Entries array of Weak_table_T is empty, the length of weak_entries should be reduced.
static void weak_compact_maybe(weak_table_t *weak_table)
{
    // Count the total length of the current hash array.
    // #define TABLE_SIZE(entry) (entry->mask ? entry->mask + 1 : 0)
    
    size_t old_size = TABLE_SIZE(weak_table);

    // Shrink if larger than 1024 buckets and at most 1/16 full.
    // If old_size is larger than 1024 and smaller than 1/16, reduce the space usage.
    
    if (old_size >= 1024  && old_size / 16 >= weak_table->num_entries) {
        // Reduce the size to 1/8 of ols_size
        weak_resize(weak_table, old_size / 8);
        
        // Reduce to 1/8 and less than 1/16, the two conditions are combined to ensure that the reduced capacity ratio is less than 1/2.
        // leaves new table no more than 1/2 full}}Copy the code

The condition for reducing the length of weak_ENTRY_T * Weak_entries is that the current total length exceeds 1024 and the capacity usage is less than 1/16, and the space of weak_entries is reduced to 1/8 of the current space.

weak_resize

The public function, Weak_RESIZE, is called when the space is enlarged or shrunk. The input parameter is weak_table_t and a specified length value.

static void weak_resize(weak_table_t *weak_table, size_t new_size)
{
    // Get the total length of the current hash array.
    // old_size = mask + 1; 
    size_t old_size = TABLE_SIZE(weak_table);
    
    // Get the starting address of the old Weak_entries hash array.
    weak_entry_t *old_entries = weak_table->weak_entries;
    
    // Apply for a specified length of space for the new Weak_entries hash array and return the starting address.
    // The total memory capacity is new_size * sizeof(Weak_entry_t).
    weak_entry_t *new_entries = (weak_entry_t *)calloc(new_size, sizeof(weak_entry_t));
        
    // Update the mask, still with the total length minus 1
    weak_table->mask = new_size - 1;
    // Update the start address of the hash array
    weak_table->weak_entries = new_entries;
    
    // The maximum hash collision offset, which defaults to 0
    weak_table->max_hash_displacement = 0;
    // The current number of occupied hash arrays, which defaults to 0
    weak_table->num_entries = 0;  // restored by weak_entry_insert below
    
    // Re-hash the data from the old hash array into the new space.
    // The two member variables of the above weak_table_t (default 0) will be updated in the below weak_entry_INSERT function.
    
    // If there is an old weak_entry_t that needs to be put into the new space
    if (old_entries) {
        weak_entry_t *entry;
        // The end of the old hash array
        weak_entry_t *end = old_entries + old_size;
        
        // Make a loop call to weak_entry_INSERT to insert weak_entry_t from the old hash array into the new hash array
        for (entry = old_entries; entry < end; entry++) {
            if (entry->referent) {
                weak_entry_insert(weak_table, entry); }}// Finally free the memory space of the old hash array.
        free(old_entries); }}Copy the code

weak_entry_insert

Add weak_ENTRY_T to weak_table_T -> Weak_entries.

/** * Add new_entry to the object's table of weak references. Add new_entry to the hash table that holds the address of the object's weak variable. * Does not check whether the referent is already in the table. There is no need to check whether the reference object is already in the table. * /
static void weak_entry_insert(weak_table_t *weak_table, weak_entry_t *new_entry)
{
    // The starting address of the hash array
    weak_entry_t *weak_entries = weak_table->weak_entries;
    ASSERT(weak_entries ! = nil);// Call the hash function to find the position of new_entry in the hash array of Weak_table_t. Hash conflict may occur. The principle of &mask is the same as above.
    size_t begin = hash_pointer(new_entry->referent) & (weak_table->mask);
    
    size_t index = begin;
    size_t hash_displacement = 0;
    
    while(weak_entries[index].referent ! = nil) {// If a hash conflict occurs, +1 continues to probe down
        index = (index+1) & weak_table->mask;
        
        // If index is incremented by 1 every time when the value is equal to BEGIN, but no null position is found, a fatal error is triggered.
        if (index == begin) bad_weak_table(weak_entries);
        
        Update max_hash_displacement by recording the displacement value
        hash_displacement++;
    }
    
    // new_entry into the hash array
    weak_entries[index] = *new_entry;
    
    / / update the num_entries
    weak_table->num_entries++;
    
    // This operation is recording the maximum offset value when hash conflict occurs in weak_table_t hash array
    if(hash_displacement > weak_table->max_hash_displacement) { weak_table->max_hash_displacement = hash_displacement; }}Copy the code

Combining with the weak_entrY_INSERT function, we can know the overall function of the Weak_resize function. When the function expands or shrinks the length of the hash array, it first applies for the corresponding size of memory according to new_size, and the new_entries pointer points to the newly applied memory. Set the mask of weak_table to new_SIze-1. Here, the function of mask is to record the memory boundary of the total capacity of Weak_table. In addition, mask is also used in the hash function to ensure that index will not cross the boundary of the hash array.

Hash collisions may occur in the hash array of Weak_table_t, and weak_table_T uses open addressing to handle collisions. In the event of a collision, the next space adjacent (or from the beginning if you are already at the end) is looked for. Max_hash_displacement Records the maximum displacement value of the current Weak_table. This value is used elsewhere, for example: When the weak_entry_for_referent function looks for the entry in the weak reference table for the given referent, If the value of hash_displacement exceeds the value of weak_table->max_hash_displacement, it indicates that there is no desired weak_entry_t.

This article mainly learned the weak related data structure, as well as from the global SideTable-> Weak_TABLE to find the address of the weak reference to save the object weak_ENTRY_T, And the length adjustment mechanism for weak_table_T -> Weak_entries hash array. In the next part we formally enter the implementation of Weak.

Refer to the link

Reference link :🔗

Use intptr_t and uintptr_t
Object Runtime — Weak
OC Runtime Weak (2) – weak_entry_t
IOS Associated object – DisguisedPtr
Objective-c runtime – Dynamic features
Objective-c Runtime mechanism (7) — SideTables, SideTable, Weak_TABLE, Weak_ENTRY_T
An interesting phenomenon (apple bug Or pit?) , about distinguishing between real and emulator precompiled macros
IOS manages the data structure and operation algorithm of object memory –SideTables, RefcountMap, Weak_table_T-2
C++11 variable parameter template (function template, class template)
STD :: Forward
LLVM data structure and memory allocation strategy – DenseMap
In RunTime, SideTables, SideTable, weak_TABLE, weak_ENTRY_t