OC Basic principles of exploration document summary

We usually use associative objects in classification to implement operations on the same property in setter and getter methods, but what is the function of associative objects, and how does the underlying layer of associative objects store and obtain the property, this article explores the underlying layer of associative objects

Main Contents:

  1. Use of associated objects
  2. Underlying analysis of associated objects

1. Use of associated objects

1.1 Introduction to Associated Objects

We usually use associative objects in categorization. Because properties cannot be used directly in a classification, implementations of setter and getter methods and member variables are not automatically generated when properties are created, nor can member variables be used even if we add methods manually. By associating member variables in different functions, we can use attributes in classification, that is, associating stored and retrieved member variables as the same value.

1.2 Why use associated Objects

  • There are no member variables or implementations of setter and getter methods in the properties of the classification
  • While implementations of setter and getter methods can be customized, member variables cannot be written themselves because they cannot be stored in the class’s RO, which is read-only and can only be read, not written
  • Without member variables, setter/getter methods have no object to operate on
  • So in order to associate the variables that the two methods operate on, that is, to make sure that the values stored and fetched are the same, you need to associate the objects.
  • So if you only have getter methods, no setter methods, you don’t need to associate.
  • Of course, as long as different functions operate on the same value, you need to associate objects, not just setters and getters

1.3 Simple use of associated objects

Set properties

@interface WYPerson (cate)

@property (nonatomic,copy) NSString *cate_name;
@end
Copy the code

instructions

  • To create such a property in a class, there is only a setter/getter declaration, no implementation, and no _cate_name member variable
  • We can’t use setters, getters, _cate_name directly
  • Even if we implement these methods manually, there are no member variables to use
  • So the next step is to use the association object in the custom method

Custom setter implementation

- (void)setCate_name:(NSString *)cate_name{/* 1: object 2: identifier 3: value 4: Policy */ objc_setAssociatedObject(self, "cate_name", cate_name, OBJC_ASSOCIATION_COPY_NONATOMIC); }Copy the code

instructions

  • The four values passed in the setter method, object, identifier, value value, and policy
  • Value is the property value that we store and retrieve
  • Objects and identifiers are used to query values
  • Identifiers are only used to identify variables, and are generally written the same as variable names
  • A policy is a set of attributes about this attribute

Custom getter implementation

- (NSString *)cate_name{/* 1: object 2: identifier */ return objc_getAssociatedObject(self, "cate_name"); }Copy the code

instructions

  • Get values directly from objects and identifiers

Strategy:

/**
 * Policies related to associative references.
 * These are options to objc_setAssociatedObject()
 */
typedef OBJC_ENUM(uintptr_t, objc_AssociationPolicy) {
    OBJC_ASSOCIATION_ASSIGN = 0,           /**< Specifies a weak reference to the associated object. */
    OBJC_ASSOCIATION_RETAIN_NONATOMIC = 1, /**< Specifies a strong reference to the associated object. 
                                            *   The association is not made atomically. */
    OBJC_ASSOCIATION_COPY_NONATOMIC = 3,   /**< Specifies that the associated object is copied. 
                                            *   The association is not made atomically. */
    OBJC_ASSOCIATION_RETAIN = 01401,       /**< Specifies a strong reference to the associated object.
                                            *   The association is made atomically. */
    OBJC_ASSOCIATION_COPY = 01403          /**< Specifies that the associated object is copied.
                                            *   The association is made atomically. */
};
Copy the code
  • As you can see from top to bottom are assign, retain/nonatomic, copy/nonatomic, retain, and copy, respectively. The comments are already very detailed. Say no more.

Remove associated objects: Remove associated objects by setting value to nil

Note: Why can this be removed and can you view the underlying analysis below

Source:

- (void)removeAssociation{
    objc_setAssociatedObject(self, "cate_name", nil, OBJC_ASSOCIATION_COPY_NONATOMIC);
}
Copy the code

Code verification:

WYPerson *person = [WYPerson alloc]; Person.cate_name = @" "; NSLog(@" before removing association %@",person.cate_name); // Remove the associated object [person removeAssociation]; NSLog(@" removed association %@",person.cate_name);Copy the code

The output

2021-10-21 10:27:50.071130+0800 Associated object [54602:771902] After removing the association (NULL)Copy the code

2. Low-level analysis of set process of associated objects

The main exploration is where value is stored and how it is stored.

2.1 objc_setAssociatedObject

We know from above that the upper layer implements the set property through objc_setAssociatedObject, so start with that and work your way down to the bottom

Find the source code in objc below

void
objc_setAssociatedObject(id object, const void *key, id value, objc_AssociationPolicy policy)
{
    SetAssocHook.get()(object,key,value,policy);
    
}
Copy the code

We’re calling a, not a function call, so we need to see what get() returns

Continue to view the source code:

static ChainedHookFunction<objc_hook_setAssociatedObject> SetAssocHook{_base_objc_setAssociatedObject}; Template <typename Fn> class ChainedHookFunction {STD ::atomic<Fn> hook{nil}; public: ChainedHookFunction(Fn f) : hook{f} { }; Fn get() {return hook.load(STD ::memory_order_acquire); } void set(Fn newValue, Fn *oldVariable) { Fn oldValue = hook.load(std::memory_order_relaxed); do { *oldVariable = oldValue; } while (! hook.compare_exchange_weak(oldValue, newValue, std::memory_order_release, std::memory_order_relaxed)); }};Copy the code

Find that a function is returned

So we can write it like this

void
objc_setAssociatedObject(id object, const void *key, id value, objc_AssociationPolicy policy)
{
//    SetAssocHook.get()(object,key,value,policy);
    _base_objc_setAssociatedObject(object, key, value, policy);
}
Copy the code

Description:

  • Go to the bottom layer and find that the setassocook.get () method is called internally
  • Like setters, this design pattern is an interface pattern. The external interface does not change, but the internal implementation changes according to different parameters, which is good for dynamically creating regular methods
  • That is, setassocook.get () returns a method that executes there
  • So you need to go into SetAssocHook and see what does get() return
  • Setassocook.get () is equivalent to _base_objc_setAssociatedObject

2.2 _base_objc_setAssociatedObject

Click here to enter the _object_set_associative_reference function, which is specifically implemented. The next step is to analyze this function

static void
_base_objc_setAssociatedObject(id object, const void *key, id value, objc_AssociationPolicy policy)
{
  _object_set_associative_reference(object, key, value, policy);
}
Copy the code

2.3 _object_set_associative_reference

This is to set the value of the most important code, notes to write more clear, I here to organize the logic for explanation

Source:

Select * from bucket; /* select * from bucket; /* select * from bucket; /* select * from bucket; Update key-value pairs if they exist. Insert if not. */ / the first parameter objcet is the key of the first map, the second parameter key is the key of the second map, and the third and fourth parameters are the values of the second map. void _object_set_associative_reference(id object, const void *key, id value, uintptr_t policy) { // This code used to work when nil was passed for object and key. Some code // probably relies on that to not crash. Check and handle it explicitly. // rdar://problem/44094390 if (! object && ! value) return; if (object->getIsa()->forbidsAssociatedObjects()) _objc_fatal("objc_setAssociatedObject called on instance (%p) of class  %s which does not allow associated objects", object, object_getClassName(object)); // The OBJc_Object structure is encapsulated into the DisguisedPtr structure, which is convenient for subsequent use. As the key value DisguisedPtr<objc_object> backup {(objc_object *)object}; ObjcAssociation association{policy, value}; ObjcAssociation association{policy, value}; Association.acquirevalue (); // Retain the new value (if any) outside the lock. {/ / this is associated with object management class, here to initialize an object / * the object here is not the only, you can create multiple, constructor and destructor lock just for multithreading The use of that is to say, every time is the only, can not use more than one at the same time, only after a delete, will unlock in the destructor, */ AssociationsManager */ AssociationsManager */ AssociationsManager */ AssociationsManager The HashMap is a record of all the associated objects in the project. This table is unique and easy to find. Here, too, we get the hash table by managing the class and it's shared across the program. */ AssociationsHashMap & Associations (manager.get())); /* (1) To obtain a backup of a program (either a backup or an empty backup is available); 3. The bucket gets the ObjectAssociationMap and stores it into the isa. If (value) {// Returns a file that is passed in /* the first argument: backup; the second argument: Empty ObjectAssociationMap 0, they add up to a bucket 1, return a bucket if it already exists 2, create an empty bucket if it doesn't exist and insert it, return the empty bucket 3, the main value is a bucket, Auto refs_result = associations. Try_emplace (backup, ObjectAssociationMap{}); auto refs_result = associations. // The value returned by this class pair is a bool. // The value returned by this class pair is a bool. Isa if (refs_result.second) {// It's the first association we make This object needs to be marked, in isa */ object->setHasAssociatedObjects(); /* establish or replace the association */ Association */ auto &refs = refs_result.first->second; Const void auto result = refs.try_emplace(key, STD ::move(association)); If (! Association. swap(result.first->second); } else {auto refs_it = associations. Find (backup); if (refs_it ! = associations.end()) { auto &refs = refs_it->second; auto it = refs.find(key); if (it ! = refs.end()) { association.swap(it->second); refs.erase(it); if (refs.size() == 0) { associations.erase(refs_it); } } } } } // release the old value (outside of the lock). association.releaseHeldValue(); }Copy the code

Description:

  1. Initialize the
    1. Object needs to be encapsulated into DisguisedPtr, which is further processed and does not need to be concerned, as long as you know that the first layer is queried through the object (the second layer is obtained through the key value)
    2. The ObjcAssociation is the final stored value that contains the value and the policy, both of which are stored together.
    3. We find the final location through the value we pass in, and then store our value and policy
    4. AcquireValue () handles association according to policy, but only retain and copy are treated specially
  2. Get the hash table
    1. The first point is that the management class of the associated object is not globally unique, and the hash map is globally unique. This should be taken into account.
    2. Objects of a management class can be created multiple times. Although destructors and constructors are locked and can only be unlocked once the destructor is executed, this does not prevent the creation of multiple objects.
    3. Lock, does not represent the unique, just to avoid multiple threads to create a repeat, in fact, outside the interview can be repeated to create
    4. The hashMap table is a global table that records all associated objects in the project. It is obtained by static variables and therefore is unique in the whole field
    5. A backup of a hashMap is a program where a backup is a program of keys and values. The keys are backup, and the values are ObjectAssociationMap. The ObjectAssociationMap is the second-level hash table
    6. All we have to do with a hash table is find the bucket and store the data
  3. Query bucket
    1. To query a backup is to query a backup
    2. We need to pass in an empty ObjectAssociationMap, because we need to create an empty bucket if we can’t find it, so we pass this parameter first
    3. Return value in addition to the bucket data, other than the concern
    4. The first value and the second item is ObjectAssociationMap.
    5. The second value isa bool that checks if the bucket is empty, and if it is empty, it needs to be tagged in isa
  4. Insert or update key-value
    1. If an empty bucket is obtained, a new association is set up
    2. If the bucket is not empty, that is, the data has been stored before, then the association is updated
    3. The method called is basically the same as the query bucket above
  5. If nil is passed in, the association is removed
    1. Based on the DisguisedPtr, find the Iterator iteration query in AssociationsHashMap
    2. If the corresponding value is found, it is cleared directly

Schematic diagram:

2.4 AssociationsManager

Source:

// class AssociationsManager manages a lock / hash table singleton pair. // Allocating an instance acquires the lock /* The admin class manages a lock and a hash form instance to create an AssociationsManager {using Storage = ExplicitInitDenseMap<DisguisedPtr<objc_object>, ObjectAssociationMap>; static Storage _mapStorage; / / get to by static variables, so it is the only public: / / constructor AssociationsManager () {AssociationsManagerLock. The lock (); } / / destructors ~ AssociationsManager () {AssociationsManagerLock. Unlock (); } AssociationsHashMap &get() { return _mapStorage.get(); } static void init() { _mapStorage.init(); }};Copy the code

Description:

  • The AssociationsManager object is created with a lock, and the lock is released when the object is unregistered. Therefore, a management class cannot allow other management classes to interfere with operations such as creating a hash table
  • The AssociationsHashMap table is created with _mapStorage, which is a static variable and therefore globally unique

2.4 try_emplace

Source:

// Inserts key,value pair into the map if the key isn't already in the map. // The value is constructed in-place if the // It is not moved. /* 1. / / template <typename... <typename... / / void ObjectAssociationMap <typename... Ts> std::pair<iterator, bool> try_emplace(KeyT &&Key, Ts &&... Args) { BucketT *TheBucket; // Get the bucket, assemble it, call it back, return the bucket, False if (LookupBucketFor(Key, TheBucket)) return STD ::make_pair(makeIterator(TheBucket, getBucketsEnd(), true), false); // Already in map. // Otherwise, insert the new element. And return true TheBucket = InsertIntoBucket(TheBucket, STD ::move(Key), STD ::forward<Ts>(Args)... ; return std::make_pair( makeIterator(TheBucket, getBucketsEnd(), true), true); } // Inserts key,value pair into the map if the key isn't already in the map. // The value is constructed in-place if The key is not in the map, otherwise // It is not moved. /* Insert key-value pairs into the map if the key is not in the map, value will be created when appropriate. */ /* 1, find the bucket by key, return true if found, and get the bucket 2, if not found, create a bucket and insert */ template <typename... Ts> std::pair<iterator, bool> try_emplace(const KeyT &Key, Ts &&... Args) { BucketT *TheBucket; If (LookupBucketFor(Key, TheBucket))// find TheBucket, where Key is the associated object, return STD ::make_pair(// if TheBucket exists, return false, False because makeIterator(TheBucket, getBucketsEnd(), true), false) already exists in hash map; // Already exists in map. Otherwise, insert the new element. TheBucket = InsertIntoBucket(TheBucket, Key, STD ::forward<Ts>(Args)...) ; Return STD ::make_pair(makeIterator(TheBucket, getBucketsEnd(), true), true); // Return true, insert bucket}Copy the code

Description:

  • Try_emplace is an overloaded function with two functions that can be distinguished by different argument types
  • As you can see from the argument type of the first argument, the first function is called when looking for buckets at the first level (argument type KeyT) and the second function is called when looking for buckets at the second level (argument type const KeyT)
  • The implementation of the two overloaded functions is basically the same.
  • LookupBucketFor is called to see if the bucket for the associated object exists, and returns YES and the bucket if it does, or NO and an empty bucket if it doesn’t
  • If so, call make_pair to return the bucket and return false to indicate that no new bucket was added
  • If not, call InsertIntoBucket to create a bucket object, and call make_pair to insert the bucket, return the bucket, and return true to indicate that a new bucket was added

2.4 LookupBucketFor

Source:

/* Find the bucket corresponding to Val, and FoundBucket returns the bucket true if it contains keys and values, Otherwise return false */ template<typename LookupKeyT> bool LookupBucketFor(const LookupKeyT &Val, const BucketT *&FoundBucket) const { const BucketT *BucketsPtr = getBuckets(); const unsigned NumBuckets = getNumBuckets(); if (NumBuckets == 0) { FoundBucket = nullptr; return false; } // FoundTombstone - Keep track of whether we find a tombstone while probing. const BucketT *FoundTombstone = nullptr; const KeyT EmptyKey = getEmptyKey(); const KeyT TombstoneKey = getTombstoneKey(); assert(! KeyInfoT::isEqual(Val, EmptyKey) && ! KeyInfoT::isEqual(Val, TombstoneKey) && "Empty/Tombstone value shouldn't be inserted into map!" ); unsigned BucketNo = getHashValue(Val) & (NumBuckets-1); // Just a hash algorithm to get the index of bucket unsigned ProbeAmt = 1 via Val; Const BucketT *ThisBucket = BucketsPtr + BucketNo; // ThisBucket = BucketsPtr + BucketNo; // Found Val's bucket? If so, return it. If (LLVM_LIKELY(KeyInfoT::isEqual(Val, ThisBucket->getFirst()))) {FoundBucket = ThisBucket; return true; } // If we found an empty bucket, The key doesn't exist in the set. // Insert it and return the default value. Insert Val and if (LLVM_LIKELY(KeyInfoT::isEqual(ThisBucket->getFirst(), EmptyKey))) { // If we've already seen a tombstone while probing, fill it in instead // of the empty bucket we eventually probed to. FoundBucket = FoundTombstone ? FoundTombstone : ThisBucket; return false; } // If this is a tombstone, remember it. If Val ends up not in the map, we // prefer to return it than something that would require more probing. // Ditto for zero values. if (KeyInfoT::isEqual(ThisBucket->getFirst(), TombstoneKey) && ! FoundTombstone) FoundTombstone = ThisBucket; // Remember the first tombstone found. if (ValueInfoT::isPurgeable(ThisBucket->getSecond()) && ! FoundTombstone) FoundTombstone = ThisBucket; // Otherwise, it's a hash collision or a tombstone, continue quadratic // probing. if (ProbeAmt > NumBuckets) { FatalCorruptHashTables(BucketsPtr, NumBuckets); } BucketNo += ProbeAmt++; BucketNo &= (NumBuckets-1); }} template <typename LookupKeyT> // Returns two values, FoundBucket, and a Boolean to determine whether the key and value are included. BucketT is not decorated with const. // This is where the call is made, because the external argument is also BucketT, Const LookupBucketFor(const LookupKeyT &Val, BucketT *&FoundBucket) { This is where LookupBucketFor is called. The only difference is that the new const decorated BucketT is used to get const BucketT *ConstFoundBucket; // The value passed in from the outside does not do anything. It is just a convenience to use the pointer as an argument. Bool Result = const_cast<const DenseMapBase *>(this) ->LookupBucketFor(Val, ConstFoundBucket); FoundBucket = const_cast<BucketT *>(ConstFoundBucket); return Result; }Copy the code

Description:

  • You can see that LookupBucketFor is also an overloaded function, and the argument type tells you that the external call is the second function
  • The implementation of the second function is actually calling the first function, so we need to analyze the first function
  • After a brief analysis, you can see that the bucket is found through the hash function with the val value passed in, and the bucket is returned through FoundBucket. Return true if the bucket contains keys and values, false otherwise

2.5 summarize

  • The outermost layer is a globally unique static hash table that stores all associated objects. The hash table has many buckets, each of which has key-value pairs of objects and associated tables
  • The associative table also stores many buckets, each of which has a key-value pair, where the key is the identifier
  • So there are two levels of hash, which is like a two-dimensional array, where the first dimension is what object to look for, and the second dimension is what property to look for
  • If the value of the associated object is set to null, the association is removed

Storage structure diagram:

3. The underlying analysis of the GET process of associated objects

We know that this is done through the objc_getAssociatedObject function, so analyzing the getter flow from this function is essentially a query at two levels, looping through each level through object and key, respectively

3.1 objc_getAssociatedObject

The underlying implementation is _object_get_associative_reference

id
objc_getAssociatedObject(id object, const void *key)
{
    return _object_get_associative_reference(object, key);
}
Copy the code

3.2 _object_get_associative_reference

Source:

/* key is an identifier. */ id _object_get_associative_reference(id object, const void *key) {/* why ???? ObjcAssociation association{}; objCasassociation {}; {// get the unique hash map AssociationsManager manager; AssociationsHashMap &associations(manager.get()); / / iteration is used to iterate over, here is an iterator to the first layer of the barrel AssociationsHashMap: : iterator I = associations. The find ((objc_object *) object); if (i ! ObjectAssociationMap &refs = I ->second; / / get the associated object map all the bucket, which is the second ObjectAssociationMap: : iterator j = refs. Find (key); if (j ! Refs.end () {association = j->second; association.retainReturnedValue(); / / add a retain}}} return association. AutoreleaseReturnedValue (); // Get the value}Copy the code

Description:

  1. As we know above, the associated object is stored in a two-tier hash table, so it takes two loops to get the final value
  2. Query the first loop with object and the second loop with key, respectively
  3. Finally, get the value of association and return it

3.3 summarize

  • We get a globally unique hash table, and then the iterator iterates through each bucket in the table
  • First, find the value of the first layer ObjectAssociationMap through the associated object
  • Then use the identifier to find the value of association
  • The second layer of association contains variable values and policies.
  • Finally, extract the value of the variable in association

4. Verify the uniqueness of AssociationsHashMap

Many people think that locking causes the AssociationsManager to be unique, so remove the lock to see if the created AssociationsHashMap is unique.

As you can see below, the AssociationsHashMap remains unique after the lock is removed, indicating that it is unique not because of the lock, but because it is created by a static variable, and the lock is only created to avoid repeated use of administrative classes in multithreading

Remove the lock:

Define a manager and a Map

Debug to see if two are unique

Conclusion:

  1. After removing the lock, it is still unique, so it has nothing to do with the lock. Locking is used to prevent multithreaded doha table processing. Managers can be created multiple times.
  2. The hash table is unique because the hash table is obtained by _mapStorage and the variable is static

5, summary

  • Associative objects are used to associate variables operated on by two functions. They are commonly used in setter and getter methods in categories
  • Associated objects are stored in a global hash table at the bottom level. There are two layers: the first layer uses objects as keys and a new hash table as values, and the second layer uses identifiers as keys and variable values and policies as values in the new hash table.
  • When we set the associative object, if we pass in a variable value of nil, the associative object is removed from the hash table
  • Associated objects are not limited to the common classification, as long as there is a need to store the variable value of an object can be used in the future
  • When an object is associated with an object, a mark is made in isa. When an object is unassociated, it is judged that if the object has been associated, it needs to delete the associated object from the hash table first and then unregister the object.