In previous articles, we looked at ISA and class_data_bits_t in Class structures. Today, we’ll look at cache_t

What is cache_t

In previous articles, objc_class had a cache_t member cache inside it. What exactly does it do? Today we’re going to explore that. The cache is used to cache the collection of recently invoked methods for faster and more efficient reading and improved performance. How does that work? Let’s look at the source code now

The structure of cache_t

Struct cache_t {#if CACHE_MASK_STORAGE == cache_mask_storage_statements //explicit_atomic is an operation to convert ordinary Pointers to atomic Pointers for thread-safety. explicit_atomic<struct bucket_t *> _buckets; explicit_atomic<mask_t> _mask; #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 explicit_atomic<uintptr_t> _maskAndBuckets; mask_t _mask_unused; // How much the mask is shifted by. static constexpr uintptr_t maskShift = 48; // Additional bits after the mask which must be zero. msgSend // takes advantage of these additional bits to construct the value // `mask << 4` from `_maskAndBuckets` in a single instruction. static constexpr uintptr_t maskZeroBits = 4; // The largest mask value we can store. static constexpr uintptr_t maxMask = ((uintptr_t)1 << (64 - maskShift)) - 1; // The mask applied to `_maskAndBuckets` to retrieve the buckets pointer. static constexpr uintptr_t bucketsMask = ((uintptr_t)1 << (maskShift - maskZeroBits)) - 1; // Ensure we have enough bits for the buckets pointer. static_assert(bucketsMask >= MACH_VM_MAX_ADDRESS, "Bucket field doesn't have enough bits for arbitrary pointers."); #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4 // _maskAndBuckets stores the mask shift in the low 4 bits, and // the buckets pointer in the remainder of the value. The mask // shift is the value where (0xffff >> shift) produces the correct // mask. This is equal to 16 - log2(cache_size). explicit_atomic<uintptr_t> _maskAndBuckets; mask_t _mask_unused; static constexpr uintptr_t maskBits = 4; static constexpr uintptr_t maskMask = (1 << maskBits) - 1; static constexpr uintptr_t bucketsMask = ~maskMask; #else #error Unknown cache mask storage type. #endif #if __LP64__ uint16_t _flags; #endif uint16_t _occupied; public: static bucket_t *emptyBuckets(); struct bucket_t *buckets(); mask_t mask(); mask_t occupied(); void incrementOccupied(); void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask); void initializeToEmpty(); unsigned capacity(); bool isConstantEmptyCache(); bool canBeFreed(); #if __LP64__ bool getBit(uint16_t flags) const { return _flags & flags; } void setBit(uint16_t set) { __c11_atomic_fetch_or((_Atomic(uint16_t) *)&_flags, set, __ATOMIC_RELAXED); } void clearBit(uint16_t clear) { __c11_atomic_fetch_and((_Atomic(uint16_t) *)&_flags, ~clear, __ATOMIC_RELAXED); } #endif #if FAST_CACHE_ALLOC_MASK bool hasFastInstanceSize(size_t extra) const { if (__builtin_constant_p(extra) && extra == 0) { return _flags & FAST_CACHE_ALLOC_MASK16; } return _flags & FAST_CACHE_ALLOC_MASK; } size_t fastInstanceSize(size_t extra) const { ASSERT(hasFastInstanceSize(extra)); if (__builtin_constant_p(extra) && extra == 0) { return _flags & FAST_CACHE_ALLOC_MASK16; } else { size_t size = _flags & FAST_CACHE_ALLOC_MASK; // remove the FAST_CACHE_ALLOC_DELTA16 that was added // by setFastInstanceSize return align16(size + extra - FAST_CACHE_ALLOC_DELTA16); } } void setFastInstanceSize(size_t newSize) { // Set during realization or construction only. No locking needed. uint16_t newBits = _flags & ~FAST_CACHE_ALLOC_MASK; uint16_t sizeBits; // Adding FAST_CACHE_ALLOC_DELTA16 allows for FAST_CACHE_ALLOC_MASK16 // to yield the proper 16byte aligned allocation size with a single mask sizeBits = word_align(newSize) + FAST_CACHE_ALLOC_DELTA16; sizeBits &= FAST_CACHE_ALLOC_MASK; if (newSize <= sizeBits) { newBits |= sizeBits; } _flags = newBits; } #else bool hasFastInstanceSize(size_t extra) const { return false; } size_t fastInstanceSize(size_t extra) const { abort(); } void setFastInstanceSize(size_t extra) { // nothing } #endif static size_t bytesForCapacity(uint32_t cap); static struct bucket_t * endMarker(struct bucket_t *b, uint32_t cap); void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld); void insert(Class cls, SEL sel, IMP imp, id receiver); static void bad_cache(id receiver, SEL sel, Class isa) __attribute__((noreturn, cold)); };Copy the code

Buckets is a collection of the bucket_t struct type. Buckets is a collection of bucket_t struct types

struct bucket_t { private: // IMP-first is better for arm64e ptrauth and no worse for arm64. // SEL-first is better for armv7* and i386 and x86_64.  #if __arm64__ explicit_atomic<uintptr_t> _imp; explicit_atomic<SEL> _sel; #else explicit_atomic<SEL> _sel; explicit_atomic<uintptr_t> _imp; #endif // Compute the ptrauth signing modifier from &_imp, newSel, and cls. uintptr_t modifierForSEL(SEL newSel, Class cls) const { return (uintptr_t)&_imp ^ (uintptr_t)newSel ^ (uintptr_t)cls; } // omit part...... public: inline SEL sel() const { return _sel.load(memory_order::memory_order_relaxed); } inline IMP imp(Class cls) const { uintptr_t imp = _imp.load(memory_order::memory_order_relaxed); if (! imp) return nil; // omit part...... template <Atomicity, IMPEncoding> void set(SEL newSel, IMP newImp, Class cls); }; }Copy the code

Here we see that the order of the arm64 and x86 structures is different, but the function is the same. We have distilled the key information to get the following figureAmong themcache_tThe meanings are as follows:

  • Buckets/storage
  • Mask Mask/mask
  • Occupied; Actual size in use
  • Capacity Allocated capacity

A partial method of cache_t

  • Bucket () gets all SEL/IMP information in the cache
  • Capacity () Obtains the allocated capacity
  • Occupied () Gets the actual occupied size
  • IncrementOccupied () increments the actual occupied space by 1
  • Reallocate () opens up memory space
  • Insert SEL/IMP into insect()

Buckets is a bucket_T structure, which contains SEL and IMP

In field

Next we’ll prepare some examples. First we’ll create a Person class, as follows


@interface Person : NSObject

- (void)sayHello;
- (void)sayHello1;
- (void)sayHello2;
- (void)sayHello3;
- (void)sayHello4;
- (void)sayHello5;

@end

@implementation Person
- (void)sayHello {
    NSLog(@"%s", __func__);
}

- (void)sayHello1 {
    NSLog(@"%s", __func__);
}

- (void)sayHello2 {
    NSLog(@"%s", __func__);
}

- (void)sayHello3 {
    NSLog(@"%s", __func__);
}

- (void)sayHello4 {
    NSLog(@"%s", __func__);
}

- (void)sayHello5 {
    NSLog(@"%s", __func__);
}

@end
Copy the code

Create a Person object in the main function and call the method

   Person *person = [Person alloc];
   Class p = [Person class];
   [person sayHello];
   [person sayHello1];
   [person sayHello2];
   [person sayHello3];
   [person sayHello4];
   [person sayHello5];
Copy the code

Then run the program and make a breakpoint before calling the methodSo let’s exploreisainsidecacheFirst print the address of the Person class

(lldb) p/x p
Copy the code

The output

(Class) $0 = 0x00000001000021b8 Person
Copy the code

Then find the cache by offsetting 0x10 (0x00000001000021C8)

(lldb) p (cache_t *) 0x00000001000021c8
Copy the code

The output

(cache_t *) $1 = 0x00000001000021c8
Copy the code

Under the print cache_t

(lldb) p *$1
Copy the code

The output

(cache_t) $2 = {
  _buckets = {
    std::__1::atomic<bucket_t *> = 0x000000010032e420 {
      _sel = {
        std::__1::atomic<objc_selector *> = (null)
      }
      _imp = {
        std::__1::atomic<unsigned long> = 0
      }
    }
  }
  _mask = {
    std::__1::atomic<unsigned int> = 0
  }
  _flags = 32784
  _occupied = 0
}
Copy the code

You see it here_buckets,_mask,_flags,_occupied _occupied is 0. Let’s go ahead and executesayHellomethodsAt this point we’re printingThe $1

(lldb) p *$1
Copy the code

The output

(cache_t) $3 = {
  _buckets = {
    std::__1::atomic<bucket_t *> = 0x0000000100692b60 {
      _sel = {
        std::__1::atomic<objc_selector *> = ""
      }
      _imp = {
        std::__1::atomic<unsigned long> = 11496
      }
    }
  }
  _mask = {
    std::__1::atomic<unsigned int> = 3
  }
  _flags = 32784
  _occupied = 1
}
Copy the code

The buckets method is called to fetch the address

(lldb) p $3.buckets()
Copy the code

The output

(bucket_t *) $4 = 0x0000000100692b60
Copy the code

Read the contents of $4

(lldb) p *$4
Copy the code

The output

(bucket_t) $5 = {
  _sel = {
    std::__1::atomic<objc_selector *> = ""
  }
  _imp = {
    std::__1::atomic<unsigned long> = 11496
  }
}
Copy the code

Get the SEL in Buckets from the previous source

(lldb) p $5.sel()
Copy the code

The output

(SEL) $6 = "sayHello"
Copy the code

Get the IMP in Buckets from the previous source code, passing the address of the Person class

(lldb) p $5.imp(p)
Copy the code

The output

(IMP) $7 = 0x0000000100000d50 (KCObjc`-[Person sayHello])
Copy the code

Here we see that the method we called before has been cached. Following the above step, we call a method again, this time we want to get the second SEL, whose debugging LLDB is as follows

(lldb) p *($4+1)
Copy the code

The output

(bucket_t) $8 = {
  _sel = {
    std::__1::atomic<objc_selector *> = ""
  }
  _imp = {
    std::__1::atomic<unsigned long> = 11304
  }
}
Copy the code

Call SEL and IMP for $8

(lldb) p $8.sel()
(SEL) $9 = "say666"
(lldb) p $8.imp(p)
(IMP) $10 = 0x0000000100000d90 (KCObjc`-[Person say666])
Copy the code

How do we make sure that the address we print is ok? We can verify it with machOViewWe see that the say666 method here is exactly the same as the address above. How to add methods to the cache? Taking a look at the source code, we find the source code for the CACHE_t insert method

#### cache_t source code and insert process ALWAYS_INLINE void cache_t::insert(Class CLS, SEL SEL, IMP IMP, id receiver) { #if CONFIG_USE_CACHE_LOCK cacheUpdateLock.assertLocked(); #else runtimeLock.assertLocked(); #endif ASSERT(sel ! = 0 && cls->isInitialized()); // Use the cache as-is if it is less than 3/4 full; Mask_t new_occupied = occupied(); unsigned oldCapacity = capacity(), capacity = oldCapacity; //slowpath compiler optimization, low probability events. When the capacity () = 0, Create Cache if (slowPath (isConstantEmptyCache())) {// Cache is read-only. Replace it.capacity The default value is 4.INIT_CACHE_SIZE= (1<<2) if (! capacity) capacity = INIT_CACHE_SIZE; reallocate(oldCapacity, capacity, /* freeOld */false); Else if (fastPath (newOccupied + CACHE_END_MARKER <= capacity /4 * 3)) {Cache is less than 3/4 full. Use it as-is.} else {// Capacity2, i.e. 4, 8, 16... capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE; if (capacity > MAX_CACHE_SIZE) { capacity = MAX_CACHE_SIZE; } reallocate(oldCapacity, capacity, true); } bucket_t *b = buckets(); mask_t m = capacity - 1; //mask=capacity - 1 mask_t begin = cache_hash(sel, m); Mask_t I = begin; mask_t I = begin; // Scan for the first unused slot and insert there. // There is guaranteed to be an empty slot because the // minimum Size is 4 and we resized at 3/4 full.do {// If (fastPath (b[I].sel() == 0)) { // Final +1; // Final +1; Set <Atomic, Encoded>(sel, imp, CLS); return; If (b[I].sel() == sel) {// The entry was added to The cache by some other thread // before we grabbed the cacheUpdateLock. return; } } while (fastpath((i = cache_next(i, m)) ! = begin)); Cache_t ::bad_cache(receiver, (sel)sel, CLS) cache_t::bad_cache(receiver, (sel)sel, CLS); }Copy the code

Through the above source code. We can see that the insertion process is divided into several steps:

  • Calculate the current occupied space

  • Create space or expand

    If no space is available, create a cache space with a default value of 4. If you’ve already opened up space. See if it’s more than 3/4. If it’s more than 3/4, create a new space twice the size of the previous one and assign it to it. Set the old buckets free

  • Find storage location

    The hash algorithm function is used to calculate the hash value of SEL and search according to the current subscript. If the current subscript does not have SEL, it indicates that the current subscript does not store SEL. Then run the set method. Returns if the current subscript already has sel and is equal to about to insert SEL. If the current subscript already has sel and is not equal to the sel to be inserted, recalculate the hash value to get a new subscript and repeat the above steps

thinking

Now let’s see how well you know by asking questions

1. When is it stored in the cache?

Objc_msgSend The first time a message is sent triggers a method lookup, which is then cached by calling cache_fill()Copy the code

2. When is the insect() method called?

Init Initializes objects, property get/set methods, method callsCopy the code

3. Why is bucket data lost?

The reason is that during capacity expansion, the original memory is cleared, and then the memory is applied for againCopy the code

4. How is the method cache cache_t implemented?

The hash table technology key-value is used to cache the method that has been called, which can improve the method search speedCopy the code

5. Why is Capacity 1 needed for mask

The index of the cache location, the index of the cache method, must be less than or equal to the length of the hash table, minus 1, so that no bounds are crossedCopy the code

The basic principles of cache_t are covered here, but the code and implementation are not posted here. If you are interested, download the source code and play around with it.