Learn more about YYCache

preface

YYCache is a high-performance cache framework, developed by ibireme. YYCache is used as a cache scheme in the project. The following is the implementation mechanism of YYCache, explaining the reason for its high performance, the implementation of LRU algorithm, the lock used, and the timing of deleting cache, etc. There are also some framework issues that I think might exist.

YYMemoryCache implementation mechanism

Remove low frequency elements first

Apple also has its own caching scheme, NSCache, which combines various automatic deletion policies to ensure that it does not take up too much system memory. When the system is running low on memory, these policies are automatically implemented to remove objects from the cache, but the order in which they are removed is uncertain.

YYCache uses a deletion mechanism that preferentially deletes low-frequency data. How does it do this?

YYCache is divided into YYMemoryCache and YYDiskCache. Let’s look at YYMemoryCache first.

YYMemoryCache internally maintains a two-way linked list: _YYLinkedMap. Each time data is stored, the data is stored at the head of the linked list. If memory is tight, the data is deleted from the end of the linked list to ensure that the deleted elements are not frequently used, thus improving efficiency to some extent.

_YYLinkedMap is a two-way linked list, but the actual carrier is still a dictionary.

@interface _YYLinkedMap : NSObject {
    @package
    CFMutableDictionaryRef _dic; // do not set object directly
    NSUInteger _totalCost;
    NSUInteger _totalCount;
    _YYLinkedMapNode *_head; // MRU, do not change it directly
    _YYLinkedMapNode *_tail; // LRU, do not change it directly
    BOOL _releaseOnMainThread;
    BOOL _releaseAsynchronously;
}
Copy the code

The actual carrier is _dic, CFMutableDictionaryRef, a c-level mutable dictionary corresponding to NSMuatbleDictionary.

Each node of the linked list is an object of type _YYLinkedMapNode. The structure of this object is as follows:

@interface _YYLinkedMapNode : NSObject {
    @package
    __unsafe_unretained _YYLinkedMapNode *_prev; // retained by dic
    __unsafe_unretained _YYLinkedMapNode *_next; // retained by dic
    id _key;
    id _value;
    NSUInteger _cost;
    NSTimeInterval _time;
}
@end
Copy the code

_prev and _next refer to the preceding node and the following node respectively.

The timing of cache deletion

YYMemoryCache provides three dimensions to control the cache:

// The number of cached objects, default NSUIntegerMax, unlimited @property NSUInteger countLimit; // The cost of caching, default NSUIntegerMax, no upper limit @property NSUInteger costLimit; // Cache time, default DBL_MAX, no upper limit @property NSTimeInterval ageLimit;Copy the code

You can set these values by yourself. For example, set the maximum number of cached objects to 5. When the cache exceeds 5 objects, YYCache will automatically delete them for you until the number of cached objects is less than or equal to 5.

How does this work?

Following these attributes is another attribute:

The default value is 5s. @property NSTimeInterval autoTrimInterval;Copy the code

After YYMemoryCache is initialized, a timer is maintained and the following methods are executed every autoTrimInterval second:

- (void)_trimRecursively { __weak typeof(self) _self = self; dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(_autoTrimInterval * NSEC_PER_SEC)), dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_LOW, 0), ^{ __strong typeof(_self) self = _self; if (! self) return; [self _trimInBackground]; [self _trimRecursively]; }); } - (void)_trimInBackground { dispatch_async(_queue, ^{ [self _trimToCost:self->_costLimit]; [self _trimToCount:self->_countLimit]; [self _trimToAge:self->_ageLimit]; }); }Copy the code

This method polls to determine whether the current cache has exceeded the set limit, and if so, frees the cache to the set limit.

In addition to the timer, there are two other times to clear the cache, corresponding to two notifications:

    [[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(_appDidReceiveMemoryWarningNotification) name:UIApplicationDidReceiveMemoryWarningNotification object:nil];
    [[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(_appDidEnterBackgroundNotification) name:UIApplicationDidEnterBackgroundNotification object:nil];
Copy the code

One is when the system is running out of memory, and the other is when the APP enters the background. After receiving these two notifications, the cache will be cleared.

By the way, if the main thread is not specified to remove the cache, then all cache clearing operations are performed in child threads.

Tips for asynchronously releasing objects

YYCache improves user experience by capturing objects in blocks. Here is a small example:

NSArray *tmp = self.array; self.array = nil; dispatch_async(queue, ^{ [tmp class]; // This call is just to avoid the compiler warning});Copy the code

CostLimit little bug

If the cost limit is not specified, then the cost of the object is 0:

- (void)setObject:(id)object forKey:(id)key {
    [self setObject:object forKey:key withCost:0];
}
Copy the code

So in that case, if I pass in 10 objects, and the actual cost of each object is 10, and I set the cost limit to 50, what I expect is that the cache can’t exceed 50, but because I didn’t manually assign cost when I passed in, the totalCost is still 0, Does not satisfy the condition of clear cache, this should be a small bug, if I understand wrong, welcome to point out in the comment section.

YYMemoryCache Lock

YYMemoryCache and YYDiskCache are both thread-safe and use different locks to ensure this. YYMemoryCache was first implemented using OOSpinLock, which is a spin lock that polls when the desired resource is occupied, which means the CPU is constantly asking: ok, ok…. , until resources are available, so this lock is also called “busy waiting”. This mechanism of spin locking can cause priority inversion problems, so it is no longer advocated in iOS, and the author has written another article on this issue, which I won’t go into here.

Instead of OOSpinLock, pthread_mutex is a mutex, which is a mutex. The main difference between a mutex and a spin lock is that the mutex will suspend and wait for resources to be occupied.

IOS also provides a number of locks, but NSLock, which encapsulates pthred_mutex, is not as good as pthread_mutex because it requires an object call. Therefore, YYMemoryCache directly uses pthred_mutex, which is already the best available lock in iOS. The content of the lock is not detailed here. If you are interested, you can read my iOS concept Of The Road to Solution (5) : Thread synchronization scheme.

YYDiskCache implementation mechanism

Data storage mode

The disk cache is actually divided into two parts, one is the database cache, one is the file cache, during initialization, specify this threshold, default is 20KB, that is, the data less than 20KB, stored in the database, the data larger than 20KB is stored in the file. Each object in the disk cache is encapsulated as a YYKVStorageItem type:

@interface YYKVStorageItem : NSObject
@property (nonatomic, strong) NSString *key;                ///< key
@property (nonatomic, strong) NSData *value;                ///< value
@property (nullable, nonatomic, strong) NSString *filename; ///< filename (nil if inline)
@property (nonatomic) int size;                             ///< value's size in bytes
@property (nonatomic) int modTime;                          ///< modification unix timestamp
@property (nonatomic) int accessTime;                       ///< last access unix timestamp
@property (nullable, nonatomic, strong) NSData *extendedData; ///< extended data (nil if no extended data)
@end
Copy the code

When the value is larger than 20KB, the value will be cached in the file, and the metadata, namely YYKVStorageItem, will be stored in the database. When retrieving the metadata, the metadata will be retrieved from the database, and then the corresponding file,

Why database is used instead of file storage directly, and why the boundary between database and file is 20KB, are the results of YYCache’s own tests:

Based on the file system, that is, one value corresponds to one file, the disadvantages of caching data through file reading and writing are: inconvenient expansion, no metadata, difficult to implement a good elimination algorithm, and slow data statistics.

Database-based caching supports metadata well, is easy to expand, data statistics are faster, and it is also easy to implement LRU or other elimination algorithms. The only uncertainty is the performance of database reads and writes, for which I have evaluated SQLite’s performance on real machines. On iPhone 6 with 64 GB, SQLite performs better than direct file reads, but the read performance depends on the data size: when a single piece of data is smaller than 20K, direct file writes are faster.

Therefore, disk caching is best done by combining SQLite with file storage, with key-value metadata stored in SQLite and value data stored in SQLite or file storage depending on size.

I will post this article too.

YYDiskCache is to use the system’s own SQlite3 to achieve this database, as for the access code of the database, here is not expanded to say, the code is in YYkvstorage. m, you can download the source code YYCache from Github.

LRU algorithm

The disk cache has the same deletion time as the memory cache, but its automatic polling time is 60 seconds. In addition, the disk cache also has corresponding costLimit, countLimit, and ageLimit.

YYMemoryCache deletes data from the end of the linked list first. However, YYDiskCache is a database.

Each metadata is stored with a parameter (last_access_time). Each operation records the current operation time and deletes the metadata from the earliest time.

Just look at the query instructions in the database:

- (NSMutableArray *) _dbGetItemSizeInfoOrderByTimeAscWithLimit: (int) count {/ / read data from the database, NSString * SQL = @"select key, filename, size from manifest order by last_access_time ASc limit? 1." ; / /... }Copy the code

YYDiskCache lock

Here is a semaphore to achieve the lock function:

static dispatch_semaphore_t _globalInstancesLock;
Copy the code

In simple terms, a semaphore is a counter whose value is the number of signals currently accumulated. It supports two operations, the addition operation up and the subtraction operation down, described as follows:

Down subtraction operation:

Determines whether the semaphore value is greater than or equal to 1
If so, subtract 1 from the semaphore value and continue
Otherwise wait on the semaphore (the thread will be suspended)

Up addition operation:

Increments the value of the semaphore by 1 (this will wake up a thread waiting on the semaphore)
The thread continues to execute

It is important to note that although the down and UP operations contain multiple steps, they are a set of atomic operations and cannot be separated from each other.

If you limit the semaphore to 0 and 1, you get a lock, also known as a binary semaphore. The operation is as follows:

Down subtraction operation:

Wait for the semaphore value to change to 1
Sets the semaphore value to 0
So let’s go ahead and execute

Up addition operation:

Sets the semaphore value to 1
Wake up the first thread waiting on the semaphore
The thread continues to execute

Since the binary semaphore values are 0 and 1, the above program prevents any two programs from entering the critical region at the same time, so this is equivalent to the concept of a lock. YYDiskCache uses the binary semaphore, but it is possible to use pthread_mutex instead. In comparison, semaphores perform better than pthread_mutex when there is no wait. But once there is a waiting situation. Performance will go down a lot. In addition, semaphores do not consume CPU resources while waiting, which is perfect for disk caching.

Refer to the article

YYCache design roadmap

OSSpinLock is no longer secure

conclusion

What deficiencies are welcome to point out.