YYCache source

YYCache design roadmap

YYCache design idea this article is very recommended, I have read it long ago, every time I see the understanding is different, this article when the author compared the world’s mainstream cache framework, and then wrote the framework on this basis.

YYCache

The official API annotation interface is basically the same as NSChche:

*/ @property (copy, readonly) NSString *name; /** memoryCache class */ @property (strong, readonly) YYMemoryCache *memoryCache; /** diskCache class */ @property (strong, readonly) YYDiskCache *diskCache; */ - (void)setObject:(nullable id<NSCoding>)object forKey:(NSString *)key; /** delete */ - (void)removeObjectForKey:(NSString *)key; /** check */ - (BOOL)containsObjectForKey:(NSString *)key; /** get */ - (nullable id<NSCoding>)objectForKey:(NSString *)key;Copy the code

Each of these operations has a corresponding block callback method, which also says that it must store objects that implement the NSCoding protocol, mainly when storing files on disk.

1. Most blocks are called back when files are saved, fetched, or deleted.

2. All operations are performed in memory first, and the operation disk is stored in the first operation

YYMemoryCache

#pragma mark - Attribute ///============================================================================= /// @name Attribute properties / / / = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = / cache name * *, Default nil */ @property (nullable, copy) NSString *name; /** total number of current objects cached (readonly) */ @property (readonly) NSUInteger totalCount; /** total size of the current cache (readonly) */ @property (readonly) NSUInteger totalCost; / / / = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = / / / @ name Limit restrictions / / / = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = the number of cache / * * biggest default to NSUIntegerMax, equivalent to no limit. */ @property NSUInteger countLimit; */ @property NSUInteger countLimit; /** Specifies the maximum cache size before executing the removal policy. This is not an explicit limit. If the cache limit is exceeded, some cache objects will be quickly cleared in background threads */ @property NSUInteger costLimit; */ @property NSTimeInterval ageLimit; */ @property NSTimeInterval ageLimit; */ @property NSTimeInterval autoTrimInterval */ @property NSTimeInterval autoTrimInterval; /** When the system memory warning whether to execute the removal policy setting, the default is YES. Memory warning execute callback * / @ property BOOL shouldRemoveAllObjectsOnMemoryWarning; @property (nullable, copy) void(^didReceiveMemoryWarningBlock)(YYMemoryCache *cache); /** Whether to set the removal policy when the app enters the background. The default setting is YES. Memory warning execute callback * / @ property BOOL shouldRemoveAllObjectsWhenEnteringBackground; @property (nullable, copy) void(^didEnterBackgroundBlock)(YYMemoryCache *cache); /** if YES, it is removed in the main thread, otherwise in the background thread. Default: NO If the key-value pair contains objects that need to be removed from the main thread, such as UIView/CALayer, then set to YES */ @property BOOL releaseOnMainThread; /** If YES, the key-value pair will be removed, avoiding blocking access methods. Otherwise it will be implemented in an access method, such as removeObjectForKey, which defaults to YES */ @property BOOL releaseAsynchronously; ///============================================================================= /// @name Trim / / / = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = / * * after pruning the maximum quantity allowed to keep. Remove data using LRU until totalCount is less than or equal to the value */ - (void)trimToCount:(NSUInteger)count; /** Maximum data size allowed after pruning. Remove data using LRU until totalCost is less than or equal to the value */ - (void)trimToCost:(NSUInteger)cost; TrimToAge :(NSTimeInterval)age; */ - (void)trimToAge:(NSTimeInterval)age;Copy the code

Memory cache

Author’s words:

YYMemoryCache is a memory cache developed by me. Compared with PINMemoryCache, I have removed the interface of asynchronous access, optimized the performance of synchronous access as far as possible, and used OSSpinLock to ensure thread safety. In addition, the internal cache with bidirectional linked list and NSDictionary to achieve LRU elimination algorithm, compared with the above several is a little progress.

Following OSSpinLock’s priority inversion problem caused by different threads accessing the same resource, the author also stated that if the operation cannot be guaranteed in the same priority thread, the spin lock is basically no longer available. The author replaced OSSpinLock with pthread_mutex_t in a later article.

YYMemoryCache access method

YYMemoryCache internal author defines a bidirectional linked list _YYLinkedMap, structure is as follows

@implementation YYMemoryCache {pthread_mutex_t _lock; _YYLinkedMap *_lru; dispatch_queue_t _queue; } @end /** _YYLinkedMap */ @interface _YYLinkedMap: NSObject {@package CFMutableDictionaryRef _dic; // Do not set NSUInteger _totalCost directly; // Total number NSUInteger _totalCount; // Total size _YYLinkedMapNode *_head; _YYLinkedMapNode *_tail; // LRU(tail node) do not modify BOOL _releaseOnMainThread directly; // Whether to remove in the main thread, default NO BOOL _releaseAsynchronously; } @interface _YYLinkedMapNode: NSObject {@package /* @package variable, accessible within the package, but not outside the package */ __unsafe_unretained _YYLinkedMapNode *_prev; // retained by dic __unsafe_unretained _YYLinkedMapNode *_next; // retained by dic id _key; id _value; NSUInteger _cost; NSTimeInterval _time; } @endCopy the code
  • The access method and node modification of _YYLinkedMap are the same as the big difference of lRU before, and the idea is the same. The author of YY uses DIC and bidirectional linked list, dic controls the query, the head node of bidirectional linked list is always the latest accessed data, and the tail node is the data that has not been accessed for the longest time. The removal strategy is to recursively remove from the tail forward, and the recursive termination condition is _totalCount or _totalCost is less than the limit.

  • CFMutableDictionaryRef object _DIC, key is the stored key, value is _YYLinkedMapNode node, _DIC holds keys and values. __unsafe_unretained and retained by DIC, _prev and _next of every node in a bidirectionally linked list do not hold the corresponding node modification right.

All dashed lines in the diagram are __unsafe_unretained, don’t hold objects, and each node points to either nil or a node, so there is no access wild pointer problem.

NSCache uses NSMapTable to do _dic work and NSMutableArray to do _YYLinkedMap work, and both hold the actual stored objects.

YYMemoryCache add, delete, change query:

Add: if not, add YYMemoryCache directly to _dic, key, and node node.o (1)

Change: direct search, if there is a search after first remove and then put the head node. This whole thing is order one.

Delete: Delete is to directly look up the key, remove the _dic key pair according to the key, and then remove the front and back nodes of the bidirectional linked list. O(1)

Check: You can use _dic to check or not check based on the key. O(1)

YYMemoryCache from above, only _DIC holds a list of keys and values. All operations are O(1), excellent indeed. For example, if NSMapTable is used for query and values is used for weak query, it is also possible to hold each node of bidirectional linked list.

In this case, it might be possible to implement all of the above schemes directly using _DIC, without using bidirectional lists. The main function of bidirectional linked list is to use the head node to represent the latest data operation, and the tail node to represent the data that has not been operated for the longest time. When the memory is full or memory warning occurs, the cache should be cleared from the tail node to the front.

YYMemoryCache has three member variables

_LRU: LRU cache scheme object of type _YYLinkedMap

_queue: serial queue of dispatch_queue_T type

_lock: mutex of pthread_mutex_t

_queue is used to ensure that all clipping policies are implemented in serial queues, each operation with a mutex to ensure thread safety.

The solution for _trimRecursively on-on-triangles is to make dispatch_after calls with a 5-second delay on the _queue queue. For those interested, look for the difference between Dispatch_After and performSelector and timers.

Locks in memory cache

In order to find a lock to replace OSSpinLock, the author did a series of performance tests. The result is that in addition to OSSpinLock in a single thread, Dispatch_semaphore and pthread_mutex have the highest performance. According to sources, Apple has optimized pthread_mutex’s performance in the new OS, so it doesn’t look that far behind OSSpinLock. So the author uses. If you are interested, you can read it. It is also good to have a look at the lock test code. The lock

Add, delete, change and check operations in the memory cache are all mutex, so there will be no data confusion when accessing or multithreading access, is thread safe

Multithreading in memory cache

The author uses the serial queue of GCD, and all operations about the queue are asynchronous operations, which are used for the elimination algorithm to crop memory according to various conditions. See _trimInBackground for this method.

Lru has an attribute _releaseAsynchronousl which defaults to YES and _releaseOnMainThread which defaults to NO. Dispatch_get_global_queue (DISPATCH_QUEUE_PRIORITY_LOW, 0); Global queues with low priority, and then when freeing memory, you can control whether the resource is freed in the main thread or the asynchronous thread. Very worth learning design.

The first time I saw it, I thought it was redundant code. It was really embarrassing.

- (void)_trimToCost:(NSUInteger)costLimit { BOOL finish = NO; pthread_mutex_lock(&_lock); if (costLimit == 0) { [_lru removeAll]; finish = YES; } else if (_lru->_totalCost <= costLimit) { finish = YES; } pthread_mutex_unlock(&_lock); if (finish) return; NSMutableArray *holder = [NSMutableArray new]; while (! finish) { if (pthread_mutex_trylock(&_lock) == 0) { if (_lru->_totalCost > costLimit) { _YYLinkedMapNode *node = [_lru removeTailNode]; if (node) [holder addObject:node]; } else { finish = YES; } pthread_mutex_unlock(&_lock); } else { usleep(10 * 1000); //10 ms}} // this is the release queue, the holder local variable holds the removed node data, the local variable is in the stack when the memory is freed by the system. If (holder.count) {dispatch_queue_t queue = _lru->_releaseOnMainThread? dispatch_get_main_queue() : YYMemoryCacheGetReleaseQueue(); dispatch_async(queue, ^{ [holder count]; // release in queue }); }}Copy the code

Disk cache

Abstract

Disk cache can be divided into three types: file based read and write, mMAP based file memory mapping, and database based.

  • TMDiskCache, PINDiskCache, SDWebImage and other caches are based on the file system, that is, one Value corresponds to one file. Data is cached through file read and write. Their implementations are relatively simple, their performance is similar, and their disadvantages are the same: poor scalability, no metadata, difficulty in implementing good elimination algorithms, and slow data statistics.

  • FastImageCache uses Mmap to map files to memory. Anyone who has used MongoDB will be familiar with the pitfalls of Mmap: Files with hot data should not exceed physical memory size, or MMAP will cause memory swapping to severely degrade performance; In addition, the data in memory is periodically flushed to the file. If the program hangs before the data is synchronized, it will cause data errors. Despite these flaws, MMAP performance is very high.

  • YYDiskCache also uses SQLite and file storage. The benchmark performance test results on iPhone 6 64G are shown in the following figure. YYDiskCache performs much better than file-based libraries when accessing small data (NSNumber). The access performance of larger data is closer. But thanks to the metadata stored by SQLite, YYDiskCache implements LRU elimination algorithm, faster data statistics, and more capacity control options.

The disk cache elimination strategy is also based on the database data last_access_time field, recursive clearance, recursive termination condition memory is less than the limit memory, and the memory cache is basically the same.

Personally overall look down the framework, the most intuitive feeling is the impact of powerful computer basic skills on the program, the use of many details, different scenarios for the lock and multi-threading research and use, as well as in disk caching for the depth of database understanding are general developers do not have. So try to understand the overall design, the framework can be directly learned to write, the details can also be copied to understand the principle, slowly will become better.

YYDiskCache

The whole process of YYDiskCache is as follows: if the value of the object stored in YYDiskCache is greater than _inlineThreshold(20kb), a file is created and written to the file database. Otherwise, binary data is directly written to the database.

Data written to the local file is archived and archived accordingly.

YYKVStorage

YYKVStorage is the database storage key and the internally encapsulated YYKVStorageItem object is basically the corresponding database storage key value, including the last access time, data size and other variables, When the removal strategy is implemented, the corresponding data (generally the sorted array) is obtained by directly filtering out data from the database, and the recursive removal is carried out according to the recursive conditions.

  • If the data size is larger than 20kb, the YYKVStorageItem object filename is the storage filename. The default filename of YYDiskCache is the 128-bit binary encrypted by _YYNSStringMD5, and the filename is 16 bytes.
  • If the data is smaller than or equal to 20kb, the database writes binary data directly to the inline_data field for higher read/write performance. Database performance tests are written in code and in articles written by web authors.

All the access methods of YYDiskCache are performed by the corresponding YYKVStorage. YYDiskCache is like a portal, and YYKVStorage is a class that operates directly. YYKVStorage can also be used independently.

Good practical design

Lock in disk cache

Dispatch_semaphore init = 1

OSSpinLock Spin lock, the highest performance lock. The principle is very simple, just do while waiting. The downside is that it consumes a lot of CPU resources when waiting, so it is not suitable for longer tasks. It is ideal for memory cache access. (Later replaced with pthread_mutex for security reasons)

Dispatch_semaphore is a semaphore, but can also be used as a lock if the total number of signals is set to 1. It performs even better than pthread_mutex when there are no wait conditions, but much worse when there are wait conditions. Its advantage over OSSpinLock is that it does not consume CPU resources while waiting. It is suitable for disk caching.

Multithreading in disk cache

There is an asynchronous queue in the disk cache, which is mainly used to access the operation of the method block callback. It is asynchronous and concurrent, and the operation properties are locked so that multithreaded access is thread safe. There are no block method calls, all in the current thread.

NSMapTable is the manager of YYDiskCache objects
  • _globalInstances where the key is PATH and the value is YYDiskCache instance can ensure that only one of the objects with the same path exists in memory. _globalInstances key is strong; values are weak.

  • The dispatch_semaphore lock is used to protect data access security in the _globalInstances set and GET instances. It’s a nice design, like if you have some singletons you can do it this way, it’s better in terms of memory.

static NSMapTable *_globalInstances; static dispatch_semaphore_t _globalInstancesLock; static void _YYDiskCacheInitGlobal() { static dispatch_once_t onceToken; dispatch_once(&onceToken, ^{ _globalInstancesLock = dispatch_semaphore_create(1); _globalInstances = [[NSMapTable alloc] initWithKeyOptions:NSPointerFunctionsStrongMemory valueOptions:NSPointerFunctionsWeakMemory capacity:0]; }); } // obtain YYDiskCache object, _globalInstances key is path, Static YYDiskCache *_YYDiskCacheGetGlobal(NSString *path) {if (path.length == 0) return nil; _YYDiskCacheInitGlobal(); dispatch_semaphore_wait(_globalInstancesLock, DISPATCH_TIME_FOREVER); id cache = [_globalInstances objectForKey:path]; dispatch_semaphore_signal(_globalInstancesLock); return cache; } // Store static void _YYDiskCacheSetGlobal(YYDiskCache *cache) {if (cache.path.length == 0) return; _YYDiskCacheInitGlobal(); dispatch_semaphore_wait(_globalInstancesLock, DISPATCH_TIME_FOREVER); [_globalInstances setObject:cache forKey:cache.path]; dispatch_semaphore_signal(_globalInstancesLock); } - (instancetype)initWithPath:(NSString *)path inlineThreshold:(NSUInteger)threshold {self = [super init]; if (! self) return nil; YYDiskCache *globalCache = _YYDiskCacheGetGlobal(path); if (globalCache) return globalCache; // _YYDiskCacheSetGlobal(self); return self; }Copy the code

increase

Compare the core logic when storing dataFilename in the database as shown below:

The big data folder is shown as follows:

The binary objects stored as inline_data in the database where the small data resides are shown as follows:

delete

Delete mainly according to YYKVStorage database design, through SQL to find the corresponding item for processing.

RemoveAllItemsWithProgressBlock: endBlock method, find out the data is according to time ascending order item of the array, so in the process of removing, even remove failed, leaving the still and lru algorithm using the latest data, It’s a nice detail response.

The removeAllItems method is a little more violent and resets the database directly. For the deletion part, interested students can go to see the source CODE SQL concrete implementation

check

GetItemForKey and getItemInfoForKey are both query methods

  • GetItemForKey has no extended data, and the value of the extended_data field in the database is NULL
  • GetItemInfoForKey has extended data. The inline_data field in the database is NULL

The details and logic of the value are basically the same as in the memory cache. The good point is that the database can obtain the corresponding data directly according to the corresponding fields of SQL, which is basically the same as the function of the bidirectional linked list in memory. Thus lRU of disk cache is realized

conclusion

YYCache actually has a lot of good detail design, each look at the source will find different details, such as a simple two-way list inserttohead method, if it is not done more than a dozen times this head insert tail insert, write all the logical details at once is basically impossible.

Difficulties in my opinion:

Personally, I think the difficulty of YYCache lies in YYKVStorage. It is basically impossible to write YYKVStorage for people who are not familiar with the database and are not firm on THE basis of C. So the best and convenient way is to learn and learn, even if you can’t write, at least take it to know how to change the use.

The second point is the ability of frame design, which requires a strong foundation and a lot of practice to hone, there is no shortcut to speak of, bone surprise please ignore.

Key points in my opinion:

  1. YYCache design ideas, design and the market before the mainstream framework application of the survey and comparison, to understand all the pros and cons, so as to determine the direction of their own framework design.
  2. After looking at the implementation of YYCache, many details of different places show the powerful structure design ability and thought. Second, multiple files expose the same API and interface properties, and you can use each class individually without affecting the whole.
  3. For different caching methods, there are different choices and performance comparison between using multithreading and locking, so that the solution with the best results can be selected after the test
  4. Research and compare disk caches to choose a combination of database disks, something few people even today can do.
  5. Take a look at the author’s performance test code and different websites. There are also many subtle details in the performance test code, so you can have a deeper understanding of the internal mechanism of the language.

Finally, there are a lot of details. Try to read it several times. Every time, a small detail may surprise you. YYCache should be YYKit framework compared to the most simple, slowly learn, not in a hurry.

Any good points, welcome to correct.