Cache design is a basic computer theory, but also one of the important basic skills of programmers. Cache is almost everywhere, CPU L1 L2 Cache, iOS clean page and Dirty Page mechanism, HTTP tag mechanism, etc., these are behind the application of Cache design ideas.

Why Cache is needed

The purpose of Cache is to achieve a higher speed experience. The source of Cache is the difference in cost and performance between the two data reading methods.

Before you can start designing a Cache, you need to understand the medium of data storage. As client developers, there are several ways to store data that we are interested in:

  • Data is initially stored on the Server and is retrieved through network requests.

  • When we get data from the Server, we go through various intermediate network nodes (such as proxies) that sometimes cache our data.

  • Once the data is downloaded locally, we cache a copy of it on our local disk so that we may not have to go back to the server every time.

  • Once stored in disk, the way the data is stored affects the speed of the read. Sqlite stored in B+ Tree is much faster than serializing NSArray directly into a file.

  • When the App starts, the system loads the data downloaded from the Server from disk to memory. The read and write performance of Memory is much faster than that of Disk.

  • In Memory, there are also speed differences between different data structures. NSDictionary (hash table) is used to store read data. The write performance is better than Array, but space is more expensive. Although memory’s read and write performance is much higher than disk’s, there are sometimes bottlenecks when it comes to large sets of data.

  • Register, L1, L2 are faster than Memory, but for iOS App development, this level of optimization is rarely explored.

Each link mentioned above, there are performance and cost differences, Server data is naturally the most timely and accurate, but an App to obtain Server data in the form of NSArray, the middle has to go through a “long” process, it can be said that each step in the existence of cache design ideas.

The prerequisite for understanding and practicing Cache is a deep understanding of storage media and the differences between different data structures.

Most of our App performance optimizations, if they involve caching, are done on the medium of Memory. Storing data in Memory that needs to be retrieved from Disk or complex CPU computations with proper data structure can solve most of the Cache requirements in App development. This level of Cache design also has different postures, let’s take a look at the simple version.

Simple available Cache

Thanks to the encapsulation of NSDictionary in Foundation, we can use hash tables as data structures to implement a simple and usable cache mechanism. Here’s an example:

- (NSString*)getFormmatedPhoneNumber:(NSNumber*)phone { if(phone == nil) { return nil; } return [PhoneFormatLib formatPhoneNumber:phone]; //CPU time consuming operation}Copy the code

This is a simple function to format the phone number, in which the formatPhoneNumber function is a CALL to the Intensive CPU. Moreover, for the same phone number in a business scenario, it is necessary to obtain the formatted NSString frequently. If repeated calculation is performed every time, it is obviously a waste of CPU resources. And the performance is not good. We can optimize this by adding a simple Cache:

static NSMutableDictionary* gPhoneCache = nil;
- (NSString*)getFormmatedPhoneNumber:(NSNumber*)phone
{        if(phone == nil)
    {                return nil;
    }        NSString* phoneNumberStr = nil;

    [_phoneLock lock];        if(gPhoneCache == nil)
    {
        gPhoneCache = @{}.mutableCopy;
    }

    phoneNumberStr = [gPhoneCache objectForKey:phone];        if (phoneNumberStr == nil) {
        phoneNumberStr = [PhoneFormatLib formatPhoneNumber:phone];
        [gPhoneCache setObject:phoneNumberStr forKey:phone];
    }
    [_phoneLock unlock];        return phoneNumberStr; 
}Copy the code

By introducing NSMutableDictionary, you eliminate the need to call formatPhoneNumber every time, so Easy creates a fast cache design that can be submitted for testing and put the optimization in the product manager’s face. This is due to the time complexity of hash O(1). However, it has little impact on a small amount of data. Modern hash tables do not allocate a large amount of space at the beginning, but gradually expand as data increases.

The biggest problem with this simple, usable Cache design is that the code is too fragmented and uncontrollable. A small, scattered cache design is almost the same as digging a hole. You may design the cache with a small amount of data, but there is no guarantee that the memory overhead will remain negligible when the service changes. Moreover, such memory loss is hard to detect, cleverly hidden in a.m file, and when it comes to controlling the memory cost of the entire App in the later stage, it will feel like there are pits everywhere and it is impossible to start. As you may have noticed, the above Cache code has no place to release the Cache.

All code that has adverse effects on our entire App needs to be managed centrally, understood and located from an architectural level. How do you define side effects? Can be abstracted into a “write”, to add new records in the Cache is to write operation, the write operation is additional memory overhead, the side effects of trading space for time is the nature of the Cache, the loss of space is our side effects, a side effect can lead to other more side effects, clarify these side effects are often need access to a lot of code over and over again. It’s better to centralize the code that has side effects from the start.

Elegant and controllable Cache

One way to avoid Cache code clutter is to design an elegant, controllable Cache module. In an App, there may be a variety of data requiring Cache, phoneNumberCache, avatarCache, spaceshipCache, etc. We need a source to track these caches. The intuitive approach is to generate and hold these various caches through a factory class:

//CacheFactory.h@interface CacheFactory : NSObject+ (instancetype)sharedInstance; - (id<MyCacheProtocol>)getPhoneNumberCache; - (void)clearPhoneNumberCache; - (id<MyCacheProtocol>)getAvatarCache; - (void)clearAvatarCache; @endCopy the code

So when we need to evaluate the impact of various caches on the overall memory overhead of our App, we can just start with the CacheFactory code and debug it so that other engineers will be grateful to take over your code.

It is also a good practice to separate cache declarations and implementations through protocol. Other important knowledge of cache is the elimination strategy of cache, which can be performed in different ways, such as FIFO, LRU, 2Queues, etc. There are many mature third-party cache frameworks that can be used, and also NSCache, which has an undefined elimination policy. If you haven’t written any cache obsoletion strategies, I recommend that you try to create one yourself. At least read the implementation source code. It is necessary to understand these obsoletion strategies and make decisions based on local conditions when doing some deep optimization.

In fact, all operations involving data should consider the life cycle of the data. When we do business, we usually take Controller as the basic unit. In some scenarios, the possibility of a Controller being re-entered after exiting is very low. Timely cache cleaning will improve the overall performance of our App.

Immutable Cache

What is in the Cache? Is the Data. When it comes to Data, peak’s much-talked about Immutability has so much to do with the stability of our code that Immutability is the elephant in the room, important and easily overlooked.

When practicing Immutability, it is necessary to classify Data first, and then distinguish how each type of Data implements Immutability. The most important thing to classify Data is to distinguish between value types and reference types. Passing a value is a new copy of memory, so the value type is mostly safe. Passing a pointer is passing the same shared memory space, which is a big reason why Pointers are dangerous. The primitive types, bool, Int, long, etc., are all value types that can be passed safely, while the object types are passed as Pointers. That’s why Swift changed many of the base classes in Objective C to value types, enhancing Immutability and making our code more secure.

Let’s look at reading and writing different types of data in the Cache.

Value type – Read

The value type can safely return:

- (int)getSpaceshipCount
{        //...
    return _shipCount;
}Copy the code

Value type – Write

Value types can also be safely written:

- (void)setSpaceshipCount:(int)count
{
    _shipCount = count;
}Copy the code

Object type – Read

The pointer type needs to generate a new copy:

- (User*)getLuckyUser
{        //...
    return [_luckyUser copy];   
}Copy the code

The copy method of the object class requires us to implement the NSCopying Protocol manually. Although it is cumbersome in the early stage of development, it is very rewarding in the later stage. And the copy must be a deep copy; each property held in User needs a recursive copy.

Object type – Write

The danger of object type writes is that the function’s input parameter is also an object type, passing in a shared reference:

- (void)setLuckyUser:(User*)user
{        //...
    _luckyUser = [user copy]; 
}Copy the code

Set type – read

Collection classes also need copy, which is a major problem for bugs and crashes:

- (NSArray*)getHotDishes
{        //...
    return [_hotDishes copy];
}Copy the code

Set type – write

- (void)setHotDishes:(NSArray*)dishes
{        //...
    _hotDishes = [dished copy];
}Copy the code

You may also find that the principle is relatively simple. Ensure that the service module obtains data from the Cache from an independent copy to avoid potential risks caused by data sharing. The Cache module is a bit like a pure function in functional programming. It neither depends on the external state nor changes the external state. Instead, it focuses on the input and output of each function call.

Multithreading safety

As long as it comes to data processing, there is no way to avoid the topic of multi-threaded security, you can take a look at my previous several articles on multi-threaded security:

What’s wrong with iOS multithreading?

Correct use of @synchronized()

How to solve multithreading problems with Xcode8

Cache multithreading safety focuses on the handling of collection classes. The Cache itself manages collections of data most of the time. It is important to note that NSStrings should also be in the collection class, because NSString and NSArray behave in many ways the same in terms of data reading and writing and multithreading safety. Some mature third-party Cache libraries have taken care of the safety of multiple threads for us. If you build your own wheels, it is especially important to ensure that the read and write operations are atomic.

conclusion

The key to understanding Cache is to understand the design idea behind it, so that we can have a more comprehensive grasp of the behavior of our App and understand the bottleneck of data processing behind each business process. As we write more and more code and business gets more complex, there will come a time today or tomorrow when we need to apply Cache design.

Welcome to follow the public account: MrPeakTech, long press the QR code to follow.