A, goals,

Understand the data persistence mode and corresponding application scenarios of mobile terminals, and provide relevant technology selection as technical reserve.

The purpose of data persistence

  1. Quick presentation to enhance the experience
    • The loaded data is displayed to the user next time, without loading it from the network (disk)
  2. Save user traffic (save server resources)
    • For large resource data cache, the next display does not need to download and consume traffic
    • In addition, the server access times are reduced and server resources are saved. (picture)
  3. Offline use.
    • The data that the user browsed does not need to be connected to the Internet and can be viewed again.
    • Some functions are used to remove the dependence on the network. (Baidu offline map and book reader)
    • When there is no network, users can perform operations and synchronize the operations to the server on the next network connection.
  4. Recording User Operations
    • Draft: For costly operations, each step of the user is cached. After the user interrupts the operation, the last operation will be continued in the next operation.
    • The read content tag cache helps the user identify what has been read.
    • Search record caching…

Classification of data persistence methods

Data persistence on mobile terminals can be generally divided into the following two categories:

1. Memory cache

  • define

    For frequently used data, the data is loaded from the network or disk to the memory and is not destroyed immediately after being used. Instead, the data is directly loaded from the memory for the next use.

  • case

    • IOS image loading — [UIImage imageNamed:@”imageName”]
    • Network image loading tripartite library: SDWebImage

2. Disk caching

  • define

    The system writes the data that is loaded from the network and generated by user operations to disks. When users view or continue operations next time, the system directly loads the data from disks.

  • case

    • User input draft cache (e.g., comments, text editing)
    • Network image loading tripartite library: SDWebImage
    • Search history cache

4. Cache Strategy (Common Cache Algorithm)

In cache design, since the storage space of the hardware device is not infinite, we expect not to take up too much storage space and only cache limited data, but we want to achieve a higher hit ratio. I want to do that. This is usually done with the help of a caching algorithm.

1. FIFO (First in First out)

Implementation principle:

The core idea of FIFO is that if a piece of data enters the cache first, it should be eliminated first. Similar to implementing a chronological queue to manage the cache, the earliest access to the data cache will be eliminated.

Schematic diagram:

Question:

The effect of time recency and access frequency on cache hit ratio is not considered. If the user has a high probability of accessing the recently accessed data, the hit ratio is low.

2, LFU (Least Frequently Used)

Implementation principle:

LFU’s least Recently used algorithm is based on the idea that “if a data has been rarely used in the recent period, it is unlikely to be used in the future”. The number of users’ accesses to data is recorded, and the data that is accessed most times is arranged in a container in descending order. The data that is accessed least times is eliminated.

Question:

LFU only maintains the information about the access frequency of each item. For a cache item, if it has a very high access frequency in the past but a low access frequency recently, it is difficult to replace the item from the cache when the cache space is full, resulting in a decrease in the hit ratio.

3、 LRU (LeastRecentlyUsed)

Implementation principle:

LRU is a widely used cache algorithm. The algorithm maintains a queue of cache items, which are sorted by the time each item was last accessed. When the cache space is full, it is at the end of the queue, that is, the item most recently accessed is deleted, and the new extent is placed at the head of the queue.

Schematic diagram:

Question:

LRU algorithm only maintains the access time information of the cache block, without considering the access frequency and other factors. When there is hot data, LRU efficiency is very good, but the occasional and periodic batch operation will lead to a sharp decline in LRU hit ratio. For example, in a VoD (video on demand) system, data already accessed by users will not be accessed again.

4, the LRU – K (LeastRecentlyUsed)

Implementation principle:

Compared with LRU, its core idea is to expand the judgment criterion of “recently used once” to “recently used K times”. Specifically, it maintains a queue that records the history of all cached data being accessed. Data is only put into the cache when it has been accessed K times. When data needs to be eliminated, lRU-K will eliminate the data whose KTH access time is the largest from the current time.

Schematic diagram:

Question:

Additional space is required to store access history, and maintaining two queues increases algorithm complexity and CPU consumption.

8, Queues

Implementation principle:

2Q algorithm is similar to LRU-2, except that 2Q changes the access history queue in LRU-2 algorithm (note that this is not for caching data) to a FIFO cache queue, that is, 2Q algorithm has two cache queues, one FIFO queue and one LRU queue.

Schematic diagram:

Question:

Two queues are required, but the two queues themselves are relatively simple. 2Q algorithm and LRU-2 algorithm have similar hit ratio and memory consumption, but for the last cached data, 2Q will reduce one operation to read or calculate data from the original storage.

6. MQ (Multi Queue)

Implementation principle:

MQ algorithms divide data into multiple LRU queues based on priority (access frequency). The core idea is to cache data that has been accessed more times first.

Schematic diagram:

Question:

Multiple queues require extra space to store the cache. Maintaining multiple queues increases the complexity of the algorithm and increases CPU consumption.

5. Data persistence schemes available on iOS

1. Memory cache

The technical means to realize memory caching include Apple’s official NSURLCache and NSCache, as well as the open source cache libraries YYCache and PINCache, which have advantages in performance and API.

2. Disk cache

  • NSUserDefault

    Suitable for small – scale data, weak business – related data cache.

  • keychain

    Keychain is a reversible encryption storage mechanism provided by Apple. It is widely used to store user names and passwords. In addition, Keychain is system-level storage and can be synchronized by iCloud. Even if the App is deleted, the Keychain data is still retained. The next time a user installs an App, the Keychain data can be accessed directly and is usually used to store the unique identifier of the user. Therefore, sensitive small data in iCloud needs to be encrypted and synchronized, and is generally accessed using the Keychain.

  • File storage

    • Plist: General structured data can be persisted using Plist
    • Archive: The archive method can be used to access data that complies with the protocol. It is convenient to access data using objects. However, the intermediate serialization and deserialization require certain performance, and can be used when you want to use objects for disk access.
    • Stream: refers to file storage, usually used to store data such as pictures and video files
  • Database storage

    The database is suitable for accessing some relational data; Can be used when there is a large number of conditional query sorting class requirements.

    • Core Data: Object Relational Mapping (ORM)
    • FMDB: One of github’s most popular iOS SQLite package open source libraries
    • WCDB: aN open source implementation based on sqLite encapsulation used by wechat team, with Object Relational Mapping (ORM) features, supporting iOS and Android.
    • Realm: A cross-platform (iOS, Android) mobile database developed by the Y Combinator startup team.

3. Which caching scheme should be used

Select according to demand:

  • For simple data storage, you can write files and access key-value files directly.
  • If you need to search and sort according to some conditions, you can use relational storage such as SQLite.
  • Sensitive data, encrypted storage.
  • You do not want to store small-capacity data (user names, passwords, and tokens) that are cleared after the App is deleted in the keychain.

Comparison of memory and disk data persistence schemes

1. Detailed explanation of optional schemes

1.1, NSCache

Apple provides a simple memory cache that has an API similar to NSDictionary, except that it is thread-safe, does not retain keys, and internally implements memory warning handling (when only applied in the background, a portion of the cache is removed).

1.1.1, features,

  • attribute
    • The name of the
    • Delegate: notifies the agent when obj is removed from the cache
    • CountLimit: indicates the storage limit
    • CostLimit: storage space costLimit (imprecise)
    • EvictsObjectsWithDiscardedContent (to be automatic recycling waste content, didn’t see the use of this property scenario)
  • methods
    • Use keys to synchronize storage, access, and deletion
    • Delete everything

1.1.2, implementation,

  • NSCacheEntry: an internal class that converts key-values into modified entities to implement a bidirectional linked list storage structure
    • Key: a key
    • Value: the value
    • Overhead cost:
    • PrevByCost: indicates the previous node
    • NextByCost: indicates the next node
  • NSCacheKey: Encapsulation of the key used for accessing and using objects that do not support the NSCopy protocol
    • Value: Accesses the value of the key used
  • _entries: NSDictionary, which is used to access instances of NSCacheEntry as key-value pairs
  • _head: the head node of a bidirectional linked list, which is sorted by cost ascending; SetObject is removed from the root when costLimit/countLimit trim is triggered
  • NSLock: Implements read-write thread safety

1.2, TMCache

TMCache was originally developed by Tumblr, but is no longer maintained. TMMemoryCache implements many features that NSCache does not, such as quantity limits, total capacity limits, lifetime limits, memory warnings, or clearing the cache when an application falls back into the background. TMMemoryCache is designed with thread-safety in mind by placing all read and write operations in the same Concurrent queue and then using dispatch_barrier_async to ensure that tasks are executed sequentially. It incorrectly uses a lot of asynchronous block callbacks to implement access functions, resulting in significant performance and deadlock issues. As the library is no longer maintained for a long time, no detailed comparison is made.

1.3, PINCache

Tumblr announced a cache SDK maintained and improved by Pinterest after not maintaining TMCache. It has basically the same functionality and interface as TMCache, but fixes performance and deadlock issues. It also uses dispatch_semaphore for thread safety, but removes dispatch_barrier_async, avoiding the overhead of thread switching and possible deadlocks.

1.3.1 Features:

  • PINCaching (protocal)

    • attribute
      • The name of the
    • methods
      • Synchronous or asynchronous use keys to save, read, delete, check the existence, and set the TTL duration and storage space consumption
      • Synchronously/asynchronously deletes data before the specified date (disk cache means creation date)
      • Delete expired data synchronously or asynchronously
      • Delete all data synchronously or asynchronously
  • PINMemoryCache

    • attribute
      • TotalCost: totalCost used
      • CostLimit: overhead (memory) usage limit (trim is triggered every time an assignment is made)
      • AgeLimit: Uniform life cycle limit (trim is triggered each time an assignment is made; GCD timer loop trigger)
      • TtlCache: indicates whether to set TTL. If this parameter is set, only the life cycle data is returned
      • removeAllObjectsOnMemoryWarning
      • removeAllObjectsOnEnteringBackground
      • The cache object Block listener will/has been added and removed
      • Will/has removed all object block listeners from the cache
      • Memory warning has been received and background block listening has been entered
    • methods
      • Deletes data below the specified cost synchronously or asynchronously
      • Synchronously/asynchronously delete data before the specified date and continue to delete data below the specified cost (trimToCostLimitByDate)
      • Synchronous/asynchronous traversal of all cached data
    • Internal implementation
      • NSMutableDictionary is used to store cached data, and additional NSMutableDictionary is used to store createdDates, accessDates, costLimit, ageLimit, and more
      • Use mutex to secure multiple threads
      • Use PINOperationQueue for asynchronous operations
      • When costLimit trim is triggered by setObject, accessDates are sorted, implementing the LRU policy
  • PINDiskCache

    • attribute
      • Prefix: indicates the cache name prefix
      • CacheURL: indicates the url of a cache path
      • ByteCount: indicates the size of data stored on a hard drive
      • ByteLimit: Maximum disk space limit, 50M by default (trim is triggered when each assignment is made
      • AgeLimit: same as PINMemoryCache; The default for 30 days
      • WritingProtectionOption:
      • TtlCache: with PINMemoryCache
      • RemoveAllObjectsOnMemoryWarning (with PINMemoryCache)
      • RemoveAllObjectsOnEnteringBackground (with PINMemoryCache)
      • Will/have added/removed cache object block listeners (same as PINMemoryCache)
      • Block listeners for all objects that will/have been removed from the cache (same as PINMemoryCache)
      • Has received a memory warning, has entered the background block listening (same as PINMemoryCache)
      • Supports custom encoding and decoding of keys (removing special characters by default). : / %)
      • Support for custom serialization and deserialization of data (default NSKeyedArchiver, NSCoding protocol required)
    • methods
      • LockFileAccessWhileExecutingBlockAsync, synchronouslyLockFileAccessWhileExecutingBlock: after all the file write operation after the callback block
      • FileURLForKey: Gets the fileUrl of the specified file
      • Synchronously/asynchronously delete data below the specified cost (same as PINMemoryCache)
      • Delete data before the specified date synchronously/asynchronously, continue to delete data below costLimit (same as PINMemoryCache)
      • Synchronous/asynchronous traversal of all cached data (same as PINMemoryCache)
    • Internal implementation
      • You can use PINDiskCacheMetadata to save data information such as createdDate, lastModifiedDate, size, and ageLimit. At initialization, metadata for all files is loaded, stored in an NSMutableDictionary, and accessed via fileKey.
      • Read the file to obtain createdDate, lastModifiedDate, size and other information write metadata; Setxattr, remoVEXattr, and getXattr stores ageLimit information and writes metadata back
      • TrimDiskToSize: Deletes files in descending order by file size. Large files are deleted first
      • TrimDiskToSizeByDate: Sort by last modified time in ascending order. Delete LRU that have not been accessed for a long time first.
      • TrimToDate: Deletes files whose creation date is earlier than the specified date (in reverse order of modification time)
      • Use mutex to secure multiple threads:
      • Use PINOperationQueue for asynchronous operations
      • Implement the LRU policy by sorting accessDates
  • PINCache

    • attribute
      • DiskByteCount: Set diskCache, byteCount
      • DiskCache: diskCache
      • MemoryCache: memoryCache
    • methods
      • Only the initialization method and the implementation
    • implementation
      • Two level cache implementation: the first memory; After the disk is fetched, the memory is updated at the same time
      • The same PINOperationQueue is used for asynchronous operations
      • PINOperationGroup to implement the memory cache and disk cache end callbacks

1.3.2, implementation,

  • PINOperationQueue (Async tasks are implemented through a custom PINOperationQueue)
    • Pthread_mutex PTHREAD_MUTEX_RECURSIVE(add operation, thread-safe)
    • dispatch_queue:
      • DISPATCH_QUEUE_SERIAL: if the number of concurrent DISPATCH_QUEUE_SERIAL is 1, the serial queue is used. Use serial queues to ensure that operations on semaphore data are safe (modify the semaphore number when modifying the number of concurrent requests)
      • DISPATCH_QUEUE_CONCURRENT: Performs time-consuming operations in blocks
    • Dispatch_group: block the current thread, used to implement waitUntilAllOperationsAreFinished
    • Dispatch_semaphore: controls the number of concurrent requests. This parameter is used when the number of concurrent requests is greater than 1.
  • PINOperationGroup
    • Dispatch_group_enter, dispatch_group_leave, dispatch_group_notify, and return group end block
  • LRU eliminated
    • Each time a new object is set, the portion exceeding the costLimit is deleted in reverse order according to the access time
  • Thread safety
    • pthread_mutex_lockThe mutex 🔐
    • PINOperationQueueImplement multithreaded queue tasks

1.4, YYCache

YYCache is a memory cache implementation of open source, YYCache is the standard PINCache, PINCache achieved most of the ability, while doing some targeted performance optimization. Compared with PINMemoryCache, the memory cache removes the interface of asynchronous access, optimizes the performance of synchronous access as much as possible, and uses the OSSpinLock pthread_mutex_t mutex to ensure thread safety. In addition, LRU elimination algorithm is implemented in the cache with bidirectional linked list and NSDictionary. Disk caching supports setting file size thresholds to control whether to write to disk or save to database.

1.4.1 Features:

  • YYMemoryCache

    • attribute
      • Name: name of the
      • TotalCount: indicates the number of caches
      • TotalCost: totalCost used
      • CountLimit: cache limit (not strict limit, GCD timer timer triggers background thread trim)
      • CostLimit: memory usage limit (not strict limit, GCD timer timer triggers background thread trim)
      • AgeLimit: uniform life cycle limit (not strictly limited, GCD timer timer triggers background thread trim)
      • AutoTrimInterval: indicates the trim interval. The default value is 5s
      • shouldRemoveAllObjectsOnMemoryWarning
      • shouldRemoveAllObjectsWhenEnteringBackground
      • ReleaseOnMainThread: Specifies whether the main thread is allowed to destroy memory key/value pairs. Note that after specifying this value to YES, the cache of YYMemoryCache will not destroy cached objects until it returns to the main thread, that is, release.
      • ReleaseAsynchronously: Whether asynchronous threads destroy memory key-value pairs. The default value is YES
      • Memory warning has been received and background block listening has been entered
    • methods
      • Synchronously use keys to save, fetch, delete, determine the presence, and set the memory cost of each storage
      • Delete all caches synchronously/asynchronously (determined by releaseOnMainThread, releaseAsynchronously)
      • Synchronize TRIM Deletes data below the specified count
      • Synchronize trim Deletes data below the specified cost (starting from tail, that is, removing recently inaccessible data)
      • Synchronize TRIM Deletes data before the specified date
    • Internal implementation
      • _YYLinkedMapNode: a linked list node that stores information about key, Value, Pre, next, cost, and time (CACurrentMediaTime, last accessed time)
      • _YYLinkedMap: The node using _YYLinkedMap can be added, deleted, or modified in a linked list
        • Dic, totalCost, totalCount, Head (MRU), tail(LRU), releaseOnMainThread, releaseAsynchronously
        • insertNodeAtHead
        • bringNodeToHead
        • removeNode
        • removeTailNode
        • removeAll
        • The latest access to the list is placed in the header node, making it easier to trim the list directly from the tail node
      • Use the pthread_mutex_t mutex to keep threads safe
      • Perform trim task if costLimit is triggered by increasing OBJ cache using DISPATCH_QUEUE_SERIAL
  • YYDiskCache

    • attribute
      • Name: indicates the cache name
      • Path: indicates the cache path
      • InlineThreshold: controls the threshold for saving sqlite or files. The value is greater than this threshold. The default value is 20KB
      • CustomArchiveBlock, customUnarchiveBlock: Custom serialization and deserialization of data (default NSKeyedArchiver, NSCoding protocol required)
      • CustomFileNameBlock: Customizes the file name based on the key name
      • CountLimit: same as YYMemoryCache; Default unrestricted
      • CostLimit: the same as YYMemoryCache, where it refers to the actual disk storage size. Default unrestricted
      • AgeLimit: same as YYMemoryCache; Default unrestricted
      • FreeDiskSpaceLimit: minimum free space limit for the disk that can be cached. The default 0
      • AutoTrimInterval: same as YYMemoryCache, 60s by default
      • ErrorLogsEnabled: error logs
    • methods
      • Synchronous/asynchronous use keys to save, fetch, judge, and delete data
      • Delete all data synchronously or asynchronously
      • Asynchronously delete all data and call back progress at block
      • Synchronously/asynchronously get totalCount and totalCost
      • Synchronous/Asynchronous trimToCount, trimToCost, and trimToAge
      • Bind extendedData for the specified object
    • Internal implementation
      • Use dispatch_semaphoRE_t: the semaphore is set to 1 and used as a lock
      • Dispatch_queue_t: DISPATCH_QUEUE_CONCURRENT. Asynchronous threads perform trim and CRUD
        • Note: This causes all asynchronous operation callback blocks to be in the asynchronous thread, not the main thread
      • _globalInstances: NSMapTable caches all initialized diskCache instances, including key strong and value weak
      • YYKVStorage
      • attribute
        • Path: indicates the cache path
        • Type: YYKVStorageTypeFile, YYKVStorageTypeSQLite, and YYKVStorageTypeMixed
        • errorLogsEnabled
      • methods
        • Save key-value data
        • Delete key-value data according to key. Delete data that exceeds the specified size (delete data in reverse order of access time, delete 16 data at a time); Delete data before the specified time (same); Delete data to the entire storage space to the specified size; Delete data to the total storage amount to the specified count; Delete all data
        • Use key to fetch data
        • Check whether the specified key contains data. Get the amount of storage; Obtain the storage usage size
      • implementation
        • Selite is used internally to access data
        • Delete all data: first move to the specified trash directory, then delete the trash directory in the background? Moving files faster than deleting them?
        • DISPATCH_QUEUE_SERIAL: deletes trash
  • YYCache

    • attribute
      • Name: name of the
      • MemoryCache: memoryCache
      • DiskCache: diskCache
    • methods
      • Synchronous/asynchronous use keys to store, fetch, judge, and delete data
      • Delete all data synchronously or asynchronously
      • Asynchronously delete all data and call back progress at block
    • implementation
      • Level 2 cache: Memory first and then disk
      • Asynchronous operations are performed directly using globalQueue.

1.4.2, implementation,

  • Disk access: package YYKVStorage to perform file reading and writing, SEQLite operations, specific access operations to it to complete
  • Memory LRU obsoletion: Each time a new object is set, the portion exceeding the costLimit is removed in reverse order (with the help of a linked list)
  • Thread safety
    • pthread_mutex_lockMutually exclusive 🔐 to achieve memory cache thread safety
    • Dispatch_semaphore_t: Semaphore set to 1, used as lock

2. Comparison of memory cache schemes

2.1, performance,

The read and write performance of YYCache is excellent. Both NSCache and PINCache have their advantages and disadvantages.

  • Single thread performance test diagram from YYCache design idea:

  • My performance test chart:

Performance test description:

The debug package used in the performance test based on YYCache Demo does not represent the actual performance.Copy the code

2.1, contrast

SDK API Ease of use implementation The advantages and disadvantages Whether the maintenance
NSCache Synchronize save, fetch, delete, set costLimit, countLimit, delegate (only trigger trim deletion notification) In the NSLock implements thread safety, internally converts key-value information into linked list object entities, accesses entities using NSDictionary, and deletes linked list in descending order of cost when triggering TRIM. Application background status triggers a memory warning to clear some storage The official is more reliable, but the lack of expansion, function is not perfect, the performance is general Apple in the maintenance
PINMemoryCache Synchronously/asynchronously save, fetch, delete, save, trim, traverse all stored data; Set costLimit, ageLimit, ttlCache (overtime data does not return, clearance), removeAllObjectsOnMemoryWarning, removeAllObjectsOnEnteringBackground; Add delete key-value block callback; Application into the background, memory warning block callback; high Use the pthread_mutex_t mutex to achieve thread safety, use NSDictionary to access the entity, use additional NSDictionary to access the creation time, update time, cost, ageLimit and other information of the entity to achieve relevant capabilities, use GCDtimer to time trim The LRU policy is implemented according to the update time of the storage. However, the internal storage split multiple NSDictionaries, resulting in performance degradation Pinterest maintenance
YYMemoryCache Synchronous save, fetch, delete, save, trim; Set the countLimit, costLimit, ageLimit, autoTrimInterval, shouldRemoveAllObjectsOnMemoryWarning, shouldRemoveAllObjectsWhenEnterin GBackground, application into the background/receive memory warning block listening high The pthread_mutex_T mutex is used to achieve thread safety, and the _YYLinkedMapNode internal class entity is used to store key-value pair information to achieve bidirectional list storage structure. The data is sorted in descending order by access time, based on which LRU cache is implemented Perfect function, easy to use, LRU strategy, high performance; However, there is no abstraction of relevant protocols, and the memory and disk cache are highly repetitive The author is no longer in maintenance

3. Comparison of disk caching schemes

3.1, performance,

Small data access YYCache wins. YYCache is faster to access files larger than 20KB.

  • Performance test diagram extracted from YYCache design idea:

  • My performance test

Performance Test Description: The debugging package used in the performance test based on YYCache Demo does not represent the actual performance.

3.2, contrast

SDK API Ease of use implementation The advantages and disadvantages Whether the maintenance
PINDiskCache Synchronous/asynchronous save, retrieve, delete, check existence, trim date/size/sizeByDate; Set byteLimit, ageLimit, ttlCache(timeout data is not returned, clear), NSDataWritingOptions (file write mode), set data self-defined serialized block, key self-defined codecode block; Add delete key-value block callback; Delete all data callbacks; Get cache URL, space occupied size, single file storage fileUrl; Perform the specified operation to wait for the file write lock to open; Iterate over all stored files high Pthread_mutex_t mutex is used for read-write thread safety, pthread_cond_t is used for file read-write protection, PINDiskCacheMetadata is used to store file information in memory for quick reading, NSDictionary is used for accessing entities with keys,, GCDtimer is used to time trim, dispatch_semaphore_t is used to control concurrency to implement customized OperationQueue, and cache queue tasks are executed in sequence Complete function, easy to use, protocol oriented implementation, the overall architecture is clear, trim operation according to the storage update time to achieve LRU policy Pinterest maintenance
YYDiskCache Synchronously or asynchronously save, read, delete, check the existence, run trim count/cost/age, and obtain totalCost and totalCount. Set inlineThreshold, countLimit, costLimit, ageLimit, freeDiskSpaceLimit, and autoTrimInterval. Set a customized serialized block for data and a customized block for fileName high Using dispatch_semaphoRE_T semaphore to achieve thread safety; YYKVStorageItem internal class entity is used to store key-value pairs such as key, value, filename, size, modTime, accessTime, extendedData, etc. The specific file access is realized by YYKVStorage. According to sqLite’s feature that the speed of accessing small-space data is better than that of reading and writing direct files, set the threshold of the access mode. The data whose space is smaller than the threshold is directly stored in SQLite, and the data index information exceeding the threshold is stored in SQLite, and the data is stored in files. Based on this, the small data access performance is several times better than PINDiskCache Perfect function, easy to use, LRU strategy, high performance; Implementation of different file storage strategies more efficient; However, there is no abstraction of relevant protocols, and the memory and disk cache are highly repetitive The author is no longer in maintenance

7. Database cache

1.1 background,

Native SQLite is cumbersome to use, requires a lot of code to complete a SQL operation, and is a C LANGUAGE API, OC or other language developers are not friendly, if you want to execute a SQL, you need to do something like the following:

- (void)example { sqlite3 *conn = NULL; / / 1. Open the database nsstrings * path = [NSSearchPathForDirectoriesInDomains (NSDocumentationDirectory NSUserDomainMask, YES).firstObject stringByAppendingPathComponent:@"MyDatabase.db"];
    int result = sqlite3_open(path.UTF8String, &conn);
    if(result ! = SQLITE_OK) { sqlite3_close(conn);return;
    }
    const char *createTableSQL =
    "CREATE TABLE t_test_table (int_col INT, float_col REAL, string_col TEXT)"; sqlite3_stmt* stmt = NULL; int len = strlen(createTableSQL); If the creation fails, sqLITe3_FINALIZE needs to release sqlite3_STMT object to prevent memory leakage.if(sqlite3_prepare_v2(conn,createTableSQL,len,&stmt,NULL) ! = SQLITE_OK) {if (stmt)
            sqlite3_finalize(stmt);
        sqlite3_close(conn);
        return; } //3. Run the sqlite3_step command to create a table. For DDL and DML statements, the only correct return value from sqlite3_step execution is SQLITE_DONE. For SELECT queries, if there is data returned to SQLITE_ROW, SQLITE_DONE is returned when the end of the result set is reached.if(sqlite3_step(stmt) ! = SQLITE_DONE) { sqlite3_finalize(stmt); sqlite3_close(conn);return; } //4. Release the resources to create the table sentence object. sqlite3_finalize(stmt);printf("Succeed to create test table now.\n"); //5. Construct the sqLITe3_stMT object to query table data. const char* selectSQL ="SELECT * FROM TESTTABLE WHERE 1 = 0";
    sqlite3_stmt* stmt2 = NULL;
    if(sqlite3_prepare_v2(conn,selectSQL,strlen(selectSQL),&stmt2,NULL) ! = SQLITE_OK) {if (stmt2)
            sqlite3_finalize(stmt2);
        sqlite3_close(conn);
        return; } //6. Obtain the number of fields in the result set according to the object of the SELECT statement. int fieldCount = sqlite3_column_count(stmt2);printf("The column count is %d.\n",fieldCount); //7. Go through the meta information of each field in the result set and get its declared type.for(int i = 0; i < fieldCount; ++ I) {// Because there is no data in the Table, the data type in SQLite itself is dynamic, so it cannot be obtained by sqlite3_column_type. Sqlite3_column_type will only return SQLITE_NULL, and the specific type will not be returned until there is data. Therefore, sqlite3_column_decltype is used to obtain the declared type of the table declaration. string stype = sqlite3_column_decltype(stmt2,i); stype = strlwr((char*)stype.c_str()); // Rule resolution of data types to determine field affinityif (stype.find("int") != string::npos) {
            printf("The type of %dth column is INTEGER.\n",i);
        } else if (stype.find("char") != string::npos
                   || stype.find("text") != string::npos) {
            printf("The type of %dth column is TEXT.\n",i);
        } else if (stype.find("real") != string::npos
                   || stype.find("floa") != string::npos
                   || stype.find("doub") != string::npos ) {
            printf("The type of %dth column is DOUBLE.\n",i);
        }
    }
    sqlite3_finalize(stmt2);
    sqlite3_close(conn);
}
Copy the code

Since SQLite is not easy to use directly on mobile, there are a number of seQLite packages that have evolved, including the following popular libraries that are well known, and their ultimate implementations point to SQLite:

  • CoreData: Apple based on SQLite encapsulated ORM(Object Relational Mapping) database, direct Object Mapping ———— Due to the poor performance and high learning cost of CoreData, many pitfalls (see Why I don’t like CoreData), Details are not covered below
  • FMDB: The most widely used encapsulation of SQLite for OC on Github on iOS, supporting queue operations
  • WCDB: open source encapsulation of SQLite operation by wechat technical team, supporting object and database mapping, an implementation of ORM database, more efficient than FMDB

A special case is that it implements a set of ORM data stores through its own search engine:

  • Realm: Realm TeamEncapsulation of SQLiteThrough a self-built search engine to achieve a set of mobile database, is also an ORM database implementation, is aMVCC database

1.2, contrast

Sqlite database includes basic operations such as adding, deleting, modifying, and searching, and the application of sqLite database in projects also requires operations such as data transformation to model and database version upgrade by adding and deleting tables, fields, and data migration. The following uses examples of these operations in various popular libraries to compare the ease of use of each library.

1.2.1,FMDB

FMDB is an OC-oriented encapsulation of SQLite, which encapsulates SQL operations in C language into OC-style code. The main features are as follows:

  • OC style, save a lot of repeated, redundant C language code
  • It provides a multi-thread safe database operation method to ensure the consistency of data
  • Much lighter than CoreData, Realm, etc.
  • Support transactions
  • Full text search supported (FTS Subspec)
  • Supports checkpoint operation in Write Ahead Logging (WAL) mode

FMDB basic operation examples:

NSString * SQL = [NSString stringWithFormat:@"CREATE TABLE IF NOT EXISTS t_test_1 ('%@' INTEGER PRIMARY KEY AUTOINCREMENT,'%@' TEXT NOT NULL, '%@' TEXT NOT NULL, '%@' TEXT NOT NULL, '%@' TEXT NOT NULL, '%@' TEXT NOT NULL, '%@' TEXT NOT NULL, '%@' TEXT NOT NULL, '%@' TEXT NOT NULL, '%@' INTEGER NOT NULL, '%@' FLOAT NOT NULL)", KEY_ID, KEY_MODEL_ID, KEY_MODEL_NAME, KEY_SERIES_ID, KEY_SERIES_NAME, KEY_TITLE, KEY_PRICE, KEY_DEALER_PRICE, KEY_SALES_STATUS, KEY_IS_SELECTED, KEY_DATE];
FMDatabaseQueue *_dbQueue = [FMDatabaseQueue databaseQueueWithPath:@"path"];
[_dbQueue inDatabase:^(FMDatabase *db) {
	BOOL result = [db executeUpdate:sql];
	if(result) { // } }]; NSString *insertSql = [NSString stringWithFormat:@"INSERT INTO 't_test_1'(%@,%@,%@,%@,%@,%@,%@,%@,%@,%@) VALUES(\"%@\",\"%@\",\"%@\",\"%@\",\"%@\",\"%@\",\"%@\",\"%@\",%d,%.2f)", KEY_MODEL_ID, KEY_MODEL_NAME, KEY_SERIES_ID, KEY_SERIES_NAME, KEY_TITLE, KEY_PRICE, KEY_DEALER_PRICE, KEY_SALES_STATUS, KEY_IS_SELECTED, KEY_DATE, model.model_id, model.model_name, model.Id, model.Name, model.title, model.price, model.dealer_price, model.sales_status, isSelected,time];
[_dbQueue inDatabase:^(FMDatabase *db) {
    BOOL result = [db executeUpdate:sql];
	 if(result) { // } }]; // update NSString * SQL = @"UPDATE t_userData SET userName = ? , userAge = ? WHERE id = ?";
[_dbQueue inDatabase:^(FMDatabase *db) {
    BOOL res = [db executeUpdate:sql,_nameTxteField.text,_ageTxteField.text,_userId];
	 if(result) { // } }]; // delete NSString * STR = [NSString stringWithFormat:@"DELETE FROM t_userData WHERE id = %ld",userid];
[_dbQueue inDatabase:^(FMDatabase *db) {
    BOOL res = [db executeUpdate:str];
	 if(res) { // } }]; / / find [_dbQueueinDatabase:^(FMDatabase *db) {
    FMResultSet *resultSet = [db executeQuery:@"SELECT * FROM message"];
	NSMutableArray<Message *> *messages = [[NSMutableArray alloc] init];
	while([resultSet next]) { Message *message = [[Message alloc] init]; message.localID = [resultSet intForColumnIndex:0]; message.content = [resultSet stringForColumnIndex:1]; message.createTime = [NSDate dateWithTimeIntervalSince1970:[resultSet doubleForColumnIndex:2]]; message.modifiedTime = [NSDate dateWithTimeIntervalSince1970:[resultSet doubleForColumnIndex:3]]; [messages addObject:message]; }}];Copy the code

1.2.2,WCDB

WCDB is a set of open source package extracted from wechat APP SQLite by wechat technical team, which has the following characteristics:

  • The ORM mapping relationship is realized by means of macro definition. According to the mapping relationship, the operations such as table building, new database fields, field name modification (binding alias) and data initialization binding are completed
  • Since the WINQ syntax, most scenes do not need to directly write native SQLite statements, easy to use
  • Internal implementation of safe multithreaded read and write operation (write operation or serial) and database initialization optimization, improve performance (iOS SQLite source optimization practice)

Solutions are provided for many other scenarios:

  • Error statistics
  • Performance statistics
  • Damage repair (wechat mobile database component WCDB series (2) – Database repair three plate axes)
  • The injection
  • encryption

In WCDB, Object Mapping (ORM) refers to a Relational database

  • Map an ObjC class to a database table and index;
  • Map class property to database table fields;

This process. Through ORM, database operations can be carried out directly through Object without the purpose of assembly process.

WCDB basic operation example:

//Message.h
@interface Message : NSObject

@property int localID;
@property(retain) NSString *content;
@property(retain) NSDate *createTime;
@property(retain) NSDate *modifiedTime;
@property(assign) int unused; //You can only define the properties you need

@end
//Message.mm
#import "Message.h"
@implementation Message

WCDB_IMPLEMENTATION(Message)
WCDB_SYNTHESIZE(Message, localID)
WCDB_SYNTHESIZE(Message, content)
WCDB_SYNTHESIZE(Message, createTime)
WCDB_SYNTHESIZE(Message, modifiedTime)

WCDB_PRIMARY(Message, localID)

WCDB_INDEX(Message, "_index", createTime)

@end
//Message+WCTTableCoding.h
#import "Message.h"
#import <WCDB/WCDB.h>

@interface Message (WCTTableCoding) <WCTTableCoding>

WCDB_PROPERTY(localID)
WCDB_PROPERTY(content)
WCDB_PROPERTY(createTime)
WCDB_PROPERTY(modifiedTime)

@end
Copy the code
WCTDatabase *database = [[WCTDatabase alloc] initWithPath:path]; /* CREATE TABLE messsage (localID INTEGER PRIMARY KEY,
 						content TEXT,
 						createTime BLOB,
	 					modifiedTime BLOB)
 */
BOOL result = [database createTableAndIndexesOfName:@"message"withClass:Message.class]; // Insert Message * Message = [[Message alloc] init]; message.localID = 1; message.content = @"Hello, WCDB!";
message.createTime = [NSDate date];
message.modifiedTime = [NSDate date];
/*
 INSERT INTO message(localID, content, createTime, modifiedTime) 
 VALUES(1, "Hello, WCDB!", 1496396165, 1496396165);
 */
BOOL result = [database insertObject:message
                                into:@"message"]; //DELETE FROM message WHERElocalID>0;
BOOL result = [database deleteObjectsFromTable:@"message"
                                         where:Message.localID > 0]; //UPDATE message SET content="Hello, Wechat!";
Message *message = [[Message alloc] init];
message.content = @"Hello, Wechat!";
BOOL result = [database updateRowsInTable:@"message"onProperties:Message.content withObject:message]; //SELECT * FROM message ORDER BYlocalID
NSArray<Message *> *message = [database getObjectsOfClass:Message.class
                                                fromTable:@"message"
                                                  orderBy:Message.localID.order()];
Copy the code

1.2.3,Realm

The ORM database is an MVCC database with the following features:

  • Objects are everything (ORM mapping)
  • MVCC database
  • Realm uses a zero-copy architecture
  • Automatically update objects and queries
  • String & Int optimization (String to enumeration, similar to OC tagged Point,)
  • Crash protection (copy-on-wirte saves your changes when the system crashes unexpectedly)
  • True lazy loading (loading real data from disk only when used)
  • Internal encryption (encryption built into the engine layer)
  • The document is detailed and available in Chinese
  • With a vibrant community, Stackoverflow can solve almost any of your problems
  • Cross-platform, supporting iOS and Android
  • Realm Browser for Mac, easy to view data
  • Easy database version upgrade. Realm can configure database versions to determine upgrades.
  • Support KVC/KVO
  • Support for listening for notification of property changes (notification triggered by write operations)

Limitations:

  • The class name contains a maximum of 57 UTF8 characters.
  • The attribute name contains a maximum of 63 UTF8 characters.
  • The NSData and NSString attributes cannot hold more than 16 MB of data.
  • String sorting and case-insensitive queries only support basic Latin Character set, Supplementary Latin Character Set, Extended Latin Character set A, and Extended Latin Character set B (UTF-8 ranges from 0 to 591).
  • Creating a new Realm object is required for multithreaded access.
  • Realm object Setters & Getters cannot be overridden
  • Realm has no increment properties. [[NSUUID UUID] UUIDString [NSUUID UUID] UUIDString [NSUUID UUID] UUIDString [NSUUID UUID]
  • All data models must inherit directly from RealmObject. This prevents us from taking advantage of arbitrary types of inheritance in the data model. (such as JsonModel)
  • Realm does not support collection types. There is only one collection, RLMArray, and the server returns array data that needs to be converted by itself. The following attribute types are supported: BOOL, BOOL, int, NSInteger, long, long, float, double, NSString, NSDate, NSData, and NSNumber marked by a special type.

Realm Basic operation examples:

// Defining a model is similar to defining a normal Objective C class @interface Dog: rlmobject@property NSString *name; @property NSData *picture; @property NSInteger age; @end @implementation Dog @end RLM_ARRAY_TYPE(Dog) Dog *mydog = [[Dog alloc] init]; mydog.name = @"Rex"; mydog.age = 1; mydog.picture = nil; // This property is nullable NSLog(@)"Name of dog: %@", mydog.name); RLMRealm *realm = [RLMRealm defaultRealm]; [Dog createOrUpdateInRealm:realm withValue:mydog]; / / find; Find all dogs younger than 2 years old named Rex"age < 2 ADN name = 'Rex'"]; puppies.count; // Store [Realm transactionWithBlock:^{[Realm addObject: myDog];}]; Puppies. Count is updated in real time; [realm transactionWithBlock:^{[realm deleteObject:mydog];}]; // Modify data [realm transactionWithBlock:^{thedog. age = 1;}]; Dispatch_async (dispatch_queue_create(dispatch_queue_create))"background", 0), ^{
    @autoreleasepool {
        Dog *theDog = [[Dog objectsWhere:@"age == 1"] firstObject]; RLMRealm *realm = [RLMRealm defaultRealm]; [realm beginWriteTransaction]; theDog.age = 3; [realm commitWriteTransaction]; }});Copy the code

1.3 Database access performance test

Performance test description:

See the test data below. Due to the small number of samples (only one kind of data), only partial write and read operations were carried out, which could not fully reflect the comprehensive performance of a CERTAIN SDK, so it was only used for reference.

The test data and results are shown in the figure below:

Insert 1W data sequentially:

Insert 1W data using transactions:

Read 1W pieces of data:

Multiple threads (2) insert a total of 2W data:

1.4 comparison of database schemes

SDK advantages disadvantages Whether the maintenance
FMDB Relatively lightweight SQLite package, API is much more convenient to use than native SDK, low learning cost of SDK, basic support for all SQLite capabilities, such as transactions, FTS, etc It does not support ORM and requires each coder to write specific SQL statements. There is no more performance optimization. Database operations are relatively complex, and users need to implement data encryption and database upgrade by themselves is
WCDB Cross-platform; Sqlite deep encapsulation, support ORM, base class support their own inheritance, do not require users to directly write SQL, low cost, basic support for all sqLite capabilities; Internal more performance optimization; Documentation is relatively complete; The extension implements error statistics, performance statistics, damage repair, reverse injection, encryption and many other capabilities, and users need to do less The internal implementation is based on c++, the base class requires the.mm suffix (or is resolved by category), and additional macros are required to mark the mapping between model and database is
REALM Cross-platform; Support the ORM; Documentation is perfect; Realization of MVCC; Zero copy improves performance; The API is very friendly; Supporting visual tools are provided It is not a relational database based on SQLite, so it is impossible or difficult to establish the association between tables. It may be difficult to solve similar scenarios in the project. Base classes can only be inherited from RLMObject and cannot be inherited freely, making it inconvenient to implement property bindings such as JsonModel is

Performance data:

8. Persistence in Projects (Summary)

1. Image caching

Image cache libraries represented by SDWebImage (KingFisher) basically realize the ability of second-level cache, queue download, asynchronous decompression, Category expansion, etc. Common image loading and display requirements can be completed by using them.

2. Simple key-value access

The system’s cache functions such as NSCache and NSKeyedArchive can meet the basic access requirements, but it is not easy to use. PINCache, YYCache and other tripartite libraries expand quite a lot of capabilities to meet most of the use scenarios, and internal LRU strategies to improve efficiency, at the same time, the internal implementation of secondary cache to speed up the loading speed, can be directly used. Among them, PINCache’s performance is not as good as YYCache’s in some test data, but it can be seen that PINCache on Github has been updated recently. However, YYCache has not submitted any code for two years, and issue has not been processed, so you need to deal with any problems by yourself. If maintenance costs are high, use PINCache instead of YYCache.

3. Database

Core Data (I have never used it) has a bad reputation due to its high threshold of entry and many pits, so it is not recommended to try it here. FMDB can be said to have passed a lot of iOS App verification, although it is not satisfactory in some extensibility capabilities, but its stability has been tested, based on SQLite implementation, without changing the table structure data, easy to directly migrate to implementation such as WCDB. Both WCDB and Realm are orM-enabled, with almost no NEED to write SQL statements to add, delete, alter, and check them. Both are cross-platform, and offer many convenient encapsulation extensions, such as encryption and data upgrades, that are much easier to use than FMDB. If you really want to use ORM, I would recommend WCDB as a Realm search engine that does not support associated table queries, while WCDB is based on SQLite and supports direct SQL queries. It also requires a lot of effort to migrate from Realm to SQLite. In addition, the wechat team itself is using WCDB. With hundreds of millions of users, they encounter much more problems such as performance and data corruption than we do, and they make more optimization, which you can experience by using WCDB.

4, other

  1. Encapsulation No matter which tripartite library you use to cache your implementation, it is best to have a layer of encapsulation so that when you want to switch implementations, you can migrate the data directly internally, with no awareness to the user, or with minimal effort, rather than a full replacement
  2. To distinguish the user directory to store Each user is using a separate folder to store the data, the database, too, the advantage is that the user data is not mutual pollution (such as when there is a more complex relationship table in the database, will make your SQL statements become very complex, improved the error probability) you distinguish between users, also facilitate data diagnosis.
  3. Singletons are recommended to assign all data operations in a certain period to one object to ensure multi-threaded read and write security and reduce the probability of errors.
  4. Processing of user switching Due to the distinction between user storage directories, when switching login users, we need to switch the instance of data access, at this point, do not immediately destroy the last instance, the last instance may have unfinished read and write tasks, wait to complete or interrupt its operation after the destruction.

# reference

  • The article
  1. The iOS Architect’s Path: Local persistence solutions
  2. IOS (Data Persistence 1)
  3. IOS Application architecture talks about local persistence solutions and dynamic deployment
  4. Common caching algorithms and caching strategies
  5. Cache elimination algorithm -LRU algorithm
  6. IOS Cache framework -PINCache interpretation
  7. PINCache for IOS cache management
  8. YYCache design roadmap
  9. Sqlite-wal principle
  10. Wechat iOS SQLite source code optimization practice
  11. Wechat mobile database component WCDB series (two) – database repair three plate axe
  12. Database design: In-depth understanding of Realm’s multithreaded processing mechanism
  13. Realm Core database engine exploration
  14. Realm Database from Getting Started to giving up
  15. Some summary using Realm
  16. Realm, WCDB and SQLite mobile database performance comparison test
  17. Realm, WCDB, and SQLite mobile database performance tests
  • Open source library
  1. wcdb
  2. realm
  3. PINCache
  4. YYCache