An overview of the

During the iteration, we ran into a number of persistence-related problems, experienced an explosion of data model logic, confusing database API usage, and a strong business dependency on the database framework. The first two converges well through continuous abstraction, but the strong dependence on the database framework causes us to encounter big trouble when doing the framework switch.

In order to solve the problems encountered in persistence, I added DAO layer design on the basis of the original architecture, decoupled the dependency of database framework, and improved the performance of database query and write.

The project design

Persistence scheme 1.0

Data Observer: A realm-based Data set change notification Observer that provides refresh callbacks to UI Data sources when the Data set changes, or triggers other Data operations.

Data Hander, used to preprocess Data before storing it, such as pre-parsing Json strings, or triggering other Data operations.

The early selection uses Realm as the database base. The advantage is that Realm provides easy to use ORM, support cross-platform convenient unified logic. At the beginning of the project, there was no historical baggage attached to the data model.

The database CRUD API is called directly by the business layer, which has the advantage of saving work in rapid iteration and making full use of the database framework features on the business side. The disadvantage is that with the increase of the project volume, the various API usage can not be managed in a unified manner, resulting in a strong dependence on the database framework.

As the number of single tables continued to increase and IO levels increased, the Realm framework became more and more bottleneck, and no solution could be found, so we decided to introduce stable Sqlite to replace the IO – intensive Realm database.

Realm supports transactions, data versioning, notification of changes to data sets, semi-automatic migration of table structures, and other convenient features.

Advantage:

  1. Data versioning, the same table data in multiple threads, the Realm kernel does the synchronization for you. Similar to the logic of Git repositories, where each thread is a branch and each branch does a different commit, Realm can help you merge all branches into materds to ensure data consistency.

  2. The Realm allows you to listen for changes to a table’s data set. For example, you can create an Observer object for the Person table. When new data is added to the table, the Observer can retrieve the changed data set. Or listen to a Person object, and when the database changes a Person field, the Observer can also receive what field has changed.

Contrary to expectations, Realm is a relatively new database framework, and it has been used frequently.

  1. First, Realm has no indexes and is unable to retrieve large amounts of data. Query performance deteriorates significantly over 500 MB of data in a single table.

  2. Realm’s data model is not available across threads, and when threads switch, data needs to be copied, which has a performance footprint.

  3. Assertion failed when multiple threads are writing frequently Header. M_top_ref [1] == 0 with (header.m_top_ref[1], get_file_PATH_for_assertions ()),

  4. Error Domain= IO. Realm Code=9 “mmap()failed: Error Domain= IO. Realm Code=9 “mmap()failed: Cannot allocate memory size:1455652864 OFFse).

  5. Realm’s notification of changes to its data collection is likely to be lost, and Realm’s multi-threaded data synchronization timing is out of control.

To sum up, I don’t think Realm is a good fit for IO heavy projects. First of all, it takes a lot of time to fill in realms. Second, there are few documents related to Realm, and many problems have been documented on Github issues.

Persistence solution 2.0

Abstract Database Protocol

The abstract Database Protocol needs to contain the basic CRUD API and implement basic Database functions such as Where, Order By, Limit, and Transaction.

Our project uses Sqlite and Realm two kinds of database framework, using abstract database protocol can solve the strong dependence of database framework, business layer only need to implement abstract database, then no matter the data layer is database framework change, or multi-version database use, can achieve seamless switch.

Data persistence manager

The Storage Manager is designed to reduce the number of queries to the database and the logical impact of database writes on the business layer (if the business layer uses the database API directly, it will need to deal with various exceptions to the database API along with the business logic).

At its simplest, the Data persistence manager is a cache pool management that provides the same CRUD API as the database. Its internal implementation of new cache data into the library, as well as from the database to obtain the required data into the cache and other capabilities.

In addition to basic CRUD, data preprocessing plug-ins and data change observer capabilities are implemented.

Data Handler. When the business layer writes a new Data model to the persistence manager, the associated pre-processing plug-in takes effect. In the plug-in, you can do some associated business processing, such as synchronously triggering the update of the session name when the user name is updated.

Data Observer, which is notified when the persistence manager writes cached Data to the database. The observer can tell whether the current data source has been updated, added, or deleted. The observer also supports filtering of data sources by the Where rule. For example, in the chat interface, the ViewModel only cares about the message data of the current session. By filtering the Where rule, the ViewModel will receive a notification callback when the current session data changes and immediately refresh the message bubble UI.

The cache module

Storage Cache to support the data persistence manager, implementing the CRUD API.

According to different scenarios, cache modules are classified into ordered and disordered types.

An unordered type, which is used by data sources that have no requirements on the order of the data.

Ordered type, which is used by data sources that require the order in which data is excreted. You can specify collation rules to reorder data sources when changes occur. For example, the session list is in reverse order based on the creation time, and the group list is in ascending order based on the creation time.

Cache cleaning policies should be implemented to avoid unreasonable usage of memory. For example, when iOS receives a memory warning, the cache will be released.

The data flow

The figure above illustrates the entire process of network data acquisition, warehousing, and UI refresh.

  1. The data requester receives the IM message, transforms it into a data model, and stores the data to the persistence manager.
  2. The persistence manager stores the data in the persistence cache and triggers data-preprocessing plug-ins (such as recent chat updates to the session model).
  3. The persistence manager invokes the abstract database to store the IM message data.
  4. When the data store is complete, the data Change Notification Center notifying observers of the changes (for example, playing an announcement that a new message has been received).
  5. The UI data source receives the data notification, transforms the data model into the UI model, and finally refreshes the UI.

PS: When the data store reaches a high I/O state, the UI will be refreshed frequently due to the use of notification refresh logic, pay attention to the throttling of UI refresh.