Android Master Notes - Storage optimization

Storage base

Storage partition:

Definition:

Dividing storage on a device into non-overlapping parts, each of which can be formatted separately and used for a different purpose;
Different partitions can use different file systems;

classification

/ system:
- Operating system reservation, used to store system files and frameworks;
- Host Android components provided by Google
- Mount in read-only mode, stable, secure, content will not be damaged or tampered with;
/ data:
- Storing user Data
- For example, when the system is upgraded or restored, the /system partition will be erased without affecting the /data partition. When the system is restored to factory Settings, only the /data partition data will be erased.
/ cache:
- The partition used by the system upgrade process, or Recovery
/ vendor:
- Modification of the Android operating system by the storage vendor
- Treble was launched on Android8.0;
- OTA can only update their own/Vendor partition, with a lower cost, easier and faster to update devices to the new Android system;
/ storage:
- External or internal SDCard

Secure storage

Access control

The first consideration of storage security is permission control;
Each application runs within its own application sandbox;
Prior to Android4.3, standard Linux protection mechanisms (microapplications create unique Linux uuids) were used;
Android4.3 introduces SELinux mechanism to further define the sandbox boundaries of Android applications (even if the process has root permission, it must first grant the permission in the exclusive security policy profile);

Data encryption

There are two types of device encryption in Android
- Full encryption:
  - Android4.4 introduced, 5.0 opened by default
  - /data partition data encryption and decryption, slightly affect performance, the new version of the chip will provide direct support in the hardware;
  - File system based encryption, once the device is unlocked, the encryption is gone
- File level encryption
  - Android7.0
  - Each file is assigned a key that must be derived from the user’s passcode
The device encryption method is transparent to the application. Sensitive data still needs to be encrypted and stored with RSA, AES, CHacha20, TEA, etc

Common storage methods

Storage is to convert a specific data structure into a format that can be recorded and restored (such as binary, XML, JSON, Protocol Buffer, etc.).
With flash, everything is binary

A key element

Validity:
- Is it stable and robust?
- Whether multi-threading or cross-process synchronization is supported
- Data verification and recovery (for example, using a dual-write or backup file policy)
Time cost:
- The CPU time and I/O time are included. For example, complex codec, encryption and decryption affect the CPU time.
Space overhead:
- Different encoding methods occupy different storage space for the same data. (XML > JSON > Protocol Buffer)
- Compression strategies can also be introduced to further reduce the storage space. Such as zip, LZma;
- Also need to consider the memory space usage, whether it will lead to a large number of GC, OOM, etc
Security:
- Some sensitive data, even if stored in /data/data, still needs to be encrypted
- Select different encryption methods according to the sensitivity and data volume
Development cost:
- Although some storage solutions are expensive, they have high service cost. Try to achieve seamless access to reduce the development cost
Compatibility:
- Consider forward and backward compatibility, whether old data can be migrated when upgrading, and whether new data can be degraded in the old version
- Whether different languages support conversion

Storage methods

1. SharedPreferences

Used to store small sets of key-value pairs (simple, lightweight)
Disadvantages:
- Multi-threading is safe, but not safe across processes: No cross-process locking is used, and even with context. MODE_MULTI_PROCESS set, frequent reads and writes across processes can cause total data loss (because Android prefers us to use ContentProvider for storage in cross-process scenarios);
- Slow loading: the sp file is loaded using an asynchronous thread, and the priority of the thread is not set, so the main thread will wait for the lock of the low-priority thread. (It is recommended to use an asynchronous thread to preload the SP file used in the startup process. Sp data is stored in multiple files in different categories.)
- Full write: either commit() or apply(), even if only one data change is made, the entire content is written to the file, and multiple changes are not merged into a single commit.
- Lag: Due to the apply mechanism providing asynchronous disk, crash or sudden power failure and other circumstances may lead to data loss, so when receiving system broadcast or onPause and other times, the system will force all SP objects landing to disk, data volume will block the main thread, resulting in lag and even ANR; (Apply is the memory that is written first and then stored asynchronously to an XML file; Commit is a direct synchronous write to a file.)
You can replace the system’s default implementation of SharedPreferences by overrewriting the Application’s getSharedPreferences method, such as optimizing the lag, merging multiple Apply actions, supporting cross-process actions, etc
Minor fixes to the SharedPreferences provided by the system have improved performance, but they still don’t solve the problem. Almost every major company will develop an alternative storage solution, such as wechat’s open source MMKV;

// implementation 'com. Tencent :mmkv-static:1.2.7' //2. String rootDir = mmKV.initialize (this); LjyLogUtil.d("mmkv root: " + rootDir); Private void testMMKV() {MMKV kv = mmkv.defaultmmkv (); kv.encode("bool", true); boolean bValue = kv.decodeBool("bool"); kv.encode("int", Integer.MIN_VALUE); int iValue = kv.decodeInt("int"); kv.encode("string", "Hello from mmkv"); String str = kv.decodeString("string"); } //4. If the service requires multi-process access, Mmkv.multi_process_mode MMKV MMKV = mmKV.mmKvWithID ("InterProcessKV", mmKV.multi_process_mode); mmkv.encode("bool", true); / / 5. SharedPreferences migration private void testImportSharedPreferences () {/ / SharedPreferences preferences = getSharedPreferences("myData", MODE_PRIVATE); MMKV preferences = MMKV.mmkvWithID("myData"); {SharedPreferences old_man = getSharedPreferences("myData", MODE_PRIVATE); preferences.importFromSharedPreferences(old_man); old_man.edit().clear().commit(); } // SharedPreferences.Editor Editor = Preferences.edit (); editor.putBoolean("bool", true); editor.putInt("int", Integer.MIN_VALUE); editor.putLong("long", Long.MAX_VALUE); Editor. PutFloat (" float ", to 3.14 f); editor.putString("string", "hello, imported"); HashSet<String> set = new HashSet<String>(); set.add("W"); set.add("e"); set.add("C"); set.add("h"); set.add("a"); set.add("t"); editor.putStringSet("string-set", set); // no need to call commit() //editor.com MIT (); }Copy the code

2. ContentProvider

One of the four components that provides a mechanism for sharing data between different processes and even between different applications;
In The Android system, modules such as album, calendar, audio, video, address book and so on all provide the access support of ContentProvider;
Startup performance:
- The life cycle of the ContentProvider precedes Application onCreate() by default and is created on the main thread. Our custom ContentProvider constructors, static code blocks, and onCreate functions try not to do time-consuming things that slow down startup.
Multi-process mode:
- Used in conjunction with the Multiprocess attribute in the AndroidManifest. In this way, the calling process will create a push process Provider instance directly in its own process, eliminating the need for cross-process invocation. Note that this can also lead to Provider multi-instance issues
Stability:
- Takes advantage of Android’s Binder and anonymous shared memory mechanisms.
- A file descriptor for anonymous shared memory inside the CursorWindow object is passed through Binder. In this way, the resulting data does not need to be transferred across processes, but in different processes, the anonymous shared memory file descriptor is transferred to operate on the same anonymous memory, so that different processes can access the same data.
- The mMAP-based anonymous shared memory mechanism also comes with a cost. When the amount of data transferred is very small, it may not be cost-effective. So the ContentProvider provides a call function that transfers data directly through the Binder.
- Android’s Binder transfers have a size limit, typically 1 to 2MB. The ContentProvider interface call parameters and call function calls do not use an anonymous sharing mechanism. For example, if a lot of data is inserted in bulk, an array of inserted data will appear. If the array is too large, the operation may cause a data oversized exception.
security
- While ContentProvider provides a good security mechanism for data sharing between applications, if ContentProvider is exported, you need to be aware of SQL injection issues when supporting the execution of SQL statements. In addition, if we pass in a file path and then return the contents of the file, we need to verify the validity. Otherwise, the entire application’s private data may be accessed by someone else. This error is often made when intent parameters are passed.
- Supports permission verification and data transfer between applications
Correctness: Cross-process support
Time overhead: Impact on starting and passing data across processes
Space overhead: Does not limit data content
Development cost: System support, but more complex development
Compatibility: Supports compatibility
Relatively bulky, suitable for large data transmission

Serialization of objects

1. Serializable

Java’s native serialization mechanism
Objects can be persisted through Serializable storage, and Serializable Serializable data can be passed through bundles.
How it works: With ObjectInputStream and ObjectOutputStream
Not only serializes the current object, but also recursively serializes other objects referenced by the object;
Poor performance due to high reflection and GC effects
Too much information leads to large files and I/O performance problems
The writeObject and readObject:
- Serializable serialization supports replacing the default process by first reflecting to determine whether our own serialization method, writeObject, or deserialization method readObject, exists. With these two methods, we can make some special changes to some fields, and we can implement the serialized encryption function
WriteReplace and readResolve:
- Proxy serialized objects that can be implemented to return custom serialized instances. They enable version-compatibility for object serialization, such as converting an older version of the serialized object to a newer version of the object type through the readResolve method.
Non-serialized fields:
- Class static variables and fields that are declared transient are ignored by the default serialization mechanism and are not serialized. We can also use the advanced writeReplace and readResolve methods to do our own serialization.
SerialVersionUID:
- After the class implements the Serializable interface, we need to add a Serial Version ID, which corresponds to the class Version number. We can declare this ID explicitly or we can let the compiler compute it. In general, I recommend an explicit declaration as a safer option, since implicitly declaring that if the class changes even a little, deserialization will result in an InvalidClassException due to the serialVersionUID change.
Serialization allows refactoring:
- Serialization allows for a certain number of class variants, and even after refactoring, the ObjectInputStream reads it just fine.
  - Add a new field to the class
  - Change the field from static to nonstatic
  - Change the field from TRANSIENT to non-transient

Serialization is not secure:

The serialized binary format is fully documented and fully reversible; When remote method calls are made over RMI, any private fields in the object sent over the connection almost always appear in plain text in the socket stream;
Serialization allows “hook” the serialization process and protects (or obscures) field data before serialization and after deserialization (overwrite writeObject,readObject)

  private void writeObject(java.io.ObjectOutputStream stream)
          throws java.io.IOException {
      // "Encrypt"/obscure the sensitive data
      age = age << 2;
      stream.defaultWriteObject();
  }
  
  private void readObject(java.io.ObjectInputStream stream)
          throws java.io.IOException, ClassNotFoundException {
      stream.defaultReadObject();
  
      // "Decrypt"/de-obscure the sensitive data
      age = age << 2;
  }
Copy the code

Serialized data can be signed and sealed:
- You can do cryptography and signature management using writeObject and readObject, but there are better ways.
- The simplest is put it in a javax.mail crypto. SealedObject and/or Java. Security. The SignedObject wrapper. Both are serializable, so wrapping an object in SealedObject creates a kind of “box” around the original object. You must have a symmetric key for decryption, and the key must be managed separately.
- Similarly, SignedObject can be used for data validation, and the symmetric key must also be managed separately.

Serialization allows the proxy to be placed in a stream

If serialization is the primary issue, then it’s a good idea to specify a flyweight or proxy to put in the stream. Provide the original Person with a writeReplace method that can serialize objects of different types to replace it. Similarly, if a readResolve method is found during deserialization, the method is called, providing the alternative object to the caller

// Use proxy, WriteReplace and readResolve /** * @author LiuJinYang * @date 2020/3/23 */ public class Person implements Serializable { private String name; private int age; public Person() { } public Person(String name, int age) { if (age < 0) throw new IllegalArgumentException("age can't < 0"); this.name = name; this.age = age; } public String getName() { return name; } public void setName(String name) { this.name = name; } public int getAge() { return age; } public void setAge(int age) { this.age = age; } private static class PersonProxy implements Serializable { private final String name; private final int age; public PersonProxy(Person person) { this.name = person.getName(); this.age = person.getAge(); } private Object readResolve() { Person person = new Person(name, age); return person; } } private Object writeReplace() { return new PersonProxy(this); } // This method will not execute, private void writeObject(ObjectOutputStream out) {system.out.println (" person.writeObject ()"); Private Object readObject(ObjectInputStream in) throws InvalidObjectException {private Object readObject(ObjectInputStream in) throws InvalidObjectException { System.out.println("Person.readObject()"); throw new InvalidObjectException("Proxy required"); }}Copy the code

This technique is one of the few that doesn’t require a read/write balance. For example, a version of a class that has been reconstituted into another type could provide a readResolve method to silently convert the serialized object to the new type. Similarly, it can serialize the old class to the new version using the writeReplace method.

Consider replacing the serialized instance with a serialized proxy
- Serialized proxy method can prevent pseudo-byte stream attack and internal domain theft attack;
Trust, but verify
- For serialized objects, this means validating fields to ensure that they still have the correct values after deserialization, “just in case.” To do this, you can implement the ObjectInputValidation interface and override the validateObject() method. If an error is found somewhere when the method is called, an InvalidObjectException is thrown.
Implement the Serializable interface prudently
- While the direct costs are low, the long-term costs are large
Write the readObject method protectively
- When an object is deserialized, it is important to make a protective copy of any field containing object references that the client should not have;
For instance control, the enumeration type takes precedence over readResolve
- If you rely on readResolve for instance control, all instance domains with object reference types must be declared transient (or primitive); (Otherwise, an attacker might protect references to the deserialized object until readResolve runs)

2. Externalizable

It inherits from the Serializable interface and defines writeExternal and readExternal methods to give developers more control over the serialization process. At the same time, it can implement some functions that cannot be implemented using the Serializable interface.
For example, objects that implement the Serializable interface, where static and TRANSIENT member variables are not serialized by default, The Externalizable interface allows you to manually serialize static and transient member variables.
The Externalizable interface’s serialization mechanism is based on reflection, and the performance of the Externalizable interface is poor.

3. Parcelable

Parcelable only serializes data in memory and does not store data to disk
In the trade-off between time overhead and use cost, the Parcelable mechanism chooses performance first.
- So it requires manually adding custom code to both write and read, and is much more complex to use than Serializable. But because of this, Parcelable does not need to use reflection to implement serialization and deserialization.

There are two problems

Compatibility with system versions
- There is no guarantee that all Android versions of the Parcel. CPP implementation will be exactly the same. Problems may arise if implementations differ between system versions, or if implementations are modified by vendors.
Data compatibility
- Parcelable is not designed for versioning, and if the version of our class is upgraded, the order of writing and the compatibility of field types need to be taken care of, which is also very costly to maintain.

So in general, if you want persistent storage, you will have to choose the worse performance Serializable solution.

Serializable is different from Parcelable

Serializable: Java Serializable interface. A large number of temporary variables are generated during the read and write process, and a large number of I/O operations are performed internally, resulting in low efficiency.
Parcelable: Android serialization interface, high efficiency, trouble to use, read and write in memory (AS has related plug-in one-key generation method required), objects can not be saved to disk;

4. Serial

Twitter’s open source high-performance serialization solution Serial
advantages
- There are significant advantages in serialization and deserialization time, as well as the size of the landing file;
- Because reflection is not used, it is more efficient than traditional reflection serialization schemes;
- Developers have strong control over the serialization process and can define which objects and fields need to be serialized.
- Strong debug ability, can debug serialized process;
- Strong version management ability, can be compatible with OptionalFieldException through version number;
use

Byte [] final Serial Serial = new ByteBufferSerial(); //1. final byte[] serializedData = serial.toByteArray(object, ExampleObject.SERIALIZER) //2. Deserialize an object from byte[] to object final ExampleObject Object = serial.fromByteArray(serializedData, Exampleobject.serializer) // The default serialization implementation class currently provided in the library is ByteBufferSerial, which produces byte[]. Users can also change the implementation class by themselves, rather than sticking to byte[]. //Serializers need to define a separate Serializer for each field. //Serializers need to define write and read operations for each field. Serializers need to be defined recursively //Serializers are stateless, so we can write them as an internal class of object, Public static class ExampleObject {public static final ObjectSerializer<ExampleObject> SERIALIZER = new ExampleObjectSerializer(); public final int num; public final SubObject obj; public ExampleObject(int num, @NotNull SubObject obj) { this.num = num; this.obj = obj; }... private static final class ExampleObjectSerializer extends ObjectSerializer<ExampleObject> { @Override protected void serializeObject(@NotNull SerializationContext context, @NotNull SerializerOutput output, @NotNull ExampleObject object) throws IOException { output .writeInt(object.num) // first field .writeObject(object.obj,  SubObject.SERIALIZER); // second field } @Override @NotNull protected ExampleObject deserializeObject(@NotNull SerializationContext context, @NotNull SerializerInput input, int versionNumber) throws IOException, ClassNotFoundException { final int num = input.readInt(); // first field final SubObject obj = input.readObject(SubObject.SERIALIZER); // second field return new ExampleObject(num, obj); }}} // This inner class is similar to Parcelable.Creator in Parcelable. public static final Parcelable.Creator<Person> CREATOR = new Creator<Person>() { @Override public Person createFromParcel(Parcel source) { Person person = new Person(); person.mName = source.readString(); person.mSex = source.readString(); person.mAge = source.readInt(); return person; } @override public Person[] newArray(int size) {return new Person[size]; }}; BuilderSerializer // You can use BuilderSerializer to serialize classes that are built in builder mode or have multiple constructors. // Inherits BuilderSerializer only, And implement the createBuilder method (just return the builder of the current class) // and the deserializeToBuilder method (in which you get the Builder object) Public static class ExampleObject {... public ExampleObject(@NotNull Builder builder) { this.num = builder.mNum; this.obj = builder.mObj; }... public static class Builder extends ModelBuilder<ExampleObject> { ... } private static final class ExampleObjectSerializer extends BuilderSerializer<ExampleObject, Builder> { @Override @NotNull protected Builder createBuilder() { return new Builder(); } @Override protected void serializeObject(@NotNull SerializationContext context, @NotNull SerializerOutput output, @NotNull ExampleObject object) throws IOException { output.writeInt(object.num) .writeObject(object.obj, SubObject.SERIALIZER); } @Override protected void deserializeToBuilder(@NotNull SerializationContext context, @NotNull SerializerInput input, @NotNull Builder builder, int versionNumber) throws IOException, ClassNotFoundException { builder.setNum(input.readInt()) .setObj(input.readObject(SubObject.SERIALIZER)); Final Serializer<ExampleObject> Serializer = New ExampleObjectSerializer(1); // final Serializer<ExampleObject> Serializer = New ExampleObjectSerializer(1); . @Override @NotNull protected ExampleObject deserializeObject(@NotNull SerializationContext context, @NotNull SerializerInput input, int versionNumber) throws IOException, ClassNotFoundException { final int num = input.readInt(); final SubObject obj = input.readObject(SubObject.SERIALIZER); final String name; if (versionNumber < 1) { name = DEFAULT_NAME; } else { name = input.readString(); } return new ExampleObject(num, obj, name); } //6. Serialization of simple parameters // Objects such as Integer, String, Size, Rect, etc., are themselves simple and do not require version control. Using ObjectSerializer will add 2-3 bytes of information to these objects. // ValueSerializer is the best option when version control is not required:  public static final Serializer<Boolean> BOOLEAN = new ValueSerializer<Boolean>() { @Override protected void serializeValue(@NotNull SerializationContext context, @NotNull SerializerOutput output, @NotNull Boolean object) throws IOException { output.writeBoolean(object); } @NotNull @Override protected Boolean deserializeValue(@NotNull SerializationContext context, @NotNull SerializerInput input) throws IOException { return input.readBoolean(); }};Copy the code

The key difference between Serial and Serializable (Externalizable) is performance. The key difference between Serial and Parcelable is the medium to which it can be serialized

Serialization of data

1. JSON

advantages

Compared to the object serialization scheme, it is faster and smaller
Compared to the binary serialization scheme, the results are readable and easy to troubleshoot
Easy to use, support cross-platform, cross-language, support for nested references

Json library

Android built-in JSON library, Google’s Gson, Alibaba’s Fastjson, Meituan’s MSON.
The libraries are optimized in two ways
- Convenience: for example, support for JSON conversion into JavaBean objects, support for annotations, support for more data types, etc.
- Performance: Reduced reflection, memory and CPU usage during serialization, especially when the data volume is large or the nesting level is deep.
Gson is the most compatible. In general, it performs as well as Fastjson, but Fastjson performs better when there is a lot of data.

2. Protocol Buffers

Performance: Using binary encoding compression, smaller size and faster codec than JSON
Compatibility: Good compatibility across languages and before and after. Automatic conversions of basic types are supported, but inheritance and reference types are not supported.
Use cost: The development cost of Protocol Buffers is high, so you need to define.proto files and use tools to generate the corresponding auxiliary classes. (Coupled and intrusive)
Google later introduced FlatBuffers with a higher compression ratio

The database

Talk about storage optimization must not open the database
For big data storage scenarios, we need to consider stability, performance, and scalability
For databases, SQLite is most commonly used on mobile, but there are also others such as startup Realm, Google’s LevelDB, and so on

ORM

Perhaps many senior development engineers do not fully understand the internal mechanism of SQLite, nor can they correctly write efficient SQL statements;
In order to improve development efficiency, most applications will introduce ORM framework;
Object Relational Mapping uses an object-oriented concept to associate tables with objects in a database. (Don’t worry about the underlying database implementation)
The most commonly used ORM frameworks are open source greenDAO and Google’s official Room; For details on its use, please refer to another article of mine: Componentized Architecture – 5. Data storage & GreenDao,Room, which will not be covered here;

Problems with the ORM framework

The ORM framework is very simple to use, but at the cost of some execution efficiency;
Let the developer’s thinking solidify, may end up even simple SQL statements can not write;

WCDB

Wechat team open source WCDB is an efficient, complete and easy to use mobile database framework, based on SQLCipher, support iOS, macOS and Android.
use

Implementation 'com.tencent. WCDB: WCDB -android:1.0.0' //2. Android {defaultConfig {NDK {// Only connect armeabi-v7a and x86 architecture abiFilters' Armeabi-v7a ', 'x86'}}} //3. //WCDB Android uses almost the same interface as the Android SDK SQLite framework. // If your APP has previously used the Android SDK database interface, // Just change android.database.* to com.0ce.wcdb.* in import, / / and android. Database. Sqlite. * to com. Tencent. WCDB. Database. *. // If SQLCipher Android Binding is used before, modify import accordingly. Sqlcipher_export () ATTACH 'old_database' AS old; //4. SELECT sqlcipher_export('main', 'old'); -- Import from 'old' to 'main' DETACH old; // The key change point is to convert the password to byte[] and pass SQLiteCipherSpec to describe the encryption mode. String passphrase = "passphrase"; // The key change point is to convert the password to byte[] and pass SQLiteCipherSpec to describe the encryption mode. SQLiteCipherSpec cipher = new SQLiteCipherSpec() // Encryption description object.setPagesize (1024) // SQLCipher default Page size is 1024 .setSQLCipherVersion(3); // 1,2,3 correspond to 1. X,2. X,3. . You can add other options SQLiteDatabase db = SQLiteDatabase openOrCreateDatabase (" path/to/database ", / / db path passphrase getBytes (), // WCDB password parameter type is byte[] cipher, // The encryption description object created above is null, // CursorFactory null // DatabaseErrorHandler // SQLiteDatabaseHook = null; //WCDB is now officially involved in Room to provide ORM and data binding capabilities and to interact with other Components of Android Jetpack. //6.1 On the basis of access Room, Gradle plus WCDB room component dependencies in the {implementation 'com. Tencent. WCDB: room: 1.0.8' / / instead of room - the runtime, At the same time also no longer need to reference WCDB - android annotationProcessor 'android. Arch. Persistence. Room: the compiler: 1.1.1' / / the compiler needs to use room} / / 6.2 In my code, when I open RoomDatabase, SQLiteCipherSpec cipherSpec = new SQLiteCipherSpec() // Specify encryption mode, Use default encryption to omit.setPagesize (4096).setkdfIteration (64000); WCDBOpenHelperFactory factory = new WCDBOpenHelperFactory().passphrase("passphrase".getbytes ()) // Specify encrypted DB key, Unencrypted DB line. CipherSpec (cipherSpec) / / specified encryption method, using the default encryption can be omitted. WriteAheadLoggingEnabled (true) / / concurrent open... and, speaking, reading and writing. AsyncCheckpointEnabled (true); // Turn on the asynchronous Checkpoint optimization, AppDatabase db = Room. DatabaseBuilder (this, appDatabase.class, "App - db") / /. AllowMainThreadQueries () / / allow the db operation on the main thread, is generally not recommended. OpenHelperFactory (factory) / / important: use WCDB open Room. The build (); Public static SQLiteDatabase openDatabase (String path, String path, String path, String path) SQLiteDatabase.CursorFactory factory, int flags, DatabaseErrorHandler errorHandler, int poolSize)Copy the code

Processes are concurrent with threads

Use SQLite often SQLiteDatabaseLockedException, in the final analysis because of concurrent causes, SQLite concurrent has two dimensions, one is a multi-process concurrent, one is multi-threaded concurrent;

Multi-process concurrency

SQLite supports multiple concurrent operations by default (controlled by file locks). The granularity of SQLite locks is not very fine, it is for the entire DB file, you can refer to it
- Introduction to the SQLite locking mechanism
- Sqlite Learning notes (5)&& Sqlite blocking mechanism
In simple terms, multiple processes can simultaneously acquire a SHARED lock to read data, but only one process can acquire an EXCLUSIVE lock to write data to the database.
In EXCLUSIVE mode, the database connection does not release the lock on the SQLite file until it is disconnected, thus avoiding unnecessary conflicts and speeding up database access.

Multi-threaded concurrency

SQLite supports multi-thread concurrent mode. You need to enable the following configuration. Of course, SQLite will enable multi-thread mode by default

PRAGMA SQLITE_THREADSAFE = 2
Copy the code

Like the multi-process locking mechanism, SQLite locks are at database file level granularity for simplicity and do not implement table or even row level locks.
If only one thread is operating with the same handle at the same time, we need to open the Connection Pool.
Similar to multiple processes, multiple threads can read from the database simultaneously, but write to the database is still mutually exclusive
SQLite provides a Busy Retry scenario, in which a Busy Handler is fired when a block occurs, allowing the thread to sleep for a period of time before trying the operation again
To further improve concurrency performance, we can also enable WAL (write-ahead Logging) mode. WAL mode writes modified data to a single WAL file and introduces WAL log file locks. With WAL mode, reads and writes can be executed completely concurrently without blocking each other.

PRAGMA schema.journal_mode = WAL
Copy the code

Note, however, that writes are still not concurrent. If there are multiple concurrent written, there are SQLiteDatabaseLockedException may occur. At this point we can ask the application to catch the exception and wait a while before trying again.

. } catch (SQLiteDatabaseLockedException e) { if (sqliteLockedExceptionTimes < (tryTimes - 1)) { try { Thread.sleep(100); {}}} catch InterruptedException (e1) sqliteLockedExceptionTimes++; } - In general, with connection pooling and WAL mode, we can greatly increase the read/write concurrency of SQLite and greatly reduce the wait time due to concurrency. It is recommended that you try this in your application.Copy the code

To optimize the

The index optimization

Proper use of indexes can significantly slow down queries in most scenarios
Many times we think we’ve created an index, but it doesn’t actually work. For example, operators such as BETWEEN, LIKE, OR are used, expressions are used, and case when is used

SELECT * FROM mytable WHERE myfield BETWEEN 10 and 20; SELECT * FROM mytable WHERE myfield >= 10 AND myfield <= 20;Copy the code

Creating an index comes with a cost. The index table needs to be kept up to date
- Create the right index: Make sure it works and is efficient
- Single-column index, multi-column index and composite index selection;
- Select index fields: Integer types are much more efficient than strings. Primary keys are indexed by default, so do not use complex fields as main keys.

Page size and cache size

A database is like a small file system, with pages and caching concepts inside;
- Page: minimum storage unit
- The indexes of different pages belonging to the same table are organized as B-trees. Each table is a B-tree
- Each page can only store the data of one table or a group of indexes, that is, it is impossible to store the data of multiple tables or indexes on the same page
SQLite caches pages that have been read to speed up the next read. The default page size is 1024 bytes, and the default cache size is 1000 pages.
- Increasing page size does not consistently improve performance, and there may be side effects beyond the inflection point
- It is recommended to select 4KB as the default page size in advance when creating a database for better performance.

Other optimization

Use “select*” with caution and select as many columns as you need
Use transactions correctly
Precompile with parameter binding and cache the compiled SQL statement
For blobs or oversized Text columns, it is possible to exceed the size of a page, resulting in oversized pages. It is recommended that these columns be separated from the table or placed after the table fields.
Periodically sort or clear useless or erasable data (if the user accesses this part of data, it can be pulled from the network again)

Damage and recovery

Wechat SQLite database repair practice
Wechat mobile terminal database component WCDB series (2) – database repair three batahawk
Android Database repair

Encryption and Security

Database security has two main aspects, one is anti injection, one is encryption
- Static security scan can be used to prevent injection
- Generally, SQLCipher is used to support encryption. SQLite encryption and decryption are in the unit of page. By default, AES algorithm is used for encryption.
- About the use of WCDB encryption and decryption, you can refer to: wechat mobile database component WCDB (4) – Android features

Full-text search

Wechat full-text search optimization road
Mobile client polyphonic word search

monitoring

Every time you write an SQL statement, you should first test it locally
EXPLAIN QUERY PLAN allows you to test the QUERY PLAN of an SQL statement, whether it is a full table scan or an index is used, and which index is used.

sqlite> EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE a=1 AND b>2;
QUERY PLAN
|--SEARCH TABLE t1 USING INDEX i2 (a=? AND b>?)
Copy the code

WCDB added SQLiteTrace monitoring module
Wechat open source Matrix has an intelligent analysis of SQLite statements in the tool: Matrix SQLiteLint – SQLite use quality detection. According to the analysis of SQL syntax tree, combined with our daily database use experience, it abstracts the improper use of index, select* and other six problems.
Meituan has also opened its own internal SQL optimization tool, SQLAdvisor
- Application of SQL parsing in Meituan
- Meituan Dianping SQL optimization tool SQLAdvisor is open source

Matrix SQLiteLint

Access Matrix SQLiteLint to check whether there is any unreasonable use of SQLite.

/ / 1. Add the dependent debugImplementation "com. Tencent. Matrix, the matrix - sqlite - lint -- android SDK: ${MATRIX_VERSION}" releaseImplementation "com.tencent.matrix:matrix-sqlite-lint-android-sdk-no-op:${MATRIX_VERSION}" //2. Private void prepareSQLiteLint() {SQLiteLintPlugin = (SQLiteLintPlugin) Matrix.with().getPluginByClass(SQLiteLintPlugin.class); if (plugin == null) { return; } plugin.addConcernedDB(new SQLiteLintConfig.ConcernDb(getWritableDatabase()) .setWhiteListXml(R.xml.sqlite_lint_whitelist) .enableAllCheckers()); }Copy the code

reference

Storage Optimization (part 1) : What are the common data storage methods?
Application sandbox
Security enhanced Linux in Android
Equipment encryption
The whole encryption
File level encryption
How to optimize data storage?
Replaces the implementation of system SharedPreferences
MMKV
Java object serialization
EffectiveJava – 10 – serialization
Serial
05 | Twitter high-performance Serial serialization framework (1) basic usage and concepts
This section describes how to use the Google Protocol Buffer
FlatBuffers experience
Database SQLite is the best way to optimize your database
Wechat WCDB evolution road – open source and start
WCDB
Componentized architecture – 5. Data storage & GreenDao,Room
Introduction to the SQLite locking mechanism
Sqlite Learning notes (5)&& Sqlite blocking mechanism
Wechat iOS SQLite source code optimization practices
How SQLite indexes work
The data structure and algorithm behind MySQL index
SQlite source code analysis
Fully parse sqLite.pdf
Matrix SQLiteLint – SQLite uses quality checks
Application of SQL parsing in Meituan
Meituan Dianping SQL optimization tool SQLAdvisor is open source
The Definitive Guide to SQLite (2nd edition)

I am Jinyang, if you want to advance and learn more about dry goods, welcome to follow the wechat public account “jinyang said” to receive my latest articles

Android Master Notes – Storage optimization

Storage base

Storage partition:

Definition:

classification

Secure storage

Access control

Data encryption

Common storage methods

A key element

Storage methods

1. SharedPreferences

2. ContentProvider

Serialization of objects

1. Serializable

2. Externalizable

3. Parcelable

There are two problems

Serializable is different from Parcelable

4. Serial

Serialization of data

1. JSON

advantages

Json library

2. Protocol Buffers

The database

ORM

Problems with the ORM framework

WCDB

Processes are concurrent with threads

Multi-process concurrency

Multi-threaded concurrency

To optimize the

The index optimization

Page size and cache size

Other optimization

Damage and recovery

Encryption and Security

Full-text search

monitoring

Matrix SQLiteLint

reference

I am Jinyang, if you want to advance and learn more about dry goods, welcome to follow the wechat public account “jinyang said” to receive my latest articles

Related Posts

Android_Dagger2 — From the most accessible point of view + latest dagger. Android

Imitate zhihu personal home page to achieve nested sliding and fading effect

Look at class initialization directly at the underlying bytecode