Storage base

Storage partition:

Definition:
  • Dividing storage on a device into non-overlapping parts, each of which can be formatted separately and used for a different purpose;
  • Different partitions can use different file systems;
classification
  • / system:
    • Operating system reservation, used to store system files and frameworks;
    • Host Android components provided by Google
    • Mount in read-only mode, stable, secure, content will not be damaged or tampered with;
  • / data:
    • Storing user Data
    • For example, when the system is upgraded or restored, the /system partition will be erased without affecting the /data partition. When the system is restored to factory Settings, only the /data partition data will be erased.
  • / cache:
    • The partition used by the system upgrade process, or Recovery
  • / vendor:
    • Modification of the Android operating system by the storage vendor
    • Treble was launched on Android8.0;
    • OTA can only update their own/Vendor partition, with a lower cost, easier and faster to update devices to the new Android system;
  • / storage:
    • External or internal SDCard

Secure storage

Access control
  • The first consideration of storage security is permission control;
  • Each application runs within its own application sandbox;
  • Prior to Android4.3, standard Linux protection mechanisms (microapplications create unique Linux uuids) were used;
  • Android4.3 introduces SELinux mechanism to further define the sandbox boundaries of Android applications (even if the process has root permission, it must first grant the permission in the exclusive security policy profile);
Data encryption
  • There are two types of device encryption in Android
    • Full encryption:
      • Android4.4 introduced, 5.0 opened by default
      • /data partition data encryption and decryption, slightly affect performance, the new version of the chip will provide direct support in the hardware;
      • File system based encryption, once the device is unlocked, the encryption is gone
    • File level encryption
      • Android7.0
      • Each file is assigned a key that must be derived from the user’s passcode
  • The device encryption method is transparent to the application. Sensitive data still needs to be encrypted and stored with RSA, AES, CHacha20, TEA, etc

Common storage methods

  • Storage is to convert a specific data structure into a format that can be recorded and restored (such as binary, XML, JSON, Protocol Buffer, etc.).
  • With flash, everything is binary

A key element

  • Validity:
    • Is it stable and robust?
    • Whether multi-threading or cross-process synchronization is supported
    • Data verification and recovery (for example, using a dual-write or backup file policy)
  • Time cost:
    • The CPU time and I/O time are included. For example, complex codec, encryption and decryption affect the CPU time.
  • Space overhead:
    • Different encoding methods occupy different storage space for the same data. (XML > JSON > Protocol Buffer)
    • Compression strategies can also be introduced to further reduce the storage space. Such as zip, LZma;
    • Also need to consider the memory space usage, whether it will lead to a large number of GC, OOM, etc
  • Security:
    • Some sensitive data, even if stored in /data/data, still needs to be encrypted
    • Select different encryption methods according to the sensitivity and data volume
  • Development cost:
    • Although some storage solutions are expensive, they have high service cost. Try to achieve seamless access to reduce the development cost
  • Compatibility:
    • Consider forward and backward compatibility, whether old data can be migrated when upgrading, and whether new data can be degraded in the old version
    • Whether different languages support conversion

Storage methods

1. SharedPreferences
  • Used to store small sets of key-value pairs (simple, lightweight)
  • Disadvantages:
    • Multi-threading is safe, but not safe across processes: No cross-process locking is used, and even with context. MODE_MULTI_PROCESS set, frequent reads and writes across processes can cause total data loss (because Android prefers us to use ContentProvider for storage in cross-process scenarios);
    • Slow loading: the sp file is loaded using an asynchronous thread, and the priority of the thread is not set, so the main thread will wait for the lock of the low-priority thread. (It is recommended to use an asynchronous thread to preload the SP file used in the startup process. Sp data is stored in multiple files in different categories.)
    • Full write: either commit() or apply(), even if only one data change is made, the entire content is written to the file, and multiple changes are not merged into a single commit.
    • Lag: Due to the apply mechanism providing asynchronous disk, crash or sudden power failure and other circumstances may lead to data loss, so when receiving system broadcast or onPause and other times, the system will force all SP objects landing to disk, data volume will block the main thread, resulting in lag and even ANR; (Apply is the memory that is written first and then stored asynchronously to an XML file; Commit is a direct synchronous write to a file.)
  • You can replace the system’s default implementation of SharedPreferences by overrewriting the Application’s getSharedPreferences method, such as optimizing the lag, merging multiple Apply actions, supporting cross-process actions, etc
  • Minor fixes to the SharedPreferences provided by the system have improved performance, but they still don’t solve the problem. Almost every major company will develop an alternative storage solution, such as wechat’s open source MMKV;
// implementation 'com. Tencent :mmkv-static:1.2.7' //2. String rootDir = mmKV.initialize (this); LjyLogUtil.d("mmkv root: " + rootDir); Private void testMMKV() {MMKV kv = mmkv.defaultmmkv (); kv.encode("bool", true); boolean bValue = kv.decodeBool("bool"); kv.encode("int", Integer.MIN_VALUE); int iValue = kv.decodeInt("int"); kv.encode("string", "Hello from mmkv"); String str = kv.decodeString("string"); } //4. If the service requires multi-process access, Mmkv.multi_process_mode MMKV MMKV = mmKV.mmKvWithID ("InterProcessKV", mmKV.multi_process_mode); mmkv.encode("bool", true); / / 5. SharedPreferences migration private void testImportSharedPreferences () {/ / SharedPreferences preferences = getSharedPreferences("myData", MODE_PRIVATE); MMKV preferences = MMKV.mmkvWithID("myData"); {SharedPreferences old_man = getSharedPreferences("myData", MODE_PRIVATE); preferences.importFromSharedPreferences(old_man); old_man.edit().clear().commit(); } // SharedPreferences.Editor Editor = Preferences.edit (); editor.putBoolean("bool", true); editor.putInt("int", Integer.MIN_VALUE); editor.putLong("long", Long.MAX_VALUE); Editor. PutFloat (" float ", to 3.14 f); editor.putString("string", "hello, imported"); HashSet<String> set = new HashSet<String>(); set.add("W"); set.add("e"); set.add("C"); set.add("h"); set.add("a"); set.add("t"); editor.putStringSet("string-set", set); // no need to call commit() //editor.com MIT (); }Copy the code
2. ContentProvider
  • One of the four components that provides a mechanism for sharing data between different processes and even between different applications;
  • In The Android system, modules such as album, calendar, audio, video, address book and so on all provide the access support of ContentProvider;
  • Startup performance:
    • The life cycle of the ContentProvider precedes Application onCreate() by default and is created on the main thread. Our custom ContentProvider constructors, static code blocks, and onCreate functions try not to do time-consuming things that slow down startup.
  • Multi-process mode:
    • Used in conjunction with the Multiprocess attribute in the AndroidManifest. In this way, the calling process will create a push process Provider instance directly in its own process, eliminating the need for cross-process invocation. Note that this can also lead to Provider multi-instance issues
  • Stability:
    • Takes advantage of Android’s Binder and anonymous shared memory mechanisms.
    • A file descriptor for anonymous shared memory inside the CursorWindow object is passed through Binder. In this way, the resulting data does not need to be transferred across processes, but in different processes, the anonymous shared memory file descriptor is transferred to operate on the same anonymous memory, so that different processes can access the same data.
    • The mMAP-based anonymous shared memory mechanism also comes with a cost. When the amount of data transferred is very small, it may not be cost-effective. So the ContentProvider provides a call function that transfers data directly through the Binder.
    • Android’s Binder transfers have a size limit, typically 1 to 2MB. The ContentProvider interface call parameters and call function calls do not use an anonymous sharing mechanism. For example, if a lot of data is inserted in bulk, an array of inserted data will appear. If the array is too large, the operation may cause a data oversized exception.
  • security
    • While ContentProvider provides a good security mechanism for data sharing between applications, if ContentProvider is exported, you need to be aware of SQL injection issues when supporting the execution of SQL statements. In addition, if we pass in a file path and then return the contents of the file, we need to verify the validity. Otherwise, the entire application’s private data may be accessed by someone else. This error is often made when intent parameters are passed.
    • Supports permission verification and data transfer between applications
  • Correctness: Cross-process support
  • Time overhead: Impact on starting and passing data across processes
  • Space overhead: Does not limit data content
  • Development cost: System support, but more complex development
  • Compatibility: Supports compatibility
  • Relatively bulky, suitable for large data transmission

Serialization of objects

1. Serializable

  • Java’s native serialization mechanism
  • Objects can be persisted through Serializable storage, and Serializable Serializable data can be passed through bundles.
  • How it works: With ObjectInputStream and ObjectOutputStream
  • Not only serializes the current object, but also recursively serializes other objects referenced by the object;
  • Poor performance due to high reflection and GC effects
  • Too much information leads to large files and I/O performance problems
  • The writeObject and readObject:
    • Serializable serialization supports replacing the default process by first reflecting to determine whether our own serialization method, writeObject, or deserialization method readObject, exists. With these two methods, we can make some special changes to some fields, and we can implement the serialized encryption function
  • WriteReplace and readResolve:
    • Proxy serialized objects that can be implemented to return custom serialized instances. They enable version-compatibility for object serialization, such as converting an older version of the serialized object to a newer version of the object type through the readResolve method.
  • Non-serialized fields:
    • Class static variables and fields that are declared transient are ignored by the default serialization mechanism and are not serialized. We can also use the advanced writeReplace and readResolve methods to do our own serialization.
  • SerialVersionUID:
    • After the class implements the Serializable interface, we need to add a Serial Version ID, which corresponds to the class Version number. We can declare this ID explicitly or we can let the compiler compute it. In general, I recommend an explicit declaration as a safer option, since implicitly declaring that if the class changes even a little, deserialization will result in an InvalidClassException due to the serialVersionUID change.
  • Serialization allows refactoring:
    • Serialization allows for a certain number of class variants, and even after refactoring, the ObjectInputStream reads it just fine.
      • Add a new field to the class
      • Change the field from static to nonstatic
      • Change the field from TRANSIENT to non-transient
  • Serialization is not secure:
    • The serialized binary format is fully documented and fully reversible; When remote method calls are made over RMI, any private fields in the object sent over the connection almost always appear in plain text in the socket stream;
    • Serialization allows “hook” the serialization process and protects (or obscures) field data before serialization and after deserialization (overwrite writeObject,readObject)
      private void writeObject(java.io.ObjectOutputStream stream)
              throws java.io.IOException {
          // "Encrypt"/obscure the sensitive data
          age = age << 2;
          stream.defaultWriteObject();
      }
      
      private void readObject(java.io.ObjectInputStream stream)
              throws java.io.IOException, ClassNotFoundException {
          stream.defaultReadObject();
      
          // "Decrypt"/de-obscure the sensitive data
          age = age << 2;
      }
    Copy the code
  • Serialized data can be signed and sealed:
    • You can do cryptography and signature management using writeObject and readObject, but there are better ways.
    • The simplest is put it in a javax.mail crypto. SealedObject and/or Java. Security. The SignedObject wrapper. Both are serializable, so wrapping an object in SealedObject creates a kind of “box” around the original object. You must have a symmetric key for decryption, and the key must be managed separately.
    • Similarly, SignedObject can be used for data validation, and the symmetric key must also be managed separately.
  • Serialization allows the proxy to be placed in a stream
    • If serialization is the primary issue, then it’s a good idea to specify a flyweight or proxy to put in the stream. Provide the original Person with a writeReplace method that can serialize objects of different types to replace it. Similarly, if a readResolve method is found during deserialization, the method is called, providing the alternative object to the caller
    // Use proxy, WriteReplace and readResolve /** * @author LiuJinYang * @date 2020/3/23 */ public class Person implements Serializable { private String name; private int age; public Person() { } public Person(String name, int age) { if (age < 0) throw new IllegalArgumentException("age can't < 0"); this.name = name; this.age = age; } public String getName() { return name; } public void setName(String name) { this.name = name; } public int getAge() { return age; } public void setAge(int age) { this.age = age; } private static class PersonProxy implements Serializable { private final String name; private final int age; public PersonProxy(Person person) { this.name = person.getName(); this.age = person.getAge(); } private Object readResolve() { Person person = new Person(name, age); return person; } } private Object writeReplace() { return new PersonProxy(this); } // This method will not execute, private void writeObject(ObjectOutputStream out) {system.out.println (" person.writeObject ()"); Private Object readObject(ObjectInputStream in) throws InvalidObjectException {private Object readObject(ObjectInputStream in) throws InvalidObjectException { System.out.println("Person.readObject()"); throw new InvalidObjectException("Proxy required"); }}Copy the code
    • This technique is one of the few that doesn’t require a read/write balance. For example, a version of a class that has been reconstituted into another type could provide a readResolve method to silently convert the serialized object to the new type. Similarly, it can serialize the old class to the new version using the writeReplace method.
  • Consider replacing the serialized instance with a serialized proxy
    • Serialized proxy method can prevent pseudo-byte stream attack and internal domain theft attack;
  • Trust, but verify
    • For serialized objects, this means validating fields to ensure that they still have the correct values after deserialization, “just in case.” To do this, you can implement the ObjectInputValidation interface and override the validateObject() method. If an error is found somewhere when the method is called, an InvalidObjectException is thrown.
  • Implement the Serializable interface prudently
    • While the direct costs are low, the long-term costs are large
  • Write the readObject method protectively
    • When an object is deserialized, it is important to make a protective copy of any field containing object references that the client should not have;
  • For instance control, the enumeration type takes precedence over readResolve
    • If you rely on readResolve for instance control, all instance domains with object reference types must be declared transient (or primitive); (Otherwise, an attacker might protect references to the deserialized object until readResolve runs)

2. Externalizable

  • It inherits from the Serializable interface and defines writeExternal and readExternal methods to give developers more control over the serialization process. At the same time, it can implement some functions that cannot be implemented using the Serializable interface.
  • For example, objects that implement the Serializable interface, where static and TRANSIENT member variables are not serialized by default, The Externalizable interface allows you to manually serialize static and transient member variables.
  • The Externalizable interface’s serialization mechanism is based on reflection, and the performance of the Externalizable interface is poor.

3. Parcelable

  • Parcelable only serializes data in memory and does not store data to disk
  • In the trade-off between time overhead and use cost, the Parcelable mechanism chooses performance first.
    • So it requires manually adding custom code to both write and read, and is much more complex to use than Serializable. But because of this, Parcelable does not need to use reflection to implement serialization and deserialization.
There are two problems
  1. Compatibility with system versions
    • There is no guarantee that all Android versions of the Parcel. CPP implementation will be exactly the same. Problems may arise if implementations differ between system versions, or if implementations are modified by vendors.
  2. Data compatibility
    • Parcelable is not designed for versioning, and if the version of our class is upgraded, the order of writing and the compatibility of field types need to be taken care of, which is also very costly to maintain.
  • So in general, if you want persistent storage, you will have to choose the worse performance Serializable solution.
Serializable is different from Parcelable
  • Serializable: Java Serializable interface. A large number of temporary variables are generated during the read and write process, and a large number of I/O operations are performed internally, resulting in low efficiency.
  • Parcelable: Android serialization interface, high efficiency, trouble to use, read and write in memory (AS has related plug-in one-key generation method required), objects can not be saved to disk;

4. Serial

  • Twitter’s open source high-performance serialization solution Serial
  • advantages
    • There are significant advantages in serialization and deserialization time, as well as the size of the landing file;
    • Because reflection is not used, it is more efficient than traditional reflection serialization schemes;
    • Developers have strong control over the serialization process and can define which objects and fields need to be serialized.
    • Strong debug ability, can debug serialized process;
    • Strong version management ability, can be compatible with OptionalFieldException through version number;
  • use
Byte [] final Serial Serial = new ByteBufferSerial(); //1. final byte[] serializedData = serial.toByteArray(object, ExampleObject.SERIALIZER) //2. Deserialize an object from byte[] to object final ExampleObject Object = serial.fromByteArray(serializedData, Exampleobject.serializer) // The default serialization implementation class currently provided in the library is ByteBufferSerial, which produces byte[]. Users can also change the implementation class by themselves, rather than sticking to byte[]. //Serializers need to define a separate Serializer for each field. //Serializers need to define write and read operations for each field. Serializers need to be defined recursively //Serializers are stateless, so we can write them as an internal class of object, Public static class ExampleObject {public static final ObjectSerializer<ExampleObject> SERIALIZER = new ExampleObjectSerializer(); public final int num; public final SubObject obj; public ExampleObject(int num, @NotNull SubObject obj) { this.num = num; this.obj = obj; }... private static final class ExampleObjectSerializer extends ObjectSerializer<ExampleObject> { @Override protected void serializeObject(@NotNull SerializationContext context, @NotNull SerializerOutput output, @NotNull ExampleObject object) throws IOException { output .writeInt(object.num) // first field .writeObject(object.obj,  SubObject.SERIALIZER); // second field } @Override @NotNull protected ExampleObject deserializeObject(@NotNull SerializationContext context, @NotNull SerializerInput input, int versionNumber) throws IOException, ClassNotFoundException { final int num = input.readInt(); // first field final SubObject obj = input.readObject(SubObject.SERIALIZER); // second field return new ExampleObject(num, obj); }}} // This inner class is similar to Parcelable.Creator in Parcelable. public static final Parcelable.Creator<Person> CREATOR = new Creator<Person>() { @Override public Person createFromParcel(Parcel source) { Person person = new Person(); person.mName = source.readString(); person.mSex = source.readString(); person.mAge = source.readInt(); return person; } @override public Person[] newArray(int size) {return new Person[size]; }}; BuilderSerializer // You can use BuilderSerializer to serialize classes that are built in builder mode or have multiple constructors. // Inherits BuilderSerializer only, And implement the createBuilder method (just return the builder of the current class) // and the deserializeToBuilder method (in which you get the Builder object) Public static class ExampleObject {... public ExampleObject(@NotNull Builder builder) { this.num = builder.mNum; this.obj = builder.mObj; }... public static class Builder extends ModelBuilder<ExampleObject> { ... } private static final class ExampleObjectSerializer extends BuilderSerializer<ExampleObject, Builder> { @Override @NotNull protected Builder createBuilder() { return new Builder(); } @Override protected void serializeObject(@NotNull SerializationContext context, @NotNull SerializerOutput output, @NotNull ExampleObject object) throws IOException { output.writeInt(object.num) .writeObject(object.obj, SubObject.SERIALIZER); } @Override protected void deserializeToBuilder(@NotNull SerializationContext context, @NotNull SerializerInput input, @NotNull Builder builder, int versionNumber) throws IOException, ClassNotFoundException { builder.setNum(input.readInt()) .setObj(input.readObject(SubObject.SERIALIZER)); Final Serializer<ExampleObject> Serializer = New ExampleObjectSerializer(1); // final Serializer<ExampleObject> Serializer = New ExampleObjectSerializer(1); . @Override @NotNull protected ExampleObject deserializeObject(@NotNull SerializationContext context, @NotNull SerializerInput input, int versionNumber) throws IOException, ClassNotFoundException { final int num = input.readInt(); final SubObject obj = input.readObject(SubObject.SERIALIZER); final String name; if (versionNumber < 1) { name = DEFAULT_NAME; } else { name = input.readString(); } return new ExampleObject(num, obj, name); } //6. Serialization of simple parameters // Objects such as Integer, String, Size, Rect, etc., are themselves simple and do not require version control. Using ObjectSerializer will add 2-3 bytes of information to these objects. // ValueSerializer is the best option when version control is not required:  public static final Serializer<Boolean> BOOLEAN = new ValueSerializer<Boolean>() { @Override protected void serializeValue(@NotNull SerializationContext context, @NotNull SerializerOutput output, @NotNull Boolean object) throws IOException { output.writeBoolean(object); } @NotNull @Override protected Boolean deserializeValue(@NotNull SerializationContext context, @NotNull SerializerInput input) throws IOException { return input.readBoolean(); }};Copy the code
  • The key difference between Serial and Serializable (Externalizable) is performance. The key difference between Serial and Parcelable is the medium to which it can be serialized

Serialization of data

1. JSON

advantages
  • Compared to the object serialization scheme, it is faster and smaller
  • Compared to the binary serialization scheme, the results are readable and easy to troubleshoot
  • Easy to use, support cross-platform, cross-language, support for nested references
Json library
  • Android built-in JSON library, Google’s Gson, Alibaba’s Fastjson, Meituan’s MSON.
  • The libraries are optimized in two ways
    • Convenience: for example, support for JSON conversion into JavaBean objects, support for annotations, support for more data types, etc.
    • Performance: Reduced reflection, memory and CPU usage during serialization, especially when the data volume is large or the nesting level is deep.
  • Gson is the most compatible. In general, it performs as well as Fastjson, but Fastjson performs better when there is a lot of data.

2. Protocol Buffers

  • Performance: Using binary encoding compression, smaller size and faster codec than JSON
  • Compatibility: Good compatibility across languages and before and after. Automatic conversions of basic types are supported, but inheritance and reference types are not supported.
  • Use cost: The development cost of Protocol Buffers is high, so you need to define.proto files and use tools to generate the corresponding auxiliary classes. (Coupled and intrusive)
  • Google later introduced FlatBuffers with a higher compression ratio

The database

  • Talk about storage optimization must not open the database
  • For big data storage scenarios, we need to consider stability, performance, and scalability
  • For databases, SQLite is most commonly used on mobile, but there are also others such as startup Realm, Google’s LevelDB, and so on

ORM

  • Perhaps many senior development engineers do not fully understand the internal mechanism of SQLite, nor can they correctly write efficient SQL statements;
  • In order to improve development efficiency, most applications will introduce ORM framework;
  • Object Relational Mapping uses an object-oriented concept to associate tables with objects in a database. (Don’t worry about the underlying database implementation)
  • The most commonly used ORM frameworks are open source greenDAO and Google’s official Room; For details on its use, please refer to another article of mine: Componentized Architecture – 5. Data storage & GreenDao,Room, which will not be covered here;
Problems with the ORM framework
  1. The ORM framework is very simple to use, but at the cost of some execution efficiency;
  2. Let the developer’s thinking solidify, may end up even simple SQL statements can not write;

WCDB

  • Wechat team open source WCDB is an efficient, complete and easy to use mobile database framework, based on SQLCipher, support iOS, macOS and Android.
  • use
Implementation 'com.tencent. WCDB: WCDB -android:1.0.0' //2. Android {defaultConfig {NDK {// Only connect armeabi-v7a and x86 architecture abiFilters' Armeabi-v7a ', 'x86'}}} //3. //WCDB Android uses almost the same interface as the Android SDK SQLite framework. // If your APP has previously used the Android SDK database interface, // Just change android.database.* to com.0ce.wcdb.* in import, / / and android. Database. Sqlite. * to com. Tencent. WCDB. Database. *. // If SQLCipher Android Binding is used before, modify import accordingly. Sqlcipher_export () ATTACH 'old_database' AS old; //4. SELECT sqlcipher_export('main', 'old'); -- Import from 'old' to 'main' DETACH old; // The key change point is to convert the password to byte[] and pass SQLiteCipherSpec to describe the encryption mode. String passphrase = "passphrase"; // The key change point is to convert the password to byte[] and pass SQLiteCipherSpec to describe the encryption mode. SQLiteCipherSpec cipher = new SQLiteCipherSpec() // Encryption description object.setPagesize (1024) // SQLCipher default Page size is 1024 .setSQLCipherVersion(3); // 1,2,3 correspond to 1. X,2. X,3. . You can add other options SQLiteDatabase db = SQLiteDatabase openOrCreateDatabase (" path/to/database ", / / db path passphrase getBytes (), // WCDB password parameter type is byte[] cipher, // The encryption description object created above is null, // CursorFactory null // DatabaseErrorHandler // SQLiteDatabaseHook = null; //WCDB is now officially involved in Room to provide ORM and data binding capabilities and to interact with other Components of Android Jetpack. //6.1 On the basis of access Room, Gradle plus WCDB room component dependencies in the {implementation 'com. Tencent. WCDB: room: 1.0.8' / / instead of room - the runtime, At the same time also no longer need to reference WCDB - android annotationProcessor 'android. Arch. Persistence. Room: the compiler: 1.1.1' / / the compiler needs to use room} / / 6.2 In my code, when I open RoomDatabase, SQLiteCipherSpec cipherSpec = new SQLiteCipherSpec() // Specify encryption mode, Use default encryption to omit.setPagesize (4096).setkdfIteration (64000); WCDBOpenHelperFactory factory = new WCDBOpenHelperFactory().passphrase("passphrase".getbytes ()) // Specify encrypted DB key, Unencrypted DB line. CipherSpec (cipherSpec) / / specified encryption method, using the default encryption can be omitted. WriteAheadLoggingEnabled (true) / / concurrent open... and, speaking, reading and writing. AsyncCheckpointEnabled (true); // Turn on the asynchronous Checkpoint optimization, AppDatabase db = Room. DatabaseBuilder (this, appDatabase.class, "App - db") / /. AllowMainThreadQueries () / / allow the db operation on the main thread, is generally not recommended. OpenHelperFactory (factory) / / important: use WCDB open Room. The build (); Public static SQLiteDatabase openDatabase (String path, String path, String path, String path) SQLiteDatabase.CursorFactory factory, int flags, DatabaseErrorHandler errorHandler, int poolSize)Copy the code

Processes are concurrent with threads

  • Use SQLite often SQLiteDatabaseLockedException, in the final analysis because of concurrent causes, SQLite concurrent has two dimensions, one is a multi-process concurrent, one is multi-threaded concurrent;
Multi-process concurrency
  • SQLite supports multiple concurrent operations by default (controlled by file locks). The granularity of SQLite locks is not very fine, it is for the entire DB file, you can refer to it
    • Introduction to the SQLite locking mechanism
    • Sqlite Learning notes (5)&& Sqlite blocking mechanism
  • In simple terms, multiple processes can simultaneously acquire a SHARED lock to read data, but only one process can acquire an EXCLUSIVE lock to write data to the database.
  • In EXCLUSIVE mode, the database connection does not release the lock on the SQLite file until it is disconnected, thus avoiding unnecessary conflicts and speeding up database access.
Multi-threaded concurrency
  • SQLite supports multi-thread concurrent mode. You need to enable the following configuration. Of course, SQLite will enable multi-thread mode by default
PRAGMA SQLITE_THREADSAFE = 2
Copy the code
  • Like the multi-process locking mechanism, SQLite locks are at database file level granularity for simplicity and do not implement table or even row level locks.
  • If only one thread is operating with the same handle at the same time, we need to open the Connection Pool.
  • Similar to multiple processes, multiple threads can read from the database simultaneously, but write to the database is still mutually exclusive
  • SQLite provides a Busy Retry scenario, in which a Busy Handler is fired when a block occurs, allowing the thread to sleep for a period of time before trying the operation again
  • To further improve concurrency performance, we can also enable WAL (write-ahead Logging) mode. WAL mode writes modified data to a single WAL file and introduces WAL log file locks. With WAL mode, reads and writes can be executed completely concurrently without blocking each other.
PRAGMA schema.journal_mode = WAL
Copy the code
  • Note, however, that writes are still not concurrent. If there are multiple concurrent written, there are SQLiteDatabaseLockedException may occur. At this point we can ask the application to catch the exception and wait a while before trying again.
. } catch (SQLiteDatabaseLockedException e) { if (sqliteLockedExceptionTimes < (tryTimes - 1)) { try { Thread.sleep(100); {}}} catch InterruptedException (e1) sqliteLockedExceptionTimes++; } - In general, with connection pooling and WAL mode, we can greatly increase the read/write concurrency of SQLite and greatly reduce the wait time due to concurrency. It is recommended that you try this in your application.Copy the code

To optimize the

The index optimization
  • Proper use of indexes can significantly slow down queries in most scenarios
  • Many times we think we’ve created an index, but it doesn’t actually work. For example, operators such as BETWEEN, LIKE, OR are used, expressions are used, and case when is used
SELECT * FROM mytable WHERE myfield BETWEEN 10 and 20; SELECT * FROM mytable WHERE myfield >= 10 AND myfield <= 20;Copy the code
  • Creating an index comes with a cost. The index table needs to be kept up to date
    • Create the right index: Make sure it works and is efficient
    • Single-column index, multi-column index and composite index selection;
    • Select index fields: Integer types are much more efficient than strings. Primary keys are indexed by default, so do not use complex fields as main keys.
Page size and cache size
  • A database is like a small file system, with pages and caching concepts inside;
    • Page: minimum storage unit
    • The indexes of different pages belonging to the same table are organized as B-trees. Each table is a B-tree
    • Each page can only store the data of one table or a group of indexes, that is, it is impossible to store the data of multiple tables or indexes on the same page
  • SQLite caches pages that have been read to speed up the next read. The default page size is 1024 bytes, and the default cache size is 1000 pages.
    • Increasing page size does not consistently improve performance, and there may be side effects beyond the inflection point
    • It is recommended to select 4KB as the default page size in advance when creating a database for better performance.
Other optimization
  • Use “select*” with caution and select as many columns as you need
  • Use transactions correctly
  • Precompile with parameter binding and cache the compiled SQL statement
  • For blobs or oversized Text columns, it is possible to exceed the size of a page, resulting in oversized pages. It is recommended that these columns be separated from the table or placed after the table fields.
  • Periodically sort or clear useless or erasable data (if the user accesses this part of data, it can be pulled from the network again)

Damage and recovery

  • Wechat SQLite database repair practice
  • Wechat mobile terminal database component WCDB series (2) – database repair three batahawk
  • Android Database repair

Encryption and Security

  • Database security has two main aspects, one is anti injection, one is encryption
    • Static security scan can be used to prevent injection
    • Generally, SQLCipher is used to support encryption. SQLite encryption and decryption are in the unit of page. By default, AES algorithm is used for encryption.
    • About the use of WCDB encryption and decryption, you can refer to: wechat mobile database component WCDB (4) – Android features

Full-text search

  • Wechat full-text search optimization road
  • Mobile client polyphonic word search

monitoring

  • Every time you write an SQL statement, you should first test it locally
  • EXPLAIN QUERY PLAN allows you to test the QUERY PLAN of an SQL statement, whether it is a full table scan or an index is used, and which index is used.
sqlite> EXPLAIN QUERY PLAN SELECT * FROM t1 WHERE a=1 AND b>2;
QUERY PLAN
|--SEARCH TABLE t1 USING INDEX i2 (a=? AND b>?)
Copy the code
  • WCDB added SQLiteTrace monitoring module
  • Wechat open source Matrix has an intelligent analysis of SQLite statements in the tool: Matrix SQLiteLint – SQLite use quality detection. According to the analysis of SQL syntax tree, combined with our daily database use experience, it abstracts the improper use of index, select* and other six problems.
  • Meituan has also opened its own internal SQL optimization tool, SQLAdvisor
    • Application of SQL parsing in Meituan
    • Meituan Dianping SQL optimization tool SQLAdvisor is open source
Matrix SQLiteLint
  • Access Matrix SQLiteLint to check whether there is any unreasonable use of SQLite.
/ / 1. Add the dependent debugImplementation "com. Tencent. Matrix, the matrix - sqlite - lint -- android SDK: ${MATRIX_VERSION}" releaseImplementation "com.tencent.matrix:matrix-sqlite-lint-android-sdk-no-op:${MATRIX_VERSION}" //2. Private void prepareSQLiteLint() {SQLiteLintPlugin = (SQLiteLintPlugin) Matrix.with().getPluginByClass(SQLiteLintPlugin.class); if (plugin == null) { return; } plugin.addConcernedDB(new SQLiteLintConfig.ConcernDb(getWritableDatabase()) .setWhiteListXml(R.xml.sqlite_lint_whitelist) .enableAllCheckers()); }Copy the code

reference

  • Storage Optimization (part 1) : What are the common data storage methods?
  • Application sandbox
  • Security enhanced Linux in Android
  • Equipment encryption
  • The whole encryption
  • File level encryption
  • How to optimize data storage?
  • Replaces the implementation of system SharedPreferences
  • MMKV
  • Java object serialization
  • EffectiveJava – 10 – serialization
  • Serial
  • 05 | Twitter high-performance Serial serialization framework (1) basic usage and concepts
  • This section describes how to use the Google Protocol Buffer
  • FlatBuffers experience
  • Database SQLite is the best way to optimize your database
  • Wechat WCDB evolution road – open source and start
  • WCDB
  • Componentized architecture – 5. Data storage & GreenDao,Room
  • Introduction to the SQLite locking mechanism
  • Sqlite Learning notes (5)&& Sqlite blocking mechanism
  • Wechat iOS SQLite source code optimization practices
  • How SQLite indexes work
  • The data structure and algorithm behind MySQL index
  • SQlite source code analysis
  • Fully parse sqLite.pdf
  • Matrix SQLiteLint – SQLite uses quality checks
  • Application of SQL parsing in Meituan
  • Meituan Dianping SQL optimization tool SQLAdvisor is open source
  • The Definitive Guide to SQLite (2nd edition)

I am Jinyang, if you want to advance and learn more about dry goods, welcome to follow the wechat public account “jinyang said” to receive my latest articles