MMKV analysis

preface

MMKV is an open source mMAP-based high-performance universal key-value component of Tencent. The traditional Key-value reading and writing tool of Android is SharedPreferences. Its advantages are lightweight, simple and easy to use, but it also has poor performance and does not support multi-process communication. MMKV uses MMAP memory-mapped files, which map chunks of memory to files, and reads and writes files as if they were memory, thus providing higher performance and thus an alternative to SharedPreferences.

The disadvantage of SharedPreferences

Android’s traditional SharedPreferences, while simple and easy to use, have some problems:

Full update, low write efficiency.
All data needs to be loaded into memory each time. If a large amount of data is stored, it will take up a lot of memory.
Use security issues in multi-process situations. There is no mechanism in place to prevent collisions caused by multi-process updates. Therefore, it is not recommended to use multi-process mode, and the MODE_MULTI_PROCESS flag has been deprecated.

MMKV is the solution to the above problems.

Realization principle of MMKV

Mmap memory mapping file

The core principle of MMKV is memory mapping. Mmap is a tool provided by Linux as application-level memory mapping. The mmap function is defined as follows:

void* mmap(void* __addr, size_t __size, int __prot, int __flags, int __fd, off_t __offset);
Copy the code

Pass the start address memory start address addr, create a memory area of size, and map the files associated with the file descriptor fd to this area, starting with the offset of the file. Prot is the access permission, which can be specified as PROT_EXEC, prot-read, PROT_WRITE, and PROT_NONE, respectively representing executable, READ, write, and inaccessible. Flags allows you to set the type of mapped objects. For example, MAP_SHARED, MAP_PRIVATE, and MAP_ANONYMOUS indicate shared objects, private objects copied during write, and anonymous objects not associated with files respectively.

Mmap vs. traditional IO

In traditional IO read operations, Direct Memory Access (DMA) copies disk data to the kernel buffer, and then the CPU copies the kernel buffer data to the application address space. A single read/write operation involves two memory copies and two context switches. Mmap, on the other hand, only needs to copy disk data to the kernel buffer for the first time, then establish a mapping between user space and kernel space, and then read this data without memory copy or up and down switching. The traditional I/O write operation also involves two copies and two context switches. However, mMAP only needs to establish the mapping between users and memory space, and data is written to it without copying or context switching. The system is responsible for writing data back to the file. Compared with traditional IO, MMAP reduces data copying and context switching, improving I/O efficiency.

The specific Mmap logic of MMKV is written in memoryfile.cpp:

bool MemoryFile::mmap() {
    m_ptr = (char *) ::mmap(m_ptr, m_size, PROT_READ | PROT_WRITE, MAP_SHARED, m_fd, 0);
    if (m_ptr == MAP_FAILED) {
        MMKVError("fail to mmap [%s], %s", m_name.c_str(), strerror(errno));
        m_ptr = nullptr;
        return false;
    }

    return true;
}
Copy the code

You can see that MMap uses the shared object mode and grants read and write permissions.

Mmap sounds perfect, doesn’t it? For example, Mmap must map the entire page of memory, which may cause memory waste. Therefore, Mmap is applicable to frequent reads and writes of large files, which can save a lot of I/O time. Although the system is responsible for writing files back to the disk, it is not real-time. It writes files back to the disk periodically. In case of kernel crash or power failure, data will still be lost, but data can be synchronized back to disk through msync.

The data structure

MMKV accesses data using protobuf. The Protobuf protocol is a Google product. It has advantages over XML in data size, serialization and deserialization speed, with data size 3-10 times smaller and deserialization speed 20-100 times faster. For details on the Protobuf protocol, see this blog post: halfrost.com/protobuf_en…

With Protobuf, the speed of data serialization and deserialization is improved, but since Protobuf does not support incremental updates, how do you implement adding or modifying only one key-value? In this case, MMKV appends the data to the end of the memory block, which may result in many identical keys. When MMKV first loads data from the file into memory, it replaces the old key with the new one, ensuring that the data is up to date. Check the MMKV source code to see that during initialization, a map is used to store the key-value pairs loaded from the file. If the same key exists, the previous key will be overwritten by the later key to ensure that the value of this key is the latest. The code snippet is as follows:

MMKV_Android.cpp: MMKV::MMKV(const string &mmapID, int size, MMKVMode mode, string *cryptKey, string *rootPath){ ... // sensitive zone { SCOPED_LOCK(m_sharedProcessLock); loadFromFile(); } } MMKV_IO.cpp: void MMKV::loadFromFile() { ... if (! m_file->isFileValid()) { MMKVError("file [%s] not valid", m_path.c_str()); } else { ... // loading if (loadFromFile && m_actualSize > 0) { ... if (needFullWriteback) { #ifndef MMKV_DISABLE_CRYPT if (m_crypter) { MiniPBCoder::greedyDecodeMap(*m_dicCrypt, inputBuffer, m_crypter); } else #endif { MiniPBCoder::greedyDecodeMap(*m_dic, inputBuffer); } } else { #ifndef MMKV_DISABLE_CRYPT if (m_crypter) { MiniPBCoder::decodeMap(*m_dicCrypt, inputBuffer, m_crypter); } else #endif { MiniPBCoder::decodeMap(*m_dic, inputBuffer); }}... } else { ... }... } m_needLoadFromFile = false; } MiniPBCoder.cpp: void MiniPBCoder::decodeOneMap(MMKVMap &dic, size_t position, bool greedy) { auto block = [position, this](MMKVMap &dictionary) { ... while (! m_inputData->isAtEnd()) { KeyValueHolder kvHolder; const auto &key = m_inputData->readString(kvHolder); if (key.length() > 0) { m_inputData->readData(kvHolder); if (kvHolder.valueSize > 0) { dictionary[key] = move(kvHolder); } else { auto itr = dictionary.find(key); if (itr ! = dictionary.end()) { dictionary.erase(itr); }}}}}; if (greedy) { try { block(dic); } catch (std::exception &exception) { MMKVError("%s", exception.what()); } } else { try { MMKVMap tmpDic; block(tmpDic); dic.swap(tmpDic); } catch (std::exception &exception) { MMKVError("%s", exception.what()); }}}Copy the code

M_dic and m_dicCrypt are dictionaries of key-value pairs. Their data structures are defined as follows:

MMKVPredef.h:

using MMKVMap = std::unordered_map<std::string, mmkv::KeyValueHolder>;
using MMKVMapCrypt = std::unordered_map<std::string, mmkv::KeyValueHolderCrypt>;
Copy the code

Both are unordered_map, which holds the key that was written to the memory block, which is the key that we passed in, and holds the value of the KeyValueHolder structure. Let’s see:

KeyValueHolder.h:

struct KeyValueHolder {
    uint16_t computedKVSize; // internal use only
    uint16_t keySize;
    uint32_t valueSize;
    uint32_t offset;

    KeyValueHolder() = default;
    KeyValueHolder(uint32_t keyLength, uint32_t valueLength, uint32_t offset);

    MMBuffer toMMBuffer(const void *basePtr) const;
};
Copy the code

The size of the key and value and the offset in the memory are stored, but the key and value are not stored directly. When using the offset and size, the memory is saved.

Data is written to

Analyze how String data is stored for example. The code snippet is as follows:

public class MMKV implements SharedPreferences, SharedPreferences.Editor { ... public boolean encode(String key, String value) { return encodeString(nativeHandle, key, value); }... private native boolean encodeString(long handle, String key, String value); } native-birdge.cpp: MMKV_JNI jboolean encodeString(JNIEnv *env, jobject, jlong handle, jstring oKey, jstring oValue) { MMKV *kv = reinterpret_cast<MMKV *>(handle); if (kv && oKey) { string key = jstring2string(env, oKey); if (oValue) { string value = jstring2string(env, oValue); return (jboolean) kv->set(value, key); } else {// remove this key pair kv->removeValueForKey; return (jboolean) true; } } return (jboolean) false; } MMKV.cpp: bool MMKV::set(const string &value, MMKVKey_t key) { if (isKeyEmpty(key)) { return false; } return setDataForKey(MMBuffer((void *) value.data(), value.length(), MMBufferNoCopy), key, true); } MMKV_IO.cpp: bool MMKV::setDataForKey(MMBuffer &&data, MMKVKey_t key, bool isDataHolder) { if ((! isDataHolder && data.length() == 0) || isKeyEmpty(key)) { return false; } SCOPED_LOCK(m_lock); SCOPED_LOCK(m_exclusiveProcessLock); checkLoadData(); #ifndef MMKV_DISABLE_CRYPT if (m_crypter) { ... } else #endif // MMKV_DISABLE_CRYPT { auto itr = m_dic->find(key); if (itr ! Auto ret = appendDataWithKey(data, itr->second, isDataHolder); if (! ret.first) { return false; } // update key&value itr->second = STD ::move(ret.second); } else {auto ret = appendDataWithKey(data, key, isDataHolder); if (! ret.first) { return false; } m_dic->emplace(key, std::move(ret.second)); } } m_hasFullWriteback = false; #ifdef MMKV_APPLE [key retain]; #endif return true; } // the append function called KVHolderRet_t MMKV::appendDataWithKey(const MMBuffer &data, const KeyValueHolder &kvHolder, bool isDataHolder) { SCOPED_LOCK(m_exclusiveProcessLock); uint32_t keyLength = kvHolder.keySize; // size needed to encode the key size_t rawKeySize = keyLength + pbRawVarint32Size(keyLength); // ensureMemorySize() might change kvHolder.offset, so have to do it early { auto valueLength = static_cast<uint32_t>(data.length()); if (isDataHolder) { valueLength += pbRawVarint32Size(valueLength); } auto size = rawKeySize + valueLength + pbRawVarint32Size(valueLength); bool hasEnoughSize = ensureMemorySize(size); if (! hasEnoughSize) { return make_pair(false, KeyValueHolder()); } } auto basePtr = (uint8_t *) m_file->getMemory() + Fixed32Size; MMBuffer keyData(basePtr + kvHolder.offset, rawKeySize, MMBufferNoCopy); return doAppendDataWithKey(data, keyData, isDataHolder, keyLength); } // append function KVHolderRet_t MMKV::appendDataWithKey(const MMBuffer &data, MMKVKey_t key, bool isDataHolder) { #ifdef MMKV_APPLE auto oData = [key dataUsingEncoding:NSUTF8StringEncoding]; auto keyData = MMBuffer(oData, MMBufferNoCopy); #else auto keyData = MMBuffer((void *) key.data(), key.size(), MMBufferNoCopy); #endif return doAppendDataWithKey(data, keyData, isDataHolder, static_cast<uint32_t>(keyData.length())); } / / two cases will be called to doAppendDataWithKey function KVHolderRet_t MMKV: : doAppendDataWithKey (const MMBuffer & data, const MMBuffer &keyData, bool isDataHolder, uint32_t originKeyLength) { ... try { if (isKeyEncoded) { m_output->writeRawData(keyData); } else { m_output->writeData(keyData); } if (isDataHolder) { m_output->writeRawVarint32((int32_t) valueLength); } m_output->writeData(data); // note: write size of data } catch (std::exception &e) { MMKVError("%s", e.what()); return make_pair(false, KeyValueHolder()); } auto offset = static_cast<uint32_t>(m_actualSize); auto ptr = (uint8_t *) m_file->getMemory() + Fixed32Size + m_actualSize; #ifndef MMKV_DISABLE_CRYPT if (m_crypter) { m_crypter->encrypt(ptr, ptr, size); } #endif m_actualSize += size; updateCRCDigest(ptr, size); return make_pair(true, KeyValueHolder(originKeyLength, valueLength, offset)); } CodedOutputData.cpp: void CodedOutputData::writeRawData(const MMBuffer &data) { size_t numberOfBytes = data.length(); if (m_position + numberOfBytes > m_size) { auto msg = "m_position: " + to_string(m_position) + ", numberOfBytes: " + to_string(numberOfBytes) + ", m_size: " + to_string(m_size); throw out_of_range(msg); } memcpy(m_ptr + m_position, data.getPtr(), numberOfBytes); m_position += numberOfBytes; }Copy the code

If we look at the writeRawData method in the last codedOutputData.cpp, the data is written to the end of the block. M_position is a positional offset for writing data. The initial value is m_actualSize, the true size of the file, which is positioned at the end of the file. The setDataForKey method in MMKv_io.cpp looks for elements in the map based on their keys and, if found, positions the pointer to keyData here in order to fetch the key and update the key-value pairs in the M_DIC or m_dicCrypt dictionaries. Itr ->second = STD ::move(ret.second); In a clever way, move reduces memory copying with move semantics. Accordingly, if it is not found, it is added directly to the end of the memory block and the key-value pair is added to m_DIC or m_dicCrypt.

Data out

Let’s see how String data is retrieved. The snippet code is as follows:

MMKV.java: public String decodeString(String key) { return decodeString(nativeHandle, key, null); } private native String decodeString(long handle, String key, String defaultValue); native-bridge.cpp: MMKV_JNI jstring decodeString(JNIEnv *env, jobject obj, jlong handle, jstring oKey, jstring oDefaultValue) { MMKV *kv = reinterpret_cast<MMKV *>(handle); if (kv && oKey) { string key = jstring2string(env, oKey); string value; bool hasValue = kv->getString(key, value); if (hasValue) { return string2jstring(env, value); } } return oDefaultValue; } MMKV_IO.cpp: bool MMKV::getString(MMKVKey_t key, string &result) { if (isKeyEmpty(key)) { return false; } SCOPED_LOCK(m_lock); auto data = getDataForKey(key); if (data.length() > 0) { try { CodedInputData input(data.getPtr(), data.length()); result = input.readString(); return true; } catch (std::exception &exception) { MMKVError("%s", exception.what()); } } return false; } MMBuffer MMKV::getDataForKey(MMKVKey_t key) { checkLoadData(); #ifndef MMKV_DISABLE_CRYPT if (m_crypter) { auto itr = m_dicCrypt->find(key); if (itr ! = m_dicCrypt->end()) { auto basePtr = (uint8_t *) (m_file->getMemory()) + Fixed32Size; return itr->second.toMMBuffer(basePtr, m_crypter); } } else #endif { auto itr = m_dic->find(key); if (itr ! = m_dic->end()) { auto basePtr = (uint8_t *) (m_file->getMemory()) + Fixed32Size; return itr->second.toMMBuffer(basePtr); } } MMBuffer nan; return nan; } KeyValueHolder.cpp: MMBuffer KeyValueHolder::toMMBuffer(const void *basePtr) const { auto realPtr = (uint8_t *) basePtr + offset; realPtr += computedKVSize; return MMBuffer(realPtr, valueSize, MMBufferNoCopy); } CodedInputData.cpp: string CodedInputData::readString() { int32_t size = readRawVarint32(); if (size < 0) { throw length_error("InvalidProtocolBuffer negativeSize"); } auto s_size = static_cast<size_t>(size); if (s_size <= m_size - m_position) { string result((char *) (m_ptr + m_position), s_size); m_position += s_size; return result; } else { throw out_of_range("InvalidProtocolBuffer truncatedMessage"); }}Copy the code

The code is relatively simple. The logic is to take the m_DIC or m_dicCrypt dictionary element, KeyValueHolder, convert it to MMBuffer, and then CodedInputData. Call the readString method inside to read the data in the memory block based on the pointer and offset.

Expansion mechanism

If the size passed in is less than the page size, the capacity defaults to one page size. As data continues to be inserted, the capacity will be exceeded. In this case, capacity expansion is required. The expansion logic of MMKV is as follows:

MMKV_IO.cpp: // since we use append mode, when -[setData: forKey:] many times, space may not be enough // try a full rewrite to make space bool MMKV::ensureMemorySize(size_t newSize) { if (! isFileValid()) { MMKVWarning("[%s] file not valid", m_mmapID.c_str()); return false; } if (newSize >= m_output->spaceLeft() || (m_crypter ? m_dicCrypt->empty() : m_dic->empty())) { // try a full rewrite to make space auto fileSize = m_file->getFileSize(); auto preparedData = m_crypter ? prepareEncode(*m_dicCrypt) : prepareEncode(*m_dic); auto sizeOfDic = preparedData.second; size_t lenNeeded = sizeOfDic + Fixed32Size + newSize; size_t dicCount = m_crypter ? m_dicCrypt->size() : m_dic->size(); size_t avgItemSize = lenNeeded / std::max<size_t>(1, dicCount); size_t futureUsage = avgItemSize * std::max<size_t>(8, (dicCount + 1) / 2); // 1. no space for a full rewrite, double it // 2. or space is not large enough for future usage, double it to avoid frequently full rewrite if (lenNeeded >= fileSize || (lenNeeded + futureUsage) >= fileSize) { size_t oldSize = fileSize; do { fileSize *= 2; } while (lenNeeded + futureUsage >= fileSize); MMKVInfo("extending [%s] file size from %zu to %zu, incoming size:%zu, future usage:%zu", m_mmapID.c_str(), oldSize, fileSize, newSize, futureUsage); // if we can't extend size, rollback to old state if (! m_file->truncate(fileSize)) { return false; } // check if we fail to make more space if (! isFileValid()) { MMKVWarning("[%s] file not valid", m_mmapID.c_str()); return false; } } return doFullWriteBack(move(preparedData), nullptr); } return true; }Copy the code

The logic is to add the size of the new element. If the size exceeds the fileSize, or if this size plus the size that might be used in the future (half the size of the dictionary or 8 elements) exceeds the fileSize, then double fileSize the fileSize and write the data back. Take a look at the doFullWriteBack function:

MMKV_IO.cpp: bool MMKV::doFullWriteBack(pair<MMBuffer, size_t> preparedData, AESCrypt *newCrypter) { auto ptr = (uint8_t *) m_file->getMemory(); auto totalSize = preparedData.second; #ifdef MMKV_IOS auto ret = guardForBackgroundWriting(ptr + Fixed32Size, totalSize); if (! ret.first) { return false; } #endif #ifndef MMKV_DISABLE_CRYPT uint8_t newIV[AES_KEY_LEN]; auto decrypter = m_crypter; auto encrypter = (newCrypter == InvalidCryptPtr) ? nullptr : (newCrypter ? newCrypter : m_crypter); if (encrypter) { AESCrypt::fillRandomIV(newIV); encrypter->resetIV(newIV, sizeof(newIV)); } #endif delete m_output; m_output = new CodedOutputData(ptr + Fixed32Size, m_file->getFileSize() - Fixed32Size); #ifndef MMKV_DISABLE_CRYPT if (m_crypter) { memmoveDictionary(*m_dicCrypt, m_output, ptr, decrypter, encrypter, preparedData); } else { #else { auto encrypter = m_crypter; #endif memmoveDictionary(*m_dic, m_output, ptr, encrypter, totalSize); } m_actualSize = totalSize; #ifndef MMKV_DISABLE_CRYPT if (encrypter) { recaculateCRCDigestWithIV(newIV); } else #endif { recaculateCRCDigestWithIV(nullptr); } m_hasFullWriteback = true; // make sure lastConfirmedMetaInfo is saved sync(MMKV_SYNC); return true; } // we don't need to really serialize the dictionary, just reuse what's already in the file static void memmoveDictionary(MMKVMap &dic, CodedOutputData *output, uint8_t *ptr, AESCrypt *encrypter, size_t totalSize) { auto originOutputPtr = output->curWritePointer(); // make space to hold the fake size of dictionary's serialization result auto writePtr = originOutputPtr + ItemSizeHolderSize; // reuse what's already in the file if (! dic.empty()) { // sort by offset vector<KeyValueHolder *> vec; vec.reserve(dic.size()); for (auto &itr : dic) { vec.push_back(&itr.second); } sort(vec.begin(), vec.end(), [](const auto &left, const auto &right) {return left->offset < right->offset; }); // merge nearby items to make memmove quicker vector<pair<uint32_t, uint32_t>> dataSections; // pair(offset, size) dataSections.emplace_back(vec.front()->offset, vec.front()->computedKVSize + vec.front()->valueSize); for (size_t index = 1, total = vec.size(); index < total; index++) { auto kvHolder = vec[index]; auto &lastSection = dataSections.back(); if (kvHolder->offset == lastSection.first + lastSection.second) { lastSection.second += kvHolder->computedKVSize + kvHolder->valueSize; } else { dataSections.emplace_back(kvHolder->offset, kvHolder->computedKVSize + kvHolder->valueSize); } } // do the move auto basePtr = ptr + Fixed32Size; for (auto &section : dataSections) { // memmove() should handle this well: src == dst memmove(writePtr, basePtr + section.first, section.second); writePtr += section.second; } // update offset if (! encrypter) { auto offset = ItemSizeHolderSize; for (auto kvHolder : vec) { kvHolder->offset = offset; offset += kvHolder->computedKVSize + kvHolder->valueSize; } } } // hold the fake size of dictionary's serialization result output->writeRawVarint32(ItemSizeHolder); auto writtenSize = static_cast<size_t>(writePtr - originOutputPtr); #ifndef MMKV_DISABLE_CRYPT if (encrypter) { encrypter->encrypt(originOutputPtr, originOutputPtr, writtenSize); } #endif assert(writtenSize == totalSize); output->seek(writtenSize - ItemSizeHolderSize); }Copy the code

You can see that a new CodedOutputData is created, the pointer is positioned at the beginning of the file, and the elements in the dictionary are sorted by offset from smallest to largest, and then written back one by one via memmove. Finally, the output pointer is positioned at the end after the new data is written. After this processing, the data has reached the purpose of merging and reforming. If the capacity is insufficient, perform capacity expansion.

Processing synchronization

MMKV internally defines several locks to handle synchronization problems in different scenarios:

mmkv::ThreadLock *m_lock; MMKV ::FileLock *m_fileLock; MMKV ::InterProcessLock *m_sharedProcessLock; MMKV ::InterProcessLock *m_exclusiveProcessLock; // Exclusive lock, wrapped above the file lock, used before the write operationCopy the code

Let’s start with ThreadLock:

class ThreadLock {
private:
#if MMKV_USING_PTHREAD
    pthread_mutex_t m_lock;
#else
    CRITICAL_SECTION m_lock;
#endif
...
}

ThreadLock::ThreadLock() {
    pthread_mutexattr_t attr;
    pthread_mutexattr_init(&attr);
    pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);

    pthread_mutex_init(&m_lock, &attr);

    pthread_mutexattr_destroy(&attr);
}
Copy the code

As you can see, it encapsulates the mutex pthread_mutex_t in a reentrant mode. This is used for synchronization between threads to lock data before it is written or read. A ScopedLock wrapper class is used to manipulate the lock, simplifying operations by locking the object when it is constructed and unlocking it when it is destructed. The code is as follows:

class ScopedLock {
    T *m_lock;

    void lock() {
        if (m_lock) {
            m_lock->lock();
        }
    }

    void unlock() {
        if (m_lock) {
            m_lock->unlock();
        }
    }

public:
    explicit ScopedLock(T *oLock) : m_lock(oLock) {
        MMKV_ASSERT(m_lock);
        lock();
    }

    ~ScopedLock() {
        unlock();
        m_lock = nullptr;
    }

    // just forbid it for possibly misuse
    explicit ScopedLock(const ScopedLock<T> &other) = delete;
    ScopedLock &operator=(const ScopedLock<T> &other) = delete;
};
Copy the code

At the same time, exclusive locks (write locks) and shared locks (read locks) are added to handle the synchronization of multi-process access during read and write operations. MMKV uses a file lock flock to wrap it. Internal maintenance of read and write counters, the reason for this is because the file lock is not recursive, although the lock can be multiple times, but the unlock operation is unlocked once. The third is to handle lock escalation and degradation, that is, read lock upgrade to write lock, write lock degraded to read lock. For upgrade, the solution is to read the lock before adding a write lock, for downgrade is to add a read lock, downgrade the write lock. The code snippet is as follows:

bool FileLock::doLock(LockType lockType, bool wait) { if (! isFileLockValid()) { return false; } bool unLockFirstIfNeeded = false; if (lockType == SharedLockType) { // don't want shared-lock to break any existing locks if (m_sharedLockCount > 0 || m_exclusiveLockCount > 0) { m_sharedLockCount++; return true; } } else { // don't want exclusive-lock to break existing exclusive-locks if (m_exclusiveLockCount > 0) { m_exclusiveLockCount++; return true; } // prevent deadlock if (m_sharedLockCount > 0) { unLockFirstIfNeeded = true; } } auto ret = platformLock(lockType, wait, unLockFirstIfNeeded); if (ret) { if (lockType == SharedLockType) { m_sharedLockCount++; } else { m_exclusiveLockCount++; } } return ret; } bool FileLock::platformLock(LockType lockType, bool wait, bool unLockFirstIfNeeded) { # ifdef MMKV_ANDROID if (m_isAshmem) { return ashmemLock(lockType, wait, unLockFirstIfNeeded); } # endif auto realLockType = LockType2FlockType(lockType); auto cmd = wait ? realLockType : (realLockType | LOCK_NB); if (unLockFirstIfNeeded) { // try lock auto ret = flock(m_fd, realLockType | LOCK_NB); if (ret == 0) { return true; } // let's be gentleman: unlock my shared-lock to prevent deadlock ret = flock(m_fd, LOCK_UN); if (ret ! = 0) { MMKVError("fail to try unlock first fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno)); } } auto ret = flock(m_fd, cmd); if (ret ! = 0) { MMKVError("fail to lock fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno)); // try recover my shared-lock if (unLockFirstIfNeeded) { ret = flock(m_fd, LockType2FlockType(SharedLockType)); if (ret ! = 0) { // let's hope this never happen MMKVError("fail to recover shared-lock fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno)); } } return false; } else { return true; } } bool FileLock::unlock(LockType lockType) { if (! isFileLockValid()) { return false; } bool unlockToSharedLock = false; if (lockType == SharedLockType) { if (m_sharedLockCount == 0) { return false; } // don't want shared-lock to break any existing locks if (m_sharedLockCount > 1 || m_exclusiveLockCount > 0) { m_sharedLockCount--; return true; } } else { if (m_exclusiveLockCount == 0) { return false; } if (m_exclusiveLockCount > 1) { m_exclusiveLockCount--; return true; } // restore shared-lock when all exclusive-locks are done if (m_sharedLockCount > 0) { unlockToSharedLock = true; } } auto ret = platformUnLock(unlockToSharedLock); if (ret) { if (lockType == SharedLockType) { m_sharedLockCount--; } else { m_exclusiveLockCount--; } } return ret; } bool FileLock::platformUnLock(bool unlockToSharedLock) { # ifdef MMKV_ANDROID if (m_isAshmem) { return ashmemUnLock(unlockToSharedLock); } # endif int cmd = unlockToSharedLock ? LOCK_SH : LOCK_UN; auto ret = flock(m_fd, cmd); if (ret ! = 0) { MMKVError("fail to unlock fd=%d, ret=%d, error:%s", m_fd, ret, strerror(errno)); return false; } else { return true; }}Copy the code

In addition to locking, MMKV internally maintains the file size to check whether the current file size matches the internal maintenance size before the read and write operation. If the file size does not match the internal maintenance size, it indicates that the multi-process write has occurred. On the other hand, it also maintains a sequence number of m_sequence. When the memory is reorganized, the sequence number will be incremented. If the sequence number of the own process is inconsistent with that of the file, it indicates that another process has been reorganized. Both cases require reloading:

MMKV_IO.cpp: void MMKV::checkLoadData() { ... if (m_metaInfo->m_sequence ! = metaInfo.m_sequence) { MMKVInfo("[%s] oldSeq %u, newSeq %u", m_mmapID.c_str(), m_metaInfo->m_sequence, metaInfo.m_sequence); SCOPED_LOCK(m_sharedProcessLock); clearMemoryCache(); loadFromFile(); notifyContentChanged(); } else if (m_metaInfo->m_crcDigest ! = metaInfo.m_crcDigest) { MMKVDebug("[%s] oldCrc %u, newCrc %u, new actualSize %u", m_mmapID.c_str(), m_metaInfo->m_crcDigest, metaInfo.m_crcDigest, metaInfo.m_actualSize); SCOPED_LOCK(m_sharedProcessLock); size_t fileSize = m_file->getActualFileSize(); if (m_file->getFileSize() ! = fileSize) { MMKVInfo("file size has changed [%s] from %zu to %zu", m_mmapID.c_str(), m_file->getFileSize(), fileSize);  clearMemoryCache(); loadFromFile(); } else { partialLoadFromFile(); } notifyContentChanged(); }}Copy the code

Data encryption

MMKV can choose to use the encryption mode, that is, the data is encrypted and stored. AES encryption is used. The key needs to be transmitted, but you need to save the key and pass it in the next initialization. MMKV processing is more flexible, can change the key, or plaintext and ciphertext switch. After the switch, the data needs to be written again. The code snippet is as follows:

bool MMKV::reKey(const string &cryptKey) { ... if (m_crypter) { if (cryptKey.length() > 0) { string oldKey = this->cryptKey(); if (cryptKey == oldKey) { return true; } else { // change encryption key MMKVInfo("reKey with new aes key"); auto newCrypt = new AESCrypt(cryptKey.data(), cryptKey.length()); ret = fullWriteback(newCrypt); if (ret) { delete m_crypter; m_crypter = newCrypt; } else { delete newCrypt; } } } else { // decryption to plain text MMKVInfo("reKey to no aes key"); ret = fullWriteback(InvalidCryptPtr); if (ret) { delete m_crypter; m_crypter = nullptr; if (! m_dic) { m_dic = new MMKVMap(); } } } } else { if (cryptKey.length() > 0) { // transform plain text to encrypted text MMKVInfo("reKey to a aes key"); auto newCrypt = new AESCrypt(cryptKey.data(), cryptKey.length()); ret = fullWriteback(newCrypt); if (ret) { m_crypter = newCrypt; if (! m_dicCrypt) { m_dicCrypt = new MMKVMapCrypt(); } } else { delete newCrypt; } } else { return true; } } // m_dic or m_dicCrypt is not valid after reKey if (ret) { clearMemoryCache(); } return ret; }Copy the code

The performance comparison

MMKV significantly outperforms Android’s traditional SharedPreferences and SQLite in both single-process and multi-process write and read performance. This is thanks to Mmap and incremental writes. Specific test results refer to the official: github.com/Tencent/MMK…

conclusion

MMKV, as a high-performance mass data storage component, has many advantages compared with the traditional Android storage mode SharedPreferences and SQLite. The core is to use MMAP memory to map files. Compared with traditional IO, it has great advantages in performance and makes the operation of reading and writing files as simple as operating memory. Look at the source code, there are many excellent design points. Such as incremental writing, memory refactoring, awareness of multi-process operations through file size verification, multi-process read/write locking, and so on. The downside is that it can be a waste of memory, because you have to map integer multiples of memory pages, and if you store only a small amount of data, you’re overqualified. Therefore, it can be used as an alternative to SharedPreferences as a data store of choice in scenarios where a large amount of data is stored.