LevelDB LRUCache shallowly discussed

LRUCache

The LruCache algorithm is the Least Recently Used algorithm. That is to apply for a piece of memory in advance and store data according to the access sequence.

Generally, two structures are required: a linked list for storing data and representing the order of the data, and a hash table for finding the corresponding linked list node.

LevelDB LRUCache structure

In levelDB, the implementation of LRUCache is optimized to improve the speed of accessing LRUCache in multi-threaded applications.

The diagram above shows the structure of LRUCache. In the LRU_ bidirectional list, the node pointed to by lru_. Among them:

N queues are maintained in the _list array. Input keys are hashed to different queues.

In_use_ Bidirectional linked list, which represents nodes being used by threads; The list is unordered;

Lru_ bidirectional linked list, which indicates the nodes used only by cache; The list is ordered;

So why maintain two bidirectional lists in LevelDB? Since lRU_ stores nodes that are not being used by the thread when the cache reaches Max, it is possible to clean the LRU_ list from back to front.

LevelDB minimum structure -LRUHandle

struct LRUHandle { void* value; void (*deleter)(const Slice&, void* value); LRUHandle* next_hash; LRUHandle* next; LRUHandle* prev; // The required size of the bytes size_t charge; // TODO(opt): Only allow uint32_t? size_t key_length; bool in_cache; // How many threads are using this node uint32_t refs; // References, including cache reference, if present. // Multiple LRUCache, hash is the key based on the hash function. Determine which LRUCache the key is placed in according to the first four bits of the hash. uint32_t hash; // Hash of key(); used for fast sharding and comparisons char key_data[1]; // Beginning of key Slice key() const { // next_ is only equal to this if the LRU handle is the list head of an // empty  list. List heads never have meaningful keys. assert(next ! = this); return Slice(key_data, key_length); }};Copy the code

Each LURHandle stores key-value pair information.

// A single shard of sharded cache. class LRUCache { public: LRUCache(); ~LRUCache(); // Separate from constructor so caller can easily make an array of LRUCache void SetCapacity(size_t capacity) { capacity_ = capacity; } // Like Cache methods, but with an extra "hash" parameter. Cache::Handle* Insert(const Slice& key, uint32_t hash, void* value, size_t charge, void (*deleter)(const Slice& key, void* value)); Cache::Handle* Lookup(const Slice& key, uint32_t hash); void Release(Cache::Handle* handle); void Erase(const Slice& key, uint32_t hash); void Prune(); size_t TotalCharge() const { MutexLock l(&mutex_); return usage_; } private: // Auxiliary function: void LRU_Remove(LRUHandle* e); Void LRU_Append(LRUHandle* list, LRUHandle* e); Void Ref(LRUHandle* e); // Auxiliary function: reduce the reference count of e. When the reference count is 0, destroy e; When the reference count is 1, move e to lru_ list void Unref(LRUHandle* e); bool FinishErase(LRUHandle* e) EXCLUSIVE_LOCKS_REQUIRED(mutex_); // Initialized before use. How many nodes are available for the key-value pair to use size_t capacity_; Size_t usage_ GUARDED_BY(mutex_); // mutex_ protects the following state. mutable port::Mutex mutex_; // Lru.prev points to the latest entry, lru.next points to the oldest entry // All entries in this list are refs==1 and in_cache==true. No client is using LRUHandle lru_ GUARDED_BY(mutex_); // Store all entries that are still referenced by the client. // Since they are referenced by the client, they are also referenced by the cache. Refs >= 2 and in_cache==true LRUHandle in_use_ GUARDED_BY(mutex_); HandleTable table_ GUARDED_BY(mutex_); };Copy the code

It can be seen from the code implementation that LRUCache class maintains a lock member object, and all operations on LRUCache will be locked and unlocked, so in the case of multi-threaded operations, it will increase a lot of time.

LevelDB optimizes it by maintaining multiple LRUCache structures and using the Hash algorithm to evenly distribute keys among different LRUCache structures so that multiple threads do not need to operate the same Cache at the same time, thereby reducing lock waits.

Advantages of the LRUCache implementation in LevelDB

The multi-fragment LRUCache mechanism avoids intensive lock-unlock operations and improves the efficiency of multi-threaded LRUCache operations.
The nodes in the cache are divided into two bidirectional linked lists (one for in_use_ and the other for lRU_) to improve the access speed. The linked list in_use_ does not require node order

LevelDB LRUCache shallowly discussed

LRUCache

LevelDB LRUCache structure

LevelDB minimum structure -LRUHandle

Advantages of the LRUCache implementation in LevelDB

Related Posts

See where you rank in the “Python Theme Month” pre-event rankings

Functional programming: Concepts, idioms, and concepts

Super detailed, understand these 6 core concepts, easy access to Java multithreading!