The foreword 0.

I have been working so hard that I even have some anxiety that SOMETIMES I can’t sleep. Since I don’t have time to study new things in blocks, I take time to summarize what I have read before.

It is planned to be divided into three parts to sort out hot issues related to Redis. This is the implementation of the bottom layer of Mountain Opening. You will learn the following contents from this article:

  • The author, development and evolution of Redis

  • Overview of Redis interview questions

  • Issues related to the underlying Redis implementation include:

    ** Low-level implementation of common data types, principles and advantages of SDS, implementation principles of dictionaries, principles of hoptables and ordered collections, Redis threading mode and service model **Copy the code

Warm tip: the content is not difficult, afraid you do not look.

If you don’t understand it, you can Mark it first, and wait until the time for in-depth research, you will find that it is really 24K dry goods! Stop bragging and write something different!

1. Redis memories

Redis is an open source, network-enabled, memory-based, optional persistence, high-performance key-value pair database written in ANSI C. The father of Redis is Salvatore Sanfilippo from Sicily, Italy, who goes by the name Antirez on Github. I found some brief information of the author and translated it, as shown in the picture below:

Ten years after its first release in 2009, Redis remains one of the most popular key-value in-memory databases.

Great open source projects are supported by big companies. Prior to May 2013, its development was sponsored by VMware, while between May 2013 and June 2015, its development was sponsored by Bivito, and since June 2015, the development of Redis was sponsored by Redis Labs.

The author has also used some other NoSQL, some of which support a very single value type, so many operations must be implemented on the client side. For example, value is a structured data, and if you need to modify a field, you need to read it as a whole, modify it, and then write it as a whole, which is cumbersome, but Redis value supports multiple types. Many operations can be done on the server side, which is very convenient for the client side.

Of course, because Redis is a memory database, the data storage capacity is limited and the distributed cluster cost will be very high, so many companies have developed ssD-like Redis system, such as SSDB, Pika and other databases developed by 360. However, the author believes that the difficulty from 0 to 1 is greater than the difficulty from 1 to 2. There is no doubt that Redis is a thick and heavy color in NoSQL, which is worth our in-depth study and use.

2. The status of Redis

Redis provides clients for Java, C/C++, C#, PHP, JavaScript, Perl, object-C, Python, Ruby, Erlang, Golang and other mainstream languages. So no matter what language the user is in, the stack will always find its own client, and the audience is very wide.

The author checked datanyze.com to see the latest market share and ranking comparison of Redis and MySQL as well as the deployment volume comparison of the world’s Top sites (website data updated to December 11, 2019.11 on the writing day) :


It can be seen that Redis ranks 9th in the overall share and has the same number of deployments as MySQL in the global Top100 sites, so Redis still has a certain status in the arena.

3. Talk about the real thing

At present, the stable version of Redis has reached 5.x, and its functions are becoming more and more powerful. Redis is almost a standard configuration from the perspective of Internet companies at home and abroad. As a developer, the probability of encountering Redis related questions in daily written test interview and work is very large, so it is very necessary to master Redis related knowledge points.

Learning and sorting out a complex thing cannot be done at one go. Everyone has his own cognitive thinking. The author believes that to fully master Redis, we need to understand Redis from bottom up and from outside to inside.

The actual knowledge of Redis can be divided into three levels:

  • Underlying implementation: mainly extracted from Redis source code, including but not limited to underlying data structure, service model, algorithm design, etc.

  • Infrastructure: The available overview is the overall external function points and performance of Redis, including but not limited to stand-alone moderator slave architecture implementation, master slave data synchronization, sentinel mechanism, cluster implementation, distributed consistency, failover, etc.

  • Practical application: What Redis can help you do in the real world, including but not limited to stand-alone cache, distributed cache, distributed lock, some applications.


Deep understanding and skilled use of Redis need time to temper, to do it is really not easy, want to break through in a short time can only start from the hot topic, although this feeling some utilitarian, but also understandable, in order to eat we still tend to forgive lazy themselves, otherwise eat dirt and drink wind?

4. The underlying implementation of hot topics

The topic of the underlying implementation is mainly related to the source code and design of Redis, which can be said to be the cornerstone of Redis function. Understanding the underlying implementation can let us better grasp the function. Because the underlying code is a lot, the subsequent infrastructure will still be interspersed with source code to analyze, so this article only lists some hot issues.

Q1: How are the five data types commonly used by Redis implemented?

The five commonly used data types supported by Redis refer to the value type, which are: String, List, Hash, Set Set, and ordered Set Zset. However, Redis later enriched several data types, namely Bitmaps, HyperLogLogs, and GEO.

Redis is written based on standard C and has only the most basic data types. Therefore, in order to meet the five data types for external use, Redis develops its own unique set of basic data structures and uses these data structures to realize the five data types.

The underlying data structure of Redis includes simple dynamic array SDS, linked list, dictionary, skip linked list, integer set, compressed list and object.

In order to balance space and time efficiency, Redis adopts different data structures at the bottom level for the specific type of value. Among them, hash table and compressed list are data structures that are reused a lot. The following figure shows the mapping relationship between external data types and the underlying data structures:


Ziplist compressed List can be seen from the figure as Zset, Set, List of three data types of the underlying implementation, it seems very powerful, compressed List is a continuous memory block sequential data structure developed in order to save memory and after special coding, the underlying structure is more complex.

Q2: What are the advantages of Redis SDS over strings in C?

In C language, the character array of N+1 length is used to represent the string, and the tail uses ‘\0’ as the ending symbol, but this kind of implementation cannot meet the requirements of Redis for security, efficiency and rich functions, so Redis alone encapsulates the simple dynamic string structure of SDS.

Before understanding the advantages of SDS, we need to take a look at the details of SDS implementation. Check out github’s latest SRC/SDS.h definition to see:

`typedef char*sds; ` `/* This can be ignored */
struct __attribute__(((packed__)) sdshdr5 {
    unsigned char flags; /* 3 lsb of type, and 5 msb of string length */
    charbuf[]; ` `}; ` `Len: The size of the space currently in use; The following macro definitions for SDS_TYPE_x are only 5. 3bit is sufficient. 5bit is not used. We know from the typedef char* SDS that this is the case. The maximum length of buF is 2^n, where n is the type of SDSHDR. For example, when sdSHDR16 is selected, buf_max=2^16. * /
struct __attribute__(((packed__)) sdshdr8 {
    uint8_t len; /* used */
    uint8_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__(((packed__)) sdshdr16 {
    uint16_t len; /* used */
    uint16_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__(((packed__)) sdshdr32 {
    uint32_t len; /* used */
    uint32_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__(((packed__)) sdshdr64 {
    uint64_t len; /* used */
    uint64_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};

#define SDS_TYPE_5  0
#define SDS_TYPE_8  1
#define SDS_TYPE_16 2
#define SDS_TYPE_32 3
#define SDS_TYPE_64 4
#define SDS_TYPE_MASK 7
#define SDS_TYPE_BITS 3
`

Looking at the previous definition, the author drew a picture:

It can be seen from the figure that SDS is essentially divided into three parts: header, BUF and NULL. The header can be regarded as the guiding part of the whole SDS. Given the information such as the space used and the maximum allocation size, an online figure is used to have a clear look at the example of SDSHDR8:


In SDS.h/SDS.c source code can clearly see the complete details of SDS implementation, this article will not expand otherwise the length is too long, quickly enter the topic to say the advantages of SDS:

  • O(1) Get length: C string needs to be traversed while len in SDS can be obtained directly;

  • Prevent bufferoverflow: when SDS needs to modify the string, it first checks whether the space meets the requirements of modification by means of len and alloc. If the space is insufficient, SDS will automatically expand the space, avoiding the overwriting situation like C string operation.

  • Effectively reduce the number of memory allocations: C string on the add or remove operation will change the size of the underlying array cause redistribution, SDS, use the space pre-allocated and inertia space release mechanism, namely every time when the extension is exponentially more allocation, the shrinkage capacity is also has not formally returned to the OS, first the two mechanisms is also better to understand;

  • Binary security: C language string can only save ASCII code, for pictures, audio and other information can not be saved, SDS is binary security, what is written to read is what, do not do any filtering and restriction;

Old rules on a good summary of god Huang Jianhong:

Q3: How is the Redis dictionary implemented? **** Briefly describe the process of progressive Rehash.

Dictionaries are a star member of the popular data types in Redis5. As mentioned earlier, dictionaries can be implemented based on Ziplist and Hashtable, but we will only discuss the principle of implementing them based on Hashtable.

A dictionary is a very hierarchical data type, as shown below:

To get an idea, let’s take a look at the latest SRC /dict.h source code definition:

`// Hash node structure
typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;

// encapsulates the dictionary operation pointer
typedef struct dictType {
    uint64_t (*hashFunction)(const void *key);
    void *(*keyDup)(void *privdata, const void *key);
    void *(*valDup)(void *privdata, const void *obj);
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);
    void (*keyDestructor)(void *privdata, void *key);
    void (*valDestructor)(void *privdata, void *obj);
} dictType;

/* This is our hash table structure. Every dictionary has two of this as we * implement incremental rehashing, for the old to the new table. */
This section is the key to understanding the dictionary
typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

// Dictionary structure
typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    unsigned long iterators; /* number of iterators currently running */
} dict;
`

The advantage of C language is that the definition must be from the bottom to the outside, so we can see an obvious hierarchy change, so the author draws a picture to show the specific concept of hierarchy:


  • About dictEntry

DictEntry is the hash table node, where we store the data, and its protected members are the key, V, and next Pointers. Key holds the key in the key-value pair, and v holds the value in the key-value pair. The value can be a pointer or uint64_t or int64_t. Next is a pointer to another hash table node that resolves hash conflicts by concatenating multiple key-value pairs of the same hash value at once.

The connection relation of two conflicting hash nodes is shown in the figure:

  • About dictht

Hash table includes members table, size, used, sizemask. Table is an array. Each element in the array is a pointer to a dictEntry structure. Each dictEntry structure holds a key-value pair. The size attribute records the size of the hash table, while the Used attribute records the number of existing nodes in the hash table. Sizemask is equal to size-1 and the hash value is used to calculate the index of a key in the table array, which is used to calculate the index.

As shown in the figure above, the hash node of a table of size 4 occurs, where K1 and k0 have hash conflict at index=2, and the open linked list exists, which is essentially the storage of K0 first. K1 is placed at the front of the conflicting linked list to ensure efficiency, because the linked list has no tail pointer.

  • About dict

Dict constructs are dictionary definitions, including type, privData, HT, and rehashidx. DictType pointer type refers to the API of operation dictionary, which can be understood as function pointer. Ht is an array containing two dicthts, that is, the dictionary contains two hash tables, and the variable used by rehashidx for rehash. Privdata is used as an argument in conjunction with the function pointing to the dictType, which gives an initial idea of the members of the dictionary.

  • Dictionary hash algorithm
`// Pseudo-code: use hash function to calculate the hash value of key
hash = dict->type->hashFunction(key);
// Pseudo code: use hash table sizemask and hash value to calculate the index value in HT [0] or HT [1]
index = hash & dict->ht[x].sizemask;
// Define the source code
#define dictHashKey(d, key) (d)->type->hashFunction(key)
`

Redis uses MurmurHash algorithm to calculate hash values, which was originally invented by Austin Appleby in 2008. Regardless of data input, MurmurHash algorithm can give hash values with good random distribution and calculate them very fast. MurmurHash2 and MurmurHash3 are currently available.

  • Plain Rehash rehashes

The number of key-value pairs stored in a hash table changes dynamically. In order to keep the load factor of the hash table in a reasonable range, it is necessary to expand and shrink the hash table.

Scaling is done by rehashing a dictionary’s hash table. The basic steps to perform a normal rehash of a dictionary’s hash table are allocating space -> migrating hash tables one by one -> swapping hash tables as follows:

  1. Ht [1] allocates space for the dictionary ht[1] hash table, depending on the operation to be performed and the number of key-value pairs ht[0] currently contains: the size of ht[1] is the first 2^n greater than or equal to HT [0]. Used *2; The size of ht[1] is the first 2^n greater than or equal to ht[0]. Used;

    For extensions such as h[0]. Used =200, select the first power of 2 greater than 400, i.e. 2^9=512.

  2. All key-value pairs saved in HT [0] are rehashed to HT [1] by recalculating key hashes and index values;

  3. Repeat the rehash until all key-value pairs contained in HT [0] have migrated to HT [1], release HT [0], set HT [1] to HT [0], and create a new blank hash table in HT [1] in preparation for the next rehash.

  • A progressive Rehash process

The rehash action of Redis is not completed at one time, but several times and gradually. The reason is that when there are a large number of key-value pairs stored in the hash table, rehash all these key-value pairs to HT [1] at one time may cause the server to stop serving for a period of time, which is unacceptable.

Redis uses progressive Rehash to address this situation. The detailed steps of the process are as follows:

  1. Allocating space for HT [1] is no different from normal Rehash;

  2. Rehashidx is set to 0 to indicate that rehash work has officially begun, and the rehashidx is incremented, starting at 0 to indicate that rehash starts from the first element of the array.

  3. During the rehash process, every time the dictionary is added, subtracted, changed, and searched, the key/value pairs of the HT [0] hash table on the rehashidx index are rehashed to HT [1]. After completion, the rehashidx is added by 1 to point to the next key/value pair that needs to be rehash.

  4. As the dictionary operation continues, eventually all key-value pairs of HT [0] are rehashed to HT [1], and the rehashidx property is set to -1 to indicate that the rehash operation is complete.

The idea of progressive Rehash is to spread the computation required for rehash key and value pairs over each add, delete, find, and update operation to the dictionary, thereby avoiding the blocking problems associated with centralized Rehash.

I can’t help but wonder if this piggy-back rehash will lead to a very long process. If a value has not been operated, it will have little impact when it needs to be expanded because it has not been used. If it needs to be reduced, memory may be wasted if it is not processed for a long time.

Q4: Skip lists understand? How is Redis Zset implemented using a hop table?

ZSet this kind of data type is also very useful, very useful when making the leaderboard demand, the author has used this kind of data type to realize the leaderboard of a day live 2000W APP, so it is necessary to understand the underlying implementation of ZSet, before the author wrote two articles on the implementation of skip linked list and ZSet, so you can refer to it.

Q5: Why does Redis use single threads? Talk about the Redis network model and how a single thread coordinates events to run.

In the new version, Redis is not a single thread service, and some auxiliary work is completed by BIO background threads. In addition, epoll is used in the bottom layer of Redis to realize the event-driven reactor mode, and time events and file events are constantly coordinated to complete the operation of the whole system in the whole main thread operation project. I have written two related articles before, which can be read for further answers.