Why is Redis so fast? Learn more about Redis!

Source: http://www.cnblogs.com/kismetv/p/8654978.html

When working with Redis, we encounter five Redis object types (string, hash, list, collection, ordered collection), and the richness of these types is one of Redis’s big advantages over Memcached and the like.

On the basis of understanding the usage and characteristics of the five object types of Redis, a further understanding of the memory model of Redis is of great help to the use of Redis.

001

Redis memory statistics

After the client connects to the server through the redis-CLI (the client uses the redis-CLI only unless otherwise specified), you can run the info command to check the memory usage: Info Memory.

The info command can display a lot of information about the Redis server, including basic server information, CPU, memory, persistence, client connection information and so on. Memory is a parameter that displays only memory-related information.

Some important notes in the returned results are as follows:

used_memory

The amount of memory allocated by the Redis allocator, in bytes, including the virtual memory used (that is, swap); The Redis allocator is described later. Used_memory_human is just more user-friendly.

used_memory_rss

The Redis process occupies the memory (in bytes) of the operating system, which is consistent with the values seen by the top and ps commands.

In addition to the memory allocated by the allocator, usED_memory_RSS includes memory needed by the process itself to run, memory fragments, and so on, but not virtual memory.

mem_fragmentation_ratio

Memory fragmentation ratio, which is the ratio of USED_memory_rss to USED_memory.

Mem_fragmentation_ratio is generally greater than 1, and a larger value indicates a larger memory fragmentation ratio. Mem_fragmentation_ratio <1 indicates that Redis uses virtual memory. Virtual memory is much slower than memory because it is used as a disk.

002

Redis memory partition

As an in-memory database, Redis stores mainly data (key-value pairs) in memory. There are other parts that take up memory.

The memory footprint of Redis can be divided into the following parts:

data

As a database, data is the most important part; This portion of memory is counted in usED_memory.

Redis uses key-value pairs to store data, where values (objects) are of five types: string, hash, list, set, and ordered set.

Memory required by the process itself to run

The main Redis process itself must consume memory, such as code, constant pool, etc. This amount of memory is in the order of a few megabytes, which in most production environments is negligible compared to the amount of memory consumed by Redis data.

The buffer memory

Buffer memory includes client buffer, replication backlog buffer, AOF buffer, etc. The client buffer stores the input/output buffer of the client connection. The replication backlog buffer is used for partial replication; The AOF buffer is used to hold the most recent write command during AOF rewrite.

Memory fragments

Memory fragmentation is generated when Redis allocates and reclaims physical memory. For example, if changes to the data are frequent and the sizes of the data differ greatly, the space freed by Redis may not be freed in physical memory.

However, Redis cannot be used efficiently, which results in memory fragmentation, which is not counted in USED_memory.

003

Details of Redis data storage

The following concepts are involved in the details of Redis data storage.

DictEntry: Redis is a key-value database, so there is a dictEntry for each key-value pair, which stores Pointers to keys and values; Next points to the next dictEntry, independent of this key-value.

Key: As you can see in the upper right corner, the Key (” hello “) is not stored directly as a string, but in an SDS structure.

Here we introduce Jemalloc, RedisObject, SDS, object types and internal encoding respectively.

jemalloc

In the 64-bit system, jemalloc divides the memory space into three ranges: small, large and huge. When Redis stores data, it selects the most appropriate memory block for storage.

The memory units divided by Jemalloc are as follows:

For example, if you need to store a 130-byte object, Jemalloc puts it into a 160-byte memory cell.

RedisObject

As mentioned earlier, there are five types of Redis objects; Regardless of type, Redis is not stored directly, but rather through RedisObject objects.

SDS

Instead of using a C string directly (that is, an array of characters ending with a null character ‘\0’) as the default string representation, Redis uses SDS. SDS stands for Simple Dynamic String.

SDS structure

Where buf stands for byte array, which is used to store strings. Len indicates the length that buF has used; Free indicates the unused length of buF.

Here’s an example:

As you can see from the structure of SDS, the length of the BUF array is =free+len+1 (where 1 represents the null character at the end of the string).

So, the space occupied by an SDS structure is: the length of free+ the length of len+ the length of buF array =4+4+ Free +len+1=free+len+9.

004

Redis object type and internal encoding

Redis supports five object types, and each structure has at least two encodings.

The advantages of this approach are: on the one hand, the interface is separated from the implementation, so that when the internal code needs to be added or changed, the use of users is not affected; on the other hand, the internal code can be switched according to different application scenarios, improving efficiency.

The internal encodings supported by the various object types of Redis are shown below (the version in the figure is Redis3.0) :

As for the conversion of Redis internal code, it conforms to the following rules: the conversion is completed when Redis writes data, and the conversion process is irreversible. Only small memory encoding can be converted to large memory encoding.

string

Strings are the most basic type because all keys are strings, and elements of several other complex types are strings that cannot exceed 512MB in length.

The internal encoding

There are three internal encodings of string types, which can be used in the following scenarios:

Int: a long integer of 8 bytes. When the string value is an integer, the value is represented by the long integer.

Embstr: a string of <=39 bytes. Both EMbSTR and RAW use RedisObject and SDS to store data.

The difference is that the use of EMbSTR allocates memory space only once (thus RedisObject and SDS are contiguous), whereas RAW allocates memory space twice (RedisObject and SDS, respectively).

Raw: The string contains more than 39 bytes.

An example is shown below:

The length of embstr from raw is 39; This is because RedisObject is 16 bytes long and SDS is 9+ string length.

So when the string length is 39, embstr is exactly 16+9+39=64, and Jemalloc can allocate exactly 64 bytes of memory.

Code conversion

When int data is no longer an integer, or the size exceeds long’s range, it is automatically converted to RAW. In the case of EMbstr, since its implementation is read-only, any modification to an EMbstr object is converted to RAW before modification.

Therefore, whenever you modify an embstr object, the modified object must be raw, whether or not it reaches 39 bytes.

An example is shown below:

The list of

A list is used to store multiple ordered strings, each string called an element. A list can store 2^32-1 elements.

Lists in Redis support both insertion and pop-up, and can obtain elements in a specified position (or range), which can act as arrays, queues, stacks, and so on.

The internal encoding

The internal encoding of a list can be a ziplist or a linkedList.

Double-endian list: consists of a list structure and multiple listNode structures; Typical structure is shown in the figure below:

As can be seen from the figure, the double-ended linked list stores both the header pointer and the tail pointer, and each node has a pointer pointing forward and backward.

Compressed list: Compressed list is developed by Redis in order to save memory. It is a sequential data structure composed of a series of contiguous memory blocks with special encoding (instead of each node being a pointer like a double-endian linked list). The specific structure is relatively complex.

When the number of nodes is small, you can use the compressed list. Compressed lists are not only used to implement lists, but also used to implement hash and ordered lists. It’s very widely used.

Code conversion

A compressed list is used only when the following two conditions are met: The number of elements in the list is less than 512; All string objects in the list are less than 64 bytes.

If one condition is not met, a double-ended list is used; And the encoding can only be transformed from a compressed list to a double-endian list, but not in the opposite direction.

The following figure shows the characteristics of list encoding transformations:

A single string cannot exceed 64 bytes to allocate the length of each node uniformly.

The hash

Hash (as a data structure) is not only one of the five object types provided by Redis (along with strings, lists, collections, and ordered combinations), but also the data structure used by Redis as a key-value database.

The internal encoding

The internal encoding used for inner hashing can be ziplist and Hashtable. The outer hashes of Redis only use HashTable.

Hashtable: A Hashtable consists of one dict structure, two Dictht structures, one dictEntry pointer array (called bucket), and multiple dictEntry structures.

Under normal circumstances (that is, when the Hashtable is not rehash), the relationship between the parts is as follows:

The following sections are introduced from the bottom up:

DictEntry: The dictEntry structure is used to hold key-value pairs. The structure is defined as follows.

The functions of each attribute are as follows:

Key: the key in a key-value pair.
Val: a value in a key-value pair, implemented using a union(common body), which may store a pointer to a value, a 64-bit integer, or an unsigned 64-bit integer.
Next: points to the next dictEntry, which is used to resolve hash conflicts.

On 64-bit systems, a dictEntry object accounts for 24 bytes (key/val/next each accounts for 8 bytes).

Bucket: A bucket is an array, each element of which is a pointer to the dictEntry structure. The size of the bucket array in Redis is calculated as follows: the smallest 2^n larger than dictEntry.

For example, if there are 1000 dictentries, the bucket size is 1024; If there are 1500 dictentries, the bucket size is 2048.

Code conversion

The following figure shows the characteristics of the Redis inner hash encoding transformation:

A collection of

The elements in the collection are unordered, so they cannot be manipulated by indexes; There can be no repetition of elements in a collection. A collection can store up to 2^32-1 elements.

The internal encoding

The internal encoding of a collection can be a collection of integers (intset) or a hashtable (hashtable).

The structure of the set of integers is defined as follows:

Where encoding represents the type of content stored in contents, although contents (which stores elements in the collection) is of type INT8_T.

The value stored is int16_t, int32_t, or int64_t, depending on encoding, and length indicates the number of elements.

Code conversion

The following figure shows the characteristics of the collection encoding transformation:

An ordered set

Ordered sets, like sets, cannot repeat elements; But unlike a set, the elements in an ordered set are ordered. The ordered collection sets a score for each element as the basis for sorting.

The internal encoding

A skip list is an ordered data structure that allows fast access to nodes by maintaining multiple Pointers to other nodes in each node.

Redis skiplist implementation is composed of zskiplist and zskiplistNode: the former is used to store skiplist information (such as head node, tail node, length, etc.), and the latter is used to represent skiplist node, the specific structure is relatively complex.

Code conversion

Compressed lists are used only if both of the following conditions are met: the ordered set has less than 128 elements; All members of an ordered set are less than 64 bytes long.

If one condition is not met, skip lists are used; And the encoding can only be transformed from a compressed list to a skip list, the reverse direction is not possible.

The following figure shows the characteristics of ordered set encoding transformations:

Redis can be seen in major companies at home and abroad, such as Sina, Ali, Tencent, Baidu, Sohu, Youku, Meituan, Xiaomi and so on.

In the era of big data, a good programmer must have big data thinking and understand large and small concepts. Projects using Redis all have huge data volume and visits, which requires us not only to have good code awareness, but also to have better ability to expand projects in the future era of big data.

For more exciting articles, please click “Read the original article”.

Recommended reading

[strongly recommended] carefully arranged | public number article directory

The ten years story sequel of the elder brother of migrant worker: hang piao ten years, withdraw bully all now!

You have to understand the front end separation principle!

Basic principles of ZooKeeper

Illustrate the evolution of distributed architecture!

Why are programmers paid so much?

How to improve programmer (technical person) self productivity?

What are the classic database questions in an interview?

How do you pretend to be a coder making over $500,000 a year?

Interesting | 20 programmers could read it on