Redis Lastest 6.0.14 data type and encoding

typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:24;
    int refcount;
    void *ptr;
} robj;

// redisObject::type
#define OBJ_STRING	0
#define OBJ_LIST  	1
#define OBJ_SET   	2
#define OBJ_ZSET  	3
#define OBJ_HASH  	4
#define OBJ_MODULE	5
#define OBJ_STREAM	6

// redisObject::encoding
#define OBJ_ENCODING_RAW 		0
#define OBJ_ENCODING_INT 		1
#define OBJ_ENCODING_HT 		2
#define OBJ_ENCODING_ZIPMAP		3
#define OBJ_ENCODING_LINKEDLIST	4
#define OBJ_ENCODING_ZIPLIST 	5
#define OBJ_ENCODING_INTSET 	6
#define OBJ_ENCODING_SKIPLIST 	7
#define OBJ_ENCODING_EMBSTR 	8
#define OBJ_ENCODING_QUICKLIST 	9
#define OBJ_ENCODING_STREAM		10
Copy the code

Type and Encoding, the relation between the reference object. C: objectComputeSize (), OBJ_ENCODING_LINKEDLIST is old OBJ_LIST coded, is no longer used.

OBJ_STRING
- OBJ_ENCODING_INT
  - Save integer values that can be represented by long.
- OBJ_ENCODING_EMBSTR
  - Saves short strings of less than or equal to 44 bytes (OBJ_ENCODING_EMBSTR_SIZE_LIMIT).
  - Memory is allocated once and released once, and the memory is continuous.
  - When the stored data is 44 bytes, robj + SDSHDR is just 64 bytes.
- OBJ_ENCODING_RAW
  - Memory is allocated for two times and released for two times. The two memory segments are not consecutive.
  - Why use RAW instead of EMBSTR for long strings? Avoid big key problems.
OBJ_LIST
- ~~OBJ_ENCODING_LINKEDLIST~~
  - Advantages: High efficiency of push and pop at both ends.
  - Disadvantages: (1) Each node maintains two Pointers before and after, large memory overhead; (2) The addresses between nodes are not continuous and the memory fragments are large.
- ~~OBJ_ENCODING_ZIPLIST~~
  - Advantages: A continuous block of memory, less memory fragmentation.
  - Disadvantages: Data changes can cause cascading updates (prelen), so only suitable for storing small amounts of data.
  - ZipList is no longer used to encode lists in Redis 6.0.14.
- OBJ_ENCODING_QUICKLIST
  - The combination of linkedList and zipList increases memory utilization and avoids cascading updates of large amounts of data.
OBJ_SET
- OBJ_ENCODING_HT
- OBJ_ENCODING_INTSET
  - Set elements are integer values and intset encoding is used when the number of elements is less than 512 (set-max-intset-entries).
OBJ_ZSET
- OBJ_ENCODING_ZIPLIST
  - Two adjacent entries save a pair of members :score. The former member and the latter score are in ascending order as a whole.
- OBJ_ENCODING_SKIPLIST
  - Zset uses both dict and skipList. Dict stores member->score mappings.
OBJ_HASH
- OBJ_ENCODING_ZIPLIST
  - The two adjacent entries save a pair of subkeys: values. The former subkey and the latter value.
- OBJ_ENCODING_HT

Specification of the underlying code

OBJ_ENCODING_INT

o->encoding = OBJ_ENCODING_INT;
o->ptr = (void((*)long)value);
Copy the code

OBJ_ENCODING_RAW

o->encoding = OBJ_ENCODING_RAW;
o->ptr = (void*)sdsnew("hello");
Copy the code

OBJ_ENCODING_EMBSTR

Strlen (“hello”) + 1, * PTR refers to the string stored in buf[]. The EMbSTR encoding is suitable for short strings and only allocates memory once. The RAW encoding robj and SDSHDR are not next to each other and need to allocate memory twice.

Alloc represents the amount of space allocated and len represents the amount of space used.

robj *o = zmalloc(sizeof(robj) + sizeof(sdshdr8) + strlen("hello") + 1);
sdshdr8 *sh = (void*)(o+1);

o->encoding = OBJ_ENCODING_EMBSTR;
p->ptr = sh+1;
Copy the code

OBJ_ENCODING_ZIPLIST

unsigned char *ziplistNew(); Zlbytes indicates the size of memory occupied by the entire Ziplist, zltail indicates the offset of the last entry, zllen indicates the number of entries, and zltail indicates the number of entries. Zlend represents the end of ziplist.

In the initial state, zlbytes = 11, zLTAIL = 10, zllEN = 0, zlend = 255.

Each entry can hold a byte array or an integer value. Prelen records the length of the previous entry, which can be 1 byte or 5 bytes. If there are 5 bytes, the first byte is set to 256, and the last 4 bytes are used to store the length. Prelen is equivalent to the prev pointer.

Encoding records what data holds (byte array or integer? How many places?) It can be 1 byte, 2 byte, or 5 byte.

OBJ_ENCODING_QUICKLIST

typedef struct quicklist {
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count;					// The number of all entries in all listNodes
    unsigned long len;      				// The number of all listNodes
    int fill:QL_FILL_BITS;    				// redis.conf:list-max-ziplist-size
    unsigned int compress:QL_COMP_BITS; 	// redis.conf:list-compress-depth
    unsigned int bookmark_count:QL_BM_BITS;
    quicklistBookmark bookmarks[];
} quicklist;

typedef struct quicklistNode {
    struct quicklistNode *prev;
    struct quicklistNode *next;
    unsigned char *zl;						// There is no compression to zipList, there is compression to quickListLZF
    unsigned int sz;						// The size of zipList, compression is the size of original zipList
    unsigned int count:16;					// Number of entries in zipList
    unsigned int encoding:2;				// 1 indicates no compression. 2 indicates LZF compression
    unsigned int container:2;				// 1 indicates no container. 2 indicates zipList container
    unsigned int recompress:1;				// Temporarily unzipped quickListNode will be marked as 1
    unsigned int attempted_compress:1;		// Data is too small to be compressed
    unsigned int extra:10;					// Reserved field, not yet used
} quicklistNode;

typedef struct quicklistLZF {
    unsigned int sz;						// Size of compressed
    char compressed[];						// A compressed array of bytes
} quicklistLZF;
Copy the code

OBJ_ENCODING_INTSET

typedef struct intset {
    uint32_t encoding;	// INTSET_ENC_INT16/32/64
    uint32_t length;	// contents Specifies the length of the byte array
    int8_t contents[];	// Stores int16/int32/int64, depending on encoding
} intset;
Copy the code

Intset Unordered and non-repetitive stores int16, int32, and int64 data sets. The encoding of each element is consistent. Int16 is not used for small integers, but INT64 is used for large integers.

OBJ_ENCODING_HT

typedef struct dict {
    dictType* type;    			// Type specific functions
    void* privdata;    			// Function arguments
    dictht ht[2];      			/ / a hash table
    int rehashidx;     			// Rehash progress. -1 indicates no progress
    int iterators;     			// The number of security iterators running
} dict;

typedef struct dictht {
    dictEntry** table;        	// Hash table array
    unsigned long size;       	// Hash table size
    unsigned long sizemask;   	// size-1
    unsigned long used;       	// Number of existing nodes
} dictht;
Copy the code

HashTable in Redis uses zip (header) to resolve Hash collisions. Sizemask always equals size-1, use dictHashKey(ht, key) & HT ->sizemask; Calculates the index of dictEntry on the table.

OBJ_ENCODING_SKIPLIST

typedef struct zskiplist {
    struct zskiplistNode *header, *tail;
    unsigned long length;	// Number of nodes, not counting head
    int level;				// The highest level
} zskiplist;

typedef struct zskiplistNode {
    sds ele;
    double score;
    struct zskiplistNode *backward;
    struct zskiplistLevel {
        struct zskiplistNode *forward;
        unsigned long span;	/ / span
    } level[];
} zskiplistNode;
Copy the code

Why do I need head nodes with ele and score empty? Data changes are indexed from next at the top level, next at the same level if they are small, and head at the next level if they are large.