“This is the first day of my participation in the First Challenge 2022. For details: First Challenge 2022.”

preface

Redis is a middleware that is widely used in work. It supports rich data structures and has strong read and write performance. TPS can reach 10W +.

The String type is also one of the most used data structures to be analyzed and summarized in today’s article. This paper is based on dis5.0 analysis.

First, basic use

set key value [EX seconds] [PX milliseconds] [NX|XX]       
Copy the code

1. Set is the syntax, key is the name, and value is the value to be stored

EX specifies the expiration time in seconds and PX specifies the expiration time in milliseconds

3. NX: The configuration is successful only when the key does not exist

XX: The configuration is successful only when the key exists

Summary: 5.0 supports the set command to specify the expiration time and does not exist when the set is successful, that is, through a command to achieve the distributed lock lock function, the previous version of the key and set expiration time need to be divided into two commands, atomicity is more difficult to ensure.

Two, use scenarios

1. Hot data cache and distributed session

2. Setnx distributed lock

3. Incr counter

4. Incr Global ID

5, Incr current limiting

6, bit operation, bitmap function, online user statistics 0/1 mark

3. Data types supported for storage

Integer, character, float

Four, different coding types

5, String storage principle

In Redis, data is stored in a RedisObject class

Typedeft struct redisObject {// This type can be string, hash,zset, etc. Unsigned type:4; unsigned encoding:4; Unsigned LRU :LRU_BITS; /* LRU time (relative to global lru_clock) or * LFU data (least significant 8 bits frequency * and most significant 16 Bits access time). */ / // Point to the real data structure object void * PTR; } robj;Copy the code

For strings,Redis customizes a simple dynamic String data structure to store String numbers.

Source code implementation: a variety of data structures, respectively, that can store strings of different lengths.

Len: Indicates the length that has been used

Alloc: indicates the total memory size allocated

Flags: indicates the storage type

Buf []: actual data

Six, three kinds of coding storage differences

1, embSTR RedisObject,SDS exists a piece, as long as the creation of a memory allocation, destruction of a memory release, easy to find

2, raw RedisObject,SDS memory is not the same, need to create two memory, destroy two memory free

3, the structure of embstr determines that it needs to increase the length, RedisObject,SDS need to reallocate memory. Embstr encoded data is therefore immutable and read-only.

When will the embstr code be converted to raw

1. Data of int type is no longer of int type, and is converted to raw

2. If the length is greater than 2^63-1, convert to embstr

3. The value of embstr exceeds 44 bytes and is converted to RAW

8. Advantages of SDS data structure

1, binary security can store image plastic, floating point type

2, String three encoding, make full use of memory, improve memory utilization

  • intStore 8 bytes long integer long,2^63-1
  • EmbstrEmbstr format SDS Simple Dynamic String memory space is continuous, read-only, and will be converted to RAW as soon as changes are made
  •  RawSDS stores strings of more than 44 bytes

3, don’t worry about memory overflow, SDS has automatic capacity expansion

String length complexity O(1), stored len attribute

5. Prevent multiple memory allocations through space preallocation and lazy space release

Len = len; len = len; len = len; len = len

Why not use the character array in C?

1. Memory needs to be pre-allocated and may overflow

2. Obtaining the length requires traversing the number group, time complexity O(n)

3, the character array length changes, need to allocate memory

4. In c’s character array, ‘\0’ represents the end of judgment. Binary data storage is not secure, can not save pictures, videos, etc.

About the memory preallocation feature

According to the source code analysis, the expansion policy is to double the space expansion before the string length is less than SDS_MAX_PREALLOC, that is, to reserve 100% of the redundant space. When the length exceeds SDS_MAX_PREALLOC, each expansion allocates only more redundant space of SDS_MAX_PREALLOC size to avoid wasting redundant space after doubling.

11. Release of inert space

Lazy space free is used to optimize SDS string shortening operations: When the SDS API needs to shorten the string saved by SDS, the program does not immediately use memory reallocation to reclaim the shortened extra bytes, but instead uses the free property to record the number of bytes and wait for future use.

Void sdsclear(SDS s) {// Sdssetlen (s, 0); // Set the first character to the terminator s[0] = '\0'; }Copy the code

Real clear space

sds sdsRemoveFreeSpace(sds s) { struct sdshdr *sh; sh = (void*) (s-(sizeof(struct sdshdr))); Sh = zrealloc(sh, sizeof(struct SDSHDR)+sh->len+1); Sh ->free = 0; return sh->buf; }Copy the code

For example, different structures store strings of different lengths, and different encoding types store strings of different lengths.

Pre-allocated space, the space of inert release, etc., from the storage structure, coding type, memory allocation and recovery strategy, the authors have done very many reasons from the aspects of performance design, it is conceivable that is why redis high performance, work to also want to learn this kind of pursuit of perfection in the performance of the fine style and design style.