This is the fifth day of my participation in the August More text Challenge. For details, see: August More Text Challenge

preface

Another year of the interview season, anyway, no matter you only elementary, intermediate, or advanced, when the interview, first a variety of implementation principles to a set, among which redis the implementation of the principle was asked especially much.

Although we are all crud boy, but if in the interview, you can achieve a ton of output by the underlying principles, to ensure that the interviewer can be a flood of surprise, then the salary doubled is not a dream.

Today we’re going to talk about the simplest and most frequently used string in Redis.

Dynamic string in Redis

We usually know that many languages use the traditional C string representation, but Redis creates its own data type called simple dynamic string, which is SDS.

One of the structures of SDS is also very simple

type SDS struct {
    len int,
    free int,
    char []buf
}
Copy the code

Let’s look at an example here to make it easier to understand

From this example we can see:

Free is 0, which means that this SDS has not allocated any unused space.

Len is 5, which means that this SDS holds a five-byte string.

The buf attribute refers to an array of type char that stores the five letters redis and has a rest.

The difference between SDS and C strings

As we can see from the above implementation, the buF storage is very similar to the C character implementation, so why do Redis adopt such a new implementation?

Get string length at constant level

When you want to get the length of itself, the C character doesn’t have the length information, which means that for the C character, it has to go through the loop, and when it hits a rest, it cuts off, which means the complexity is going to be n.

But in the case of an SDS, because it has length information stored in its structure, it doesn’t have to do an traversal to get its own length, it can get its own length directly, which is 1.

Buffer overflow

For C character, it is not the length of the record itself, so at the time of initialization, assuming that has been allocated enough memory space to store the characters, but if the assumption fails, can appear buffer overflow condition, when this happens, it will cause the character is not intact.

The SDS does not have this problem at all, because it checks to see if it has enough space and expands when it does, so there is no buffer overflow.

Reduced memory reallocation when modifying characters

As mentioned earlier, C characters do not record their own length, so every time you modify the underlying array of characters, you need to do a memory reallocation, which is very costly for performance.

But with SDS, there is no such problem

Space preallocation

First of all, when expanding the storage space, the SDS will not only modify the necessary space, but also allocate a part of the unused space for the SDS, which we see as free, so as to avoid the problem of reallocating space every time the character set is changed.

Lazy space release

Similarly, when a long character set is reduced, the SDS does not recycle the space directly, but instead puts it in free, which can be used in the next time. This has the advantage of reducing the problem of memory re-partitioning.

conclusion

Overall, the implementation of SDS is not complicated, and because it is partially compatible with C string functions, it is used as the underlying implementation of a string in Redis.