To list the basic knowledge of Redis, refer to “EDIS Development and Operation – Fu Lei & Zhang Yijun”.

Redis has three basic features:

  • In-memory database
  • High concurrency in a single process
  • The I/O optimization

These three things are highly correlated.

The reason why a single-process database can support high concurrency is because it is an in-memory database. Operations respond much faster than hard drives. There is no need to switch multiple processes to read and write to the hard disk. After the native speed problem is solved, the important thing is the network, namely I/O optimization, which Redis implements using EPoll and does not waste time on the network. Once the I/O problem is resolved, a big advantage of single threading is that you don’t have to do thread switching.

Any questions about the benefits? The most serious problem is that if a command takes too long to execute, subsequent commands will block, seriously affecting concurrency.

Common commands

Set (ex: takes effect when the key already exists, same as update, nx: takes effect when the key does not exist, same as insert)

get

Multil operation: reduces I/O.

The data structure

type Introduction to the features scenario
String (string) Binary security It can contain any data, such as JPG images or serialized objects
Hash (dictionary) A collection of key-value pairs, the map type in programming languages It is suitable for storing objects and can only modify a property value like updating a property in a database Store, read, and modify user attributes
List (List) Linked list (bidirectional linked list) Add and delete fast, provides an API for manipulating an element Latest news rankings; The message queue
Set Hash table implementation, elements do not repeat The complexity of add, delete and search is O(1), providing the operation of finding intersection, union and difference sets Mutual friends; Using uniqueness, count all Ip addresses that visit the site
Sorted set Add a weight parameter score to the elements in set, and the elements are arranged in order by score When the data is inserted into the collection, it is already naturally sorted List; A weighted message queue

encoding

Encoding is how these data structures are stored. In fact, there are only a few storage options available, including sequential storage (arrays), linked lists, hashes, graphs (generally trees). Each approach has advantages and disadvantages. The advantage of Redis is that with the change of data (single data size, data amount), it will automatically switch the encoding mode to achieve better storage effect.

ZipList: Compact and continuous storage, you can see that most data structures are using zipList for small amounts of time

14. hashtable: Hash is used later to optimize reads and writes to O(1)

string

set key value

The most common form

The hash

Key –> field

Field is a set of hashed k: v values

  • Command: +h before common commands
hset
hexist
Copy the code

Usage scenario: Standard relational database

List: a list

  • Python-like lists, ordered, can be accessed randomly using subscripts. You can also use Lpush, LPOP, rpush, and RPOP as queues
  • A list can hold up to 2^32-1 elements

The command

Rpush key Value Lpush key value Lrange Key Start endCopy the code

A collection of

Disorder, unique

The command

sadd key element
Copy the code

Usage scenarios

tag

An ordered set

Compared with the set, a score variable is added, which can be sorted according to the score variable

Usage scenario: ranking system

HyperLogLog

Redis HyperLogLog is used to do cardinality statistics algorithm. The advantage of HyperLogLog is that when the number or volume of input elements is very, very large, the space required to calculate the cardinality is always fixed and small.

In Redis, each HyperLogLog key costs only 12 KB of memory to calculate the bases of close to 2^64 different elements. This is in stark contrast to collections where the more elements you have, the more memory you consume when calculating cardinality. The price is a degree of inaccuracy.

Common functions

The slow query

Slowlog – log – slower – than = positive number

Record all commands whose execution time is longer than this

Pipeline

The package command is sent to the server to solve the network delay problem

Lua transaction

Lua transactions are needed because native transactions do not support atomicity. A single command is of course atomic (single-threaded guarantee), but if one of multiple commands fails, the database does not roll back.

With Lua scripts, Redis ensures that script execution is atomic.

  • Application: When using Redis to do distributed lock, lua script is required for the two-step operation of unlocking.

persistence

RDB

After compression, write to DISK, blocking Redis.

Pros and cons

  • Advantages: Due to compression, Redis that use DISK are smaller than Redis that use memory.

  • Disadvantages: Persistence always blocks for a period of time, not every second.

operation

Save: This process, which blocks the database for persistence, has been abandoned

Bdsave: the fork subprocess is persisted, and only the fork process is blocked.

The fork process copies the parent page table, and redis page tables are often not small, resulting in slow forking

Copy-on-write problem: If a new instruction writes to redis (the parent process) during fork, copy-on-write is triggered, and the operating system needs to reassign the physical address to the child process, which is likely to cause a long block.

AOF: append only file

Principle: Save every command since the establishment of Redis

Each time append is persisted after the AOF cache, the AOF cache writes to the file system according to the configuration.

Pros and cons

  • Advantage: Persistence per second
  • Disadvantage:

operation

  • AOF buffer synchronization policy to file system:
configuration The effect
always Fsync is called after each buffer write and blocks until the disk write is complete. Very bad for high concurrency
everysec When does the operating system not know about sync? There is a special thread that calls fsync every second
no Write is called every time a buffer is written. When does the operating system let sync go
  • rewrite

Manually or automatically, eliminate redundant parts when rewriting. The AOF file becomes smaller.

Timeout files, invalid commands, merge commands

  • Everysec block:

    Even though it is another thread fsync, the main thread checks the last sync time and blocks if it is longer than 2 seconds

copy

Replication is a core function of clustered layout.

The slaveof command specifies the master/slave relationship. A master can have multiple slaves.

  • The principle

    • Full replication: RDB is generated, causing network and hardware stress.
    • Partial copy: issue AOF
  • Heartbeat: The primary and secondary servers, disguised as clients, periodically send Ping Pong to confirm the connection

  • Asynchronous replication: The command is returned after completion on the master, regardless of the master/slave synchronization

  • Application: Read/write separation: Because there are so many reads in the application, you can distribute the reads from the database

    • Read only in the master database
    • Asynchronous synchronization delay is unavoidable in principle. Monitor can be used to prevent excessive offset