“This is the sixth day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”

The function of the buffer is actually very simple. It mainly uses a piece of memory to temporarily store command data, so as to avoid data loss and performance problems caused by the slower processing speed of data and commands than the sending speed. But because the buffer has limited memory space, if the speed of writing data to it continues to be faster than the speed of reading data from it, the buffer will need more and more memory to hold data temporarily. A buffer overflow occurs when the memory used by the buffer exceeds a set upper threshold.

If an overflow occurs, data will be lost. Does that mean you don’t have to set an upper limit on the size of the buffer? Obviously not. As more and more data is accumulated, the buffer takes up more and more memory space, which will cause the Redis instance to crash once the available memory of the machine on which the Redis instance is located is exhausted.

So it’s no exaggeration to say that buffers are used to avoid the disaster of requests or data loss, but only when used correctly can they really be “avoided.”

As we know, Redis is a typical client-server architecture. All operation commands need to be sent to the server through the client. Therefore, one of the main applications of the buffer in Redis is to temporarily store the command data sent by the client or the data returned by the server to the client during communication between the client and the server. In addition, the buffer is used to temporarily store write commands and data received by the primary node during data synchronization between the primary and secondary nodes.

In this lesson, we will talk about buffer overflows between server and client, primary and secondary clusters, and how to deal with them.

Client input and output buffers

Let’s first look at the buffer between the server and client.

In order to avoid the client and server side request sending and processing speed mismatch, the server side for each connected client set up an input buffer and output buffer, we call the client input buffer and output buffer.

The input buffer temporarily stores the commands sent from the client, and the Redis main thread reads the commands from the input buffer for processing. When the Redis main thread finishes processing the data, it writes the result to the output buffer and returns it to the client via the output buffer, as shown in the following figure:

Below, we will learn the input buffer and output buffer overflow, and the corresponding solutions.

How do I deal with input buffer overflows?

As we have already analyzed, the input buffer is used to hold requests sent by the client, so there are two main situations that can cause an overflow:

  • Write bigkey, such as multiple million level set type data all at once;
  • The server is too slow to process requests. For example, the Redis main thread is intermittently blocked and cannot process normally sent requests in a timely manner. As a result, requests sent by the client end accumulate in the buffer.

Let’s start with how to check the memory usage of the input buffer and how to avoid overflow.

To see how each CLIENT connected to the server uses the input buffer, we can use the CLIENT LIST command:

CLIENT LIST ID =5 addr=127.0.0.1:50487 fd=9 Name = age=4 IDLE =0 FLAGS =N db=0 sub=0 psub=0 multi=-1 qBUf =26 qBUf-free =32742  obl=0 oll=0 omem=0 events=r cmd=clientCopy the code

The CLIENT command returns a lot of information, but we only need to focus on two types of information.

One is the client information that connects to the server. This example shows the input buffer of one client. If there are multiple clients, addr in the output will show the IP and port number of the different clients.

The other is the three parameters associated with the input buffer:

  • CMD: indicates the latest command executed on the client. In this example, the CLIENT command is executed.
  • Qbuf, which indicates the size already used by the input buffer. The CLIENT command in this example already uses a 26-byte buffer.
  • Qbuf-free: indicates the unused size of the input buffer. The CLIENT command in this example can also use a 32742 byte buffer. The sum of qbuf and qbuf-free is the total buffer size currently allocated by the Redis server for the connected client. This example allocates a total of 26 + 32742 = 32768 bytes, or 32KB buffer.

With the CLIENT LIST command, we can use the output to determine the memory usage of the CLIENT input buffer. If qbuf is large and qbuf-free is small, you should pay attention because the input buffer is already taking up a lot of memory and there is no free space left. If the client writes a large number of commands, the client’s input buffer overflows. Redis shuts down the client connection and the business application cannot access data.

In general, the Redis server serves more than one client, and when the total amount of memory consumed by multiple client connections exceeds the Redis maxMemory configuration item (for example, 4GB), Redis is triggered to perform data flushing. Once the data is weeded out of Redis, it needs to go to the back-end database to read the data, which reduces the access performance of business applications. To make matters worse, if you use multiple clients, Redis will consume too much memory, which can lead to out-of-memory problems, which can cause Redis to crash, which can have a serious impact on business applications.

So, we have to find a way to avoid input buffer overflows. We can consider how to avoid this from two perspectives, one is to make the buffer larger, and the other is from the data command sending and processing speed.

Let’s see, is there a way to adjust the size of the input buffer with parameters? The answer is no.

Redis’s client input buffer size upper threshold is set to 1GB in the code. That is, the Redis server side allows you to temporarily store up to 1GB of commands and data per client. The 1GB size is suitable for normal production environments. On the one hand, this size is sufficient for handling most client requests; On the other hand, if it is too large, Redis may crash because the client takes up too much memory resources.

Therefore, Redis does not provide parameters that allow us to adjust the size of the client input buffer. If we want to avoid input buffer overflows, we can only start with data command sending and processing speed, which is mentioned earlier to avoid client writing bigkey, and avoid Redis main thread blocking.

Next, let’s look at output buffer overflows.

How do I deal with output buffer overflows?

The Redis output buffer temporarily stores the data to be returned to the client by the Redis main thread. Typically, the main thread returns data to the client, ranging from a simple, fixed size OK response (for example, executing a SET command) or an error message, to a variable size execution result that contains specific data (for example, executing an HGET command).

Therefore, Redis sets the output buffer for each client in two parts: one, a fixed 16KB buffer for OK responses and error messages; The other part is a dynamically increased buffer space to temporarily store variable-size response results.

When does an output buffer overflow occur? I’ve summed up three for you:

  • The server returns a large number of results for Bigkey;
  • The MONITOR command is executed.
  • The buffer size is not set properly.

Since bigkey occupies a large amount of memory, the output buffer will be affected if the result returned by the server contains bigkey. Let’s focus on the MONITOR command and setting the buffer size.

The MONITOR command is used to MONITOR Redis execution. After executing this command, the monitored command operations are continuously printed as follows:

MONITOR
OK
1600617456.437129 [0 127.0.0.1:50487] "COMMAND"
1600617477.289667 [0 127.0.0.1:50487] "info" "memory"
Copy the code

Do you see any problems so far? The output of MONITOR continues to occupy the output buffer and continues to occupy more and more of it, resulting in an overflow. So, I have a tip for you: Use the MONITOR command mainly in debugging environments, and do not continuously use MONITOR online in production environments. Of course, if you occasionally use MONITOR to check the execution of Redis commands in an online environment, this is fine.

Next, let’s look at the issue of output buffer size Settings. Unlike the input buffer, we can set the size of the buffer with the client-output-buffer-limit configuration item. The specific Settings include two aspects:

  • Set the upper threshold of buffer size;
  • Sets the upper threshold for the number of continuous data writes to the output buffer and the upper threshold for the duration of continuous data writes.

Before using client-output-buffer-limit to set the buffer size, we need to distinguish the client type.

For applications that interact with Redis instances, there are two main types of clients and Redis server side interactions: regular clients that interact with read and write commands on the Redis server side, and subscription clients that subscribe to Redis channels. In addition, in the Redis master-slave cluster, there is also a class of client (slave client) on the master node for data synchronization with the slave node, which I will explain to you when I introduce buffers in the master-slave cluster.

When we set the buffer size for normal clients, we can usually do this in the Redis configuration file:

client-output-buffer-limit normal 0 0 0
Copy the code

Normal indicates that the current setting is a normal client, the first 0 is the buffer size limit, the second 0 and the third 0 are the buffer write duration limit and write duration limit respectively.

A normal client sends a request and waits for the result of the request to be returned before sending the next request, which is called blocking. In this case, the output buffer on the server side will not be blocked unless a very large Bigkey is read.

Therefore, we usually set the buffer size limit for normal clients, as well as the write duration limit and write duration limit to 0, i.e., no limit.

As for the subscriber client, once there is a message for the subscribed Redis channel, the server will send the message to the client through the output buffer. Therefore, messages sent between the subscribing client and the server are not blocked. However, if there are more channel messages, it will also take up more output buffer space.

Therefore, we set buffer size limits, buffer write duration limits, and write duration limits for subscribing clients, which can be set in the Redis configuration file:

client-output-buffer-limit pubsub 8mb 2mb 60
Copy the code

Where, the pubsub parameter indicates that the subscription client is currently set; 8mb indicates that the maximum size of the output buffer is 8mb. Once the actual size of the buffer exceeds 8mb, the server directly closes the connection to the client. 2MB and 60 indicate that if more than 2MB is written to the output buffer for 60 consecutive seconds, the server will also close the client connection.

Ok, let’s summarize how to deal with output buffer overflows:

  • Avoid a bigkey operation that returns a large number of data results.
  • Avoid continuous use of the MONITOR command in an online environment.
  • Use client-output-buffer-limit to set a reasonable upper limit for the buffer size, or the maximum number of consecutive buffer writes.

That’s all we need to know about the client buffer. Let’s continue to look at what needs to be noted when using buffers between primary and secondary clusters.

Buffers in primary and secondary clusters

Data replication between the primary and secondary clusters includes full replication and incremental replication. Full replication synchronizes all data. Incremental replication synchronizes only the commands received by the master database to the slave database during the disconnection between the master database and the slave database. In any form of replication, buffers are used to ensure data consistency between the master and slave nodes. However, the buffers in the two replication scenarios differ in terms of overflow effects and size Settings. So, let’s study separately.

Replication buffer overflow problem

During full replication, the primary node continues to receive write command requests from clients while transferring RDB files to secondary nodes. These write commands are stored in the copy buffer and sent to the slave node after the RDB file transfer is complete. The primary node maintains a replication buffer for each secondary node to ensure data synchronization between the primary and secondary nodes.

Therefore, if the RDB is received and loaded slowly from the slave node during full replication, and the master node receives a large number of write commands, the write commands will accumulate in the replication buffer and eventually overflow.

In fact, the replication buffer on the master node is essentially an output buffer used by clients (we call them slave clients) that connect to the slave node. If the replication buffer overflows, the primary node directly closes the replication connection with the secondary node, causing a full replication failure. So how do you avoid replication buffer overflows?

On the one hand, we can control how much data the master node stores. As a general rule of thumb, we keep the data size of the primary node in the range of 2 to 4GB to make full synchronization faster and avoid the replication buffer accumulating too many commands.

On the other hand, we can use the client-output-buffer-limit configuration item to set a reasonable size for the replication buffer. The data volume of the primary node, the write load of the primary node, and the memory size of the primary node are the criteria for setting this parameter.

Let’s go through a concrete example to see how this works. Run the following command on the active node:

config set client-output-buffer-limit slave 512mb 128mb 60
Copy the code

The slave parameter indicates that the configuration item is for the replication buffer. 512MB indicates that the upper limit of the buffer size is set to 512MB. 128MB and 60 represent Settings that also trigger a buffer overflow if more than 128MB is written for 60 consecutive seconds.

Let’s go ahead and see what this does for us. If the data of a write command is 1KB, then the replication buffer can accumulate 512K write commands (512MB/1KB = 512K). During full replication, the maximum write command rate of the primary node is 2000 commands per second (128MB/1KB/60 is about 2000).

In this way, we get a method: when setting the size of the replication buffer in practical applications, we can roughly estimate the amount of write command data that will accumulate in the buffer according to the size of write command data and the actual application load (that is, write command rate). Then, the system compares it with the size of the replication buffer to determine whether the size of the replication buffer is sufficient to support the accumulated write command data.

There is another problem with copy buffers. The memory overhead of the replication buffers on the master node will be the sum of memory occupied by each output buffer on the slave node client. If the number of slave nodes in the cluster is very large, the memory overhead of the master node can be very large. Therefore, we must also control the number of slave nodes that are connected to the master node rather than using a large master slave cluster.

Ok, so let’s just summarize this part. To prevent the replication buffer from overrunning due to the accumulation of too many commands, we can control the amount of data stored on the primary node and set a reasonable size of the replication buffer. At the same time, we need to control the number of slave nodes to avoid the problem of the replication buffer occupying too much memory in the primary node.

Replication backlog buffer overflow problem

Next, let’s look at the buffer used for incremental replication, which is called the replication backlog buffer.

When the master node synchronizes the received write commands to the slave node, the write commands are also written to the replication backlog buffer. Once the slave node has a network flash disconnection and is connected to the master node again, the slave node reads the write commands received by the master node during the disconnection from the replication backlog buffer for incremental synchronization, as shown below:

Does this sound familiar to you? Yes, we learned about replicating backlogs in lecture 6, but I told you its English name: repl_backlog_buffer. So in this lecture, we will review two key points from the perspective of buffer overflows: the impact of replication backlog overflows, and how to deal with replication backlog overflows.

First, the copy backlog buffer is a circular buffer of limited size. When the master node fills the replication backlog, it overwrites the old command data in the buffer. If the slave node has not synchronized the old command data, the full replication between the master and slave nodes starts again.

Second, to deal with overflow problems, we can adjust the size of the replicate backlog by setting the value of the repl_backlog_size parameter. For details, see the repl_backlog_size calculation in Lesson 6 below.

summary

In this lesson, we learned about the buffer used in Redis. The buffer prevents the loss of command data when the processing speed of the receiver cannot keep up with the sending speed of the sender.

Depending on the purpose of the buffer, such as whether it is used for client communication or master/slave replication, I split the buffer into input and output buffers for the client, and replication buffers and replication backlogs on the master node in the master/slave cluster. The advantage of learning this way is that you can see exactly where the buffer is being used in Redis, so you can quickly identify the cause of the problem from the communication between the client and the server and the replication between the master and slave nodes.

Now, from the perspective of the impact of buffer overflows on Redis, I will summarize these four buffers into two categories.

  • Buffer overflows shut down network connections: regular clients, subscription clients, and slave clients use buffers that are essentially maintained between Redis clients and servers, or between master and slave nodes, for the purpose of transferring command data. When these buffers overflow, the mechanism is to close the connection between the client and server, or between the master and slave nodes. The direct impact of network connection shutdown is that service programs cannot read and write Redis, or full synchronization between the primary and secondary nodes fails and needs to be executed again.
  • Command data loss due to buffer overflow: The replication backlog buffer on the primary node is a circular buffer. Once the overflow occurs, the newly written command data overwrites the old command data, causing the loss of the old command data and causing the primary and secondary nodes to perform full replication again.

Essentially, buffer overflows occur for three reasons: command data is sent too fast and too large; Command data processing is slow. The buffer space is too small. With that in mind, we can tailor our strategies.

  • Bigkey can be avoided for normal clients and RDB files can be avoided for copy buffers.
  • The solution to the problem of slow command data processing is to reduce blocking operations on the Redis main thread, such as using asynchronous delete operations.
  • The solution to the problem of too small buffer space is to use the client-output-buffer-limit configuration item to set reasonable output buffer, copy buffer, and copy backlog buffer sizes. Of course, let’s not forget that the size of the input buffer is fixed by default, and we cannot modify it through configuration unless we modify the Redis source code directly.

With these solutions, I believe you can avoid the “disaster” of command data loss and Redis crash caused by buffer overflows.

Each lesson asking

Finally, let me ask you a quick question.

In this class, we mentioned that Redis uses a client-server architecture, where the server maintains input and output buffers for each client. So, when an application interacts with a Redis instance, do clients used in the application need to use buffers? If so, will it affect Redis performance and memory usage?

Welcome to write down your thoughts and answers in the comments area, and we will exchange and discuss together. If you find today’s content helpful, you’re welcome to share it with your friends or colleagues, and we’ll see you next time.