preface

Many bloggers on the web have proposed the following methods for increasing the TCP half-connection queue and full-connection queue:

  • To increase the TCP half-connection queue, add /proc/sys/net/ipv4/tcp_max_syn_backlog.
  • The WAY to increase the TCP full-connection queue is to increase the backlog in listen().

Here to tell you first, the above way is not accurate.

“How do you know it’s not accurate?”

Very simple ah, because I did the experiment and looked at the TCP stack kernel source code, found that to increase the length of the two queues, not simply increase a parameter can be.

Next, will be actual combat + source code analysis, with you to decrypt the TCP half connection queue and full connection queue.

“Source code analysis, isn’t that persuasion? We Java people don’t understand it.”

Rest assured, the source analysis of this article will not involve deep knowledge, because I have been deleted, you only need to be able to judge the conditional statement if, left move right move operator, addition and subtraction and other basic grammar, you can understand.

In addition, there is not only source code analysis, but also introduces Linux troubleshooting half-connection queue and full connection queue commands.

“Oh? Seems to be very interesting, that I will have a look!”

Ok, those of you who have not been discouraged, deserve encouragement. Here is an outline of this article:

In this paper, an outline

The body of the

What are TCP half-connection queues and full-connection queues?

During the TCP three-way handshake, the Linux kernel maintains two queues:

  • Half-connection queue, also known as SYN queue;
  • Full connection queue, also known as ACCePET queue;

When a server receives a SYN request from a client, the kernel stores the connection in the half-connection queue and responds with a SYN+ACK. The client then returns an ACK. When the server receives a ACK for the third handshake, the kernel removes the connection from the half-connection queue and creates a new full connection. Add it to the Accept queue and wait for the process to fetch the connection when it calls accept.

Half connection queue and full connection queue

Either a half-connection queue or a full-connection queue has a maximum length limit, and when the limit is exceeded, the kernel simply discards or returns the RST packet.


Combat-tcp full connection queue overflows

How do I know the SIZE of the TCP full connection queue for my application?

On the server side, you can use the ss command to check the status of the TCP full connection queue:

Note that the LISTEN and non-listen meanings of recv-q/send-q obtained by the SS command are different. You can see the difference in the kernel code below:

In LISTEN state, recv-q/send-q indicates the following meanings:

  • Recv-q: indicates the size of the current full connection queue, that is, the three-way handshake is completed and the server is waitingaccept()TCP connection;
  • Send-q: indicates the maximum queue length of the current full connection. The previous output indicates that TCP services on port 8088 are monitored. The maximum connection length is 128.

In the Non-LISTEN state, recv-q/send-q indicates the following meanings:

  • Recv-q: the number of bytes received but not read by the application process.
  • Send-q: indicates the number of bytes sent but not acknowledged.

How to simulate the TCP full connection queue overflow scenario?

The test environment

Experimental environment:

  • Both client and server are CentOs 6.5 and Linux 2.6.32
  • The server IP address is 192.168.3.200, and the client IP address is 192.168.3.100
  • The server is the Nginx service with port 8088

The WRK tool is a simple HTTP pressure tool. It can use the system’s own high-performance I/O mechanism under the condition of a single multi-core CPU, through multithreading and event mode, to produce a lot of load on the target machine.

This simulation experiment uses WRK tool to stress test the server, initiate a large number of requests, let’s see what happens when the server TCP full connection queue is full. What are the indicators?

The WRK command is executed by the client to initiate a stress test on the server, and 30,000 connections are made concurrently:

On the server side, you can use the ss command to view the status of the current TCP full connection queue:

The ss command is executed twice. According to the output result, the size of the current TCP full connection queue increases to 129, exceeding the maximum TCP full connection queue.

When the number of TCP connections exceeds the maximum TCP connection queue, the server will discard subsequent TCP connections. The number of lost TCP connections will be counted, which can be queried using the netstat -s command:

41150 times indicates the number of overflows of the full connection queue. Note that this is a cumulative value. You can do it every couple of seconds, and if this number keeps going up and up you’re sure to get full once in a while.

From the above simulation results, it can be seen that when the server concurrently processes a large number of requests, if the TCP full connection queue is too small, it is prone to overflow. When the TCP full connection queue overflows, subsequent requests are discarded. As a result, the number of requests on the server does not increase.

The full connection queue overflows

Linux has a parameter that specifies the policy to be used to respond to the client when the TCP full connection queue is full.

In fact, dropping the connection is the default behavior of Linux, and we can also choose to send the client an RST reset message telling the client that the connection has failed to be established.

Tcp_abort_on_overflow has two values, 0 and 1 respectively, which represent:

  • 0: If the full connection queue is full, the server throws the ACK sent by the client.
  • 1: If the full connection queue is full, the server sends oneresetThe packet is sent to the client, indicating that the handshake process and the connection are discarded.

Set tcp_aborT_on_overflow to 1. If the client fails to connect to the server, you can set tcp_abort_on_overflow to 1. Then it can be proved that the server TCP full connection queue overflow problem.

In general, you should set tcp_ABORT_ON_overflow to 0, because this is better for dealing with unexpected traffic.

For example, when the TCP full connection queue is full and the server loses an ACK, and the client is in the ESTABLISHED connection state, the process sends the request on the ESTABLISHED connection. As long as the server does not reply with an ACK for the request, the request will be reissued multiple times. If the process on the server is busy for a short time and the ACCEPT queue is full, when the TCP full connection queue is empty, the received request packet will still trigger the establishment of a connection on the server because it contains ACK.

Therefore, tcp_ABORT_ON_overflow is set to 0 to improve the success rate of establishing connections. Only if you are sure that the TCP full connection queue will overflow for a long time should you set it to 1 to notify clients as soon as possible.

How do I increase the TCP full connection queue?

Yes, when a TCP full connection queue overflows, we need to increase the size of the queue to handle a large number of client requests.

The TCP full connection queue maximum is determined by the minimum value between the somaxconn and backlog, i.e., min(somaxconn, backlog). As you can see from the following Linux kernel code:

  • somaxconnIs a parameter of the Linux kernel. The default value is 128/proc/sys/net/core/somaxconnTo set its value;
  • backloglisten(int sockfd, int backlog)The backlog size in functions, which Nginx defaults to 511, can be set by modifying the configuration file;

In the previous simulation test, my test environment was as follows:

  • Somaxconn is the default value 128;
  • The backlog for Nginx is 511 by default

Therefore, the maximum value of the TCP full-connection queue in the test environment is min(128, 511), that is, 128. You can run the ss command to check:

Now let’s re-pressure the TCP full connection queue and set somaxCONN to 5000:

Then set the backlog to 5000 as well:

Finally, the Nginx service is restarted because the TCP full connection queue is reinitialized only when the listen() function is called again.

After the Nginx service is restarted, the server runs the ss command to check the size of the TCP full-connection queue:

According to the result, the maximum number of TCP full connections is 5000.

After the TCP full connection queue is enlarged, continue the pressure test

The client also sends requests to the server with 30,000 connections:

The server runs the ss command to view the usage of the TCP full-connection queue.

TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP TCP

Note After the maximum value of TCP full connection queue increases from 128 to 5000, the server handles 30,000 concurrent connection requests and does not overflow the full connection queue.

If connections continue to be dropped due to TCP full connection queue overflow, the backlog and somaxconn parameters should be increased.


Combat-tcp half-connection queue overflows

How do I view the length of a TCP half-connection queue?

Unfortunately, the length of the TCP half-connection queue is not as visible with ss as the full-connection queue.

But we can capture the characteristic of TCP half-connections, which is that the server is in the SYN_RECV state of the TCP half-connection queue.

Therefore, we can use the following command to calculate the current TCP half-connection queue length:

How to simulate a TCP half-connection queue overflow scenario?

It is not difficult to simulate a TCP half-connection overflow scenario. In fact, the server sends A TCP SYN packet all the time, but does not respond with a ACK handshake for the third time. In this way, the server has a large number of TCP connections in the SYN_RECV state.

This is also known as SYN flooding, SYN attacks, and DDos attacks.

The test environment

Experimental environment:

  • Both client and server are CentOs 6.5 and Linux 2.6.32
  • The server IP address is 192.168.3.200, and the client IP address is 192.168.3.100
  • The server is the Nginx service with port 8088

Note: tcp_syncookies are not turned on in this simulation experiment. The function of tcp_syncookies will be explained later.

In this experiment, THE Hping3 tool was used to simulate SYN attack:

After a SYN attack occurs on the server, the SSH connected to the server is disconnected and cannot be reconnected. Query the size of the TCP half-connection queue only on the server host:

You can also use netstat -s to observe the overflow of the half-connection queue:

The preceding output is the cumulative value, indicating the total number of TCP connections discarded because of the overflow of the half-connection queue. If the command is executed several times every few seconds, it indicates that the half-connection queue overflows.

Most people say tcp_max_syn_backlog specifies the size of the half-connection queue. Is that true?

Unfortunately, the size of the half-connected queue is not solely related to tcp_MAX_syn_backlog.

In the SYN attack scenario simulated above, the default value for tcp_max_syn_backlog on the server is as follows:

However, when testing, the server has a maximum of 256 half-connection queues, not 512, so the maximum length of the half-connection queue is not necessarily determined by the tcp_MAX_syn_backlog value.

Next, go into the Linux kernel source code to analyze how the maximum value of the TCP half-connection queue is determined.

The Linux kernel code for the TCP first handshake (receiving the SYN packet) is as follows, with a lot of code reduced to focus on the logic for handling the TCP half-connection queue overflow:

From the source code, I can conclude that there are three conditions that are discarded due to queue length:

  1. If the half-connection queue is full and tcp_syncookies are not enabled, it is discarded;
  2. If the full connection queue is full and more than one SYN+ACK packet is not retransmitted, the packet is discarded.
  3. If tcp_syncookies are not enabled and the max_syn_backlog minus the current half-connection queue length is less than (max_syn_backlog >> 2), the queue is discarded.

Tcp_syncookies is one of the means to mitigate SYN attacks.

Next, we continue with inet_cSK_reqsk_queue_IS_full and sk_acceptq_is_full to check whether the full connection queue is full:

From the above source, we can know:

  • allThe maximum connection queue issk_max_ack_backlogThe variable, sk_max_ack_backlog, is actually specified in the listen() source code, i.emin(somaxconn, backlog);
  • And a halfThe maximum connection queue ismax_qlen_logWhere is the variable max_qlen_log specified? We don’t know yet. We’ll follow up.

Let’s follow the code and see where the maximum value of the half-connection queue max_qlen_log is initialized:

From the above code, we can calculate that max_qlen_log is 8, and then replace it with reqsk_queue_is_full, which checks whether the half-connection queue is full:

When qlen >> 8 is 1, the half-connection queue is full. It’s not that hard to calculate, but it’s obvious that when qlen is 256, 256 >> 8 = 1.

That’s why the server had a maximum of 256 SYN_RECV connections when the SYN attack was tested in the simulation above.

As you can see, the maximum value of the half-connected queue is not solely determined by max_syn_backlog, but also by somaxconn and backlog.

In the Linux 2.6.32 kernel version, the relationship between them can be summarized as follows:

  • When max_syn_backlog > min(somaxconn, backlog), max_qlen_log = min(somaxconn, backlog) * 2;
  • When max_syn_backlog < min(somaxconn, backlog), max_qlen_log = max_syn_backlog * 2;

Does max_qlen_log mean the maximum number of server states in SYN_REVC?

Still a shame, no.

Max_qlen_log is the theoretical maximum half-connection queue and does not necessarily represent the maximum number of servers in the SYN_REVC state.

In the previous section, we analyzed three conditions that are discarded on the first TCP handshake:

  1. If the half-connection queue is full and tcp_syncookies are not enabled, it is discarded;
  2. If the full connection queue is full and more than one SYN+ACK packet is not retransmitted, the packet is discarded.
  3. If tcp_syncookies are not enabled and the max_syn_backlog minus the current half-connection queue length is less than (max_syn_backlog >> 2), the queue is discarded.

If condition 3 is true, SYN packets will still be discarded, assuming condition 1’s current half-connection queue length “does not exceed” the theoretical maximum half-connection queue max_qlen_log. The maximum number of server states in SYN_REVC will not be the theoretical value max_qlen_log.

It seemed hard to understand, so we kept doing the experiment, and the experiment showed the truth.

The server environment is as follows:

After the configuration, the server restarts Nginx because the maximum number of full-connection queues and the maximum number of half-connection queues are initialized in Listen ().

Based on the previous source code analysis, we can calculate the maximum value of the semi-connected queue max_qlen_log at 256:

The client runs hping3 to launch a SYN attack.

The server runs the following command to check the maximum number of states in the SYN_RECV state:

You can see that the maximum number of server states in SYN_RECV is not the value of the max_qlen_log variable.

That’s why: If the length of the current half-connection queue “does not exceed” the theoretical maximum of the half-connection queue max_qlen_log, then if condition 3 is true, SYN packets are still discarded, so that the maximum number of server states in the SYN_REVC state is not max_qlen_log.

Let’s analyze a wave of condition 3:

According to the above analysis, if the current half-connection queue Length > 192 condition is triggered, the SYN packet in the first TCP handshake is discarded.

In the previous test, the maximum number of SYN_RECV states on the server was 193, which triggered condition 3. Therefore, the number of SYN_RECV states did not reach the “theoretical half-connection queue maximum of 256” before SYN packets were discarded.

Therefore, the maximum number of servers in the SYN_RECV state can be classified as follows:

  • If the “current half-connection queue” does not exceed the “Theoretical Half-connection queue Maximum” but exceeds the max_syn_backlog – (max_syn_backlog >> 2), Then the maximum number of states in SYN_RECV is max_syn_backlog – (max_syn_backlog >> 2);
  • If the current half-connection queue exceeds the Theoretical Half-connection Queue Maximum, the maximum number of states in SYN_RECV is the theoretical half-connection Queue Maximum.

The “theoretical” maximum half-connection value is calculated differently for each Linux kernel version.

Above we have a “theoretical” half-connection maximum algorithm for Linux 2.6.32 version analysis, which may be somewhat different from version to version.

For example, in Linux 5.0.0, the “theoretical” maximum half-connection is the maximum full-connection queue, but there are still three conditions for queue overflow:

If the SYN half-connection queue is full, can I only drop the connection?

This is not the case. Enabling Syncookies allows a connection to be established without using the SYN half-connection queue, as we saw earlier in our source code analysis. When enabling Syncookies, connections are not discarded.

Syncookies work like this: the server calculates a value according to the current state and sends it in its OWN SYN+ACK packet. When the client returns an ACK packet, it takes out the value for verification. If it is valid, the connection is established successfully, as shown in the following figure.

Enable syncookies

The syncookies parameter has the following three values:

  • If the value is 0, the function is disabled.
  • The value of 1 indicates that the SYN half-connection queue is enabled only when it is too large.
  • 2: indicates that the function is enabled unconditionally.

For SYN attacks, set the value to 1:

How do I defend against SYN attacks?

Here are several ways to defend against SYN attacks:

  • Increase the half-connection queue;
  • Enable the tcp_syncookies function
  • Reduce SYN+ACK retransmission times

Method 1: Increase the half-connection queue

In the previous source code and experiments, we learned that to increase the half-connection queue, we can’t just increase the tcp_MAX_syn_backlog, but also increase the somaxCONN and the backlog, i.e., the full connection queue. Otherwise, simply increasing the tcp_max_syn_backlog is not effective.

Tcp_max_syn_backlog and somaxconn can be augmented by modifying the Linux kernel parameters:

The way the backlog is enlarged varies from Web service to Web service. For example, Nginx increases the backlog as follows:

Finally, after changing these parameters, the Nginx service needs to be restarted, because both the half-connection queue and the full-connection queue are initialized in Listen ().

Method 2: Enable tcp_syncookies

Tcp_syncookies can also be turned on simply by modifying the Linux kernel parameters:

Method 3: Reduce SYN+ACK retransmission times

When the server is attacked by a SYN attack, a large number of TCP connections in the SYN_REVC state will retransmit SYN+ACK. When the number of retransmission times reaches the upper limit, the connection will be disconnected.

In the case of SYN attacks, we can reduce the number of SYN+ACK retransmissions to speed up the disconnection of TCP connections in SYN_REVC state.


Shoulders of giants

[1] System performance tuning is a must. Tao Hui. Geek Time.

[2] https://www.cnblogs.com/zengkefu/p/5606696.html

[3] https://blog.cloudflare.com/syn-packet-handling-in-the-wild/


Kobayashi is a tool man for you. Goodbye, see you next time!