This article is participating in the topic of “performance optimization actual combat record” essay activity

Time-wait is an inevitable topic for Nginx server optimization, and we often have a problem with time-wait.

A common solution is:

1. Fast recycling

2. Link reuse

There is a misunderstanding about how far to optimize time-wait. Some children even feel that time-wait needs to be optimized when they see it. Today, I just want to talk about this topic

Want to talk clearly, still want to start from principle

Time-wait is the state of a TCP connection. The state of a TCP connection is a time-wait state

This figure shows the whole transition process of all states, which is divided into three parts. The upper part is the process of establishing connections, and the lower part is the process of active and passive closing

As you can see, TIME_WAIT only occurs during active shutdown, but in fact TIME_WAIT is TCP’s solution to complex network problems

What problem do you solve? Look at the following two scenarios

  • In four waves, A sends A FIN, B responds with an ACK, B sends A FIN again, and A responds with an ACK to close the connection. If the ACK packet sent by USER A is lost, user B thinks that USER A has not received its closing request and sends A FIN packet to user A again.

    If there is no TIME_WAIT state, USER A will not save the connection information and will respond to the RST packet when receiving A packet that does not exist.

    In this case, TIME_WAIT is used to ensure the normal termination of the full-duplex TCP connection.

  • We also know that the IP layer protocol under TCP cannot guarantee the order of packet transmission. If A network quad (SRC/DST IP /port) is recovered after the two parties wave their hands, and A late packet is not received by B, application A immediately uses the same quad to create A new connection, and the late packet does not reach B until the same quad is created. Then the packet will make B think that A just sent it.

    In this case, TIME_WAIT exists to ensure that packets lost in the network expire properly.

In the first scenario, TIME_WAIT is used to ensure that the passive closing party receives an ACK, the connection is closed properly, and the next connection is not affected because the passive closing party retransmits the FIN

In the second scenario, TIME_WAIT holds two MSLS to ensure that data is not lost

Note: MSL(Maximum Segment Lifetime) indicates the Maximum length of time a TCP Segment can exist in an Internet system. Any Segment beyond this Lifetime is discarded. RFC 1122 recommends 2 minutes. This value is uncertain

Because of the above two scenarios, TCP introduces the TIME_WAIT state to solve the problem, so it is not reasonable to completely eliminate TIME_WAIT or a goal of optimization to eliminate TIME_WAIT

So how to optimize TIME_WAIT specifically and what state is a reasonable state?

Let’s take a look at two common optimization methods and start with the one that is not recommended: quick recycle

The TIME_WAIT duration is 2MSL, which is 4 minutes if the RFC recommends 2 minutes. For high-concurrency servers, local_port has a fixed amount. If the TIME_WAIT takes 4 minutes, the port will soon be exhausted

In CentOS, MSL can be set to 30 seconds by modifying the tcp_fin_timeout parameter. A quad (local_IP, local_port, remote_IP, and remote_port) is frozen for 60 seconds. By default, the system can allocate about 30,000 ports

/proc/sys/net/ipv4/ip_local_port_range
Copy the code

Therefore, the concurrent request from the same client can only reach about 500QPS

Therefore, a fast reclamation method is proposed, that is, when the TCP connection is in TIME_WAIT state, the connection is immediately reclaimed without waiting 2MSL, so as to quickly reclaim resources for new connections

SYN can not get ACK caused by fast recovery, as follows:

TCP caches the latest timestamp of each connection. If the timestamp is smaller than the cached timestamp, the subsequent request is considered invalid and the corresponding data packet is discarded. Whether Linux enables this behavior depends on tcp_timestamps and tcp_TW_RECYCLE, because tcp_TIMEstamps is enabled by default, so when tcp_TW_RECYCLE is enabled, this behavior is actually enabled. In NAT environments, timestamp errors occur and subsequent packets are discarded. This is usually caused by a SYN sent by the client, but the server does not respond to an ACK. Because the NAT device changes the source IP address of the packet to one address (or a small number of IP addresses), but does not change the timestamp of the TCP packet, timestamp confusion results

Suggestion: If a Layer 3 or 4 NAT device is deployed on the front end, disable quick reclamation to avoid SYN rejection caused by incorrect time stamps on the real machine behind NAT

Another optimization approach is recommended: reuse

Can be determined by the above analysis, the concurrent came because the quad frozen MSL 2 time cause there is no new resources available, and quad in the frozen time cannot be used, so the proposed reuse, but reuse is a premise, it requires tcp_timestamps open at the same time, look at the kernel of this part of the code

The reuse reuse protocol is used to reuse sockets in TIME_WAIT state more than 1s after the last packet was received, so the reuse protocol is normally used to reuse sockets in TIME_WAIT state quickly without any of the “side effects” of the recycle protocol.

In addition to the above two TIME_WAIT solutions, there is another parameter that you must pay attention to: TW_buckets

cat /proc/sys/net/ipv4/tcp_max_tw_buckets
Copy the code

If the number of time_waits is the same as that of tcp_MAX_TW_BUCKETS, tw_BUCKETS will overflow. The default value is 4096

When deprecating a reuse reuse, the reuse of a TIME_WAIT on a high-concurrency server does not recycle the previous link, so the number of time_waits must be kept constant. If the local_port reuse is set to 3W, tw_BUCKETS is critical. If TW_BUCKETS is 4096 by default, TIME_WAIT will soon reach its limit and cannot be increased

You should keep TW_BUCKETS at least at the local_port, and although a large amount of TIME_WAIT consumes memory resources for a long time, the amount of memory consumed by TIME_WAIT is tolerable on current servers

Conclusion:

Reuse is not a TIME_WAIT optimization because the reuse function is enabled only when tw_buckets overflows. Reuse is not a TIME_WAIT optimization because the reuse function is enabled only when TW_buckets overflows. Add TW_BUCKETS appropriately to handle concurrency