preface

As one of the most commonly used transport layer protocols, TCP provides reliable data transmission over unreliable transmission channels. As long as one layer protocol is reliable, the whole network transmission is secure and reliable. In reality, almost all HTTP traffic is transmitted through TCP. Therefore, we need to optimize web performance, and TCP is a key part of that. To optimize TCP for performance, you need to understand how it works.

Three-way handshake

It is well known that three handshakes are required to establish a TCP connection. About three handshakes, a picture is worth a thousand words.

The three-way handshake introduces a lot of latency to TCP, but the handshake process is essential. If there is no three-way handshake, some invalid request packets may suddenly be sent to the server. The server thinks that this is a new connection initiated by the client and sends a confirmation packet to indicate that it agrees to establish the connection. The client does not respond, causing the server to be empty and wasting server resources.

Since the three-way handshake process is inevitable, we can only reduce the number of three-way handshakes by reusing TCP connections. HTTP 1.1 introduced long connections by adding Connection: keep-alive to the request header to tell the request not to close the Connection after the response is complete. However, HTTP long connections are also limited. The server usually sets the keep-alive timeout period and the maximum number of requests. If the number of requests times out or exceeds the maximum number of requests, the server actively closes the connection.

In addition, TFO(TCP Fast Open) is also designed to optimize the three-way handshake process. It authenticates a previously connected client with a TFO cookie (a TCP option) in the SYN packet at the beginning of the handshake. If the validation is successful, it can start sending data before the final ACK packet of the three-way handshake is received.

Linux 3.7 and later kernels support TFO on both the client and server. For mobile, TFO is supported on Android and iOS 9+, although it is not enabled by default on iOS.

PS: You are recommended to install wireshark, so that you can intuitively observe the three-way handshake.

Flow control

Traffic control is a mechanism to prevent the sender from sending too much data to the receiver. Its main purpose is to prevent the receiving end service overload, resulting in packet loss. For traffic control, each side of a TCP connection declares its own notification window (RWND), indicating how much data its buffer can receive. If one side can’t keep up with the other, notify the other side of a smaller window. If the window size is 0, the application layer must clear the buffer before it can continue receiving data. This is known as the sliding window protocol.

You may often find yourself with a 100 megabit broadband connection, but the actual download speed is only a few megabits per second. This situation may be caused by the improper setting of the notification window (RWND). The original TCP specification assigned a 16-bit field to the size of the receive window, which is 64KB (2 to the power of 16). In fact, the size of RWND should be determined by THE BDP (bandwidth delay product). BDP(bit) = bandwidth(b/s) * round-trip time(s). For example, a 100 MBPS broadband, RTT is 100 ms, then BDP = (100/8) * 0.1 = 1.25m. At this point, to improve network throughput, RWND should be 1.25m.

To address this problem, TCP Window Scaling, which extends the Window size from 16 to 32 bits, occurred. The buffer size tuning mechanism is available on Linux. You can view the initial window size of Linux by running the following command:

Tcp_rmem // Net.ipv4. tcp_rmem = 4096 87380 6291456 // The minimum, default, and maximum values are displayed from left to rightCopy the code

Slow start

A traffic control mechanism can prevent service overload between the sender and the receiver, but cannot prevent data overload from either end to a network. Therefore, an estimation mechanism is needed to dynamically change the data transmission speed according to the network environment. This is why slow start occurs.

Slow start adds a window to the sender’s TCP: congestion window, denoted as CWND. When a TCP connection is established with a host on another network, the CWND is initialized to one TCP segment. CWND adds a TCP segment to each ACK it receives. The sending end takes the minimum value of CWND and RWND as the upper limit of sending. It can be understood that the congestion window is the flow control used by the sender and the notification window is the flow control used by the receiver.

At first, CWND is 1, and the sender sends only one MSS (maximum packet segment length) packet. After receiving an ACK, CWND increases by 1, and CWND =2.

CWND =4 CWND = CWND *2

In this way, after each RTT, the CWND will become twice as much as before the last RTT. Therefore, the size of CWND increases exponentially.

As the CWND increases, the sending network becomes overloaded and packet loss occurs. Once this problem is found, CWND is reduced exponentially.

To reduce the number of round trips, it is important to set the size of the initial congestion window. By default (RFC 2581), the initial CWND is four MSS. Google suggests changing the initial window to 10 MSS. According to Google research, 90% of HTTP request data is less than 16KB, which is about 10 MSS.

conclusion

This paper introduces some working principles of TCP, including three-way handshake, flow control, slow start, and describes the optimization methods of TCP fast opening, window scaling, increasing the size of initial congestion window and so on. The content is a little bit theoretical, or need a lot of practice, in order to reasonably master all kinds of optimization means.


References:

  1. The Definitive Guide to Web Performance
  2. TCP/IP Detail, Volume 1: Protocols
  3. Discussion on TCP Optimization
  4. CWND growth problem in TCP slow start? – One no answer – Zhihu