In recent days, I have read “Computer Networking from top down” and “TCP/IP Volume 1: Protocols”. I have sorted out some things that I could not learn from blogs before, and I would like to share with you here. The three handshakes and the four waves will also be written, but it will highlight the new knowledge learned from the book.


TCP packet segment structure

As shown in Figure 3-29 from Top down

  • Source and destination port numbers are used for multiplexing/decomposition
  • The 32bit serial number field and 32bit confirmation number field are used for reliable data transmission
  • 16bit receive window field with flow control
  • The 4bit header length field indicates the length of the TCP header. The length of the TCP header is variable because of the option field. The option field is usually empty, so the TCP header is usually 20 bytes long
  • The option field is used to negotiate the maximum message segment length (MMS) between the sender and the receiver, or to act as a window regulator on high-speed networks
  • CWR — Congestion window minus, ECE — ECN echo, URG — emergency, ACK — confirmation, PSH — Push, RST — reset connection, SYN — synchronous serial number used to initialize a connection, FIN — Sender of this segment has stopped sending data to each other

TCP connection and release

Three-way handshake

First handshake: Client The client sends a TCP connection request with the SYN set to 1 and the state is synchronized. ISN(c) is the initial sequence number of the client.

Second handshake: The server is originally in the listening state. After receiving a TCP connection request and agreeing to establish a connection, the server sends a TCP connection request confirmation packet and enters the synchronous received state. The SYN bit and ACK bit are set to 1, and the ISN(s) is the initial sequence number of the server. The ACK confirms the initial sequence number of the client +1, because ACK represents “the sequence number you want to receive next time”.

Third handshake: The client sends a normal ACK packet with the ACK set to 1. Seq=ISN(C)+1 is the second ACK packet sent by the client. ACK=ISN(s)+1 indicates the sequence number of the next packet to be received from the server. After sending the message, the system enters the ESTABLISHED connection state. The server also enters ESTABLISHED after receiving the message

The point of the last handshake is that if there are only two handshakes to establish a connection, the late second handshake may cause the server to open many invalid connections

The purpose of the three-way handshake is not only to let the communication parties know that a connection is being established, but also to exchange the Initial Sequence Number (ISN) using the packet options to carry special information.

Four times to wave

The picture here is from the web class, because he drew it very clearly

First, both the client and the server are in the established connection state

First wave: The client sends a TCP connection release packet segment. FIN=1, ACK=1, seq= U indicates the sequence number +1 previously sent and ACK= V indicates the sequence number +1 received. TCP stipulates that a FIN=1 packet segment without data consumes 1. Enter the termination wait 1 state

Second wave: The server sends a common TCP acknowledgement packet to the client. ACK=1, SEq = V, ACK= U +1. Enter the shutdown waiting state

Then the server to client side also needs to close, continue waving

Third wave: The server sends a TCP connection release packet to the client. FIN=1, ACK=1, SEQ = W, ACK= U +1. Then enter the final confirmation state

Fourth wave: The client sends a common TCP acknowledgement packet to the server. ACK=1, SEq = U +1, ACK= W +1. Then enter the time wait state

TIME_WAIT functions: Prevents the third wave from being lost. The server resends the third wave repeatedly, wasting resources

About SEQ and ACK

Initial serial number

Seq is the serial number of each packet segment, which is continuously added during data exchange. Is the serial number of the first packet segment 0 or 1? In fact, before sending the SYN to establish the connection, the two communicating parties each select an initial sequence number. The initial sequence number changes over time, so each connection has a different initial sequence number.

As you can imagine, if the serial number starts with the same value each time, then if two connections use the same sending and receiving IP and port number, the later connection may treat the delayed TCP segment from the previous connection as a valid segment (which should be treated as an invalid and useless discarded segment).

In Linux, a clock-based scheme is used and a random offset is set for the clock for each connection. Random offsets are obtained using a cryptographic hash function based on the connection identity, which is a quad (IP and port number of both parties).

The value of an ack

The ack value is “SEQ +1 of the packet segment I just received”, which means “SEQ of the next packet I want to receive”.

SYN flood attack

Read your own book in detail

TCP timeout and retransmission

When TCP sends data, it sets a timer. If no confirmation message is received until the timer expires, a timeout or timer retransmission operation is triggered. The timer timeout is called retransmission Timeout RTO. Another type of retransmission is called fast retransmission and usually occurs without delay. If TCP cumulative acknowledgement fails to return a new ACK, or if the ACK contains the select acknowledgement information SACK, which indicates that an out-of-order packet segment exists, fast retransmission will infer that packet loss occurs.

  • RTT — round-trip Time Round-trip Time
  • RTO — Retransmission Timeout Retransmission Timeout period

RTO is estimated based on RTT. There are a number of ways that I won’t go into detail here.

Timer based retransmission

Within the RTO set for the connection, if TCP does not receive an ACK for the timed packet segment, timeout retransmission is triggered. When this happens, TCP responds by reducing the current rate of sending data. There are two ways to reduce the current rate:

  1. Reduce the size of the sending window based on congestion control
  2. Each time a retransmission segment is retransmitted, the retreat factor of RTO is increased. RTO is temporarily multiplied by n to form a new timeout retreat value RTO= n*RTO is generally doubled

The fast retransmission

In short: TCP performs fast retransmission once it receives three redundant ACKS

Why do redundant ACKS occur

When the receiver receives a packet segment larger than the expected SEQ, it indicates that the middle segment is missing. It indicates that a packet segment may be lost or reordered. Because TCP is not designed to send a negative message like “I didn’t receive”, TCP can only send a message like “I received the previous segment” again, for example:

“Have you eaten yet?” I asked. “And you say” I did, “and I say” What did you eat?” “, you still say “I ate”, you repeat “I ate” three times and answer the question incorrectly three times, then you did not receive the following question.

Because the cause of this triple redundant ACK condition is packet loss, usually due to network congestion, TCP also triggers congestion control when fast retransmission is enabled (described later).

TCP traffic control and window management

First of all, do not confuse traffic control with congestion control. Traffic control is because the cache size of the sender and receiver is limited, so traffic control is implemented to avoid sending too many packets. Traffic control is because the cache overflow of the receiver is limited

Sliding Windows at both ends

TCP maintains its window structure in bytes

I’m just going to draw two pictures to make it clear

If the send window becomes zero, window probes are sent at regular intervals

Confused window syndrome

Read for yourself

TCP congestion control

CWND (Congestion window) represents network transmission capacity. The actual available window W at the sender is the smaller of the notification window AWnd and congestion window CWND at the receiver: W = min(CWND, AWnd)

Let me borrow another picture here

Slow start

The purpose of slow start is to allow TCP to get the CWND value before using congestion to avoid exploring more available bandwidth, and to help TCP establish an ACK clock. Typically, TCP performs a slow start when establishing a new connection until packet loss occurs and the congestion avoidance algorithm enters a stable state.

The initial window usually takes one or two SMSS(sender’s maximum segment size), and then increases by two

Congestion avoidance

CWND +1 increments until packets are lost, set SSTHRESH = SSTHRESH /2, CWND = 1 and restart the slow start algorithm

Slow start and congestion avoidance options

Normally, TCP connections are always in a slow start or congestion avoidance process and do not occur simultaneously. So what are the key factors in deciding which algorithm to use? Yes Slow start threshold SSthRESH

When CWND < SSTHRESH, use slow start, when CWND > SSTHresh use congestion avoid, equal at will

Ssthresh Slow start threshold

Ssthresh is not fixed but changes over time, and the main purpose of SSTHRESH is to remember the last “best” estimate of the operating window without packet loss occurring, that is, to record the lower bound of the estimate of the TCP optimal window

Ssthresh = Max (external data value / 2,2 *SMSS)

The initial value of SSTHRESH can be set arbitrarily, but eventually a suitable value will be found due to the slow start and congestion avoidance algorithm. For example, if set to 2 and packet loss occurs at 10, SSTHRESH will become 5.

Fast recovery

CWND = ssTHRESH /2, CWND = 1, CWND = 1 Ssthresh = CWND = SSTHRESH /2, also set CWND to SSTHRESH /2+3

Just don’t cut CWND to 1 to save some time

Looking forward to

Next we’ll look at other layers of protocols, HTTP and HTTPS, then inside Understanding Computer Systems, then Inside InnoDB Technology, then Redis Design and Implementation.