From my public number “YongHao write Cache” hit a small advertisement, or hope to write something someone see 🙊

Let’s share some insights

Misunderstanding and False Analogy of three-way handshake (RFC Interpretation)

TCP is a three-way handshake, not a two-way handshake or a four-way handshake. For example, the first 3.6k times agree analogy is wrong:

Three handshakes: "Hello, can you hear me?" "I can hear you, can you hear me? "I can hear you, balabala today..."Copy the code

Again, the analogy of 107 endorsements is wrong:

The handshake, like the military salute, originated from a ritual in which "both sides confirm that the other side has no weapons and no malice." (Although it takes four steps for both parties to ask each other for confirmation, the two requests are combined into one step because the same person performs them.) Jeong-eun holds out his hand and says, Look, I have no weapon in my hand. (SYN) Trump looked at it and said, well, no. Then he reached out his hand and said, You see, I have no weapon in my hand. (SYN) Jeong-eun looks at it and says, Well, you seem to be sincere. (ACK)Copy the code

These two analogies are taken for granted, and you will know clearly why they are wrong.

In addition, in the fourth edition of Computer Network written by Xie Xiren, the purpose of “three-way handshake” is “to prevent the invalid connection request message segment from being suddenly transmitted to the server, thus causing errors”. This can only be considered as an ostensional cause, but does not involve the essence.

Xie Xiren version of “computer network” example is that the connection request message of “failed” in such a case: the first connection request from the client message segment is not lost, but in a network node of the stranded for a long time, so that has been delayed to release after a certain time to reach the server connection. Originally, this is an invalid packet segment. However, after the server receives the invalid connection request packet segment, it mistakenly thinks it is a new connection request sent by the client. Then the client sends a confirmation message to agree to establish a connection. Assuming that the “three-way handshake” is not used, a new connection is established as soon as the server sends an acknowledgement. Since the client does not send a connection request, it ignores the server’s confirmation and does not send data to the server. However, the server assumes that the new transport connection has been established and waits for data from the client. As a result, many of the server’s resources are wasted. The three-way handshake prevents this from happening. For example, the client does not issue an acknowledgement to the server’s acknowledgement. When the server receives no acknowledgement, it knows that the client has not requested a connection.”

If you read RFC793, which is TCP’s protocol, RFC, you’ll see why the three-way handshake is necessary — TCP needs the SEQ sequence number to reliably retransmit or receive, and to avoid connection multiplexing without being able to tell if SEQ is delayed or an old link’s SEQ. So three handshakes are required to agree to determine the ISN (initial SEQ sequence number) of both parties.

The following is a detailed interpretation of RFC :(data groups are called segments, usually referred to as packages in China)


The first thing we need to know is that TCP reliable connections are made by seQ (sequence numbers).

A fundamental notion in the design is that every octet of data sent over a TCP connection has a sequence number. Since every octet is sequenced, each of them can be acknowledged.

The acknowledgment mechanism employed is cumulative so that an acknowledgment of sequence number X indicates that all octets up to but not including X have been received.

A basic setup in TCP design is that each packet sent over the TCP connection has a sequence number. And because each packet has a serial number, it can be confirmed that the packet was received.

The validation mechanism is cumulative, so an acknowledgement of sequence number X means that the packet was received before the X sequence number (excluding X).

The protocol places no restriction on a particular connection being used over and over again.

The problem that arises from this is — “how does the TCP identify duplicate segments from previous incarnations of the connection?” This problem becomes apparent if the connection is being opened and closed in quick succession, or if the connection breaks with loss of memory and is then reestablished.

TCP does not restrict the reuse of a particular connection (like the sockets at both ends).

So this raises the question: how does TCP recognize the packet retransmitted from the previous connection when the connection is suddenly disconnected and reconnected? This requires a unique ISN mechanism.

When new connections are created, an initial sequence number (ISN) generator is employed which selects a new 32 bit ISN. The generator is bound to a (possibly fictitious) 32 bit clock whose low order bit is incremented roughly every 4 microseconds. Thus, the ISN cycles approximately every 4.55 hours. Since we assume that segments will stay in the network no more than the Maximum Segment Lifetime (MSL) and that the MSL is less than 4.55 hours we can reasonably assume that ISN’s will be unique.

When a new connection is established, the initial Sequence Number ISN generator generates a new 32-bit ISN.

The generator will use a 32-bit clock that grows by about 4µs, so the ISN will cycle once at about 4.55 hours

(2^32 counter, need 2^32*4 µs to increase, divided by 1 hour total number of µs can be calculated 2^32*4 /(1*60*60*1000*1000)=4.772185884)

A Segment on a network is not longer than the Maximum Segment Lifetime (MSL), which is shorter than 4.55 hours, so we can assume that an ISN is unique.

Both the sender and the receiver have their own ISN (X and Y in the following example) to communicate with each other, which is described as follows:

  1. A –> B SYN my sequence number is X
  2. A <– B ACK your sequence number is X
  3. A <– B SYN my sequence number is Y
  4. A –> B ACK your sequence number is Y

Handshake 2 and handshake 3 are handshake 2 and handshake 2 and Handshake 3 and Handshake 2 and Handshake 3 and handshake 3 and handshake 2 and Handshake

Therefore, it can be concluded that a three-way handshake is necessary:

A three way handshake is necessary because sequence numbers are not tied to a global clock in the network, and TCPs may have different mechanisms for picking the ISN’s. The receiver of the first SYN has no way of knowing whether the segment was an old delayed one or not, unless it remembers the last sequence number used on the connection (which is not always possible), and so it must ask the sender to verify this SYN. The three way handshake and the advantages of a clock-driven scheme are discussed in [3].

A three way Handshake is necessary because sequence numbers are not tied to the global clock of the entire network. You can determine if the packet is delayed) and TCPs may have a different mechanism for selecting the ISN (initial sequence number).

When the recipient receives the first SYN, there is no way to know if the SYN is long delayed unless he can remember the sequence numbers that were last received in the connection (which is not always possible, however).

When a seQ comes in, how can I tell if it was delayed or delayed?

Therefore, the recipient must acknowledge SYN with the sender.

If SEQ in SYN is not recognized, then only:

  1. A –> B SYN my sequence number is X
  2. A <– B ACK your sequence number is X SYN my sequence number is Y

Only B has confirmed receiving A’s SEQ, while A cannot confirm receiving B’s. That is, only packets sent from A to B are reliable, and packets sent from B to A are not, so this is not A reliable connection. In this case, if A only needs to send to B, and B does not need to respond, the three-way handshake can be avoided.

Three handshake details

      TCP A                                                TCP B

  1.  CLOSED                                               LISTEN

  2.  SYN-SENT    --> <SEQ=100><CTL=SYN>               --> SYN-RECEIVED

  3.  ESTABLISHED <-- <SEQ=300><ACK=101><CTL=SYN,ACK>  <-- SYN-RECEIVED

  4.  ESTABLISHED --> <SEQ=101><ACK=301><CTL=ACK>       --> ESTABLISHED

  5.  ESTABLISHED --> <SEQ=101><ACK=301><CTL=ACK><DATA> --> ESTABLISHED

          Basic 3-Way Handshake for Connection Synchronization

                                Figure 7.
Copy the code

In the above

  • In the second line, A sends SEQ 100 with the SYN flag bit;
  • In line 3, B sends back ACK 101 and SEQ 300 with the SYN and ACK flags (the two procedures are merged). Note that ACK is 101 means that B expects to receive a data segment starting with sequence number 101.
  • In line 4, A returns empty data, SEQ 101, ACK 301, flag bit ACK. At this point, the initial SEQ (ISN) numbers 100 and 300 have been received.
  • On line 5, the packet is sent. Note that the ACK is still 301 on line 4, because there is no SYN that needs to be ACK.