An interview question: “Why does TCP establish a connection with a three-way handshake, but close a connection with a four-way handshake? Why can’t we connect with two handshakes?” , think about the recent to gold, silver and four, so I looked up the relevant information, sorted out this article, I hope to help you.

A TCP connection

So let’s just fill in the basics what is TCP? Transmission Control Protocol (TCP) is a connection-oriented Protocol that ensures reliable Transmission of data from end to end. Connection-oriented is the process of establishing a virtual link before sending data, and then letting data “flow” through the link.

TCP is reliable and will try its best to complete data transmission. TCP protocol is relatively complex. You can see the following picture of TCP packet header:

The content is very rich, and we are going to discuss the connection protocol is related to the middle of the six status bits: URG, ACK, PSH, RST, SYN, FIN, are set to 1 to indicate effective, in these six, we mainly focus on the emphasis on ACK, SYN, FIN these three. Here’s an explanation of the three status bits:

ACK: Used to acknowledge the received data, which is represented by the acknowledgement sequence number.

SYN: Used as a synchronization signal for establishing a connection

FIN: Indicates that no data needs to be sent, which usually means that the established connection needs to be closed.

Ok, here, we know the basics of TCP, let’s look at why three handshakes, rather than four or two, in order to make you better understand, I put a highly praised special image metaphor on Zhihu here, I hope to help you.

Two and four can be problematic, three is just fine, and hopefully this will give you a better idea of why it’s a three handshake.

We already know that TCP is a three-way handshake, why is it a three-way handshake? Let’s take a look at the following sequence diagram of the TCP connection.

In general, it is called, answered, and responded. Let’s go through each step in detail:

  • Step 1: Machine A sends A packet to machine B and sets SYN to 1 to indicate that it wants to establish A connection. The serial number in this package is assumed to be X.
  • Step 2: After receiving the packet from MACHINE A, machine B realizes through SYN that this is A request to establish A connection, and sends A response packet with the SYN and ACK flags set to 1. Assume that the sequence number in the packet is Y, but the sequence number must be X + L to indicate that a SYN was received. In TCP, the SYN is treated as a byte in the data part.
  • Step 3: After receiving the response packet, A needs to confirm the packet. ACK is set in the packet and the sequence number is set to Y +, indicating that A SYN is received from B

After these three steps, the two servers are connected and ready to communicate. Why three handshakes? This is mainly for informational equivalence and to prevent dirty connections due to request timeouts.

The first is to ensure that the two machines have information equivalence, to ensure that there is no problem with either machine:

Only after three handshakes can you ensure that the two servers are completely clean and capable of sending and receiving packets.

The second is to prevent dirty connections due to request timeouts, as shown in the following figure:

Why do dirty connections occur? The TTL TTL of network packets usually exceeds the TCP request timeout time. If A connection can be created after two handshakes, the first connection request that times out reaches MACHINE B after data transmission and connection release, machine B thinks that it is A request of MACHINE A to create A new connection, and then confirms and agrees to create A connection. Since machine A’s state is not SYl_SENT, the confirmation data of machine B is discarded directly, so that machine B ends up creating the connection unilaterally.

A three-way handshake can solve this problem, because the connection needs to be confirmed by server A.

TCP waved four times

The TCP protocol is used to connect and disconnect. It takes four waves to disconnect, compared to three times to connect. Take a look at the following scenario:

A: Oh, I don’t want to play anymore.

B: Oh, you don’t want to play, I see.

At this point, it is just that A does not want to play, that is, A will not send data, but can B directly shut down when ACK? Of course not, it is very likely that A is ready to stop playing after sending the last data, but B is still able to send data before finishing his own work, so it is called the semi-closed state.

At this time, A can choose not to receive the data any more, or it can choose to receive A section of data at last and wait for B to actively close.

B: Oh, well, I don’t play either. Bye.

A: Ok, bye.

This is a complete closing link, and in the closing process, four sentences are said, which we also call four waves. Like establishing a connection, disconnection is represented by state. Here is the sequence diagram of disconnection:

Let’s analyze the TCP disconnection process with the sequence diagram and scenario above.

When USER A says “no more”, user A enters the FIN_WAIT_1 state. When user B receives the “No more” message, user B sends the notification and user B enters the CLOSE_WAIT state.

A enters the FIN_WAIT_2 state after receiving “B says yes”. If B runs away, A will remain in this state forever. Although TCP does not handle this state, Linux does. You can adjust tcp_fin_timeout to set A timeout period, and finally A will be closed.

If B does not run away and sends the request “B will not play either”, A sends the ACK “knows B will not play either” when it arrives at A, and the state ends from FIN_WAIT_2. Normally, A can run away, but what if B cannot receive the last ACK? If USER B does not receive an ACK, user B will send another “NO longer playing” message. If user A runs away, user B will not receive an ACK again. Therefore, TCP requires user A to wait TIME_WAIT for A long time, so that if user B does not receive an ACK, “B says he’s done playing” will be resent, and A will send A new ACK with enough time to reach B.

Another reason for requiring A to wait TIME_WAIT is to prevent confusion. A is directly closed, but B is unaware of this. It is possible that B sent A lot of data packets before A is closed. The new application will receive the packet sent by B from the previous connection, which will cause confusion. Although the packet is invalid, waiting for TIME_WAIT can be a double insurance, so it will have to wait long enough for all packets sent by B to be dead before emptying the port.

The above is the TCP protocol three handshakes, four waves of the reason, I hope this article is helpful to your study or work, if you think the article is good, please also help to click a like and forward, thank you.

The last

At present, many big guys on the Internet have TCP protocol related articles, such as the same, please forgive more. The original is not easy, the code word is not easy, but also hope you support. If there are mistakes in the article, please also put forward, thank you.

Welcome to scan the code to follow the wechat public number: “Technology blog of Brother Flathead”. Brother Flathead can learn together and make progress together.