The characteristics of TCP

TCP is a connection-oriented, reliable streaming protocol. TCP implements sequence control or retransmission control to provide reliable transmission. In addition, it has “flow control”, “congestion control”, providing network utilization and many other functions.

TCP is in the fourth layer of the Network OSI seven-layer model — Transport layer, IP is in the third layer — Network layer, ARP is in the second layer — Data Link layer, the Data on the second layer is called Frame, the Data on the third layer is called Packet, the Data on the fourth layer is called Segment

The data is first sent to the TCP Segment, then the TCP Segment is sent to the IP Packet, and then the Ethernet Ethernet Frame. After the Packet is sent to the peer end, each layer parses its own protocol, and then sends the data to the higher-layer protocol for processing

TCP Reliability transmission

The diagram above shows TCP establishing communication, including the famous three-way handshake and four-way wave.

In TCP, reliability is provided by sequence numbers (SYN) and ACK. When data from the sender arrives at the receiving host, the receiver returns an acknowledgement that the message has been received. This message is called an ACK.

Why does it take 3 handshakes to build links and 4 waves to break links

  • For the 3-way handshake, the initial value of the Sequence Number is initialized. SYN (SYN, short for SYN Sequence Number) is an initial Sequence Number (ISN: Inital Sequence Number) that both parties communicate with each other. That’s x and y in the figure above. This number should be used as the serial number of future data communication to ensure that the data received by the application layer is not out of order due to transmission problems on the network. (TCP uses this serial number to concatenate data.)
  • For 4 waves, actually 2 if you look closely, because TCP is full-duplex, both sender and receiver need Fin and Ack. Except one of them is passive, so it looks like four waves.

The figure above shows an ideal TCP connection process, but the real connection process may not be so smooth, and packet loss may occur. This requires TCP retransmission mechanism to ensure reliable data transmission.

Packet loss occurs when data is lost on the way to the receiving end, or when the receiving end receives data but acknowledges that the reply is lost on the way back.

These functions of acknowledgement processing, retransmission control and repeat control can be realized by serial numbers. The serial number is the number that is assigned to each byte (8-bit byte) of the sent data in sequence. The receiving end queries the serial number and data length in the TCP header of the received data and sends back the serial number to be received as an acknowledgement. In this way, through the sequence number and acknowledgement number, TCP can achieve reliable transmission

Use window control to increase speed

TCP transfers data in the unit of one Segment, that is, Maximum Segment Size (MSS). MSS is calculated between hosts at both ends in the three-way handshake. In the three-way handshake, the MSS option is written in the TCP header to tell the MSS size applicable to the peer interface. Then choose a smaller value between the two to put into use.

TCP acknowledges each segment sent during data transmission. There is a drawback to this method of transmission. That is, the longer the round trip time of the packet, the lower the communication performance.

To solve this problem, TCP introduced the concept of Windows. Acknowledgments are no longer acknowledged in segments, but in larger units. That is, the sending segment host does not have to wait for an acknowledgement after sending a segment, but continues to send.

The window size is the maximum value at which data can continue to be sent without waiting for a confirmation reply. This mechanism enables simultaneous acknowledgement of multiple segments by using a large number of buffers.

Window control and resend control

Confirm that the reply did not return

In this case, the data has reached the peer end and does not need to be retransmitted. However, when window controls are not used, data that does not receive an acknowledgement is retransmitted. With window controls, some confirmation replies do not need to be resent even if they are lost.

As shown in the figure above, even if the acknowledgement acknowledgement of 3001 and 4001 is lost on the way back, but the client receives the acknowledgement acknowledgement of 5001, the client continues to send the following data without resending it.

Segment loss

As shown in the figure above, when a packet segment is lost, the sending segment always receives an acknowledgement with the serial number 1001. Therefore, when the window is large and the packet segment is lost, the acknowledgement with the same serial number is repeatedly returned. If the sending host receives the same acknowledgement for three consecutive times, it resends the corresponding data. This mechanism is also known as high-speed retransmission control because it is more efficient than the time-out management mentioned earlier

Flow control

TCP provides a mechanism for the sender to control the sent data flow according to the actual receiving capability of the receiver. This is called flow control. The specific operation is that the receiving host informs the sending host of the size of the data it can receive, and the sending host will send data up to this limit. This size limit is the size of the window mentioned above.

In the TCP header, there is a special field to inform the window size. The receiving host notifies the sender of the size of the buffer it can receive in this field. The larger the field, the higher the network throughput.

As shown in the figure, when the receiving end receives data starting from 3001, its buffer is full and it has to temporarily stop receiving data. After that, communication can continue only after receiving notification of an update to the send window.

Congestion control

To prevent network congestion caused by sending a large amount of data at the beginning of communication, TCP controls the sending of data at the beginning of communication by calculating the value through a slow start algorithm.

First, in order to adjust the amount of data to be sent at the sending end, a concept called “congestion window” is defined. On slow start, set the congestion window size to 1 data segment (1 MSS), and then increase the congestion window value by 1 each time you receive a confirmation reply. When sending a packet, the size of the congestion window is compared to the size of the receiving host notification window, and the smaller value is used to send a smaller amount of data.

TCP header format

Let’s look at the format of the TCP header

Note the following points:

  • TCP packets do not have IP addresses, that is at the IP layer. But active ports and target ports.
  • A TCP connection requires four tuples to represent the same connection (src_IP, srC_port, DST_IP, dst_port), or quintuple, and one protocol. But because I’m just talking about TCP, I’m just talking about quads.
  • Sequence Number indicates the Sequence Number of a packet, which is used to solve the reordering problem.
  • The Acknowledgement Number is an ACK — an Acknowledgement of receipt and an Acknowledgement of non-loss.
  • Advertised-window, also known as a Sliding Window, is used for flow control.
  • The TCP Flag, or packet type, is used to manipulate the TCP state machine.

reference

  • All that TCP stuff
  • Diagram to TCP/IP