Author: Sister Yuan

Tcp is a transport layer protocol used to transfer data between networks. Used to deliver data for application delivery to the network layer. Or deliver network-layer data to the application.

Tcp features:

The characteristics of describe
Connection-oriented transport layer protocol TCP connections are virtual, logical connections, not real physical connections. Why do we connect? The advantage of connection-oriented is that we maintain state information about the subsequent groups, who arrived, who lost packets
End-to-end connection 1. The transport layer provides logical end-to-end communication for applications, port to port. 2. The difference between end-to-end and point-to-point: Point-to-point refers to the data link layer, because the data link layer is responsible for the communication between two connection points and sends the data frame to the next adjacent node through the MAC address. End-to-end is the transport layer, only responsible for the delivery of data from the upper layer to the receiver, which does not care how many nodes across, only care about the sender – receiver.
Provide reliable delivery How to do that? Please see below
Provides full duplex communication What is full duplex? That is, the communication parties can both send and receive data without interfering with each other at the same time. Half duplex means that only one party can send data and the other party can receive data.
Word oriented stream The content of the data transmitted is a byte stream.
1. The confirmation number and sequence number in the Tcp packet are four bytes and 32 bits, which can be expressed in the range of [0-2^32-1]. 4 GIGABytes of data can be numbered. 2. Serial number Indicates the serial number of the first byte of the sent TCP packet. 3. Acknowledgement number Indicates the serial number of the next packet that the receiver expects. That is, the byte data with the confirmation number -1 has been correctly received.Copy the code

Use a three-way handshake and a four-way wave to understand the role of the serial number and confirmation number.

Three-way handshake and four-way wave:

Three handshake process

  1. The client generates a random sequence seq=x in the packet with the SYN flag at position 1 (indicating that this is a packet requesting a connection) and sends it to the server.
  2. The server generates a packet, sets the SYN flag position to 1, ACK flag position to 1(the ACK bit takes effect only after the confirmation number is 1), ACK = X +1(it indicates that the packet with serial number X has been received successfully), generates a random sequence number seq= Y, and sends the packet to the client.
  3. The client sends ack= Y +1 to successfully receive the data packet. After receiving the data packet, the server transmits the data packet.

Four waves:

  1. The client generates a random sequence seq= X +2 and sends the packet to the server with the FIN flag set to 1(indicating that the packet is disconnected), and goes to 2.
  2. After receiving the packet, the server returns an acknowledgement packet (ACK =x+3), but the server may not have finished transmitting the data, so the client waits with the server in close_wait state, go to 3.
  3. After the server finishes sending data, the FIN packet is sent to the client to generate a random sequence number (SEq = Y +1), then go to 4.
  4. After receiving the message, the client returns the confirmation packet. After waiting for 2MSL, the client shuts down.

Then why three handshakes?

  1. We want to ensure that the communication connection is full duplex. Then the communication channels of both sides must be normal to transmit information, but how to confirm that the communication channels of both sides are normal? At least three communication times can confirm that the communication channels of both sides are normal.

  2. An error occurs in case an invalid connection request segment is suddenly sent to the server.

Specific scenario of Reason 2:

The segment of the first connection request packet sent by the client is not lost, but is detained on a network node for a long time. As a result, it is delayed to reach the server until a certain time after the connection is released. Originally, this is an invalid packet segment. However, after the server receives the invalid connection request packet segment, it mistakenly thinks it is a new connection request sent by the client. Therefore, the server sends a confirmation message to the client, agreeing to establish a connection. However, the client does not transmit data at this time. Therefore, the server allocates resources for the client and maintains the connection.

Why set TIME_WAIT = 2MSL after the client sends the last acknowledgement packet when Tcp is waved four times?

1) To achieve reliable release of TCP full-duplex connections. 2) To make the old data packet disappear in the network due to expiration. The reason for reliable release is that packet loss may occur after the last ACK packet is sent. After packet loss, the server retransmits timeout packets and retransmits FIN packets. If the client is in the close state, RST reset abnormal packets are generated, which makes the server mistakenly think that a major problem has occurred.

Reason 2:

In communication connections, connections depend on quads (source IP, destination IP, source port and destination port). If a connection is established for the first time, some packets sent are stranded in the network and do not reach the server. Soon after this connection is disconnected, another connection is made using the same quad. This time, the packets that were stranded in the network arrive at the server (the server does not go back to verify whether the packets are from the previous connection or this time), and the problem will occur that the packets from the previous connection will affect the new connection. Therefore, after the last confirmation packet is sent, it is necessary to wait for all packets of this connection to fail before the connection is truly disconnected.

But why 2MSL instead of MSL? TIME_WAIT starts after the client sends an ACK packet, and restarts after receiving a FIN packet from the server if packet loss occurs. However, MSL is the time for the packet to disappear in one direction. If retransmission occurs, the packet from the server to the client will certainly disappear. Therefore, the client needs to wait for 2MSL to wait for the invalid packets in both directions.

What is reliable transmission? What are the requirements for reliable transmission?

Be able to transmit data correctly and ensure data correctness, error-free, non-loss, non-duplication and sequential arrival.

How to ensure reliable transmission? What protocols can be used as support?

The checksum in the header of a Tcp packet ensures error-free data. Once an error occurs, the packet is discarded and the sender resends the packet.

ARQ protocol. Automatic retransmission request protocol is an error correction protocol. Reliable transmission is ensured by positive confirmation of timeout retransmission. Reliable transport over unreliable transport networks.

Stop waiting for ARQ

To send a packet, you must wait for the recipient’s confirmation before continuing to send the next packet. Otherwise, the packet is retransmitted.

Advantages:

  • This ensures that data is not lost and that the receiver receives the packet.
  • Ensure that data does not duplicate. After receiving duplicate data with the same serial number, do not deliver the data and discard the data.

Disadvantages: Low transmission efficiency, low channel utilization, only one packet in transmission at a time.

Continuous ARQ protocol – Pipeline mode

The sender sends a continuous set of packet data and waits for an ACK of the data. You don’t have to stop every time you send a packet. The sender sends a continuous set of packet data and waits for an ACK of the data. You don’t have to stop every time you send a packet.

Advantages: Improved transmission efficiency and channel utilization.

Disadvantages: There are negative effects in the case of go-back-N and bad communication lines. That is, once packet loss occurs, all subsequent data is lost by default and all subsequent data is retransmitted, wasting bandwidth.

How does Tcp implement reliable transmission?

A sliding window protocol in bytes, based on the pipeline mode of continuous ARQ protocol, set cache, using the mode of selective retransmission.

Sliding window protocol maintains the sender data cache, to send a set of data at a time, waiting for the data confirm the package, if you receive serial data to confirm the package, so the sender window after the cache the data (no longer) and frontier (ready to send a new data) sliding forward, send the following data, the receiver to deliver data to the application of sequential arrive, The rear edge and front edge of the receive window move forward, ready to receive new data. If the data does not arrive in sequence, the data is cached in the receive window to wait for the expected data. If the expected data does not arrive, only the expected data is retransmitted.

Why flow control?

The sliding window protocol ensures that go-back-n retransmission occurs when the data packet is no longer lost. However, if the cache of the receiver is insufficient and the sending window sends data too late for the receiver to receive, packet loss and timeout retransmission will occur.

Flow control means that the sender limits the size of the sending window by setting RWND (receive window size) in the confirmation packet.

Flow control is end-to-end.

Why congestion control?

What is congestion?

In a certain period of time, when the demand for a resource in the network (bandwidth, caches and processors in switch nodes, etc.) exceeds the available portion of the resource, the network performance deteriorates. This situation is called congestion. In general, when network resources are in short supply, performance deteriorates.

What is congestion control?

Congestion control prevents too much data from being injected into the network so that routes or links in the network are not overloaded. Packet loss occurs when the network is not in good condition, and packet loss causes timeout retransmission. The rate at which packets are sent to the network is not controlled, and the network eventually tends to break down. Therefore, we need to implement congestion control in time when the network is congested to prevent the network from getting worse.

Congestion control algorithm:

Slow start, congestion avoidance, fast retransmission, fast recovery, these four algorithms are the whole network in the process of operation, dynamic adjustment at any time, at some trigger points at any time to change the algorithm, together with the network congestion control.

Congestion window CWND, sender's send window size. Ssthresh Slow Start threshold. 1. CWND < SSTHresh, continue to use slow start algorithm; CWND > SSTHRESH, stop using slow start algorithm, use congestion avoidance algorithm; 3. CWND = SSTHRESH, both slow start algorithm and congestion avoidance algorithm can be used;Copy the code
Congestion control algorithm The detailed process The illustration
Slow start Procedure: The initial congestion window CWND is set to 1 to test the network. Received an ACK acknowledgement, congestion window +1. After one round (RTT), the entire window is equivalent to a doubling of amplification. CWND will eventually reach an SSTHRESH slow start threshold. A process in which the size of the congestion window increases exponentially. Each round is doubled, but the doubling is for the final result, and each ACK is actually increased by one. The slowstart algorithm is used so that hosts that are new to the network do not send a lot of data at first, but send a group first to test the network condition, and slowly increase the size of the window.
Congestion avoidance After each transmission turn, the CWND +1 of the sender makes the congestion window slowly increase in accordance with the linear law. The network is less prone to congestion. The congestion window increases linearly. Each round of congestion window increases by one. If packet loss occurs during the congestion avoidance algorithm, it indicates that the network is congested. Methods: SSTHresh = CWND /2; CWND = 1. Back to slow start. Disadvantages: Too large to reduce the size of the sending window, reducing the throughput of TCP connections.
Fast retransmission fast recovery Fast retransmission scenario: When sending a set of data, the sender receives three duplicate Ack packets, that is, the expectation for a certain packet. The sender considers that the expected packet is lost and retransmits the packet immediately rather than waiting for a timeout to retransmit the packet. Purpose: To enable the sender to know the loss of individual packets as soon as possible and retransmit packets without waiting for RTO(timeout). After fast retransmission performs retransmission, the “multiplication reduction” algorithm is executed, and the threshold value is halved. CWND = ssTHresh. Continue the congestion avoidance algorithm. Advantages: When fast retransmission occurs, the sender thinks that it can receive these three ACKS, which proves that the network is not so bad. Therefore, it only lowers the threshold of slow start, but does not reduce the congestion window to 1. Then, it starts the congestion avoidance algorithm, which makes the network experience great turbulence.

Background: When a host suddenly connects to the network, assume that the initial threshold ssTHresh =16

Congestion control process

Timeouts and three duplicate Acks occur not only during congestion avoidance algorithms, but also during slow start algorithms

Flow chart of congestion control

The purpose and difference of congestion control algorithm

  • The slow start and congestion control algorithms are often used as a whole to reduce the number of packets sent to the network by the host so that the congested router has enough time to clear the backlog of packets in the queue.

  • Fast retransmission and fast recovery algorithms are designed to reduce packet loss resulting in timeout, resulting in false perception of congestion and slow start. Improved network throughput.

reference

  • Computer Networks (7th edition) — Xie Xiren