preface

I started to learn TCP in university, but I still feel very abstract after learning it many times. Recently, I read kobayashi’s schematic network, and I have a different feeling. The following is my study note

The first TCP

What is the TCP

TCP is a connection-oriented, reliable protocol that supports byte stream transport

Connection-oriented: If the UDP protocol provides the service of the ancient pigeon message, then TCP can provide the service of the telephone, before using TCP to send data, the application must establish a TCP transport protocol between the source port and the destination port

Support for byte stream transmission: a stream is like a water pipe, what is put in one end and what can be taken out from the other end as is, which is a transmission process without loss, repetition and disorder

Reliable: The TCP acknowledgement and retransmission mechanisms ensure that packets can reach the receiving end

TCP Packet Format

  1. Source port number, destination port number: it is equivalent to an address. Without an address, data cannot be sent to any application
  2. Serial number: the initial value is a random number generated by the computer during connection establishment. Each time data is sent, the size of the data bytes is accumulated to solve the packet disorder problem
  3. Confirmation number: indicates that the receiving end correctly receives the byte with the serial number N. The sender is required to send the packet segment with the serial number N+1 next to avoid packet loss
  4. Header length: The TCP header length is calculated as a unit of 4B. The fixed length is 20B, and the variable length is 40B. Therefore, the value of this field ranges from 5 to 15
  5. ACK: This bit is 1 in all packet segments sent after the connection is established, except for SYN packets sent during the initial connection establishment
  6. RST: if this bit is 1, it indicates that the TCP connection is abnormal and must be forcibly disconnected
  7. SYN: If the bit is 1, it indicates that a connection is expected and the sequence number is synchronized
  8. FIN: If the bit is 1, the sender finishes sending packets and wants to release a TCP connection
  9. Window size: The sliding window mechanism, described in more detail below
  10. Checksum: Used to verify whether errors occur during the transmission of the entire packet
How do I uniquely identify a TCP connection?

Quad: source address, destination address, source port, destination port

The source IP address and destination IP address are in the IP header to locate the host

The source and destination ports are in the TCP header to locate processes

TCP three-way handshake

  1. Both the client and the server are in the CLOSE state at first, and the server actively listens to a port and changes to LISTEN
  2. The client sends a connection request packet to the server process in LISTEN state. In this packet, the SYN bit is 1, the SEQ value is randomly generated, and the state enters SYN-send
  3. After receiving the connection request packet from the client and agreeing to the connection, the server sends an acknowledgement packet to the client. In this packet, the SYN, ACK bit is 1, seQ is a random value generated by the server, and the acknowledgement number is SEQ + 1. Then the client enters the SYN-received state
  4. After receiving an acknowledgement packet from the server, the client sends an acknowledgement packet to the server. The ACK bit of the packet is 1, the acknowledgement number is seQ + 1, and the packet enters the ESTABLISHED state. Note that the packet can carry data from the client to the server
  5. After receiving the packet, the server enters the ESTABLISHED state
Why three handshakes?
  1. It takes three handshakes to prevent the initialization of a history connection

The first request from the client connection message, because of the network congestion for a long time did not reach the server, the client didn’t receive the confirmation message, and sends a request connection message, results than new old SYN SYN first arrived in the server, the server returns a confirmation message to the client, the client according to its own context, If the connection is a historical one, it sends an RST packet to the server to terminate the connection, saving the server’s resources (if the connection is not terminated, the server wastes a file descriptor’s resources). Two handshakes is obviously impossible

  1. Synchronize the initialization sequence number of both parties

After a TCP connection is established, the serial number can solve problems such as packet disorder and data duplication

On the first handshake, the client sends its initialization sequence number to the server. On the second handshake, the server sends its initialization sequence number to the client. If there is no third handshake, how does the server know that the sequence number has been synchronized to the client? What if the bag with the second handshake is lost?

A three-way handshake can set up a TCP connection, and a four-way handshake can certainly set up a TCP connection, but a three-way handshake can reliably set up a TCP connection, so why waste another communication?

  1. The three-way handshake ensures that both the client and server can send and receive data

Client request connection for the first time to prove their ability to talk no problem, request connection message service received, send confirmation message after, to prove their listening, speaking ability all have no problem, the client receives the service side to prove himself to establish the connection confirmation message after hearing is no problem, the two sides can normal communication, if do not have the third handshake, What if the client doesn’t have the ability to receive data?

TCP waved four times

  1. The client sends a connection release packet. The FIN bit of the packet is 1, and seQ is the sequence number of the last byte sent by the client plus 1 (assume u). Then the client enters the FIN-WaIT-1 state
  2. After receiving the connection release packet from the client, the server sends an acknowledgement packet to the client. The ACK bit of the packet is 1, the sequence number of the last byte of the server sender is plus 1, and the acknowledgement number is seQ + 1 of the client. Then the server enters close-wait state
  3. After receiving the confirmation packet from the server, the client enters the FIN-WaIT-2 state. In this case, the TCP connection between the client and the server is disconnected, but the TCP connection between the server and the client is still in the half-connection state
  4. After sending data, the server sends a connection release packet. The FIN and ACK bits of the packet are 1, and the SEQ number is W (some more data is sent in the half-closed state). Then, the server enters the last-ACK state
  5. After receiving the connection release packet, the client sends an acknowledgement packet to the server. The ACK bit of the packet is 1, the SEQ is U +1, and the acknowledgement number is seQ +1. Then the client enters the time-wait state.
  6. After receiving the acknowledgement packet from the client, the server enters the CLOSE state. After 2MSL, the client also enters the CLOSE state and waves four times
Why is the WAIT TIME of time-wait 2MSAL

Maximum Segment Lifetime (MSL) indicates the Maximum Segment Lifetime for two reasons

  1. If the ACK packet is lost, B will retransmit the connected packet. A can receive the retransmitted ACK packet within 2MSL. Then A retransmits the ACK packet and restarts the 2MSL timer. Both A and B can enter the CLOSE state normally. If user A does not have the time-wait TIME, user B cannot enter the CLOSE state after ACK packets are lost
  2. 2MSL can ensure that all packet segments of the old connection disappear from the network. If there are packets of the old connection in the network, the problem as shown in the following figure may occur

The SYN attack

What is a SYN attack

During the TCP three-way handshake, the Linux kernel maintains two queues, one for

  • A half-connection queue is also called a SYN queue
  • Full connection queue, also known as accept queue

Server after receiving the client a SYN requests, the kernel will store the connection to the connection queue, and sends a SYN + ACK to the client, then the client will return an ACK server after receiving the third handshake ACK, the kernel will connect the connection from half a queue to remove, and then create a new connection completely, and added to the accept queue, Wait for the process to call accept to retrieve the connection

After the server returns an acknowledgement packet, it stores the connection in the half-connection queue and waits for ACK packets. Based on this principle, an attacker can continuously send SYN packets. Each connection request is stored in the half-connection queue for a period of time, which occupies the space in the half-connection queue and prevents the server from receiving normal requests

How do I defend against SYN attacks
  1. Increase the half-connection queue
  2. Enable TCP — Syncookies
  3. Reduce the number of SYN+ACK retransmissions. When a server receives a SYN attack, a large number of TCP connections in the SYN_RECV state are retransmitted. The TCP connections in the SYN_RECV state are disconnected only when the number of SYN+ACK retransmissions reaches the upper limit

DDOS attack

Ruan Yifeng big man’s article has been very easy to understand, worth reading

www.ruanyifeng.com/blog/2018/0…

Sliding Windows with confirmation and retransmission mechanism

TCP uses a sliding window protocol in bytes to control the process of sending, receiving, acknowledging, and retransmitting the byte stream

  1. TCP USES two cache and a window to control byte stream to send, send the TCP has a cache, is used to store application to send data, the sender of the cache Settings a send window, as long as this window value 0 can not send message segment, TCP receiver also has a cache, the receiver will now properly receive bytes written to the cache, Wait for the application to read. The receiver sets a receive window equal to the size of how many byte streams the receive cache can continue to receive
  2. The receiver uses the TCP header to inform the sender of the correct byte number that has been received and the number of bytes that the sender can continue to send

Let’s take a look at the sending window first

In order to verify that the byte stream is correctly transmitted, the transmission status of the byte stream must be tracked. According to the transmission status, the bytes sent can be divided into the following four types

  1. Sent and acknowledged, for example, the 19 bytes in blue are received and acknowledged by the receiving end
  2. Sent but no acknowledgement received
  3. Not sent yet, but the receiver is ready to receive, and the receiver has space
  4. It has not been sent and the receiver is not ready. The receiver does not have space

The third type of window is also called available window

The second type of window and the third type of window are called the sending window, the sending window is determined by the receiver, if the receiver does not have space, the sender sends no matter how much

In the figure above, when the sender receives the confirmation numbers 20 and 21, if the sending window also changes, the sliding window moves two bytes to the left

In the figure above, the sending window is all used up, indicating that the receiver is temporarily unable to receive new data

Now look at the window on the receiving end

The receiving window is relatively simple, and the size of the receiving window is sent to the sender

Is the send window necessarily equal to the receive window?

This is not certain. The reading speed of the application on the receiving end is not invariable, and the packet notifying the sender on the receiving end also has a delay, so the relationship between the two is approximately equal

The retransmission strategy

In the above discussion, segment loss is not considered, but in real networks, segment loss is inevitable. What does TCP do when segment loss occurs?

Select the retransmission

Choose the retransmission in the received byte stream number allows the receiving end of discontinuity, if the serial number of these bytes are receiving within a window, the first to complete the receive window bytes of, then will lose the serial number of bytes sent to the sender, the sender only need retransmission lost message segment, without the need for a retransmission has received message. This strategy requires the receiver to add a SACK to the TCP header options field, which can send the cached map to the sender, so that the sender can know which data was received and which was not, and can only retransmit the lost data

Timeout retransmission

First, what are RTT and RTO

Round-trip Time (RTT) : round-trip time of a packet from the sending end to the receiving end

RTO (Retransmission time-out) : indicates the timeout Retransmission Time

Setting timeout retransmission times is important, so let’s look at the following two scenarios

If the value is too low, packets correctly received by the receiving end may be retransmitted, causing duplicate received packets. If the value is too high, a packet is lost and the sending end waits for a long time, which reduces the efficiency

The network environment is constantly changing, so RTO is not invariable. The calculation of RTO is quite complicated. Interested students can study it on the Internet in detail

Flow control

Why flow control?

The purpose of traffic control is to control the sending rate of the sender and ensure that it does not exceed the receiving rate to prevent packet loss because the receiving end is too late to receive the delivered byte stream

How to control flow?

The receiving end selects an appropriate RWND value according to the receiving ability and writes it to the TCP header to notify the sender of the receiving status. The sending window of the sender cannot exceed the value of the receiving window. The TCP header window value is in bytes

  1. When the rate at which the receiving application process reads bytes from the cache is greater than or equal to the rate at which the bytes arrive, the receiver needs to send one in each acknowledgementNon-zero windownotice
  2. When the sending speed is faster than the receiving speed, the receiving buffer will be fully occupied, and the receiving end must send oneZero windowThe notice. The sender receives the packetZero windowWhen a notification is received, stop sending until it is receivedNonzero windownotice

As shown in the following figure, TCP adopts the sliding window control mechanism to coordinate the sending speed of the sender and the receiving speed of the receiver, thus realizing the function of flow control

Congestion control

Why congestion control when you have flow control?

Is the focus of the flow control on the sending end – local control at the receiving end, the congestion control is focus on the overall control of the total into the network packet, imagine in front of your house is so big, high street Monday morning peak, neighbor vehicles has stuffed the boulevard is dead, you go out not plug in the road now? Unfortunately, the TCP retransmission mechanism retransmits packets without receiving them, aggravating the congested traffic

So, congestion control

How to do congestion control?

Four algorithms are used

  1. Slow start
  2. Congestion avoidance
  3. Fast retransmission
  4. Fast recovery

In a TCP connection, the sender needs to maintain a congestion window (CWND) status parameter. The size of the congestion window is dynamically adjusted according to the network congestion. As long as there is no congestion on the network, the sender will gradually increase the congestion window. When there is congestion, the congestion window will immediately decrease.

When the sender starts to send data, it does not know the load state of the network. In this case, the slow start algorithm can be used to gradually increase the congestion window from small to large

Assume that the initial value of the congestion window is 1. If a packet is sent to the receiver, the receiver confirms the packet within the specified time range, indicating that the network is not congested. The congestion window is doubled to 2, and two packets are sent to the receiver. The receiver sends an acknowledgement within the time range set by the timer, indicating that the network is not congested. The congestion window is doubled to 4, and four packets are sent to the receiver. The receiver sends confirmation within the time range set by the timer, indicating that the network is not congested. And it keeps growing exponentially

However, in order to avoid network congestion caused by excessive congestion window growth, a parameter – slow start threshold (SSthRESH) needs to be defined.

  1. CWND < ssTHresh, using the slow start algorithm
  2. CWND >= SSTHRESH, using the congestion avoidance algorithm

In the congestion avoidance algorithm phase, the congestion window grows not by doubling, but by incrementing by 1 each time, becoming linear

When the acknowledgement packet does not arrive within the specified time, the network is congested, and the slow start threshold (SSthRESH) is set to 1/2 of the maximum CWND timeout value.

CWND is reset to 1 to restart the slow start algorithm

Once appear congestion, start way too progressive for the slow start algorithm, have such a situation, when the sender sends a message continuously M1 ~ M7, only M3 lost in transit, and the M4 ~ M7 can be received correctly, then an M3 timeout can’t according to simply determine congestion, in this case, the need to use fast retransmission and fast recovery algorithms

If the receiving end does not receive M3, it cannot confirm M4. Instead, it should send three consecutive confirmation messages to M2 to request the sending end to retransmit the unconfirmed packets as soon as possible

Fast retransmission with the fast recovery algorithm, the fast recovery algorithm provides:

When the sender receives repeated confirmation for M2 for three consecutive times, the sender immediately sets the congestion window to the maximum value of the congestion window pair 1/2. Execute the congestion avoidance algorithm, and the congestion window grows in the current way

Grab the bag

With that said, what does a TCP packet look like

Tcpdump is used to capture packets that request a certain degree

Perform a request for a degree

After you close the tcpdump file to the Wireshark, you can view the three-way handshake, HTTP request, and four-way wave

The following figure shows the packet for the first handshake. By default, the serial number and confirmation number displayed in Wireshark are relative, so you can see that the packet starts from 0

If you look at this bag, you can see that right below Window size is a Calculated Window size, what is that?

TCP header window size is only 16 bits, that is, the maximum window size is 65535 bytes, for modern network transmission requirements, this is obviously not enough, so TCP introduced the TCP window scaling option as a window scaling factor, the range is 0-14, indicating that the window can be expanded to the original 2 power n, In the figure, 4096*64=262144, indicating that the scale factor is 6

The scale factor can be seen in the first handshake option

In the options, we can see many familiar faces, such as MSS and SACK. The options are in the following formats

Common commands

(TCP) flags) syn = = 1) && (TCP) analysis) retransmission) talked about at the receiving end above did not receive the package or the recipient’s confirmation packet loss, retransmission will happen, use this command can be selected for the first time shaking hands to heavy bag, we can see, The time pattern of retransmission is 1s, then 2s, 4s, 8s, 16s, and then the end of retransmission. Since I am using a MAC, I can’t find how to set the number of retransmission.

(tcp.flags.reset == 1)&&(tcp.seq == 1) This command can filter packets whose handshake was rejected

Tcp.analysis.zero_window: Grabs zero window packets

The resources

Illustrated Network by Kobayashi Coding: it is very well written. I recommend you to read it and benefit a lot