A few days ago, I sent a circle of friends, and found that the girl who had a crush on me for a long time gave me a thumbs-up, so I tossed and turned that night, sleepless all night! Wondering if a girl has feelings for me? Why else would you suddenly give me a thumbs-up? Why don’t you take the opportunity to make it clear?

So the next day I simulated the words of many vessions in my heart, even breathing repeated practice. In the evening, I dialed the sister’s wechat voice, before the other side of the mouth I could not resist the idea of the heart, began to talk, a frenzy of expression… It was five minutes straight, and everything was so natural!

But after I say that but half a day did not wait for the girl’s response… After a long time, the voice of the other party was heard: “Hello! A: hello! I didn’t hear a word of what you were saying. I was shopping with my boyfriend…” .

I hung up the phone, I also for my failure to vindicate in-depth summary! The reason is because I didn’t learn TCP well!

If I knew TCP, I would at least ask “Are you there? ! First establish a reliable connection, ensure that the connection is normal before starting to express!

If I understand TCP, I need to be confirmed by the other party during my speech, so that the other party can hear every word I say! So THAT I can declare success!

So it was all because I didn’t learn TCP well, so I went to the library…

Let’s take a look at the definition of TCP:

Transmission Control Protocol (TCP) is a connection-oriented, reliable, byte stream – based transport layer communication Protocol. TCP is a transport protocol designed to provide reliable end-to-end byte streams over unreliable Internet networks.

We know every word of it, but together it’s not so easy to understand! So let’s boil down some key words, which I highlighted above: connection-oriented, reliable, byte stream based, transport layer, protocol, end-to-end! Understand these keywords also understand the principle of TCP implementation, then we will start from these keywords for analysis!

The transport layer

Let’s start with the transport layer, and since we can look at TCP at a higher level, let’s take a look at the classic OSI seven-layer network reference model:

When we need to exchange data over the network, we need to go through these layers. Each layer has a related implementation of the ground, we will talk about TCP today is a transport layer implementation of the ground. May we usually when talking about the transmission layer naturally think of TCP, but TCP is only a kind of implementation of the transmission layer, other more common transmission layer protocol and UDP etc.!

I know the dry words are too abstract for you, so I’ll grab a bag to have a look and make these layers more concrete! All packages in this article were requested via Postman and then captured using wireShark! If the two software do not understand the friends can go to understand ha, here but more. We type the domain name www.17coding.info into postman, send the request, and wireshark catches the packet.

The graph shows the relationship between each layer and the captured packets. Yi! Didn’t we talk about the 7-layer network reference model above? Why do packets only have 5 layers? Note reference two words, 7 layer model is a theoretical model, the actual network is often the application layer, session layer, presentation layer for the application layer!

What is an agreement?

When it comes to an agreement, it’s an agreement that both parties abide by! For example, in this article I wrote, you can read every word I wrote and understand my meaning, because we are following the Chinese grammar, which itself is a kind of agreement. For example, when we write code, we must follow the prescribed syntax so that the compiler can compile correctly.

There are also many protocols in computer networks, such as common application layer protocols HTTP, FTP, DNS and so on. Common transport layer protocols are TCP, UDP, etc… In fact, these protocols are a specification that both sender and receiver follow. If we follow the specification, we can also become implementors of the protocol, such as writing our own Web server to handle user requests. We can even create our own protocols for others to use!

TCP header format

The TCP protocol must have a specification as well. In this way, the communication parties can identify each other’s data packets for data exchange. Let’s first look at the FORMAT of TCP packets

A TCP packet contains a data header and a data body. The header has 5 lines of fixed length and 1 line of variable length. The first five lines on the graph are fixed lengths! Each line of fixed length occupies 4 bytes (32 bits). So the fixed length of the header is 5*4=20 bytes!

Here we can grab a packet to make a better impression, we still send a request to www.17coding.info and look at its TCP part of the packet

Let’s analyze TCP headers line by line:

The first line:

1. Source port: sender port

2. Destination port: receiver port

Before we said TCP is end-to-end, here can be a good embodiment! Each packet has sender and receiver ports. Here each port occupies 2 bytes (16 bits).

Line 2, Line 3:

1. Serial number: TCP is byte stream oriented. Data is stored and sent in cache in blocks. Serial number is used to mark the first byte of a packet as the number of bytes of the whole data.

2. Confirmation number: After each request is received, the receiver will reply to the sender, telling the sender how many bytes it has received and the number of bytes from which the next packet needs to be sent. In this case, the value is generally equal to the received serial number + the received packet data part length.

The serial number and confirmation number are necessary to ensure the reliability of TCP. We will analyze them in detail through packet capture later. The ordinal and confirmation numbers each take up 4 bytes (32 bits)!

Line 4:

1. Data offset: this is more appropriately called header length. As mentioned earlier, TCP headers are partially variable, so you need to identify where the data portion of the packet starts. This value takes up four bits.

2. Reserved: unused for extension. This value takes up three bits.

3. Signs: there are 9 signs in total, and each sign has 1 bit. The packet capture screenshot above can see the 9 identification bits! 3.1 NS: Nonce, which is related to ECN explicit congestion notification. 3.2. CWR: The CWR flag and ECE flag are used in the ECN field in the IP header. If the ECE flag is 1, it notifies the peer that the congestion window has been reduced. 3.4 URG: Urgent, used to add plug at the sender. For example, if you need to stop downloading files halfway down, you need to send an urgent request telling the other party to stop sending data. Packets are not queued. 3.5. ACK: Acknowledgment, mark it as an Acknowledgment. 3.6 PSH: Push, corresponding to URG, is used for the receiver plug. 3.7. RST: Reset: indicates that a serious error occurs and the TCP connection may need to be re-created. If we open a website and it doesn’t come up, we refresh F5 and reject the previous data packets. 3.8. SYN: Used for synchronizing requests. You wear it when you shake hands! 3.9. FIN: Used when the communication ends and the connection is released. They wear it when they wave!

4. Window: Both sender and receiver have corresponding sending window and receiving window. Before communication, the communication parties negotiate the size of the window. The sender sets its own send window according to the receiver’s receive window, and the send window is also limited by the congestion window, as discussed in the congestion Control section! The window is adjusted according to the processing power of the receiver during the sending process. This value is very important for reliable TCP transmission and traffic control! This value takes up 16 bits.

Line 5:

1. Checksum: Used to check whether the packet is complete or modified. This value takes up 16 bits.

2. Urgent pointer: the pointer used to mark urgent data in this paragraph, that is, to indicate that the data from the header of the data packet to the specified position is urgent data, only when the flag bit URG is set. This value takes up 16 bits.

Line 6:

1. Options: There are also some important data in the options. Let’s pick a few

1.1. MSS: The full name of MSS is Maximum segment size. The Maximum length of data that can be carried by each packet segment negotiated by both parties (excluding the header).

1.2, WS: THE full name of WS is Window scale, also called Window factor! Is used to resize the window. We talked about the window size field, but what does this window factor do? Early network bandwidth and hardware configuration are poor, so the maximum window size is reserved only 16 bits, that is, the maximum value can be set to 65535. With the development of hardware and network, 65535 is not enough. So a WS option was added to extend it! If WS is set, the actual window size is equal to the window size multiplied by the window factor.

1.3 Selective ACK:SACK is called Selective ACK. Selective ACK is based on cumulative ACK. SACK may be sent only when the packet is received out of order. If the receiver receives the later packet and finds that the previous packet is lost, it will notify the sender which packet segment is lost and needs to be retransmitted!

2. Padding: This field is used to make the entire header a multiple of 4 bytes. There are many similar uses in Java!

We find a packet and look at its detailed header data:

1. The red section shows the TCP header with a length of 32 bytes and the options section with a length of 12 bytes. We said the TCP header has a fixed length of 20 bytes, so 20+12=32. 2. The yellow line has a window size of 259 bytes and a window factor of 256. So the actual window size is 259*256=66304!

What about connection-oriented

As can be seen from the example of my failure in confession, I started to confess before I could ensure the normal connection, resulting in the other party did not hear me because of poor signal when I finished. This wouldn’t have happened if I had made sure the connection was working! We said TCP is connection-oriented, but how is TCP connection-oriented?

What do three handshakes tell us?

Yes, it all starts with a handshake! We all know that TCP requires three handshakes to establish a connection. What does each handshake tell us? How about two handshakes? Let’s start with a scene where the phone is connected:

A: Hello, can you hear me? B: I can hear you. Can you hear me? A: I can hear you, too. … .

In order to ensure the reliability of the call, it is usually necessary to confirm the call through the above three conversations before a formal call. So were these three conversations necessary? What is the necessity of each conversation?

A: Hello, can you hear me? B: I can hear you. Can you hear me? A: I can hear you, too. Let B know that A can hear you… .

Only after three conversations can you confirm that your voice can be heard and the other person’s voice can be heard. That’s how follow-up conversations can begin. Here we have to offer the classic three-way handshake:

We analyzed the process of three handshakes and the status after each handshake as follows:

Seq=x. After the first handshake request is sent, THE state of A is SYN_SENT. After the request is received, the state of B changes from LISTEN to SYN_RCVD!

2. After receiving the connection request, Host B sends A SYN=1,ACK=1(SYN indicates that B requests to establish A connection with A,ACK indicates that IT responds to A’s connection request), with the sequence number Seq= Y and ACK= (x+1). The state changes to ESTABLISHED, and the state of B is still SYN_RCVD!

3. Host A checks whether the Ack is correct after receiving it. If the Ack is correct, host A will send Ack= 1(Seq=(x+1), Ack=(Y +1), indicating that it responds to B’s connection request. After B receives A’s confirmation, A and B both change to ESTABLISHED!

A few points to note here are:

1. The SYN and ACK bits in parentheses are the flag bits in the TCP header. Seq and Ack represent serial number and confirmation number respectively.

2. After receiving the Seq sent by the sender, the receiver replies with an Ack whose value is equal to Seq+1, indicating that the sender has started to send data at the position of Seq+1.

2. After receiving the connection request from A, B sends both SYN and ACK bits in the reply, sending the connection request and the reply to A in the same packet. This is why only three handshakes are needed to establish the connection.

We still send the request to www.17coding.info, below is the package of three handshakes:

In the info column, we can obviously see that the header of the sent packet has the flag bits mentioned above, as well as Seq, Ack and other header information, as well as Win, MSS and other header options data! So the three-way handshake isn’t just about establishing a connection, it’s also about negotiating some parameters!

When the mouse I choose a row, if the packet contains a packet confirmation (also is the mark of a ACK), can the corresponding packet No listed above to see a small hook, such as the above diagram I mouse selection is the third handshake packet, in front of the second packet of shaking hands is a small hook.

Why does it take three handshakes and four waves?

With three handshakes, a reliable connection is established and data can be transferred! When the data transfer is complete, the connection must be closed, because the connection is also a resource! Closing the connection requires four waves of the hand!

Why can you shake hands three times, but wave hands four times? Can I just do it three times? In fact, there is no reason not to! For example, the following dialog scenario:

A: I’m done. Hang up when you’re done. B: Ok, I’m done. You can hang up now! A: Ok, bye. Hang up…

In this way, three conversations can achieve a wave, but in the actual network, when I send a request, the server may be a large response body, it takes a long time to transmit! Therefore, when the client initiates a disconnection request, the server responds with an acknowledgement and sends a disconnection request after all data is transferred.

A: I’m done. Hang up when you’re done. B: Ok… B:… B: That’s all FOR me. You can hang up now.

So in most cases, four waves are required! However, in my personal packet capture practice, there are also cases where three waves of the hand can complete the disconnection.

Here again we have to do the classic four wave:

We analyzed the process of four waves and the state after each wave as follows:

1. Host A sends FIN=1(FIN indicates that host A requests to close the connection) to disable data transmission from A to B. The state of A is FIN_WAIT_1!

2. Host B sends an ACK to host A after receiving the shutdown request. Host A does not send data to host B. A is in FIN_WAIT_2 state and B is CLOSE_WAIT!

3. Host B sends FIN=1 to disable data transmission from HOST B to HOST A. A is in the TIME_WAIT state and B is in the LAST_ACK! State.

4. After receiving the shutdown request, host A sends an ACK to host B. In this case, host B does not send data to host A. At this point, both A and B are CLOSED and their states become CLOSED.

As can be seen from the figure, A’s TIME_WAIT state lasts 2MSL and then becomes CLOSED. MSL (Maximum Segment Lifetime) It is the maximum time that any packet exists on the network. After this time, the packet is discarded. What does TIME_WAIT hold 2MSL for?

1. Host A sends an ACK to host B on the 4th wave. If host B does not receive an ACK for network reasons, then HOST B cannot close the connection. Therefore, after A replies for confirmation, it still needs to wait. In case B does not receive the reply, it will continue to send the request of FIN.

If you do not wait for 2MSL, the client port may be reused. If you use this port again to establish a connection to the server, there will be interference between two connections using the same quad.

Let’s look at the wave packet that sends the request to www.17coding.info:

Maybe you can’t see four wave packets immediately when you capture them! That’s because long connections are enabled by default in HTTP1.1 and later! After a request, a connection is not closed immediately, but can be used by subsequent requests to reduce the resource consumption of each re-connection! If you want to catch four waves immediately after making the request, you can set the Http header Connection:close. Now you can see the complete three-handshake and four-wave process every time you send a request.

How does TCP guarantee reliable transmission?

Ensuring reliable transmission We have already talked about connection orientation, and establishing a connection is the first step in ensuring data transmission. How can data transfer be reliable after the connection is established?

Let’s go back to the scene where we’re talking on the phone, and generally in the course of a conversation, both parties have to interact and respond to each other. Instead of one person talking and the other person saying nothing! Consider the following scenario:

A: Tell you what, I met A girl online last week. A: Then I went out to meet you yesterday. B: 666! And then what? A: Then we @#¥%… &B: Oh my God, I didn’t hear what you just said. Could you say it again? .

Such confirmation and response ensures complete and reliable communication between the two parties. TCP also uses this mechanism to ensure reliable transmission over unreliable networks. As long as I do not receive confirmation, I will consider it unsuccessful and resend it.

Stop waiting protocol

The stop-wait protocol is that after each packet is sent to the other party, the other party needs to wait for a response before sending the next packet! The agreement to stop waiting can occur in the following ways:

1. No error: A sends M1 packet to B, and B will send A confirmation to A after receiving B’s confirmation. When A receives B’s confirmation, it will send M2 packet.

2. Timeout retransmission: A sends the M1 packet to B. If the packet is lost during the sending process, A will resend the packet. A The waiting time for retransmission is slightly longer than the round trip time (RTT) of A packet.

3. Confirmation loss: If B loses the M1 packet when sending an acknowledgement to A, A will resend the M1 packet to B. Since B has already processed the packets of M1, B will discard the packet and retransmit the confirmation of M1 to A.

4. Late confirmation: If A sends packet M1 to B, B’s reply is delayed. In this case, A resends the packet M1 to B. B discards the packet after receiving it, and then retransmits the packet to confirm that M1 is sent to A. In this case, A will receive multiple acknowledgments, and will discard the second acknowledgement after receiving the late acknowledgement.

As we can see from the above, the stop wait protocol always waits until an acknowledgement is received before sending the next packet. As long as I do not receive the confirmation from you, I will think that you did not receive the packet I sent, and I will resend it! This is reliable, but will lead to low channel utilization!

Pipeline transmission

Pipelined transmission is the process of sending multiple packets at a time without having to wait for confirmation after each packet is sent. Because there is continuous data transmission on the channel, high channel utilization can be achieved!

Pipeline transmission how to ensure reliable? The sender needs to maintain the send window, if the send window is 5, the 5 packets will be sent at the same time, and then wait for confirmation! If there is acknowledgement from the recipient, the window slides and the sixth packet is sent.

If it is a single confirmation, the efficiency may be relatively low, so there is a cumulative confirmation! That is, if the sender sends packets 1, 2, 3 and 4, the receiver only needs to reply with an acknowledgement of packet 4, which means that 1234 packets have been received, and the fifth packet can be sent! If packets 1, 2, 3, 4 are sent, and the third packet is lost, how can I confirm? TCP only replies with an acknowledgement of packet 2 and optionally acknowledges packet 4 (SACK as mentioned in the TCP header option) so that the sender knows that packet 4 has been successfully sent and only needs to resend packet 3.

Continuing with the previous packet capture example, instead of confirming every packet, the receiver confirms multiple packets cumulatively:

Here we can see that the server sends multiple packets before the client makes one acknowledgement.

Flow control and congestion control

Through the front we know, through the establishment of reliable connection and confirmation mechanism, to ensure the TCP connection is reliable! But everyone uses the computer processing power is not the same, I send too fast to deal with the other party how to do? How do communication parties coordinate the frequency of sending and receiving data?

A sliding window technique in bytes

In introducing the TCP header, we have already mentioned sliding Windows and introduced the related control parameter Win! Also talking about the receive window and the send window! What was their relationship like?

Suppose A now needs to send data to B, B must first tell A how big its receive window is. A sets its own send window according to B’s receive window! The sending window of A cannot be larger than the receiving window of B. Before starting to transfer data, the initial window Settings are shown below:

As shown in the figure above, can we see that B’s receive window is set to 10 bytes, so A’s send window cannot be set to more than 10 bytes! If data transmission starts, A will encapsulate the data into multiple packets for transmission, as shown in the following figure

A’s window will not slide until it receives confirmation from B, which means it can send up to 10 bytes of data. If B receives the data and replies to confirm to A, then A’s window slides, as shown below:

So, A can send the 11th and 12th bytes again. If B’s processing power becomes weak, you can also inform A to reduce the sending window! This is also very good coordination of both receiving and sending capabilities! This is a good implementation of TCP reliable transmission and flow control!

The above data packet continues to be sent. If the data packet consisting of 3, 4 and 5 bytes is lost in the process of sending, but the following data is received, will the sending window of A move?

If this is the case, A’s send window will not move. When B receives subsequent packets, the Ack to A is set to 3, and A SACK (described in the TCP header options) is set in the options to tell A which part of the data received and which part of the data needs to be retransmitted.

Congestion control

Using sliding window technology can coordinate the sending and receiving ability of both parties. However, the network situation is very complex and there can be tens of thousands of senders and receivers on the same network! If everyone needs to transmit data and occupy the network, without proper control measures, the whole network will be blocked or even broken.

If I want to drive from Shenzhen to Guangzhou, I’ll take the expressway. If I was the only one driving, it would be clear! But I don’t own the highway. Everyone can use it! So a holiday, we all swarm, and high-speed transport capacity will not be adjusted because of the holiday! At this time often need traffic control, limit the flow of measures to ease the traffic!

1, the green line represents the ideal situation, if the highway throughput is 100! When no more than 100 vehicles need to pass, all vehicles can pass! When more than 100 vehicles need to pass, the number of vehicles passing each time is 100, which can provide a stable load.

2. Red represents the situation without any traffic control, if the throughput of the expressway is 100! When no more than 100 vehicles need to pass, there will be a slight traffic jam! But as the number of cars increases, there will be serious congestion and even paralysis!

3, blue represents under traffic control, if the throughput of the expressway is 100! When no more than 100 vehicles need to pass, there will be a slight traffic jam! But with the increase of vehicles, traffic has been kept a higher load, there will be no paralysis!

The network is like the highway, the transmission of data packets is like the vehicle to pass, and TCP is more like a traffic police, maintaining the order of data transmission! So what does TCP do?

Slow start with congestion avoidance

The sender maintains a CWND (congestion window, note that the congestion window cannot be larger than the send window mentioned above!). , initially set the congestion window to 1. If you find that the packet is not lost, adjust the congestion window to 2! If no packet is lost again, adjust the congestion window to 4! So it doubles every time it gets to 16! Then 17, 18, 19 and so on, one by one, until the size matches the send window. This is called slow start and congestion avoidance, and 16 is the slow start threshold…

There is no feeling of pushing your luck!

I can’t get in there… … I’ll just go in there and stay… … I just..

If packet loss is found in the process of sending, the congestion window is adjusted to 1 and the new slow start threshold is set to one half of the congestion threshold. That is to say, if packet loss occurs when the congestion window is 24, the new slow start threshold is adjusted to 12! If you understand the above text description, the diagram below is not difficult to understand!

Fast retransmission

We talked about cumulative confirmation, we talked about selective confirmation. This has to do with fast retransmission! If a receiver discovers packet loss, it notifies the sender of three repeated acknowledgements to resend the lost packet without waiting for cumulative confirmation. When the receiver receives three duplicate acknowledgements, it realizes that the packet is lost and retransmits it!

As can be seen from the following figure, when packet loss occurs, the Ack of the receiver is equal to 50, and the SACK selects the bytes between 60 and 89, respectively. At this time, the sender also knows that the data of 50 and 59 is lost and retransmits it!

Fast recovery

If once packet loss occurs, the congestion window becomes 1, which is silly. If only there was a quick recovery mechanism! TCP uses a quick recovery mechanism! When packet loss occurs, it does not start slow again, but goes directly to congestion avoidance! That is, add from the new slow start threshold!

Let’s go back to the definition of TCP. Can you understand it more?