• An overview of the
  • Multiplexing multiplexing decomposition
  • Connectionless transport: UDP
    • The characteristics of
    • message
    • The difference between UDP and TCP
  • Connection-oriented transportation: TCP
    • The characteristics of
    • MSS
    • message
    • Connection management
      • Three-way handshake
      • Why three handshakes instead of two?
    • Four times to wave
      • Why does the client end up waiting for 2MSL?
      • Why is it three handshakes to establish a connection and four waves to close it?
  • Reliability transmission
    • ARQ
      • Continuous ARQ
      • Stop waiting for ARQ
    • The sliding window
  • Flow control
  • TCP Conggestion Control Algorithm
    • Slow start
    • Congestion avoidance
    • Fast Retransmit and Recovery (FRR)
  • Application layer protocols use transport layer protocols for some classification
  • The problem
    • Q: Why is multithreading faster for large file downloads?
    • Q: Can you guess how many HTTP requests can be sent over a TCP connection?
    • Q: Why does TCP establish a connection with a three-way handshake, but close a connection with a four-way handshake?
    • Which TCP packet length is determined in the TCP three-way handshake?
    • Q:Web page request process

An overview of the

Function: The transport layer protocol provides logical communication for application processes running on different hosts. The network layer provides logical communication between hosts.

Multiplexing multiplexing decomposition

Multiplexing: Deliver the data in the packet segments of the transport layer to the correct socket multiplexing: The source host collects the data blocks from different sockets and generates the packet segments for each packet header to be transmitted to the network layer. This work is called multiplexing.

Connectionless transport: UDP

The characteristics of

  • There is no handshake between a connectionless sender and a transport layer entity on the receiving side. No delay is introduced for establishing a connection.

  • Connectionless TCP maintains connection status in the end system, including receiving and sending parameters to the cache, congestion control parameters, and serial numbers and confirmation numbers.

  • The packet head has low overhead

  • The application layer control of when and what data to send is more sophisticated. UDP packets are sent as soon as they are packaged, without TCP congestion control and other restrictions. TCP reliable delivery may have a long delay.

  • Datagram oriented

Why is TCP packet segment byte stream oriented and UDP datagram oriented?

In the packet-oriented transmission mode, the application layer sends UDP packets as long as they are received. That is, UDP sends one packet at a time. Therefore, the application must select a message of the appropriate size. If the packet length is too long, the IP layer needs to fragment the packet, which reduces the efficiency. If it is too short, the IP address is too small. UDP does not merge or split packets from the application layer, but retains the boundaries of the packets. In other words, the application layer sends UDP packets as long as they are sent, that is, one packet at a time.

While an application interacts with TCP one data block at a time (of varying sizes), TCP treats the application as a series of unstructured byte streams. TCP has a buffer, and when an application sends a block of data that is too long, TCP can cut it up and send it back. If an application sends only one byte at a time, TCP can also wait for enough bytes to accumulate before sending a segment.

Indicates the MSS value of the option field in the SYN packets that are exchanged before the TCP connection is established. The communication parties agree on the maximum packet length. If the application layer delivers too much data, it segments the data and sends it. Otherwise, the data is controlled by sliding window protocol.

message

Provides end-to-end error detection, but does nothing to recover from errors. Some UDP implementations discard damaged segments, others hand them to the application with a warning.

The difference between UDP and TCP

Connection-oriented transportation: TCP

The characteristics of

  • Connection-oriented reliable transportation
  • A connection is established between full-duplex service processes A and B. Application layer data flows from process B to process A and process B.
  • A point-to-point connection is between a single sender and a single receiver.
  • Word oriented stream

Wikipedia TCP

MSS

The sending cache is set at the beginning of the TCP three-way handshake. TCP fetches data from the sending cache and stores data from the cache to packet segments. The amount of data is limited by the Maximum Segment Size (MSS).

MSS is usually set according to MTU. MSS ensures that a TCP packet segment fits the MTU (packet +TCP/IP header). The typical MSS value is 1460 bytes.

In a word, MSS is the maximum length of data that can be transmitted in a TCP packet segment. If the data delivered by the upper layer is too large, it is segmented. The chunking process is completed at the transport layer, where the data portion of the segmented TCP packet segment is reassembled at the transport layer at the receiving end.

Note: This MSS refers to the maximum length of the data part of the TCP packet segment, not the entire TCP packet segment. The length of the entire TCP packet segment = the length of the TCP header + the length of the TCP data segment.

The value of MSS is determined by the communication parties through negotiation during the TCP three-way handshake. We all know that when Ethernet is used at the link layer, the MTU of the IP layer is 1500 bytes, so the IP datagram header (20 bytes) is removed, and the TCP header (20 bytes) is 1460 bytes. By default, the MSS value of the TCP Options field is 1460 byte = 1500-20-20. According to Internet standards, the MTU of the IP layer is 576 bytes. Therefore, the MSS value of the TCP Options field is 536 bytes = 576-20-20.

  • TCP Packet segment size (MSS) and MTU

Which TCP packet length is determined in the TCP three-way handshake? The MSS value is only present in SYN packets (don’t ask me why, I don’t know why), that is, the MSS field value is only present when SYN=1. When the client wants to download data from the server over TCP,

  • (1) The client sends a SYN request packet. The OPTIONS field of the SYN packet contains the MSS value (MSS = MTU-IP header length – TCP header length). The MSS value indicates the maximum size of data to be sent.
  • (2) After receiving a SYN packet, the server returns a SYN+ACK packet to the requestor, in which the Options field also has an MSS value.
  • (3) The communication parties select the smallest MSS in SYN and SYN+ACK packets as the MSS of this TCP connection, so as to achieve the effect of double-transmission negotiation MSS.

To sum up, you can answer the question at the beginning. The maximum transmission message (MSS) size in TCP can be determined after the second handshake.

  • What’s the difference between Ethernet and the Internet? – Answer of sand sculpture – Zhihu

message

The TCP header is usually 20 bytes.

Connection management

Three-way handshake

Refer to TCP’s three-way handshake and four-way wave

  • TCP server process first create transmission control block TCB, ready to accept the client process connection request, at this time the server enters the LISTEN state;
  • The TCP client process creates TCB and sends a connection request packet to the server. SYN=1 in the header of the packet and seq= X is selected. Then the TCP client process enters the SYN-sent state. According to TCP, the SYN segment (SYN=1) cannot carry data, but must consume a sequence number.
  • After receiving the request packet, the TCP server sends an acknowledgement packet if it agrees to the connection. In the acknowledgement packet, ACK=1, SYN=1, ACK= X +1, and seq= Y are initialized. Then, the TCP server process enters the SYN-RCVD state. This message also does not carry data, but again consumes a serial number.
  • The TCP client process also sends an acknowledgement to the server after receiving the acknowledgement. Confirm the ACK=1, ACK= y+1, and seq= X +1 of the packet. In this case, the TCP connection is ESTABLISHED and the client enters the ESTABLISHED state. According to TCP, AN ACK packet segment can carry data, but does not consume serial numbers if it does not.
  • After receiving the confirmation from the client, the server enters the ESTABLISHED state. After that, the two sides can communicate.

Why three handshakes instead of two?

In short, the main purpose is to prevent invalid connection request packets suddenly sent to the server, resulting in errors.

If you are using two handshake connection is established, suppose there is such a scenario, the client sends the first request connection and not lost, just because in the network node in the retention time is too long, as a result of the TCP client has not received the confirmation message, thought the server did not receive, at this time to send this message to the server, The client and server then complete the connection with two handshakes, transfer data, and close the connection. This message should be invalid. However, the two-handshake mechanism will allow the client and server to establish a connection again, which will lead to unnecessary errors and waste of resources.

If the three-way handshake is used, even if the invalid packet is sent, the server receives the invalid packet and replies with an acknowledgement message, but the client does not send an acknowledgement again. Since the server does not receive an acknowledgement, it knows that the client has not requested a connection.

Another way to think about it is, the first time you can only confirm that the sender is sending properly and the second time you can confirm that the receiver is receiving properly and the sender is sending properly, but you can’t confirm that the sender is receiving properly and the third time you can confirm that both parties are receiving properly and sending properly

Four times to wave

  • The client process sends a connection release packet and stops sending data. Release the header of the data packet, FIN=1, whose sequence number is SEq = U (equal to the sequence number of the last byte of the previously transmitted data plus 1). At this point, the client enters the fin-WaIT-1 state. According to TCP, FIN packets consume a sequence number even if they do not carry data.
  • After receiving the connection release packet, the server sends an acknowledgement packet with ACK=1, ACK= U +1 and its serial number seq= V. In this case, the server enters close-wait state. The TCP server notifies the higher-level application process that the client is released from the direction of the server. This state is half-closed, that is, the client has no data to send, but if the server sends data, the client still accepts it. This state also lasts for a period of time, i.e. the duration of the close-wait state.
  • After receiving the acknowledgement request from the server, the client enters the fin-WaIT-2 state and waits for the server to send a connection release packet (before receiving the final data from the server).
  • After sending the LAST data, the server sends a connection release packet with FIN=1 and ACK = U +1 to the client. The server is probably in the semi-closed state. Assume that the serial number is SEQ = W, then the server enters the last-ACK state and waits for the client’s confirmation.
  • After receiving the connection release packet from the server, the client sends ACK=1, ACK= W +1 and its serial number is SEq = U +1. In this case, the client enters the time-wait state. Note that the TCP connection is not released, and the client enters the CLOSED state only after the client revokes the corresponding TCB after 2*MSL (maximum packet segment life).
  • The server enters the CLOSED state immediately after receiving an acknowledgement from the client. Similarly, revoking the TCB terminates the TCP connection. As you can see, the server ends the TCP connection earlier than the client.

Why does the client end up waiting for 2MSL?

MSL (Maximum Segment Lifetime) : TCP allows different implementations to set different MSL values.

First, ensure that the client sends the final ACK packet to reach the server, because the ACK packet may be lost, standing in the server’s perspective, I have already sent the FIN + ACK message request disconnect, the client also did not give me response, should be I send the request of the disconnect message it did not receive, then the server will send a again, The client receives the retransmitted message within the 2MSL period, responds with a message, and restarts the 2MSL timer.

Second, prevent “invalid connection request message segment” as mentioned in “three-way handshake” from appearing in this connection. After the client sends the last acknowledgement message, in this 2MSL time, all the message segments generated during the duration of the connection can be removed from the network. In this way, the new connection does not contain the request packets of the old connection.

  • Why is the wait for TCP4 wave set to 2MSL?

www.zhihu.com/question/67,…).

Why is it three handshakes to establish a connection and four waves to close it?

During connection establishment, the server receives a SYN packet in LISTEN state and sends the ACK and SYN packets to the client. And close connection, the server receives the other side of the FIN message, just said to each other can no longer send data but also receives the data, and the oneself also is not necessarily all data are sent to each other, so their can immediately shut down, also can send some data to each other, then send the FIN message now agreed to close the connection to the other side, therefore, The ACK and FIN are usually sent separately, resulting in an extra ACK.

Reliability transmission

ARQ

Automatic Repeat reQuest (ARQ) is one of the error correction protocols in the DATA link layer and transport layer of OSI model. Through the two mechanisms of acknowledgement and timeout, reliable information transmission is realized on the basis of unreliable services. Including stop wait ARQ and continuous ARQ.

TCP uses the continuous ARQ protocol.

The ARQ protocol includes these mechanisms

  • Error detection
  • The receiver responds with positive ACK and negative NCK
    • Cumulative acknowledgement: The recipient does not have to send acknowledgement for each received packet individually. Instead, after several packets are received, an acknowledgement is sent for the last one that arrives in sequence.
  • Retransmission timeout retransmission

ARQ protocol corrects errors by:

  • Discards received packets containing errors.
  • Request the sending point to resend the packet.

Refer to ARQ on Wikipedia

Continuous ARQ

To overcome the disadvantage of stopping and waiting for the ARQ protocol to wait for ACK for a long time. This protocol sends a continuous set of packets (pipelined) and then waits for ACK of these packets.

Pipelined transmission means that the sender can send multiple packets consecutively without having to wait for confirmation after each packet is sent. Continuous ARQ is usually used in conjunction with the sliding window protocol.

Rollback N retransmission (go-back-n)

  • The receiving point discards all packets starting with the first one not received.
  • After receiving the NACK, the sending point resends the packets specified in the NACK.
    • Disadvantages: Cannot correctly reflect to the sender the information that the receiver has received correctly in all groups.
    • For example, if the sender sends the first five packets and the middle third packet is missing, the receiver can only send an acknowledgement for the first two packets. I don’t know the whereabouts of the next three groups, so I have to retransmit the last three groups,

Selective Repeat

  • The sending point sends packets continuously but sets a timer for each packet.
  • When no ACK has been received for a packet within a certain period of time, the sending point resends only the unack packet.

Stop waiting for ARQ

The stop-and-wait protocol works as follows:

  • The sending point sends a packet to the receiving point and waits for the receiving point to reply with an ACK and start timing.
  • While waiting, the sending point stops sending new packets.
  • If the packet is not successfully received by the receiving point, the receiving point will not send an ACK. In this way, the sending point will wait for a certain period of time and send the packet again.
  • Repeat the above steps until an ACK is received from the receiving point.

disadvantages

  • Long wait times result in low data transfer speeds

The sliding window

The sender maintains the send window, and the receiver maintains the receive window.

Rules:

  • (1) Any data that has been sent must be temporarily retained until confirmation is received, so as to be used in the case of timeout retransmission.
  • (2) Only when the sender A receives the confirmation message segment from the receiver, the sender window can slide several serial numbers forward.
  • (3) When the data sent by sender A has not received confirmation after A period of time (controlled by A timeout timer), it shall use the backstepping n-step protocol to return to the place where the confirmation number was last received and resend this part of data.

There are four concepts in the send window

  • Data sent and acknowledged (outside the send window and send buffer)
  • Data sent but not acknowledged (within the send window)
  • Data allowed to be sent but not yet sent (within the send window)
  • Data that is temporarily not allowed to be sent in the buffer outside the send window.

There are also four concepts in the receive window

  • Data that has been sent for confirmation and delivery to the host (outside the receive window and receive buffer)
  • Unordered received data (within the receive window)
  • Allowed data (within the receive window)
  • Data that is not allowed to be received (within the send window).

Flow control

A TCP connection is cached on each side of the host. When the TCP connection receives the correct, ordered bytes, it places the data into the receive cache. The application reads data from this cache, but not as soon as it arrives. TCP provides a flow-control service to prevent senders from sending too many messages too fast and overflows the receive cache.

Use the sliding window protocol

The sender maintains the variables of the receiving window (RWND) for flow control.

Put the RWND value in the TCP receive window field.

For example:

TCP Conggestion Control Algorithm

Similar to traffic control, traffic control controls the sending speed of the sender, but the target is different. Traffic control prevents the cache overflow of the receiver, while congestion control reduces the sending speed due to network congestion.

TCP congestion control adopts four algorithms: slow start, congestion avoidance, fast retransmission and fast recovery.

May refer to

  • TCP congestion control
  • Discussion on TCP congestion control algorithm

Slow start

The sender maintains a state variable called congestion window CWND (congestion Window). The size of the congestion window depends on the level of congestion on the network and changes dynamically. The sender makes its sending window equal to the congestion window, and considering the receiving capacity of the receiver, the sending window may be less than the congestion window.

The idea of the slow-start algorithm is not to send a lot of data at the beginning, but to detect the level of congestion on the network, that is, to gradually increase the size of the congestion window (multiplicative growth).

The congestion window size of the number of packet segments is used as an example to illustrate the slow-start algorithm. The real-time congestion window size is expressed in bytes. The diagram below:

Congestion avoidance

To prevent network congestion caused by excessive CWND growth, a slow start threshold ssthRESH state variable should also be set. Ssthresh is used as follows:

  • When CWND < SSTHRESH, the slow start algorithm is used.
  • When CWND > SSTHRESH, congestion avoidance algorithm is used instead.
  • When CWND = SSTHRESH, slow start with congestion avoidance algorithm arbitrary.

In either the slow start phase or the congestion avoidance phase, as long as the sender determines that the network is congested, the slow start threshold is set to half of the size of the sending window when congestion occurs. Then set the congestion window to 1 and perform the slow start algorithm.

Fast Retransmit and Recovery (FRR)

Fast retransmission requires the receiver to send repeated acknowledgements as soon as it receives an out-of-order segment (so that the sender knows early that a segment has not reached the other party) rather than waiting until it sends data with additional acknowledgements. According to the fast retransmission algorithm, the sender should immediately retransmit the unreceived segment of the packet as soon as it receives three consecutive repeated acknowledgements, rather than waiting for the retransmission timer to expire. (Without FRR, TCP will use the timer to suspend the transmission if the packet is lost. During this pause, no new or duplicated packets are sent.

  • Wikipedia -TCP congestion control

Fast Retransmit is an improvement for TCP sender to reduce the time of waiting for retransmission lost segments. Each time a TCP sender sends a segment, it starts a timeout timer. If it does not receive an acknowledgement of the segment within a certain period of time, the sender assumes that the segment is lost on the network and needs to be retransmitted. This is also the measure TCP uses to estimate RTT.

The duplicate cumulative acknowledgements (DupAcks) is the basis for this phase and is based on the following process: If the receiver receives a data segment, the sequence number of the segment plus a value of data bytes is sent back to the sender as the segment confirmation number, indicating that the sender is expected to send the segment of the next sequence number. But if the receiver receives the segment with the next sequence number early — or if the segment arrives out of order, i.e., the segment corresponding to the previously expected acknowledgment number is missing — the receiver needs to immediately send the segment acknowledgment using the previous acknowledgment number. At this time, if the sender receives the segmented confirmation with the same confirmation number from the receiver more than once, and the segmented timeout timer of the corresponding serial number has not yet timed out, it is a duplicate confirmation and fast retransmission is required.

Fast retransmission with the use of fast recovery algorithm, there are the following two points:

  • ① When the sender receives three consecutive repeated acknowledgements, the “multiplication reduction” algorithm is executed to halve the SSthRESH threshold. But then the slow start algorithm is not performed.
  • ② The sender now assumes that the network may not be congested, considering that it would not receive multiple duplicate acknowledgements if the network were congested. So instead of executing the slow start algorithm, set CWND to ssTHRESH size, and then execute the congestion avoidance algorithm.

Application layer protocols use transport layer protocols for some classification

  • SSH port 22, TCP
  • MYSQL port 3306 TCP

Port numbers from 0 to 1023 are restricted and reserved for HTTP or FTP.

Q: Why is multithreading faster for large file downloads?

  • Why does multithreaded downloading speed up?
  • What’s the real reason multithreading downloads a large file faster?
  • The ultimate factor that determines the speed of downloading large files is the amount of network bandwidth occupied by the user’s downloading process in real time. Other factors are negligible compared with it.
  • Bandwidth preempted by user processes in real time ≤ Available network bandwidth in real time Forever!!
  • The traditional TCP traffic detection mechanism has a fatal flaw: once packet loss is detected, the sending rate is immediately reduced to half. If there is no packet loss after the speed is reduced by half, the sending rate will be increased by a fixed growth value (linear growth) based on the rate of half. This is then followed up until the moment of packet loss (real-time available bandwidth). Then it slows down again by half and repeats until the file is downloaded.
  • At any given moment, some threads are in a 1/2 packet loss penalty, some are in a double Slow Start, and some are in a linear growth phase. A relatively smooth download curve is obtained through a weighted average of the download rates of multiple threads. This smooth curve should be above the single-thread download rate most of the time. This is where multithreaded download rates are more advantageous.

Q: Can you guess how many HTTP requests can be sent over a TCP connection?

  • Guess how many HTTP requests can be sent over a TCP connection
  • Ha, guess how many HTTP requests can be sent over a TCP connection?

Q: Why does TCP establish a connection with a three-way handshake, but close a connection with a four-way handshake?

  • Why does TCP establish a connection with a three-way handshake, but close a connection with a four-way handshake?

Q: Which TCP packet length is determined in the TCP three-way handshake?

We know that one of the guarantees of TCP transmission reliability is that TCP will divide the data delivered by the application layer into the data blocks that TCP considers most suitable for sending. The maximum segment size is MSS. The MSS field is an option field in the TCP header.

In a word, MSS is the maximum length of data that can be transmitted in a TCP packet segment. If the data delivered by the upper layer is too large, it is segmented. The chunking process is completed at the transport layer, where the data portion of the segmented TCP packet segment is reassembled at the transport layer at the receiving end.

Note: This MSS refers to the maximum length of the data part of the TCP packet segment, not the entire TCP packet segment. The length of the entire TCP packet segment = the length of the TCP header + the length of the TCP data segment.

How communication Parties Negotiate MSS The MSS value is determined by communication parties through negotiation during the TCP three-way handshake. We all know that when Ethernet is used at the link layer, the MTU of the IP layer is 1500 bytes, so the IP datagram header (20 bytes) is removed, and the TCP header (20 bytes) is 1460 bytes. By default, the MSS value of the TCP Options field is 1460 byte = 1500-20-20. According to Internet standards, the MTU of the IP layer is 576 bytes. Therefore, the MSS value of the TCP Options field is 536 bytes = 576-20-20.

The above is the default MSS value in a TCP packet segment. The procedure for determining the MSS value is described in detail below.

The MSS value is only present in SYN packets (don’t ask me why, I don’t know why), that is, the MSS field value is only present when SYN=1. When the client wants to download data from the server over TCP,

(1) The client sends a SYN request packet. In the SYN packet, the OPTIONS field contains the MSS value (MSS = the length of the MUt-IP header – the length of the TCP header). The MSS value indicates the maximum size of data to be sent.

(2) After receiving a SYN packet, the server returns a SYN+ACK packet to the requestor, in which the Options field also has an MSS value.

(3) The communication parties select the smallest MSS in SYN and SYN+ACK packets as the MSS of this TCP connection, so as to achieve the effect of double-transmission negotiation MSS.

To sum up, you can answer the question at the beginning. The maximum transmission message (MSS) size in TCP can be determined after the second handshake.

The MTU is the maximum transmission unit (MTU), which is determined by the specific network. For example, the MTU of the Ethernet is 1500, and the MTU of the Internet is 576.

Q:Web page request process

  • The request process for a Web page