A: Summary overview

TCP is a reliable transport layer protocol. The network layer uses unreliable IP protocol to ensure reliable data transmission. When the packet is sent, TCP will start the timer. When the timer reaches the threshold and the sent packet is not confirmed by the data receiver, the lost packet will be re-transmitted. Of course, the premise of retransmission is the confirmation mechanism. This article covers retransmission and validation in detail, as well as the concepts of fast retransmission and delayed validation

Two: timeout retransmission simulation

Edit a PacketDrill script below, annotate the ACK response part, and retransmit data when the data sender is unable to obtain the response. Then, you can use tcpdump to capture packets and save files, and the Wireshark can capture packets for analysis. The script is a simple classic three-handshake simulation and data transfer

// Three handshake 0 socket(... , SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ... ,...). = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 4000 <mss 1000> +0 > S. 0:0(0) ack 1 <... > +.1 < . 1:1(0) ack 1 win 4000 +0 accept(3, ... ,...). = 4 // Data send +0 write(4... +.1 <. 1:1(0) ACK 1001 win 1000 +0 'sleep 1000000Copy the code
Tcpdump -i any-nn -vv -w /home/retry.pacp port 8080Copy the code

Three: timeout retransmission analysis

The first concern is the number of timeout retransmissions, which is controlled by the parameter /pro/sys/net/ipv4/tcp_retries2, which generally defaults to 15. Use the following command to view:

[root@zsl home]# cat /proc/sys/net/ipv4/tcp_retries2
15
Copy the code

TCP timeout retransmission uses an exponential avoidance policy. You can see the red box in the Wireshark analysis diagram. The 3-6-12-24…

tcp_retries2
RTO

Four: fast retransmission simulation

Write packetdrill scripts consistent with timeout retransmission, tcpdump subpackage, Wireshark analysis

-- Tolerance_usecs =100000 // Tolerance_usecs: initializing 0 sockets (... , SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ... ,...). = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <... > +.1 < . 1:1(0) ack 1 win 257 +0 accept(3, ... ,...). = 4 // Write 5000 bytes to the client +0.1 Write (4,... . 1 <.1:1 (0) ACK 1001 WIN 257 < SACK 1:1001, NOP, NOP > // Three times repeated ACK +0 <.1:1 (0) ACK 1001 win 257 <sack 1:1001 2001:3001,nop,nop> +0 < . 1:1(0) ack 1001 win 257 <sack 1:1001 2001:4001,nop,nop> +0 < . 1:1(0) ack 1001 win 257 <sack 1:1001 2001:5001, NOP, NOP > // Reply to the acknowledgement packet, let the server not retry +.1 <. 1:1(0) ACK 5001 win 257 +0 'sleep 1000000'Copy the code
Tcpdump -i any-nn -vv -w /home/sack.pacp port 8080Copy the code

Five: Fast retransmission analysis

  • 1-3: Three handshakes
  • 4- 8:50000byte packet transmission
  • 9-12: Data recipient confirmation
  • 13: Fast retransmission is performed

The most important process is the confirmation of the data receiver from 9 to 12. The data sender sent five data packets of 0-1000, 1001-2001, 2001-3001, 3001-4001 and 4001-5001. However, when confirming, it can be seen that only four packets are confirmed. Open 10, 11 and 12 of them to view details, 10 of which is shown in the figure below. The ACK sequence number is 1001, but in Options carries the received packet sequence number

  • 9th: Brother I have received data packet 0-1001
  • 10th: Brother I received a packet with the maximum serial number 1001, but 2001-3001 also arrived
  • 11th: Brother I received a packet with the maximum serial number 1001, but 2001-4001 also arrived
  • 12th: Brother I received a packet with the maximum serial number 1001, but 2001-5001 also arrived

    Pictures come from Master Zhang’s gold-digging brochure:Timeout retransmission, fast retransmission, and SACK, has been approved by Master Zhang, interested friends to buy the booklet is really worth it. The condition for fast retransmission is when the data sender receives itThree or more identical ACK packets, the system determines that the data packet is lost, and retransmits data even if the timeout retransmission timer has not reached the time

Six: delayed confirmation

The normal idea is that the data recipient needs to send an ACK for confirmation after receiving the data packet. However, IN fact, TCP uses the delayed acknowledgement policy to reduce the performance cost. After receiving the data, it will wait a little bit to see if there is any data packet to return. If there is any data packet to return, it will conduct ACK confirmation along with it. Of course, if there is no data packet transmission during the waiting process, it will also conduct ACK confirmation separately, which is the delayed acknowledgement. Delayed confirmation does not apply to the following scenarios:

// tcp.input. C file source code
static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible)
{
	struct tcp_sock *tp = tcp_sk(sk);

	    /* More than one full frame received... * /
	if (((tp->rcv_nxt - tp->rcv_wup) > tp->ack.rcv_mss
	     / *... and right edge of window advances far enough. * (tcp_recvmsg() will send ACK otherwise). Or... * /
	     && __tcp_select_window(sk) >= tp->rcv_wnd) ||
	    /* We ACK each frame or... * /
	    tcp_in_quickack_mode(tp) ||
	    /* We have out of order data. */
	    (ofo_possible &&
	     skb_peek(&tp->out_of_order_queue))) {
		/* Then ack it now */
		tcp_send_ack(sk);
	} else {
		/* Else, send delayed ack. */tcp_send_delayed_ack(sk); }}Copy the code
  • If the packet size is larger than one frame and the window size needs to be adjusted
  • In Quickack mode (tcp_in_quickack_mode)
  • We have out of order data.

The following is a simulation scenario using the PacketDrill script

- tolerance_usecs = 100000 0.000 socket (... Setsockopt (3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 Bind (3,... ,...). = 0 0.000 listen(3, 1) = 0 0.000 < S 0:0(0) win 32792 < MSS 1000, sackOK, NOp, NOp, Wscale 7> 0.000 > s. 0:0(0) ACK 1 <... > 0.000 <.1:1 (0) ACK 1 win 257 0.000 Accept (3,... ,...). Setsockopt (4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 POST/HTTP/1.1 +0 < p.1:11 (10) ACK 1 win 257 1314} +0 < P. 11:26(15) ack 1 win 257 // HTTP response {} +0 write(4,... P. 26:36(10) ACK 101 win 257 // The server returns +0 'sleep 1000000'Copy the code

Because the Wireshark is used to set the interval between the last packet capture, you can see that the confirmation delay is 40ms