The network layer



TCPAgreement bearingHTTP, HTTPSUpper-layer application protocol data transmission

The working principle of

The connection

TCP keeps all these connections running correctly through port numbers.

[Source IP address, source port number] < — > [Destination IP address, destination port number]

These four values uniquely define a connection. Two different TCP connections cannot have four identical address component values, and parts of different connections can have the same value.

Examples are as follows:

The connection Source address IP [Client] Source port Destination IP address [Server] Destination port
A 172.16.110.1 8081 192.168.77.1 9090
B 172.16.110.2 8077 192.168.77.1 9040
C 172.16.110.2 8079 192.168.77.2 9050
D 172.16.110.3 8044 192.168.77.2 9050

A block of data

HTTPorHTTPSWhen the protocol wants to transmit a message, theflowThe content of packet data is transmitted sequentially through an open TCP connection.

After receiving a data stream, TCP cuts it intoPeriod ofSmall data blocks and speak segments encapsulated inIP packetOver the Internet.



Each TCP segment is carried by an IP packet and sent from one IP address to another. Each IP group contains:

  • An IP packet header (usually 20 bytes)

    • Contains source address, destination address, length, and some other tags
  • A TCP segment header (usually 20 bytes)

    • TCP port numbers, TCP control flags, and some numeric values for data sorting and integrity
  • A TCP data block (0 or more bytes)

interaction

The interaction process

State change

Establish a connection (three handshakes)

The problem summary

  • Why three handshakes instead of two? TCP is a connect-based duplex protocol. After the handshake is complete, the data transmission stage is entered. Therefore, the three stages \ are SYN -> SYN+ACK -> ACK

  • What can I do if no ACK is returned after SYN_SENT? The Linux parameter /proc/sys/net/ipv4/tcp_synack_retries controls the number of retries. The default value is five. The retry interval is 1s, 2s,4s,8s,16s, 32s based on the binary exponential retreat algorithm. If no ACK is received after one minute, the connection is closed. \

  • SYN_QUEUE Semi-connected queue control? Linux /proc/sys/net/ipv4/tcp_max_syn_backlog control. The maximum value can be 65536 or 16000. In addition to half a connection queue adjusting/proc/sys/net/core/somaxconn full connection queue number, Nginx reverse proxy parameters such as can be, this is a full link project, not a single point \ modify problems

  • SYN FLOOD (SYN FLOOD attack) How does the attack work? Using TCP protocol defects, it sends a large number of forged TCP connection SYN requests and does not respond to the SYN+ACK returned by the server, thus making the server half-connection queue full and unable to receive other normal connection requests. In addition, for the server with weak TCP stack, Semi-connected queues store and process a large number of connections, leading to resource exhaustion (full CPU load or low memory) and ultimately system or server downtime

  • SYN FLOOD Attack Defense?

    • Shorten SYN Timeout. The effect of SYN Flood attacks depends on the number of SYN half-connections maintained on the server. The value is equal to SYN attack frequency x SYN Timeout. Therefore, you can shorten the time from receiving a SYN packet to confirming that the SYN packet is invalid and discards the connection, for example, set the time to less than 20 seconds. However, a low SYN Timeout setting may affect customer access
    • SYN cookies are set. It is to allocate a Cookie to each IP address that requests connection. If repeated SYN packets from a certain IP address are received consecutively within a short period of time, it is considered to be under attack and the address information is recorded. Packets from this IP address will be discarded in the future. The result of doing so may also affect the access of normal users
    The syn cookie value function
    0 Indicates that the function is disabled.
    1 This parameter is enabled only when the SYN half-connection queue becomes too large
    2 Indicates that the function is enabled unconditionally
    • SYN FLOOD attacks are based on IP addresses or domain names. The domain name needs to be resolved into IP addresses. In essence, SYN FLOOD attacks are based on IP addresses. This is a passive AD hoc solution that protects against the attacker’s DNS resolution while the cache lasts
  • What if the full connection queue is full? /proc/sys/net/ipv4/tcp_abort_on_overflow Controls the processing of the full connection queue with two values:

    Tcp_abort_on_overflow value function
    0 If the Accept queue is full, the server throws away the ACK sent by the client
    1 If the Accept queue is full, the server sends an RST packet to the client to abort the handshake and the connection
  • How do I view server processesFull connection queueThe length of the?

    ss -lnt

  • Recv -q: the size of the current ACCEPT queue, that is, the TCP connection that has completed the three-way handshake and is waiting for the server to accept()

  • Send-q: indicates the maximum length of the Accept queue. The preceding output indicates that the TCP service is monitored on the port. The maximum length of the Accept queue is 128

Disconnect (wave four times)

  • Why four waves? Because TCP is based on the connection of duplex working mechanism, initiate any end closing request, need to wait to get confirmation after the closing, and closed for the passive is also need to pass the validation process, and before the closing belong to the state of data transmission, network, data transfer amount is unknown, completes the closing action to safely, grace, Time window is required and both sides are confirmed, as two FIN -> ACK waves are required.

  • The TCP connection can be closed in two ways: RST packet closing and FIN packet closing. If the process exits abnormally, the kernel will send an RST message to shut it down, which is a violent way to close the connection without going through the four-wave process. The process initiates FIN packets by invoking close and shutdown functions. RST is easy to lose interaction information

  • Only the party initiating the shutdown has FIN_WAIT status

  • How can I handle the failure to receive an ACK packet after a FIN packet is sent? The Linux parameter /proc/sys/net/ipv4/tcp_orphan_retries controls the orphan_retries. The default value is 0. 0 indicates eight retries.

  • If a malicious attack occurs, FIN packets cannot be sent out. This is caused by TCP’s two features:

    • TCP ensures that packets are sent in order. FIN packets are no exception. If there is still data in the sending buffer, FIN packets cannot be sent in advance.
    • TCP has the traffic control function. When the receiver receive window is 0, the sender can no longer send data. Therefore, when an attacker downloads a large file, the receive window can be set to 0, which prevents FIN packets from being sent and keeps the connection in the FIN_WAIT1 state.
    • LInux parameters/proc/sys/net/ipv4/tcp_max_orphansControl this send overFINEnter the messageFIN_WAIT_1If the value of this parameter is exceeded, the TCP connection is routed directlyRSTPacket Closes the TCP connection.
  • The Linux parameter /proc/sys/net/ipv4/tcp_fin_timeout controls the FIN_WAIT_2 duration

  • The time-wait state is particularly important for two reasons:

    • Prevent packets from old connections. Prevent old packets with the same **

      ** from being received
    • Make sure the connection is closed correctly. Ensure that the side of the passively closed connection is properly closed, that is, ensure that the last ACK is received by the passively closed party to help it close properly
  • TIME-WAITWhat are the problems caused by disconnection with no wait time or too short a time



    ifTIME-WAITIf the time is set too short, the end initiating the shutdown will enter immediatelyCLOSEState, and sent to the other endACKThe packet may be lost without being received by the passively closed end, that is, the passively closed end does not receive the message to close the connectionLAST_ACKState, unable to enterCLOSEState, the next time a new TCP connection is established, a new connection request is sentSYNThe message is returned directly if the party that failed to close remains in the previous work stateRSTTherefore, the new connection process is closed and affected.

  • Why is it 2MSL? In this case, packets are allowed to be lost at least once. For example, if an ACK is lost in one MSL and the passive FIN resends arrives in the second MSL, a TIME_WAIT connection can handle it, which is also a reasonable prediction of bad network conditions.

  • Optimization of TIME_WAIT status The Linux parameter /proc/sys/net/ipv4/tcp_max_tw_buckets controls the maximum number of TIME_WAIT connections. If the number exceeds the threshold, the connection is closed.

Performance issues



The entire HTTP interaction process is as follows:

  • In DNS query, the HTTP client resolves the IP address and port of the REQUESTED URI. Generally, HTTP clients retain frequently used or recently used site addresses through the DNS cache
  • Connection The TCP connection handshake process is complex and time-consuming
  • Request Service packet request
  • Processing also takes time away from the logical processing of the request
  • Response Service packet response
  • Disable The TCP connection is disabled

Performance issues are mainly focused on the following parts, respectively

  • TCP connection handshake delay
  • TCP deferred acknowledgment algorithm for piggyback acknowledgment
  • TCP Slow start congestion control
  • Data aggregation Nagle algorithm, TCP Nodelay
  • TIME_WAIT accumulation and port exhaustion

TCP connection handshake delay



(1)(2)(3) is the three-way handshake used by TCP to establish a connection

  • (1) The SYN Client sends a small TCP packet (40-60 bytes) to the Server. This group sets a special SYN flag to indicate that this is a connection request.
  • (2) After the SYN + ACK Server accepts the connection, it calculates the connection parameters and returns a TCP packet to the Client. In this packet, the SYN and ACK flags are set to indicate that the request has been accepted
  • (3) The ACK Client sends an acknowledgement message to the Server. The ACK flag in this TCP packet is set to indicate that a connection has been established and HTTP data is allowed to carry, thus reducing one request

Optimization solution: Reuse TCP connections

(3)(4) is HTTP data transmission

  • (4) Respond to Client request and return data

TCP deferred acknowledgment algorithm for piggyback acknowledgment

  • Packet loss problem. Due to the uncertainty and unreliability of the network, TCP packet transmission may lose packets due to router overload or network interruption.
  • Reissue the problem. Each TCP packet has a sequence number and data integrity checksum. The TCP packet receiver sends an acknowledgement to the sender when receiving the complete packet. If the sender does not send or receive the acknowledgement in the specified time window, the packet is considered damaged or lost and resends.
  • Confirmation piggyback & delayed confirmation algorithm. Because the acknowledgement message received from a complete TCP packet is very small, pigging is performed with other TCP packets to save network resources. There is a delay confirmation algorithm, that is from information stored in the buffer with other output TCP packet to send, when at a specific time window (typically 100 ~ 200 milliseconds) there is still no other TCP packet is then separately send a confirmation message, that is the worst case confirmation message needs to wait for a time window to send, Therefore, there may be delays.

Optimization solution: Adjust the TCP delay acknowledgement algorithm parameters based on the actual situation

TCP Slow start congestion control

In order to prevent sudden overload and congestion of the network, the starting node of TCP data transmission will limit the maximum speed of the connection, and gradually increase the transmission speed as time goes by, so it is calledTCP Slow Start



TCP slow startLimits the number of TCP packets that a TCP endpoint can transmit at any time. , for example, when a group of very large data to be sent, may need eight TCP packet to send to complete, beginning allow only sends a TCP packet, after the return to confirm grouping, next time you can send two TCP packet, back after confirm the grouping, the next time you can send four TCP packet, and so on, the window to send traffic is more and more big, So it’s calledOpen the congestion window.

In the presence of thisCongestion control featuresTherefore, the transmission speed of the new connection is slower than that of the adjusted TCP connection that has exchanged a certain amount of data.

Optimization: Reuse TCP connections

Data aggregation Nagle algorithm, TCP Nodelay



We can see that every time data is sent, it is sent withTCP packetFor upstream users, onlyA block of dataIs valid, otherIP packets and TCP segmentsSuch information is used for TCP communication interaction, so fill in as much as possibleA block of dataOtherwise, a small number of data blocks are sent each time for the same total number of data blocks, which increases the number of TCP communications. Most of the network traffic is spent on useless low-level data encapsulation.



Nagle algorithmEncourage sending full-size (maximum size packet is about 1500 bytes)TCP packetWhen all TCP packets are confirmed or the TCP packets are cached to a certain extent, the TCP packets are sent to maximize the use of each TCP packet data bearing space and reduce communication interactions.

Nagle algorithmThere are also performance issues. When HTTP packets fail to fill a packet, they can cause delays in sending data, as described aboveDelayed confirmation algorithmThe occurrence of time window overlap leads to more serious delay problems.

Optimization solution: Configure TCP_NODELAY to disable Nagle algorithm to improve performance. In this case, make sure that TCP packets keep writing large blocks of data to make the transfer as efficient as possible.

TIME_WAIT accumulation and port exhaustion

When a TCP endpoint closes a TCP connection, the TCP connection enters the TIME_WAIT state, which maintains a memory space to record the IP address and port information of the most recently closed connection, and maintains a certain time window, usually twice the maximum lifespan (2MSL, generally 2 minutes). To ensure that the same IP and port number are not created during this time. The server has a limited number of available ports, typically 60,000, and if they cannot be reused within 120 seconds (i.e., 2 MSL), the maximum number of connections in extreme cases is 60,000/120 =500/ SEC when connection flood peaks.

Optimization solution: Set tcp_MAX_TW_BUCKETS

reference

How to avoid TCP three handshake TCP core knowledge, teach you to beat the interviewer! This section describes the mechanism and defense against SYN Flood attacks