Linux network protocol stack is based on the TCP/IP model. TCP/IP model consists of four layers: application layer, transport layer, network layer and network interface layer. Each layer has its own responsibility.

When an application sends a packet, usually through the socket interface, a system call is made to copy the data from the application layer to the socket layer in the kernel, which is then processed layer by layer through the network protocol stack, and finally sent to the network adapter for transmission.

Receiving network packets also goes through layer by layer network protocol processing, but in the opposite direction to sending data, layer by layer from bottom to top, and finally to the application.

The speed of a network is often tied to user experience, so how do we measure the performance of a Linux network? And how to analyze network problems?

That’s all for this time.


What are the performance indicators?

Generally, four indicators are used to measure network performance, namely, bandwidth, delay, throughput rate and Packet Per Second (PPS), which represent the following meanings:

  • bandwidthIs the maximum transmission rate of the link, in b/s (bit /s). The higher the bandwidth, the higher the transmission capability.
  • Time delayIs the time delay for receiving the response from the peer end after the request packet is sent. Different scenarios have different meanings, such as the time delay required to establish a TCP connection, or the time delay required for a round trip of a packet.
  • throughputThe unit is B/s (bit /s) or B/s (byte/second). The throughput is restricted by the bandwidth. A larger bandwidth increases the upper limit of the throughput rate.
  • PPS, full name is Packet Per Second. It indicates the transmission rate in network packets and is used to evaluate the forwarding capability of the system on the network.

Of course, in addition to these four basic indicators, there are some other commonly used performance indicators, such as:

  • Network availability, indicates whether the network can communicate normally.
  • Number of concurrent connectionsIs the number of TCP connections.
  • Packet loss rateIs the ratio of the number of lost packets to the sent data group;
  • The retransmission rateIs the proportion of retransmitted network packets.

You might ask, how do you measure these performance metrics? Don’t worry, keep reading.


What about network configuration?

To know the configuration and status of the network, we can use the ifconfig or IP command to view.

Ifconfig belongs to net-tools and IP belongs to iproute2. I have the impression that the Net-Tools package is no longer maintained. The iproute2 package is still being maintained by the developer, so using the IP tool is highly recommended.

Eth0 eth0 eth0 eth0 eth0 eth0 eth0 eth0

Although the two commands output format is not the same, but the output content basically the same, for instance contains the IP address, subnet mask, MAC address, gateway address, MTU size, so state and network packet transceiver statistical information, he said for the information below, they all have a certain relationship with network performance.

First, the connection status of the network port. If RUNNING is displayed in the ifconfig output or LOWER_UP is displayed in the IP output, the physical network is connected. If RUNNING is displayed in the IFconfig output or LOWER_UP is displayed in the IP output, the physical network is connected. If RUNNING is not displayed, the network port is not connected.

Second, MTU size. The default value is 1500 bytes, which limits the size of network packets. If there is a datagram to be transmitted at the IP layer and the length of the data frame is larger than the MTU at the link layer, the IP layer divides the datagram into dry pieces, so that each piece is smaller than the MTU. In fact, the link layer MTU of each network may be different, so you may need to increase or decrease the value of the MTU.

Third, the IP address, subnet mask, MAC address, and gateway address of the network port. These information must be configured correctly for the network to work properly.

Fourth, network packet statistics. Generally, the number of bytes, packets, errors, and packet loss on the network are displayed. If the errors, dropped, Overruns, carrier, and collisions indicators in the TX (sending) and RX (receiving) fields are not 0, It indicates that the network sends or receives faults. The indicators of these error statistics are as follows:

  • errorsIndicates the number of packets with errors, such as checksum errors and frame synchronization errors.
  • droppedRepresents the number of discarded packets, that is, packets have been received in the Ring Buffer (which is in the kernel memory, more specifically, in the NIC driver), but packets are lost due to insufficient system memory.
  • overrunsIndicates the number of packets exceeding the limit, that is, the network receiving/sending speed is too fast, resulting in the packet loss caused by the delay in processing of the packets in the Ring Buffer. Because too many packets are squeezed in the Ring Buffer, the Ring Buffer overflows easily.
  • carrierRefers to the number of packets with carrirer errors, such as mismatch of duplex mode and problems of physical cables.
  • collisionsRepresents the number of conflicting and collision packets;

The ifconfig and IP commands only display the configuration of the network port and the statistics of packets sent and received, but do not see the information in the protocol stack.


How do I view socket information?

We can use netstat or ss to view information about sockets, network protocol stacks, network ports, and routing tables.

Although the information displayed by the netstat command is similar to that displayed by the ss command, do not use the netstat command to view such information in the production environment, because the netstat command has poor performance. If the system is busy, the frequent use of the netstat command will worsen the performance. Therefore, it is recommended that you use the ss command with better performance.

Here you can see the output of these two commands:

You can see that the output is pretty much the same, For example, they contain the State of the socket, recV-Q, Send-Q, Local Address, Foreign Address, PROCESS PID, and process name (PID/Program) Name), etc.

Recv-q and Send-Q are special and have different socket states. They mean different things.

When the socket is in Established state:

  • Recv-QIndicates the number of bytes in the socket buffer that have not been read by the application.
  • Send-QIndicates the number of bytes in the socket buffer that have not been acknowledged by the remote host.

When the socket is in Listen:

  • Recv-QRepresents the length of the full connection queue;
  • Send-QRepresents the maximum length of the full connection queue;

During the TCP three-way handshake, when the server receives a SYN packet from the client, the kernel stores the connection to the half-connection queue and then sends a SYN+ACK packet to the client. Then the client returns an ACK packet. After the server receives the ACK for the third handshake, the kernel removes the connection from the half-connection queue. A new full connection is then created and added to the full connection queue, waiting for the process to fetch it when it calls accept().

In other words, a full connection queue is a queue that has not yet been removed by the Accept () system call after the server and client have finished the TCP three-way handshake.

For stack statistics, use netstat or SS to view statistics as follows:

The SS command outputs less statistics than netsat. Ss displays simple statistics such as ESTab, closed, orphaned socket, etc.

Netstat has more detailed network protocol stack information, For example, the above shows the active connections Openings, passive connection Openings, and failed connection of TCP Attempts, segments send out, and the number of segments received.


How to check the network throughput rate and PPS?

You can run the SAR command to view the current network throughput and PPS. You can add -n to SAR to view network statistics, for example

  • Sar-n DEV displays statistics of network ports.
  • Sar-n EDEV, which displays statistics about network errors;
  • Sar-n TCP: displays TCP statistics

For example, I get the statistics of the network port through the SAR command:

What they mean:

  • rxpck/stxpck/sPackets are received and sent in packets/second.
  • rxkB/stxkB/sThese are the received and sent throughput rates, expressed in KB/ second.
  • rxcmp/stxcmp/sIndicates the number of received and sent compressed packets, in packets/second.

For bandwidth, you can use the ethtool command, which is usually in Gb/s or Mb/s, but note that the lowercase b represents bits, not bytes. We usually refer to gigabit network card, gigabit network card, etc., the unit is also bit (bit). As you can see, eth0 is a GIGABit nic:

$ ethtool eth0 | grep Speed
  Speed: 1000Mb/s
Copy the code

How about connectivity and latency?

To test the connectivity and latency between the local host and the remote host, you usually use the ping command, which is ICMP based and works at the network layer.

For example, if you want to test connectivity and latency from the host to the 192.168.12.20 IP address:

The displayed contents mainly include ICMP_SEQ (ICMP serial number), TTL (survival time, or hop count) and time (round-trip delay), and the test situation will be summarized at last. If there is no packet loss on the network, the percentage of packet loss is 0.

However, it is important to note that just because the server cannot be pinged does not mean that the HTTP request cannot be pinged, because some servers have firewalls that disable ICMP.


Recommended reading

Aren’t you curious about how Linux sends and receives network packets?

What happens when the TCP half-connection queue and the TCP full-connection queue are full? And how to deal with it?