Original: Taste of Little Sister (wechat official ID: XjjDog), welcome to share, please reserve the source.

It is said that the charm of Web2.0 lies in the transformation of static resources into interactive resources, while the charm of Web3.0 lies in its decentralized resources, and everyone can participate in them to enjoy the welfare of The Times. But no matter how fancy the upper level concepts may be, the lowest level of communication is based on the technologies that web1.0 has evolved.

Our ultimate goal, in fact, is in the name of decentralization, to do the actual centralization.

When traffic increases to a certain level, all sorts of weird scenarios can occur in network programming. Here are more than a dozen practical cases to illustrate the high-frequency problems that XJJDog encounters in its daily work in the hope of helping you.

1. Avoid a large number of online clients

No matter how powerful your server is, it can cause instant problems when a large number of connections arrive to conduct business services.

For example, if your MQTT server is connected to hundreds of thousands of devices. When your MQTT server goes down and restarts, it’s hundreds of thousands of concurrent requests that almost no service can handle.

In xJjDog’s past experience, there have been numerous blocking incidents due to server restart issues.

This scenario, in fact, is very similar to the cache breakdown concept. When the hot data set in the cache fails, requests can all break down to the database level, causing problems.

As shown in the figure above, the cache breakdown problem is solved by giving each key a random expiration time so that they do not expire at the same time. Similarly, we can add a random time when the client reconnects to the server. Random numbers are a good thing because they allow us to keep our mass connections growing linearly over random time Windows.

2. Multiple NICS queue

For hypothetical VMS on virtual platforms such as openstack, service congestion occurs when the traffic reaches a certain level due to weak network interface cards (nics). This is because a single CPU creates a bottleneck when handling interrupts. You can run the dstat or iftop command to view the current network traffic.

For example, when a new Kafka machine goes live, it does a lot of data copying, and if you ping the machine, the ping value becomes very large. At the same time, the values of recv-q and send-q also increase.

In this case, you need to enable the nic multi-queue mode.

You can use ethtool to view the queue information of the network adapter.

ethtool -l eth0 | grep 'Combined'

Combined: 1

Copy the code

Of course, you can increase the queue of the network card by using the following command.

ethtool -L eth0 combined 2

Copy the code

You are advised to enable the interrupt balancing service.

systemctl start irqbalance

Copy the code

3. Disconnect the long connection periodically

If a client and a server are connected and never close each other, it is a long connection. Long connections can avoid the overhead of frequent connection creation. From HTTP1 to HTTP2 to HTTP3, there are efforts to reduce connections and reuse connections. Usually, long connections are the first choice.

But there are special cases where we want a long connection not to be there all the time, and we need to add a TTL to it. This usually happens in load balancing scenarios.

Such as LVS, HAProxy and so on.

If there are three machines A, B, and C on the back end, 90 connections are spread across three machines after LVS load. But at some point, A goes down, and its 30 connections are reloaded to B and C, which each have 45 connections.

When A restarts, it can no longer get A new connection. If the LVS operation is rebalanced, the impact is also large. So we want to create a long connection that has a lifetime property, with gradual rebalancing at some time interval.

4. K8s port range

To avoid conflicts between K8S and other programs, the default port range is 30,000-32767. If you are using the K8S platform and have nodePort configured but cannot access it, be aware that the port number is too small.

5. TIME_WAIT

TIME_WAIT is the state held by the party that actively closes the connection. Nginx and crawler servers often have a large number of connections in TIME_WAIT state. After closing the connection, TCP waits for 2MS and then closes the connection completely. Because HTTP uses TCP, there is a huge backlog of TIME_WAIT connections on these frequently-opened servers.

Some systems can see the following information through DMESG.

__ratelimit: 2170 callbacks suppressed
TCP: time wait bucket table overflow
TCP: time wait bucket table overflow
TCP: time wait bucket table overflow
TCP: time wait bucket table overflow

Copy the code

The sysctl command can set these parameters. If you want them to take effect after a restart, add them to the /etc/sysctl.conf file.

Tcp_max_tw_buckets = 50000 Net.ipv4. tcp_tw_reuse = 1 timewait is enabled Quick recycling. This has to be turned on, it's turned off by default. Net.ipv4. tcp_tw_recycle= 1 # Change the system default TIMEOUT period. The default is 60s net.ipv4.tcp_fin_timeout = 10Copy the code

To test the parameters, use the sysctl -w net.ipv4.tcp_tw_reuse = 1 command. If it is written to a file, use sysctl -p to take effect.

6. CLOSE_WAIT

CLOSE_WAIT is usually caused when the peer is actively closed and we have not handled it properly. To put it bluntly, there is a problem written by the program, which belongs to a larger kind of harm.

Everyone knows that TCP connections are three-way handshakes and four-way waves, because TCP connections allow one-way closure.

As shown, when a connection initiates an active closure, it enters the FIN_WAIT_1 state. At the same time, the passive closing party enters the CLOSE_wait state after receiving the FIN packet, and the active closing party enters the fin_WAIT_2 state after replying the ACK message. It’s a one-way shutdown.

In this case, if the passive closing party does not send the FIN packet to the active closing party for some reason, the passive closing party remains in close_wait state. For example, an EOF is received but no close is initiated.

Obviously, most of this is a programming bug that can only be fixed by code review.

7. Network connections that a process can open

Linux can accept a huge number of connections even if it opens up a port. The upper limit of these connections is limited by the number of single-process file handles and the number of operating system file handles, namely ulimit and file-max.

In order to be able to persist parameter changes, we prefer to write the changes to a file. Process the file handle of restrictions, can be placed in the/etc/security/limits the conf, it got the upper limit of the fs. Nr_open restriction; The file handle limit of the operating system can be stored in the /etc/sysctl.conf file. Finally, in the /proc/$ID/limits file, verify that the changes are working for the process.

The/etc/security/limits. Conf configuration case:

root soft nofile 1000000
root hard nofile 1000000
* soft nofile 1000000
* hard nofile 1000000
es  -  nofile  65535

Copy the code

8. SO_KEEPALIVE

If the Socket option is turned on, the client Socket sends a packet to the server every two hours using an idle connection.

This packet does nothing but check to see if the server is still active.

If the server does not respond to the packet, the client Socket sends another packet after about 11 minutes. If the server does not respond within 12 minutes, the client Socket is shut down. If the Socket option is turned off, the client Socket may not be closed for a long time if the server is invalid.

9. What problem is SO_REUSEADDR intended to solve

When developing a network, we often encounter an exception where the address is already in use because the application is closed and the network connection of the corresponding port is in TIME_WAIT state.

The TIME_WAIT state usually lasts for a period of time (2ML). Setting SO_REUSEADDR enables fast port overcommitment and fast restart of applications.

10. Application heartbeat is used for health check

TCP’s own keepalived mechanism is so weak that it runs quietly at the bottom, unable to generate application-layer semantics.

In our imagination, a connection should be a line. But really, it’s just two points, and it could take a different path each time. A point that needs to send a heartbeat packet and receive a reply to know whether the other person is alive or not.

TCP’s built-in heartbeat mechanism only knows whether the other party is alive, whether the service is available, health, and other things, and timeout configurations often conflict with timeout retransmission mechanisms.

Therefore, it is necessary to have the exact meaning of the application layer heartbeat.

11. SO_LINGER

This Socket option can affect the behavior of the close method.

By default, when the close method is called, it returns immediately; If there are still unsent packets, they are discarded.

If the linger parameter is set to a positive integer n (the maximum value of n is 65,535), it will be blocked for n seconds at most after the close method is called.

During these n seconds, the system will try to send out packets that have not been sent; If n seconds pass, if there are any unsent packets, they are discarded. The close method returns immediately.

Setting LINGER to 0 has the same effect as turning off the SO_LINGER option.

12. SO_TIMEOUT

You can use this option to set a timeout for reading data.

If you set timeout (in milliseconds) when the read method of an input stream is blocked, the system will throw an InterruptedIOException after waiting for timeout milliseconds.

After the exception is thrown, the input stream is not closed and you can continue reading data through the read method.

13. SO_SNDBUF and SO_RCVBUF

By default, the send buffer for the output stream is 8096 bytes (8K). This value is the size of the output buffer recommended by Java.

If this default is not sufficient, the setSendBufferSize method can be used to resize the buffer. However, it is best not to set the output buffer too small, otherwise the data will be transferred too frequently, which will reduce the efficiency of network transmission.

14. SO_OOBINLINE

If the Socket option is enabled, a single byte of data can be sent to the server via the sendUrgentData method of the Socket class.

This one-byte data does not pass through the output buffer, but is emitted immediately.

Although OutputStream is not used on the client side to send data to the server, the single-byte data is mixed with other normal data in the server. Therefore, the server application does not know whether the data sent by the client was sent by OutputStream or sendUrgentData.

End

I was very surprised to find that some network environments are still gigabit nics, including some professional test environments. When the actual pressure test is performed on these environments, the application response will be extremely slow when the traffic exceeds the limit of the network card. Computer system is a whole, CPU, memory, network, IO, any link bottleneck, will cause problems.

In distributed systems, networks are a very important factor. But because it is relatively low-level, most developers know little about it. With the popularity of cloud native components, there is less and less exposure to these underlying infrastructures. But if something does go wrong, don’t forget — after you’ve eliminated the most likely components

There’s the Internet for you.

Xjjdog is a public account that doesn’t allow programmers to get sidetracked. Focus on infrastructure and Linux. Ten years architecture, ten billion daily flow, and you discuss the world of high concurrency, give you a different taste. My personal wechat xjjdog0, welcome to add friends, further communication.

Recommended reading:

2. What does the album taste like

3. Bluetooth is like a dream 4. 5. Lost architect, leaving only a script 6. Bugs written by the architect are unusual

Taste of little sister

Not envy mandarin duck not envy fairy, a line of code half a day

318 original content