Original: Taste of Little Sister (wechat official ID: XjjDog), welcome to share, please reserve the source.

I recently came across a very interesting problem. There is a group of HaProxies that have frequent problems. Log in to the server, CPU, memory, network, IO a fierce check. It turned out that there were more than 60,000 connections in TIME_WAIT state on the machine.

TIME_WAIT state, usually occurs on proxy machines such as HAProxy and Nginx, mainly due to frequent active shutdown. You can quickly resolve the problem by modifying the reuse and reclaim parameters.

You can run the following command to collect statistics on network status.

netstat -ant|awk '/^tcp/ {++S[$NF]} END {for(a in S) print (a,S[a])}'
ESTABLISHED 70
FIN_WAIT2 30
CLOSING 33
TIME_WAIT 65520
Copy the code

There’s nothing magical about this, but 65535 is too sensitive. Must have triggered some kind of ceiling.

To further confuse us, why is the service unavailable when a connection in TIME_WAIT state reaches 65535?

Everywhere known single million connection, is blowing hot air? Why are you so vulnerable?

65535, which means equal to two to the sixteenth minus one, is a magic number. Small numbers aside, let’s see how many connections Linux can support.

1. How many connections does Linux support?

The answer is an infinite number. But there are only 65535 ports.

Why are there only 65535 ports?

This is historically because there are 16 bits at the beginning of TCP and UDP to store the source port number and the target port number respectively. Unfortunately, this value is of type short and size 2^16-1.

Immutable standards, because of historical reasons, are so deeply rooted.

So how many connections does Linux support? The answer is an infinite number.

In the case of Nginx, we have it listening on port 80. At this time, A machine to connect to Nginx, can initiate up to 6W long connections. If machine B tries to connect to Nginx, it can also initiate more than 6W connections. This is because determining a connection is determined jointly by SRC and DST.

The idea that Linux can only accept 65,535 connections is just plain naive.

At 65535 ports, it might be too small for you as a pressure machine. But it’s more than enough for a server.

2. How to support millions of connections?

As you can see from above, there is no limit to the number of connections. But Linux has another layer of protection, and that is the number of file handles. The things you see through the lsof command are called file handles.

Let’s take a look at a few commands.

Ulmit, which shows the number of file handles each process can occupy.

ulimit -n
65535
Copy the code

File-max, which shows the total number of file handles the operating system can occupy, is for all processes.

cat /proc/sys/fs/file-max
766722
Copy the code

File-nr, showing the number of handles currently in use and the total number of handles. It could be used for surveillance.

cat /proc/sys/fs/file-nr
1824	 0	766722
Copy the code

To support millions of connections, you need to let go of both operating system-level and process-level handles. In other words, ulimit and file-max must be larger than a million.

3. How do I set this parameter?

To set the number of handles for a process, ulimit is a common method, but is not recommended. For no other reason, the ulimit setting takes effect only when processes are started in the same shell. You open another shell, or restart the machine, and ulimit changes are lost. Here’s how:

ulimit -n 1000000
Copy the code

Is the right way, to modify the/etc/security/limits file. Like the following.

root soft nofile 1000000
root hard nofile 1000000
* soft nofile 1000000
* hard nofile 1000000
Copy the code

As you can see, we can modify the number of handles for a particular user. This is often encountered when installing applications such as ES.

es  -  nofile  65535
Copy the code

But even this way, you need to open a new shell to operate. This does not take effect in the current or previous shell. Xjjdog has had a number of cases where restrictions have been lifted and problems have occurred.

To see if these changes have taken effect for the process, look at the process’s memory-mapped file. For example, cat /proc/18032/limits, where you can see the limits in detail.

This number is not as large as you want it to be. The upper limit of its size is determined by nr_open. To make it larger, change the fs.nr_open value in /ect/sysct.conf.

cat /proc/sys/fs/nr_open
1048576
Copy the code

What about file-max? You are advised to add the following content to the /etc/sysctl.conf file. That’s more than six million!

fs.file-max = 6553560
Copy the code

When the number of files exceeds the threshold, the kernel: VFS: file-max limit 65535 reached error is reported.

End

To sum up.

Linux can accept a huge number of connections even if it opens up a port. The upper limit of these connections is limited by the number of single-process file handles and the number of operating system file handles, namely ulimit and file-max.

In order to be able to persist parameter changes, we prefer to write the changes to a file. Process the file handle of restrictions, can be placed in the/etc/security/limits the conf, it got the upper limit of the fs. Nr_open restriction; The file handle limit of the operating system can be stored in the /etc/sysctl.conf file. Finally, in the /proc/$ID/limits file, verify that the changes are working for the process.

In this way, mega Link is worthy of its name. I wonder why Linux doesn’t let go of these configurations by default. Make it 65535 also recognize ah, why make a 1024?

Xjjdog is a public account that doesn’t allow programmers to get sidetracked. Focus on infrastructure and Linux. Ten years architecture, ten billion daily flow, and you discuss the world of high concurrency, give you a different taste. My personal wechat xjjdog0, welcome to add friends, further communication.