The author | Luobingli source | Erda public account

One night, the customer encountered such a problem: the K8S cluster has been unable to expand capacity, all the nodes can not join the cluster properly. After many tortures and no solution, the customer feedback the problem to us here, hoping to get technical support. The whole troubleshooting process of this problem is more interesting. This paper summarizes and sorts out the troubleshooting ideas and methods used and shares them with you, hoping to help and reference you when troubleshooting this kind of problems.

Phenomenon of the problem

When the operation and maintenance students expanded the node capacity of the customer’s K8S cluster, they found that the newly added nodes failed to be added all the time. Preliminary investigation results are as follows:

The network access to K8S Master Service VIP on the new node is blocked.
On new nodes, direct access to the K8S Master Hostip + 6443 network is normal.
On new nodes, container IPs accessing other nodes can be pinging normally.
Access to the Coredns Service VIP network is normal on the new node.

The client is using Kubernetes version 1.13.10 and the host kernel version 4.18 (CentOS 8.2).

Problem troubleshooting process

After receiving the feedback from the front-line colleague, we have initially suspected that IPVS is the problem. According to the experience of troubleshooting network problems in the past, we first made some routine on-site troubleshooting:

Verify that the kernel module ip_tables is loaded (normal)
Confirm if iptable forward defaults to accpet (normal)
Verify that the host network is normal (normal)
Verify that the container network is normal (normal)
.

After eliminating the conventional problems, the scope can be basically narrowed down. Next, we will continue to conduct the investigation based on the relevant level of IPVS.

1. Checking through ipvsadm command

10.96.0.1 Customer cluster K8S Master Service VIP.

As shown in the figure above, we can find that there is abnormal connection in the SYN_RECV state, and we can observe that Kubelet + Kube-proxy is normally connected at startup, indicating that there is abnormal K8S service network after startup.

2. Tcpdump capture analysis

Packets are captured at both ends and confirmed by Telnet 10.96.0.1443 command.

Conclusion: The SYN packet was found not to be sent on the machine.

3. Preliminary summary

Through the above investigation, we can narrow the scope again: the problem is basically with Kube-proxy. We adopted the IPVS mode and also relied on the iptables configuration to achieve some network forwarding, snat, drop, etc.

According to the above investigation process, we narrowed the scope again and began to analyze the suspect Kube-proxy.

4. Check the Kube-Proxy log

As shown in the figure above, the exception log is found and the iptables-restore command executes the exception. Check through Google, the community, and confirm the problem.

The relevant issue link can be referred to:

https://github.com/kubernetes/kubernetes/issues/73360
https://github.com/kubernetes/kubernetes/pull/84422/files
https://github.com/kubernetes/kubernetes/pull/82214/files

5. Go deeper

Through the code to view (1.13.10 version PKG/proxy/ipvs/proxier. Go: 1427), can be found in this version does not judge KUBE – MARK – DROP the existence and create logic. When the chain does not exist, a logic defect occurs that causes the iptable command to fail.

K8S Master Service VIP is blocked, and the IP related to the actual container is passed. The reason for this situation is related to the following iptable rules:

iptable -t nat -A KUBE-SERVICES ! -s 9.0.0.0/8 -m comment --comment "Kubernetes service cluster ip + port for masquerade purpose" -m set --match-set KUBE-CLUSTER-IP dst,dst -j KUBE-MARK-MASQ

6. Root cause exploration

We already know that the kube-proxy version 1.13.10 has a bug. If we do not create the kube-mark-drop chain, we execute the iptables-restore command to configure the rule. But why does K8S 1.13.10 report an error when running on a centos8.2 4.18 kernel, but it runs normally on a centos7.6 3.10 kernel?

If we look at the source code of kube-proxy, we can see that kube-proxy is just a rule configuration using iptables commands. If kube-proxy fails to report iptables-restore, let’s find a 4.18 kernel machine and go to the kube-proxy container to see what happens. If you run iptables-save inside the container, you can see that the kube-proxy container does not create the kube-mark-drop chain (as expected by the code). Iptables -save: kube-mark-drop: iptables-save: kube-mark-drop: kube-mark-drop: iptables-save: kube-mark-drop

Here are two questions:

Why do iptables on the 4.18 kernel host have a kube-mark-drop chain?
Why are the iptables rules for the 4.18 kernel host inconsistent with the rules for the kube-proxy container?

The first doubt is that there are other programs operating iptables besides Kube-proxy. I continue to roll down the K8S code. Conclusion: In addition to Kube-proxy, Kubelet also modified the iptables rules. The specific code can be seen: PKG /kubelet/kubelet_network_linux.go is the second question. Google it to find out why the kube-proxy container has mounted the /run/xtables. The rules viewed by the host and the container iptables do not match. Conclusion: CentOS 8 abandoniptables and uses NFtables framework as the default network packet filtering tool.

At this point, all the mystery was solved.

The team has completed a large number of customer project deliveries. Here are some questions that can be answered:

Question 1: Why do so many customer environments encounter this situation for the first time?

Because an operating system of K8S 1.13.10 + CentOS 8.2 is required, this combination is rare and problems are inevitable. Upgrade K8S 1.16.0+ will not cause this problem.

Q2: Why does the K8S 1.13.10 + 5.5 kernel not have this problem?

Since it is related to the CentOS 8 operating system, we manually upgraded version 5.5 to use the iptables framework by default.

You can verify that you are using nftables by using the iptables -v command.

As an aside: Who is Nftables? Is it better than iptables? This is another point worth studying further, but I won’t go into it here.

Summary and Insight

In view of the above inspection problems, we summarize the solutions:

Adjust the kernel version to 3.10 (CentOS 7.6+) or manually upgrade the kernel version to 5.0 +;
Upgrade the Kubernetes version, currently confirm the 1.16.10+ version does not have this issue.

The above is our experience in Kubernetes network fault troubleshooting. I hope to be able to efficiently troubleshoot and locate the causes.

If there is anything else you would like to know about the Erda project, please add the little assistant WeChat (Erda202106) to the exchange group!

Welcome to Open Source

As an open source one-stop cloud native PaaS platform, Erda has platform-level capabilities such as DevOps, micro-service observation governance, multi-cloud management and fast data governance. Click the link below to participate in open source, discuss and communicate with many developers, and build an open source community. Everyone is welcome to follow, contribute code and STAR!

Erda Github address:https://github.com/erda-project/erda
Erda Cloud Website:https://www.erda.cloud/

An “irresponsible” K8S network troubleshooting experience sharing