10. High Availability Architecture (Scalable Multi-Master Architecture)

As a container cluster system, Kubernetes realizes the POD fault self-repair ability through health check + restart strategy. The POD distributed deployment is realized through scheduling algorithm, and the expected number of copies is maintained. According to the failure state of Node, the POD is automatically pulled up from other nodes, so as to realize the high availability of application layer.

For the Kubernetes cluster, high availability should also include the following two levels of consideration: the high availability of the ETCD database and the high availability of the Kubernetes Master component. As for ETCD, we have used three nodes to set up a cluster to achieve high availability. This section will explain and implement the high availability of Master node.

The Master node plays the role of the general control center and maintains the healthy working state of the whole cluster by constantly communicating with Kubelet and Kube-Proxy on the work node. If the Master node fails, you will not be able to use the kubectl tool or API for any cluster management.

The Master node has three main services: kube-apiserver, kube-controller-manager, and kube-scheduler. The Kube-Controler-Manager and Kube-Scheduler components themselves have achieved high availability through selection mechanism, so the Master high availability is mainly targeted at the Kube-Apiserver component, which provides services via HTTP API. Therefore, its high availability is similar to that of the Web server, which can be load balanced by increasing the load balancer, and can be expanded horizontally.

Multiple Master Architecture Diagram:

10.1. Install Docker

10.2. Configure the host environment

10.3. Deploy Master2 Node (192.168.219.164)

All operations of MASTER2 are consistent with those of the deployed MASTER1. Therefore, we only need to copy all the K8S files of MASTER1, and then modify the server IP and host name to start.

10.3.1 Create ETCD certificate directory

Create ETCD certificate directory in MASTER2: MKDIR-P/OPT/ETCD/SSL

10.3.2 Copy files (MASTER1 operation)

Copy all K8S files and ETCD certificates from MASTER1 to MASTER2:

Scp-r /opt/kubernetes [email protected]:/opt scp-r /opt/cni/ :/opt scp-r /opt/etcd/ SSL [email protected]: / opt/etcd SCP/usr/lib/systemd/system/kube * [email protected]: / usr/lib/systemd/system SCP The/usr/bin/kubectl [email protected]: / usr/bin

10.3.3. Modify configuration file IP and hostname

Modify apiserver, kubelet and kube-proxy configuration files to local IP:

vi /opt/kubernetes/cfg/kube-apiserver.conf ... -- Bind-address =192.168.219.165 \ -- Circulate-address =192.168.219.165 \... vi /opt/kubernetes/cfg/kubelet.conf --hostname-override=k8s-master2 vi /opt/kubernetes/cfg/kube-proxy-config.yml hostnameOverride: k8s-master2

10.3.4 Start Settings

systemctl daemon-reload
systemctl start kube-apiserver
systemctl start kube-controller-manager
systemctl start kube-scheduler
systemctl start kubelet
systemctl start kube-proxy
systemctl enable kube-apiserver
systemctl enable kube-controller-manager
systemctl enable kube-scheduler
systemctl enable kubelet
systemctl enable kube-proxy

10.3.5. Check the cluster status

kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-1               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"}   
etcd-0               Healthy   {"health":"true"}

10.3.6 Approve the application for Kubelet certificate

kubectl get csr NAME AGE SIGNERNAME REQUESTOR CONDITION node-csr-JYNknakEa_YpHz797oKaN-ZTk43nD51Zc9CJkBLcASU 85m kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap Pending kubectl certificate approve node-csr-JYNknakEa_YpHz797oKaN-ZTk43nD51Zc9CJkBLcASU kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 Ready <none> 34h v1.18.18k8s-master2 Ready <none> 83m v1.18.18k8s-node1 Ready <none> 33h v1.18.18k8s-node2 Ready <none> 33h v1.18.18

11. Deploy the NGINX Load Balancer

Nginx is a major Web service and reverse proxy server that uses a four-tier implementation to load balance the APIServer.

Keepalived is one of the most widely available software programs in the world, which provides both server and server backup services based on VIP bindings. In this topology, Keepalived depends on Nginx running state to determine if a failover is needed. For example, when a Nginx master fails, a VIP is automatically bound to Nginx standby. In order to ensure that VIP has been available to achieve high availability of NGINX.

Kube-Apiserver High Availability Architecture Diagram:

11.1. Install the software package (master/standby)

yum install epel-release -y
yum install nginx keepalived -y

11.2. Nginx configuration file (main/backup same)

cat > /etc/nginx/nginx.conf << "EOF" user nginx; worker_processes auto; error_log /var/log/nginx/error.log; pid /run/nginx.pid; include /usr/share/nginx/modules/*.conf; events { worker_connections 1024; } # 4 layer load balancing, Stream {log_format main '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent'; access_log /var/log/nginx/k8s-access.log main; Upstream k8s-apiserver {server 192.168.219.161:6443; upstream k8s-apiserver {server 192.168.219.161:6443; # Master1 Apiserver IP: Port Server 192.168.219.164:6443; # Master2 APISERVER IP:PORT } server { listen 6443; proxy_pass k8s-apiserver; } } http { log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; include /etc/nginx/mime.types; default_type application/octet-stream; server { listen 80 default_server; server_name _; location / { } } } EOF

11.3. Keepalived configuration file (Nginx Master)

cat > /etc/keepalived/keepalived.conf << EOF global_defs { notification_email { [email protected] [email protected] [email protected] } notification_email_from [email protected] smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id NGINX_MASTER} vrrp_script check_nginx {script "/etc/keepalived/check_nginx.sh"} vrrp_instance Vi_1 {State MASTER interface ens33 # = Virtual_router_id 51 # = "/etc/keepalived/check_nginx.sh"} vrrp_instance Vi_1 {State MASTER interface ens33 # = "Virtual_router_id 51 # VRRP routing ID instances with unique priority 100 #, if the server sets 90 advert_int 1 # to specify the VRRP heartbeat packet notification interval AUTHENTICATION {AUTH_TYPE PASS AUTH_PASS 1111} # Virtual_IPAddress {192.168.219.188/24} TRACK_SCRIPT { check_nginx } } EOF
  • Vrrp_script: Specifies the script to check the working state of Nginx (failover based on the state of Nginx)
  • Virtual_IPAddress: Virtual IP (VIP)

Check the nginx status script:

cat > /etc/keepalived/check_nginx.sh << "EOF" #! /bin/bash count=$(ps -ef |grep nginx |egrep -cv "grep|$$") if [ "$count" -eq 0 ]; then exit 1 else exit 0 fi EOF chmod +x /etc/keepalived/check_nginx.sh

11.4. Keepalived Configuration File (Nginx Backup)

cat > /etc/keepalived/keepalived.conf << EOF global_defs { notification_email { [email protected] [email protected] [email protected] } notification_email_from [email protected] smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id NGINX_BACKUP} vrrp_script check_nginx {script "/etc/keepalived/check_nginx.sh"} vrrp_instance Vi_1 {state BACKUP interface ens33 virtual_router_id 51 # VRRP routing Ids instance, Priority 90 advert_int 1 authentication {auth_type PASS auth_pass 1111} virtual_ipaddress { 192.168.219.188/24} track_script {check_nginx}} EOF

The script to check the health of Nginx in the above configuration file:

cat > /etc/keepalived/check_nginx.sh << "EOF" #! /bin/bash count=$(ps -ef |grep nginx |egrep -cv "grep|$$") if [ "$count" -eq 0 ]; then exit 1 else exit 0 fi EOF chmod +x /etc/keepalived/check_nginx.sh

Note: Keepalived determines failover based on the script return status code (0 for working and non-0 for abnormal).

11.5. Start and set startup

systemctl daemon-reload
systemctl start nginx
systemctl start keepalived
systemctl enable nginx
systemctl enable keepalived

11.6, Check Keepalived working state

ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0 c: 29:04: f7:2 c BRD ff: ff: ff: ff: ff: ff inet 192.168.31.80/24 BRD 192.168.31.255 scope global noprefixroute ens33 Preferred_lft forever preferred_lft forever inet 192.168.31.88/24 scope global secondary ens33 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe04:f72c/64 scope link valid_lft forever preferred_lft forever

You can see that the 192.168.31.88 virtual IP is attached to the ens33 network card, which indicates that it is working properly.

11.7, Nginx+Keepalived High Availability Test

Close the primary node Nginx and test whether the VIP has drifted to the standby server. Execute pkill Nginx at Nginx Backup, IP addr command to check that VIP has been successfully bound.

11.8. Access load balancer tests

11.8.1 If you want to use curl to check the K8S version test, you can use curl to check the K8S version test.

Curl - https://192.168.219.188:6443/version {k "major" : "1", "minor" : "18", "gitVersion" : "v1.18.3", "gitCommit" : "2e7996e3e2712684bc73f0dec0200d64eec7fe40", "gitTreeState": "clean", "buildDate": "2020-05-20T12:43:34Z", "goVersion": "Go1.13.9", "compiler" : "gc", "platform" : "Linux/amd64"}

The K8S version information can be obtained correctly, indicating that the load balancer is set up normally. Curl-> VIP (nginx) -> APIServer

11.8.2 You can also see the forwarding apiserver IP by looking at the Nginx log:

Tail /var/log/nginx/ k8s-access.log-f 192.168.219.181 - [30/May/2020:11:15:10 +0800] 200 422 192.168.219.181 192.168.219.164:6443 - [30/May/2020:11:15:26 +0800] 200 422

That doesn’t end there, but here’s the most important step.

12. Modify all Worker nodes to connect to LB VIP

Imagine that although we added MASTER2 and load balancer, we expanded capacity from a single Master architecture, which means that currently all Node components are still connected to MASTER1. If we don’t connect to VIP load balancer instead, then the Master will still be a single point of failure.

12.1. The next step is to change all Node component configuration files from 192.168.219.161 to 192.168.219.188 (VIP).

The host name ip
k8s-master1 192.168.219.161
k8s-node1 192.168.219.162
k8s-node2 192.168.219.163
k8s-master2 192.168.219.164

This is the node viewed by the kubectl get node command.

12.2. Execute in all the Worker nodes mentioned above

Sed -i 's#192.168.219.161:6443# '/opt/kubernetes/ CFG /* systemctl restart kube-proxy

12.3. Check node status

Kubectl get node NAME STATUS ROLES AGE VERSION k8s-master1 Ready <none> 34H v1.18.18k8s-master2 Ready <none> 101M V1.18.18k8s-node1 Ready <none> 33H v1.18.18k8s-node2 Ready <none> 33H v1.18.18

At this point, a complete set of Kubernetes highly available cluster deployment is complete!

PS: If you are on a public cloud which does not normally support Keepalived, then you can simply use one of their load balancer products (the internal network is fine, it is also free). The architecture is the same as above. Load balancer can be used to load multiple Master Kube-apiservers directly.