Analysis of possible outage causes

Let's first take a look at how Kubernetes manages workload connections when an application rerouting traffic during a rolling update actually happens from the old Pod instance to the new one.

If the client we are testing connects directly to the Service from within the cluster, it will first be resolved to the ClusterIP of the Service via the cluster’s DNS Service and then forward to the Pod instance behind the Service. This is done by updating the iptables rules on the Kube-Proxy on each node.

Kubernetes updates Endpoints objects based on the status of Pods to ensure that Endpoints contain Pods that are ready to handle requests.

However, Kubernetes Ingress connects to instances in a slightly different way, which is why we see different downtime behavior during rolling updates when clients connect to applications via Ingresss.

Most Ingress controllers, such as Nginx-Ingress and Traefik, get Pod addresses directly from watch Endpoints instead of iptables.

No matter how we connect to the application, Kubernetes’ goal is to minimize service interruptions during rolling updates. Once the new Pod is active and ready, Kubernetes will stop the existing Pod, updating its state to “Terminating” and then removing it from the Endpoints object. And sends a SIGTERM signal to the main Pod process. The SIGTERM signal causes the container to close normally and not accept any new connections. When pods are removed from the Endpoints object, the previous load balancer routes traffic to other (new) pods.

This is also the main reason for the usability gap in our application, because the Pod is disabled by the termination signal before the load balancer notices a change and updates its configuration (i.e., nginx-ingress will automatically modify upstream and reload). This reconfiguration happens asynchronously, so the correct order is not guaranteed, so few requests may be routed to the terminated Pod.

How to achieve zero downtime?

First, the prerequisite for achieving this goal is that our container handles the termination signal correctly, gracefully closing on the SIGTERM signal. The next step is to add readiness probes to check that our application is ready to handle traffic.

The readable probe is just the starting point for our smooth rolling updates, and to solve the problem of Pod stopping without blocking and waiting until the load balancer is reconfigured, we need to use the preStop lifecycle hook, which is called before the container terminates.

The lifecycle hook function is synchronous, so it must be done before the final termination signal is sent to the container, and then the SIGTERM signal will stop the application process. At the same time, Kubernetes will remove the Pod from the Endpoints object, so the Pod will be removed from our load balancer. Basically, our lifecycle hook function waits to ensure that the load balancer is reconfigured before the application stops.

We use preStop to set a grace period of 20 seconds. Pod will sleep for 20 seconds before it is destroyed. This gives time for the Endpoints controller and Kube-proxy to update Endpoints objects and forwarding rules. Although the Pod is in Terminating state during this period, even if a request is forwarded to the Terminating Pod before the forwarding rule is fully updated, it can still be processed normally because it is still in sleep and has not been destroyed.

        lifecycle:
          preStop:
            exec:
              command: ["/bin/bash"."-c"."sleep 20"]
Copy the code

Test hook availability

The resource adds the above rule and calls the hook to wait 20 seconds to see the effect before destroying the pod

If the pod is in the terminating state, the IP address of the pod can be accessed normally by using the curl tool. The POD will not be destroyed until 20s.

'# restart the deploy to see the effect
[root@dm01 ~]# kubectl rollout restart deploy nginx-app
deployment.apps/nginx-app restarted
[root@dm01 ~]# kubectl get po -o wide
NAME                         READY   STATUS        RESTARTS   AGE     IP             NODE   NOMINATED NODE   READINESS GATES
hpa-demo-6c655c5458-6hdgd    1/1     Running       10         53d     10.2441.62.    dm02   <none>           <none>
nginx-6c79994d64-zf746       1/1     Running       0          59m     10.244. 0208.   dm01   <none>           <none>
nginx-app-69d45d4d89-8qhk5   1/1     Terminating   0          26m     10.244. 0215.   dm01   <none>           <none>
nginx-app-d8585cb66-mwv9h    1/1     Running       0          7s      10.244. 0216.   dm01   <none>           <none>
nginx-test-5f4tf             1/1     Running       6          6d17h   10.244. 0200.   dm01   <none>           <none>
nginx-test-jkqcj             1/1     Running       3          6d17h   10.2442.85.    dm03   <none>           <none>
nginx-test-n98lc             1/1     Running       4          6d17h   10.2441.. 56    dm02   <none>           <none>
The pod is still accessible in terminating state
[root@dm01 ~]# curl 10.244. 0215.<! DOCTYPE html><html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
Copy the code