In my experience, most people (using Helm or manual YAML) deploy applications to Kubernetes and assume that they will run stably all the time. Instead, there are some pitfalls that I’d like to list here to help you understand some of the things you need to be aware of before launching your application on Kubernetes.

Introduction to Kubernetes scheduling

The scheduler uses Kubernetes’ Watch mechanism to find pods that are newly created in the cluster and not yet scheduled on Node. The scheduler will schedule each unscheduled Pod it finds to run on an appropriate Node. The kube-Scheduler serves as the cluster’s default scheduler. For each newly created or unscheduled Pod, kube-Scheduler selects an optimal Node to run the Pod. However, each container within a Pod has different resource requirements, and the Pod itself has different resource requirements. Therefore, pods need to filter nodes in the cluster based on these specific resource scheduling requirements before they can be scheduled on nodes.

In a cluster, all nodes that satisfy a Pod scheduling request are called schedulable nodes. If no Node satisfies the Pod’s resource request, the Pod will remain unscheduled until the scheduler can find a suitable Node.

Factors to consider when making scheduling decisions include: individual and overall resource requests, hardware/software/policy constraints, affinity and antiaffinity requirements, data locality, interference between loads, and so on. For more information about scheduling, please refer to the official website

Pod Requests and Limits

Consider a simple example where only partial YAML information is captured

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx-demo
    image: nginx
    resources:
      limits:
        memory: "100Mi"
        cpu: 100m
      requests:
        memory: "1000Mi"
        cpu: 100m
Copy the code

By default, we create the service deployment file. If we do not write the Resources field, the Kubernetes cluster will use the default policy and do not impose any resource restrictions on the Pod. This means that the Pod can use the memory and CPU resources of Node nodes at will. But this creates a problem: competition for resources. For example, a Node has 8GB of memory and two Pods running on it. At first, both pods need only 2 gigabytes of memory to run, which is fine, but if one of the pods uses up to 7 gigabytes of memory due to a memory leak or a sudden increase in the process, then the Node’s 8 GIGABytes of memory is obviously not enough. This can cause services to be slow or unavailable. So, in general, when we redeploy the service, we need to limit the resources of the POD to avoid similar problems.

As shown in the sample file, you need to add Resources;

For example, 100Mi memory is required. For example, 100 MB CPU limits: indicates the maximum resources that the service can use. In this example, the maximum resources are set to 1000Mi memory and 100 MB CPUCopy the code

What does that mean? A picture is worth a thousand words. PS: @@@ I really tried my best @@@

Liveness and Readiness Probes

Another hot topic frequently discussed in the Kubernetes community. It is important to understand Liveness and Readiness probes because they provide a mechanism for running fault-tolerant software and minimizing downtime. However, if configured incorrectly, they can have a serious performance impact on your application. Here’s an overview of the two probes and how to reason about them:

Liveness Probe: Detects whether the container is running. If the active probe fails, Kubelet will kill the Container and the Container will accept its restart policy. If the Container does not provide active probes, the default state is Success.

Because Liveness probes run frequently, setting them up is as simple as possible. For example, if you set them to run once per second, one additional request per second will result in additional traffic, so you need to consider the additional resources required for that request. Typically, we provide a health check interface for Liveness that returns a response code of 200 indicating that your process is started and can handle the request.

Readiness Probe: Detects whether the container is ready to handle requests. If the ready probe fails, the Endpoint removes the Pod’S IP address from the endpoints of all services that match the Pod.

The Readiness probe is more demanding to check because it indicates that the entire application is running and ready to receive requests. For some applications, requests will not be accepted until records are returned from the database. By using well-thought-out readiness probes, we were able to achieve higher levels of availability and zero downtime deployment.

Liveness and Readiness Probes are detected using the same methods

  1. Define the survival command:

If the command is executed successfully and the return value is zero, Kubernetes considers the probe successful. If the command returns a non-zero value, Liveness detection fails. 2. Define a viable HTTP request interface. Send an HTTP request with any return code greater than or equal to 200 and less than 400 indicating success, and all other return codes indicating failure. 3. Define TCP survival probe to send a tcpSocket request to the execution port. If the connection succeeds, it fails otherwise.

For an example, take the common TCP survival probe

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx-demo
    image: nginx
    livenessProbe:
      tcpSocket:
        port: 80
      initialDelaySeconds: 10
      periodSeconds: 10
    readinessProbe:
      tcpSocket:
        port: 80
      initialDelaySeconds: 10
      periodSeconds: 10
Copy the code
The livenessProbe section defines how to perform Liveness probes: 1. The detection method is to connect to port 80 of nginx through the tcpSocket. If the execution succeeds, the return value is zero, Kubernetes considers the Liveness detection successful. If the command returns a non-zero value, Liveness detection fails. InitialDelaySeconds: 10 Specifies that Liveness detection will start after the container starts 10. This parameter is set based on the preparation time for application launch. For example, if the application takes 30 seconds to start properly, the value of initialDelaySeconds should be greater than 30. 3. PeriodSeconds: 10 Indicates that Liveness detection is performed every 10 seconds. Kubernetes kills and restarts the container if three consecutive Liveness probes fail. The same as readinessProbe, but Readiness undergoes the following changes: 1. When it was created, the READY state was unavailable. 2. After 20 seconds (initialDelaySeconds + periodSeconds), Readiness is detected for the first time and returns successfully. Set the Readiness to available. 3. If Kubernetes fails three consecutive Readiness probes, READY is set to unavailable.Copy the code

Set the default network policy for Pod

Kubernetes uses a “flat” network topology where all pods can communicate directly with each other by default. But in some cases this is undesirable or even unnecessary. There are potential security risks, such as a vulnerable application being exploited that could provide the attacker with full access to send traffic to all the PODS on the network. As in many security domains, a minimum access policy applies here, and ideally a network policy is created to explicitly specify which container-to-container connections are allowed.

For example, here is a simple policy that rejects all incoming traffic for a particular namespace

---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: deny-ingress-flow
spec:
  podSelector: {}
  policyTypes:
    - Ingress
Copy the code

A diagram of this configuration

Custom behavior via Hooks and init containers

One of our main goals with the Kubernetes system is to try to provide off-the-shelf developers with deployment with as little downtime as possible. This is difficult because of the variety of ways in which applications can shut themselves down and clean up utilized resources. One application where we had particular difficulty was Nginx. We noticed that when we started rolling deployment of these pods, the active connection was dropped before successfully terminating. After extensive online research, it turns out that Kubernetes didn’t wait for Nginx to exhaust its connections before terminating the Pod. Using the before-stop hook, we were able to inject this functionality and achieve zero downtime with this change.

Normally, for example, we would do a rolling upgrade to Nginx, but Kubernetes does not wait for Nginx to terminate the connection before stopping the Pod. This will result in the stopped Nginx not closing all connections correctly, which is not reasonable. So we need to stop money using hooks in order to solve such problems.

Lifecycle can be added to the deployment file

lifecycle:
  preStop:
    exec:
      command: ["/usr/local/bin/nginx-killer.sh"]
Copy the code

nginx-killer.sh

#! /bin/bash
sleep 3
PID=$(cat /run/nginx.pid)
nginx -s quit
while [ -d /proc/$PID ]; do
    echo "Waiting while shutting down nginx..."
    sleep 10
done
Copy the code

Thus, Kubernetes will execute the nginx-killer. Sh script to shut down nginx in the way we defined before shutting down Pod

An init Container is a Container used for initialization. It can be one or more, and if there are more than one, they are executed in a defined order. The main Container is not started until all init containers have been executed.

 initContainers:
        - name: init
          image: busybox
          command: ["chmod"."777"."-R"."/var/www/html"]
          imagePullPolicy: Always
          volumeMounts:
          - name: volume
            mountPath: /var/www/html
      containers:
      - name: nginx-demo
        image: nginx
        ports:
        - containerPort: 80
          name: port
        volumeMounts:
        - name: volume
          mountPath: /var/www/html
Copy the code

Nginx /var/ WWW/HTML: /var/ WWW/HTML: /var/ WWW/HTML: /var/ WWW/HTML: /var/ WWW/HTML: /var/ WWW/HTML: /var/ WWW/HTML Init Container has more powerful features, such as initial configuration…

Kernel Tuning

Finally, leaving the more advanced technology to last, Haha Kubernetes is a very flexible platform designed to let you run your services the way you see fit. Usually, if we have a high performance service with strict resource requirements, such as the common Redis, the following message will be displayed after startup

WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
Copy the code

This requires us to modify the system kernel parameters. Fortunately, Kubernetes allows us to run a privileged container that can modify kernel parameters that apply only to a particular running Pod. Here are our used to modify the/proc/sys/net/core/somaxconn parameters of the sample.

initContainers:
   - name: sysctl
      image: Alpine: 3.10
      securityContext:
          privileged: true
       command: ['sh'.'-c'."echo 511 > /proc/sys/net/core/somaxconn"]
Copy the code

conclusion

While Kubernetes provides an out-of-the-box solution, there are some key steps you need to take to ensure stable application performance. Before the application goes live, be sure to test it several times, observe key metrics, and adjust it in real time. Before we deploy our service to the Kubernetes cluster, we can ask ourselves a few questions:

  • How many resources does our program need, such as memory, CPU, etc.?
  • What is the average and peak traffic of the service?
  • How long do we want the service to expand, and how long does it take for new pods to accept traffic?
  • Is our Pod stopped normally? How to do it without affecting the online service?
  • How can we ensure that the problem of our service will not affect other services and will not cause large-scale service outage?
  • Do we have too much authority? Safe?

Finally finished, whoo-hoo ~~~ really drop good difficult ah ~~~