An overview of the

Kubernetes(K8S), as the most famous container orchestration tool at present, is the “operating system” in the era of Cloud Native. Familiarity with and use of Kubernetes is a necessary skill for research and development, operation and maintenance, products and so on. This article from the development history, installation and operation, resources, storage, network, security, management, future prospects and other aspects of about 680 knowledge points outlined the K8S knowledge map, aimed to help you better understand the K8S related knowledge, for business, operation and maintenance, innovation to lay a solid foundation.

The full version link: https://www.processon.com/view/link/60dfeb3e1e085359888fd3e3

Noun shorthand

PV: Persistent Volume


PVC: Persistent Volume Claim


SA: Service Account


HA: High Available


HPA: Horizontal Pod Autoscaler


VPA: Vertical Pod Autoscaler


PSP: Pod Security Policy


PDB: Pod Disruption Budget


CRD: Custom Resource Definition


CSI: Container Storage Interface


CNI: Container Network Interface


CRI: Container Runtime Interface


OCI: Open Container Initiative


CNCF: Cloud Native Computing Foundation

1. Development historyHistory

As Docker has gained a foothold in container technology, it has challenged the interests of other players on multiple occasions, including Google, RedHat, CoreOS, Microsoft, and others. Google made a big move in the early days of Docker: open source the internally produced verification Container LMCTFY (Let Me Container That For You). But in the face of Docker’s strong rise, it has no resistance. Docker has absolute authority and discourse power in the container industry. Google offered Docker a hefty price, and Docker’s tech boss and co-founder Solomon Hykes, presumably an idealist, brushed aside the offer.

Google was so frustrated that it teamed up with players in the open source infrastructure space, including RedHat and CoreOS, to lead a new Foundation called the CNCF(Cloud Native Computing Foundation).

Borg is Google’s core and lowest level technology, which is entrusted to the CNCF Foundation, known as Kubernetes.

In October 2017, Docker unexpectedly announced that it would build Kubernetes into its flagship Docker Enterprise Edition, marking the end of a nearly two-year-old “Choreographing War.”

Twitter stopped using Apache Mesos after May 2019, and Aliyun stopped supporting Docker Swarm after July 2019.

Kubernetes promotes the “democratization” of the architecture throughout the community, i.e. at every level from the API to the container runtime, the Kubernetes project exposes extensible plug-in mechanisms for developers, and encourages users to participate in every phase of the Kubernetes project through code. In this way, in the overall atmosphere of encouraging secondary innovation, Kubernetes community has experienced unprecedented development since 2016. More importantly, unlike the previous PaaS approach, which was limited to “package and distribute”, this container community is a “hundred schools of thought” centered entirely on the Kubernetes project.

2. The architectureArchitecture

Kubernetes follows a very traditional client/server architecture pattern. Clients can communicate with a Kubernetes cluster either through a RESTful interface or directly using Kubectl. There is not much difference between the two in practice. The latter simply encapsulates and provides the RESTful API provided by Kubernetes. Each Kubernetes cluster is composed of a group of Master nodes and a series of Worker nodes, in which the Master node is mainly responsible for storing the state of the cluster and allocating and scheduling resources for Kubernetes objects.

Master (Control Plane)

As the Master node that manages the state of the cluster, it is mainly responsible for receiving client requests, arranging the execution of the container, running the control loop, and migrating the state of the cluster to the target state. The Master node is composed of four components:

kube-apiserver: Responsible for handling requests from clients (client-go, kubectl), its main role is to provide a RESTful interface to the outside, including read requests to view the state of the cluster and write requests to change the state of the cluster, and is the only component that communicates with the ETCD cluster.

ETCD: A key-value database with both consistency and high availability that acts as a backend database for all cluster data in Kubernetes.

Kube-scheduler: A component on the master node that monitors PODs of newly created unspecified runnable nodes and selects nodes on which to run them. Factors considered in scheduling decisions include resource requirements for individual PODs and POD sets, hardware/software/policy constraints, affinity and anti-affinity specifications, data location, interference between workloads, and deadlines.

Kube-controller-manager: The component that runs the controller on the primary node. Logically, each controller is a separate process, but to reduce complexity, they are all compiled into the same executable and run in a single process. A List-Watch Event is used to trigger the tuning process of the corresponding controllers, which include: Node Controller, Replication Controller, Endpoint Controller, ServiceAccount/Token Controller, etc.

Node (Worker)

Kubelet: It is the agent that the work node performs operations. It is responsible for the specific container life cycle management, managing the container according to the information obtained from ETCD, and reporting the POD running status, etc.

Kube-Proxy: A simple network access proxy that also acts as a Load Balancer. It is responsible for assigning requests to a service specifically to the POD of the same class of Label on the work node. The essence of Kube-proxy is to implement POD mapping by operating firewall rules (iptables or ipvs).

Kubernetes supports multiple container-runtime environments: Docker, Containerd, CRI-O, RKtlet, and any implementations of Kubernetes CRI(Container Runtime Interface).

3. Installation and operationInstall & Run

K8s installation can manually download a binary package (https://github.com/kubernetes/kubernetes/releases), can also through third-party toolkit installation cluster environment, it is recommended to use the latter to install.

Currently commonly used third party tools are: Minikube, Kubeadm, Kind, K3S.

Minikube is suitable for lightweight, single node local cluster environment construction, novice learning can choose; Kubeadm is suitable for the construction of complete K8s master/node multi-node cluster environment. The characteristic of Kind is to deploy K8s into Docker container, and K3S is suitable for the construction of cluster environment on lightweight, IOT and other micro devices.

4. The resourcesResources

In K8S, resource objects can be divided into two categories: Workloads and Controllers.

Workloads include: POD, Deployment, StatefulSet, Service, ConfigMap, Daemononset, Job/ Cronjob, HPA, Namespace, PV/PVC, Node, etc. It mainly classifies various types of resources according to requirements and characteristics.

Controllers mainly include: Node Controller, Replication Controller, Namespace Controller, ServiceAccount Controller, Service Controller, Endpoint Controller, etc., mainly Reconcile the Actual values of various types of resources to expected values in automatic resource control.

All the resources to GET through the REST API calls kube – apiserver/POST/PATCH/REPLACE/DELETE resources control (add and DELETE), need to satisfy the interface of authentication, authorization and access control. The kubectl command is an officially provided client-side command-line tool that encapsulates REST API calls to kube-apiserver. All resources are persisted to ETCD back-end storage through Kube-Apiserver. Therefore, in production practice, it is necessary to ensure high availability deployment of Kube-Apiserver and ETCD at the same time to prevent single point of failure.

5. StorageStorage

The data generated by the Container in the POD needs to be stored persistently. Especially for StatefulSet services, it can be stored locally or on the network via PV/PVC so that the Container application can still use the data after the rebuild. If you don’t need persistent storage, you can use Ephemeral-Volume(EmptyDir), a temporary storage Volume that is purged along with the POD life cycle.

PV(PersistentVolume) is an abstraction of the underlying shared storage, defining the shared storage as a “resource.” PV is manually created by the administrator or dynamic provision is provided through a plug-in mechanism to connect with the specific implementation of CSI(Container Storage Interface). Such as GlusterFS, iSCSI, GCE, AWS public cloud and so on.

The PVC(PersistentVolumeclaim) is a “request” to the storage resource, just as the POD “consumes” the Node’s resource, the PVC will “consume” the PV, both in the same namespace. Or a specific SC(StorageClass), Label Selector (Label Selector) match to complete the Bound. You can set a policy to automatically delete the bound PV resources when the PVC is deleted, and recycle the storage resources in time.

6. The networkNetwork

A basic principle of K8S network model design is that each Pod has an independent IP address, that is, IP-per-pod model, and it is assumed that all pods are in a flat network space that can be directly connected. So whether containers are running on the same Node or not, they are required to be directly accessible via each other’s IP.

In fact, IP in K8S is allocated on a unit of POD, and the container inside a POD shares a network protocol stack. However, Docker’s native access mode through port mapping will introduce the complexity of port management, and the IP address and port seen by visitors are different from the actual binding of the service provider. Because NAT will change the source/target address, it is difficult for the service itself to know the real IP and port that it exposes to the outside.

Therefore, K8S has the following requirements for the cluster network:

  • All containers can communicate with each other without using NAT.
  • All nodes can communicate with all containers without NAT.
  • The IP of the container is the same as what the visitor sees;

K8S abstracts a group of container applications with the same function as services, and provides a unified access address. Service communication within and outside the cluster can be realized through ClusterIP, NodePort and LoadBalancer, and network policies such as INGRESS (in network) and EGRESS (out network) can be used to control the request flow in and out of the container.

7. The schedulingScheduler

The Kube-Scheduler performs the important function of “connecting the preceding and the following” in the K8S cluster. “Heading” means that it is responsible for receiving PODs created by the Controller Manager and selecting an appropriate Node for them. “Start” means that the selected Kubelet server process on Node takes over the follow-up work and will be responsible for the “second half” of the POD life cycle.

The default scheduling process provided by K8S is divided into the following two steps:

  • Pre-selection scheduling process: through a variety of pre-selection strategies (XXX Predicates), select the candidate nodes that meet the requirements;
  • Identify the optimal Node: Based on the first step, XXX Priority is used to calculate the score of each candidate Node. The Node with the highest score will be selected as the target Node.

In addition, PODs can be set with Affinity and anti-affinity to indicate the deployment of PODs to related Node collections in a preferred or mandatory manner. Tolerations/Taint does the opposite, allowing nodes to resist the deployment of certain PODs. Taint works with Tolerations to ensure that PODs don’t get dispatched to inappropriate nodes. A single Node can apply multiple TAINTs. Nodes do not accept POD scheduling that is intolerable to TAINTs. Tolerations are properties of a POD that allow (but do not force) a POD to be dispatched to nodes that match the TAINT.

Tips: Taint is an attribute of Node and Affinity is an attribute of Pod.

Safe 8.Security

K8S implements cluster security Control through a series of mechanisms, including Authentication, Authorization, Admission Control, Secret, Service Account and so on. All the access and change of resources in K8S are realized through the REST API of K8S API Server, so there are three ways of Authentication:

  • HTTPS: CA + SSL
  • HTTP Token
  • HTTP Base

Authorization modes include: ABAC(Attribute-Based Access Control), RABC(Role-Based Access Control), Webhook, etc., among which RBAC is used more in production practice. Role-based controls mainly include: Role, RoleBinding, and ClusterRoleBinding. The first two control resources in a certain namespace, and the latter two control resources at the cluster level or in a specified namespace.

After the first two levels of authentication and authorization, there is an access Control chain controlled by Admission Control, including resources, POD and other access Control plug-ins. Only through all the open Control chains, can a successful API Server response be achieved.

K8S component kubectl. By default, kubelet communicates with API Server securely through SA(Service Account) named default. The specific implementation is to mount the Secret named Token into the container directory. Carry the corresponding Token for security verification when accessing.

In addition, K8S provides PSP(PodSecurityPolicy) mechanism for more fine-grained security policy control of Pod/Container. It mainly includes the control of host, user, group, privilege, and Linux capabilities at different levels. The securityContext in Pod/Container needs to match the corresponding policy in order to be created successfully.

9. ExtensionExtensions

With the development of K8S, the built-in POD, RC, Service, Deployment, ConfigMap and other resource objects in the system can no longer meet the diversified business, operation and maintenance needs of users, so it is necessary to expand the API. Currently K8S provides two mechanisms to extend the API:

  • CRD extension: reuse K8S API Server, need to provide CRD Controller;
  • API Aggregation Layer: requires the user to write additional API servers that allow for more fine-grained control of resources;

In fact, API Aggregation is mainly processed through Kube-proxy requests for resources on different paths and forwarded to user-defined API Service Handler.

In addition, K8S adopts the plug-in mechanism of out-of-tree pattern, which supports the extension of infrastructure components without affecting the base of trunk branch code, including: CSI, CNI, CRI, Device Plugins, etc. ABSTRACTS the underlying interface in the way of plug-ins. Users only need to implement specific interfaces to connect their own infrastructure capabilities to K8S.

Management of 10.Management

K8S provides a cluster management mechanism of different dimensions, This includes Cordon, Uncordon, Drain, PDB Active Ejection Protection, Requests/Limits, Metrics Server monitor collection, Log, Audit auditing, and more. Cluster administrators can also use the aforementioned mechanisms such as Affinity, Taints/Tolerations, labels, and annotations to meet the diverse management needs of the cluster.

In addition, the cluster operation needs to consider the high availability (HA) deployment, can adopt the ETCD, master component, BIZ container and other aspects of the multi-copy high availability deployment, to ensure the stable operation of the cluster.

11. The toolTools

K8S provides the client tool Kubectl for users to use, which integrates almost all the APIs that API Server can handle. For specific use, please refer to the common commands listed in the figure, and for all commands, please refer to the official documentation.

For complex business types, the official recommendation is to use the Kustomize toolkit to handle YAML files flexibly. In the higher version of K8S, the kubectl command already installs Kustomize by default, without the need for a separate installation.

In addition, K8S officially recommends Helm, a package management tool, to package various applications with different versions, different environments and dependencies as Chart to realize rapid CI/CD deployment, rollback, and gray scale release.

12. Future ProspectsOngoing & Future

The API, resources, fields and tools involved in this paper may have been Deprecated due to the fast development of K8S project. Please refer to the latest official version.

At present, Kubernetes has fully involved in the Internet, AI, blockchain, finance and other industries, it can be expected that there will be more and more industries began to use Kubernetes in the future, and the practice of each industry will be more in-depth.

At present, the K8S community is developing in support of Windows Container, GPU, VPA vertical expansion, Cluster Federation multi-cluster, edge computing, machine learning, Internet of Things and other aspects to promote the implementation of more cloud native products. Maximize the capabilities and value of the cloud.

PS: Please pay attention to the public account “Straw Life” for more articles.

The resources

  1. Official documentation of Kubernetes
  2. The Definitive Guide to Kubernetes: From Docker to Kubernetes Practice Full Touch (4th Edition)
  3. https://www.padok.fr/en/blog/minikube-kubeadm-kind-k3s
  4. https://segmentfault.com/a/1190000039790009
  5. https://www.cnblogs.com/lioa/p/12624601.html
  6. https://blog.csdn.net/dkfajsldfsdfsd/article/details/80990736