Kubernetes Concepts: Basic concepts and terminology

This article has participated in the good article call order activity, click to see: back end, big front end double track submission, 20,000 yuan prize pool for you to challenge!

Before starting to use, we should first understand the related concepts and terms about Kubernetes, which will be of great help to the subsequent learning and use. (Kubernetes concept is more, it is recommended to strengthen the understanding, and clear the various positions and relevance!)

Most concepts in Kubernetes, such as: Node, Pod, Replication Controller, Service, and so on can all be considered as resource objects. Almost all resource objects can be added, deleted, changed, and searched through kubectl tools (or APIS) provided by Kubernetes and stored in etCD for persistent storage.

From this perspective, Kubernetes is a highly automated resource control system that implements advanced features of automatic control and error correction by tracking the difference between the “expected state of the resource” stored in the ETCD library and the “actual state of the resource” in the current environment.

This article introduces the basic concepts and terminology of the important Kubernetes resource object, that is, Kubernetes.

1, Master

A Master is a Master Node in a Kubernetes cluster. Each Kubernetes cluster needs a Master to manage and control the entire cluster. Basically, all control commands are sent to it and it is responsible for the specific execution process. All subsequent commands are executed on the Master.

Master provides a unique view of the cluster and has a number of components, such as the Kubernetes API Server. API Server provides REST endpoints that you can use to interact with the cluster. Pod, replica, and service can be maintained from the command line or graphical interface.

Include the following components on Master:

Etcd: distributed key-value storage, which stores cluster status data and resource object data.
API Server(Kube-api-server) : An HTTP Rest interface provided by Kubernetes. It is the only portal for adding, deleting, modifying, and querying all resources, and is also the portal process for cluster control.
Controllers(kube-controller-Manager) : Controllers(kube-controller-Manager) : Automation control center for all resource objects in Kubernetes.
Scheduler(Kube-scheduler) : A process responsible for resource scheduling (Pod scheduling), which is equivalent to the “Scheduler room” of a bus company.

2, the Node

Except for the Master, the other clusters in the Kubernetes cluster are called nodes, that is: Worker nodes. Like the Master, a Node can be a physical host or a virtual machine.

Nodes are workload nodes in the Kubernetes cluster. Each Node is allocated some workload by the Master. When a Node goes down, the Master automatically transfers the workload to other nodes.

Each Node runs the following key components:

Kubelet: Responsible for Pod container creation, starting and stopping, and other tasks. Meanwhile, kubelet works closely with Master to realize basic cluster management functions.
Kube-proxy: an important component that implements the communication and load balancing mechanism of Kubernetes Service.
Container Runtime: Downloads the image and runs the Container. For example, the Docker engine is responsible for the creation and management of the native container.

Nodes can be dynamically added to the Kubernetes cluster during runtime. By default, Kubelet registers itself with the Master. Once Node is included in the cluster management scope, the Kubelet process will periodically report its own information to the Master, such as the operating system, Docker version, machine CPU and memory, as well as the current Pod running, so that the Master can know the resource usage of each Node. And implement efficient and balanced resource scheduling policies. If a Node does Not report any information within a specified period of time, the Master determines that the Node is in the disconnected state and is Not Ready. Then, the Master triggers an automatic “workload shift” process.

Run the kubectl get nodes command to check how many nodes are in the cluster:

[xcbeyond@localhost ~]$kubectl get nodes NAME STATUS ROLES AGE VERSION Minikube Ready master 17d v1.19.0Copy the code

Kubectl describe node

[xcbeyond@localhost ~]$ kubectl describe node minikube Name: minikube Roles: master Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=minikube Kubernetes. IO/Linux OS =...Copy the code

3, Pod

Pods are the atomic objects in Kubernetes, the basic building blocks.

Pod represents the set of running containers on the cluster. Pods are typically created to run a single master container. Pod can also run optional Sidecar containers to implement complementary features such as logging. (for example, the istio-proxy and istio-init containers that exist with the application in the Service Mesh)

Deployment is usually used to manage Pod.

A Pod can contain multiple containers (supplemented by other containers) that handle the data volumes, keys, and configuration of the containers.

As you can see from the diagram below, each Pod has a special Pause container called the “root container.” The Pause container image is part of the Kubernetes platform. In addition to the Pause container, each Pod contains one or more closely related user business containers.

Why did Kubernetes come up with a whole new Pod concept with such a special structure?

In the case of a group of containers as a unit, it is difficult to simply judge the “whole” and effectively control it. For example, if one container dies, is that the whole? Introducing the Pause container as the root container of Pod, with its state representing the state of the entire container group, is a simple and clever way to solve this problem.
Multiple business containers in the Pod share Pause container IP and Volume, which simplifies communication between closely related business containers and solves the file sharing problem between them.

Kubernetes assigns each Pod a unique IP address, called a Pod IP, which is shared by multiple containers within a Pod.

Kubernetes requires the underlying network to support TCP/IP direct communication between any two PODS in the cluster, which is usually implemented by virtual layer 2 network technologies, such as Flannel, Open vSwitch, etc., so we need to keep one thing in mind: In Kubernetes, a Pod container can communicate directly with a Pod container on another host.

There are two types of Pod:

Ordinary Pod
Static Pod

The latter is special in that it is not stored in the Kubernetes ETCD store, but in a file on a Node and only starts on that Node. A normal Pod, once created, is stored in an ETCD and then dispatched to a specific Node by the Kubernetes Master for Binding. The Pod is then instantiated into a set of related Docker containers by the kubelet process on the corresponding Node and started.

By default, when a container in a Pod stops, Kubernetes automatically detects the problem and restarts the Pod (restarting all containers in the Pod), and reschedules all pods from that Node to another Node if the Node in which the Pod is running goes down. The diagram of Pod, container, and Node is shown in the following figure.

The life cycle of Pod is uncertain and may be very short, but Pod is highly regenerative and can automatically restart after death (restart mechanism). Throughout the Pod life cycle, one of the following five phases may occur:

Pending: Pod is correctly defined and committed to Master, but the container image it contains has not been fully created. Typically, it takes time for the Master to schedule the Pod, time for the Node to download the container image, and time to start the container.
Running: Pod has been assigned to a Node, all containers have been created, at least one container is Running, or a container is being started or restarted.
Succeeded: All containers in Pod have completed successfully and are not being restarted. This is one of the final states of Pod.
Failed: All containers in Pod have finished running, and at least one container ended incorrectly (exit code is not 0). This is also a final state of Pod.
Unknown: The status of the Pod cannot be obtained, usually because it cannot communicate with the Node where the Pod resides.

4, the Label

Label is another core concept in Kubernetes. A Label is a key-value pair with key=value, where the key and value are specified by the user. Labels can be attached to various resource objects, such as Node, Pod, Service, RC, etc. A resource object can define any number of labels, and the same Label can also be added to any number of resource objects. Labels are usually determined when the resource object is defined, but can also be added or removed dynamically after the object is created.

Generally speaking, we will define multiple labels for the specified resource object to achieve multi-dimensional resource group management, so that resource allocation, scheduling, configuration, deployment and other management work can be flexibly and conveniently carried out. For example, deploy different versions of applications to different environments, or monitor and analyze applications (logging, monitoring, alerting, etc.). Some examples of common labels are as follows:

Version label:Release: stable,release: canary
Environment Label:environment: dev,environemnt: qa,environment: production
Architecture label:tier: frontend,tier: backend,tier: middleware
…

After a resource object defines a Label, Kubernetes can query and filter the Label’s resource object through the Label Selector. In this way, Kubernetes implements an SQL-like object query mechanism.

In general, we specify the Label by describing the spec. Selector field in the file, so that Kubernetes can find all the objects that contain your specified Label and manage it.

Kubernetes currently supports two types of Label Selector:

Equal-based Selector: Equal-thunder expression matching tag.
Set-based Selector: Set operation expressions match labels.

One or more groups of labels can be created for an object by using Label. Together with Label Selector, Label and Label Selector constitute the most core application model in Kubernetes system, which enables objects to be grouped and managed in fine detail, and realizes the high availability of cluster.

5, the Replication of the Controller

Replication Controller, or RC for short, is one of the core Kubernetes concepts that defines an expected scenario in which the number of copies of a Pod is declared to conform to an expected value at any given time.

The definition of RC consists of the following parts:

The number of copies that Pod expects.
Label Selector used to filter the target Pod.
A Pod template used to create a new Pod when the number of copies of a Pod is smaller than expected.

As an example, Kubernetes uses RC to automatically control the number of Pod copies in a cluster with three nodes.

If we define a “redis-slave” Pod in RC to keep two copies, we will create a Pod for two of the nodes, as shown in the figure below:

If a Pod terminates on Node 2, Kubernetes will automatically create and start a new Pod based on the replicas number 2 defined by RC to ensure that there are always two Redis slaves running in the cluster. Kubernetes may select Node 3 or Node 1 to create a new Pod, as shown in the figure below.

In addition, at run time, Pod can be scaled dynamically by modifying the number of RC replicas. This can be accomplished by executing the kubectl scale RC redis-slave –replicas=3 command with one click. The execution result is shown in the figure below:

Note: Deleting an RC does not affect the Pod created with that RC. Delete all pods, set replicas to 0, and then update the RC. In addition, Kubectl also provides stop and delete commands to delete RC and all RC controlled pods at once.

Finally, summarize the characteristics and functions of RC:

In most cases, the Pod creation process and the number of copies are automatically controlled by a custom RC.
The complete POD definition template is included in RC.
RC realizes automatic control of POD copy through label Selector mechanism.
By changing the number of Pod copies in RC, Pod capacity can be expanded and reduced.
By changing the mirror version of the Pod template in RC, you can achieve the rolling upgrade function of Pod.

6, Deployment

Deployment is a new concept introduced by Kubernetes in version 1.2 to better solve Pod orchestration problems. To achieve this, Deployment internally uses Replica Set, which can be seen as an upgrade of RC in terms of Deployment’s role, YAML definition, and its specific command line operations.

One of the biggest upgrades to Deployment over RC is that we can keep track of the current progress of Pod Deployment.

Typical application scenarios:

Create a Deployment object to generate the corresponding Replica set and complete the creation of the Pod Replica.
Check the Deployment status to see if the Deployment action has been completed (whether the number of Pod copies is as expected).
Update Deployment to create the new Pod.
If the current Deployment is unstable, roll back to a previous Deployment version.
Suspend Deployment so that multiple PodTemplateSpec configuration items can be modified at once, and then resume Deployment for a new release.
Scale Deployment to cope with the high load.
View Deployment status as an indicator of success of the release.

7, StatefulSet

In Kubernetes, Pod’s management objects RC, Deployment, DaemonSet, and Job are all stateless services. But in reality, many services are stateful, especially some complex middleware clusters, such as MySQL cluster, MongoDB cluster, Kafka cluster, Zookeeper cluster, etc., these application clusters have the following in common.

Each node has a fixed ID through which the members of the cluster can discover and communicate with each other.
The size of a cluster is fixed and cannot be changed at will.
Each node in a cluster is stateful and typically persists data into permanent storage.
If the disk is damaged, a node in the cluster cannot run properly and the cluster functions are damaged.

If we use RC or Deployment to control the number of Pod copies to implement the statically cluster mentioned above, we will find that the first point cannot be satisfied, because the name of Pod is generated randomly, and the IP address of Pod is determined during the runtime and may change. We cannot determine the unique and constant ID for each Pod in advance. In order to recover a failed node from another node, the Pod in such a cluster needs to be attached to some kind of shared storage. To solve this problem, Kubernetes introduced PetSet, a new resource object, starting in V1.4 and renamed StatefulSet in V1.5. StatefulSet is essentially considered a special variant of Deployment/RC with the following features.

Each Pod in StatefulSet has a stable, unique network identity that can be used to discover other members of the cluster. Assuming the StatefulSet name is Kafka, the first Pod will be called Kafak-0, the second kafak-1, and so on.
The start-stop sequence of statefulset-controlled copies of pods is controlled, with the NTH Pod operated when the first N-1 Pod is on-time and ready.
Pod in StatefulSet uses stable and persistent storage volumes, which are implemented through PV/PVC. When Pod is deleted, storage volumes related to StatefulSet are not deleted by default (to ensure data security).

In addition to being bundled with PV volumes to store Pod state data, StatefulSet is also used with Headless Services, which are declared in the definition of each StatefulSet. The key difference between a Headless Service and a normal Service is that it does not have a Cluster IP. If the DNS domain name of the Headless Service is resolved, the Endpoint list of all the pods corresponding to the Service is returned. StatefulSet creates a DNS domain name for each Pod instance controlled by StatefulSet on top of the Headless Service.

$(podname).$(headless service name)

For example, a three-node Kafka StatefulSet cluster with a Headless Service named Kafka and a StatefulSet named Kafka. The DNS names of the three Pods in the StatefulSet are kafka-0. Kafka, kafka-1. Kafka, and kafka-3.

8, the Service

Service is also one of the core resource objects in Kubernetes. Every Service in Kubernetes is actually a “microservice” in what we often refer to as the microservice architecture. The Pod, RC and other resource objects mentioned above are actually for the Kubernetes Service to do the preparation. The following figure shows the logical relationship between Pod, RC, and Service.

As you can see from the figure, Kubernetes Service defines an entry address for a Service that the front-end application (Pod) uses to access a cluster of instances consisting of Pod replicas behind it. The Service is “seamlessly connected” to the back-end Pod replica cluster by Label Selector. In fact, the role of RC is to ensure that the Service capability and quality of Service are always at the expected standard.

9, the Job

A batch task is a process that starts multiple processes in parallel or serial to process a batch of work. When the processing is complete, the entire batch task is finished. Starting with Kubernetes version 1.2, applications that support the batch type can define and start a batch Job using a new resource object called Kubernetes Job. Similar to RC, Deployment, and ReplicaSet, jobs are used to control a set of Pod containers.

A Job is responsible for batching short, one-time tasks, that is, tasks that are executed only once, and it ensures that one or more PODS of a batch task end successfully.

10 and Volume

A Volume is a shared directory in a Pod that can be accessed by multiple containers. The concept, purpose, and purpose of Kubernetes Volume are similar to Docker Volume, but they are not equivalent. First, the Volume in Kubernetes is defined on a Pod and then mounted to a specific file directory by multiple containers in a Pod. Second, the data in the Volume in Kubernetes is not lost. Finally, Kubernetes supports multiple types of Volume, such as Gluster, Ceph, and other advanced distributed file systems.

11, the Namespace

Namespaces (namespaces) are another very important concept in the Kubernetes system. Namespaces are used in many cases to achieve resource isolation for multi-tenants. Nameaspace allocates resource objects in a cluster to different namespaces to form logically grouped projects, groups, or user groups. In this way, different groups can share the resources of the entire cluster and be managed separately.

By default, Kubernetes cluster creates a Namespace named default.

[xcbeyond@bogon ~]$ kubectl get namespaces
NAME                   STATUS   AGE
default                Active   23d
istio-system           Active   22d
kube-node-lease        Active   23d
kube-public            Active   23d
kube-system            Active   23d
kubernetes-dashboard   Active   23d
Copy the code

If the Namespace is not specified, the Pod, RC, and Service created by the user will be added to the default Namespace.

12, the Annotation

An Annotation, like a Label, is defined as a key/value pair. The difference is that the Label has a strict naming convention, which defines the Metadata of the Kubernetes object and is used for the Label Selector. The Annotation is the “additional” information arbitrarily defined by the user, so that external tools can find it. In many cases, the module of Kubernetes itself will mark the special information of the resource object by means of Annotation.

In general, the information that is recorded with annotations is as follows:

Build information, release information, Docker image information, such as timestamp, release ID number, PR number, image hash value, Docker Registry address, etc.
Address information about the log library, monitoring library, and analysis library.
Debugging tool information, such as the tool and version number.
Contact information such as team, such as phone number, name of person in charge, website, etc.

13, ConfigMap

In order to accurately and profoundly understand the functions and values of Kubernetes ConfigMap, we can start with Docker. As we all know, Docker solves the problem of application deployment differences by “packaging” programs, dependent libraries, data and configuration files into an invariable image file, but this also brings another thorny problem, that is, how to modify parameters in configuration files during operation. To solve this problem, Docker offers the following two approaches:

Parameters are passed through environment variables.
The Docker Volume is used to map configuration files outside the container into the container.

In most cases, we prefer the latter approach, as most applications usually have multiple parameters and configuration files are mapped in a concise way. However, there is an obvious drawback to this approach: configuration files must be created on the host before they can be mapped to the container when the container starts.

This is even worse in a distributed system, where the creation of the same profile on multiple hosts and the consistency of these profiles can be difficult to achieve. Kubernetes introduced ConfigMap to solve this problem.

All configuration items are regarded as key-value strings. For example, the following configuration items host=192.168.1.1, user=root, and password=123456 are used to indicate the configuration parameters for connecting to the FTP server. These configuration items serve as an entry in the Map table. The entire Map data is persisted in the Kubernetes ETCD, and apis are provided to facilitate Kubernetes related components or applications CRUD operations. The Map used to hold the configuration parameters is the Kubernetes ConfigMap resource object.

ConfigMap mechanism: Changes the ConfigMap stored in the ETCD to the configuration file in the target Pod by Volume mapping. No matter which server the target Pod is scheduled to, the mapping is completed from Europe to East. If the key-value data in the ConfigMap is changed, the “configuration file” mapped to the Pod is automatically updated. As a result, ConfigMap forms the simplest and application-free configuration center in a distributed system.

14,

These conceptual terms are also the core components of Kubernetes, and together they form the Kubernetes framework and computing model. By combining them flexibly, users can quickly and easily configure, create, and manage container clusters. Besides this paper introduces the concept of Kubernetes, there are many other concepts in used for auxiliary resources object, such as: LimitRange, ResourceQuota etc, for more concept term can refer to the glossary: Kubernetes. IO/useful/docs/ref…

Xcbeyond. Cn/blog/kubern…