Reprinted please indicate the source:
Grape City Official Website, Grape City provides developers with professional development tools, solutions and services to empower developers.

In the last section, we mentioned that the development of community ecology made the benign development and spread of Kubernetes. Compared with the relatively closed Docker community, the open CNCF community has achieved greater success. However, just the comparison of the vitality of the community is not enough to make Docker fail so quickly. The fundamental reason is that Kubernetes has a better understanding of container choreography technology than Docker. This advantage is almost a crushing dimensional-reduction blow, and Docker has no defense against it.

Here’s how Kubernetes has the upper hand in this container war.

The container arrangement

The so-called container choreography is actually dealing with the relationship between containers. In a large distributed system, there can’t be multiple single individuals, they may be one with many, a group with a group of such interwoven.

Docker container orchestration

Docker builds PaaS ecology with Docker container as the core, including simple container relationship arrangement with Docker Compose as the main element, and online operation and maintenance platform with Docker Swarm as the main element. Users can handle the relationship between their containers in the cluster by Docker Composer, and manage and maintain their own clusters by Docker Swarm. It can be seen that all this is actually the PaaS function of Cloud Foundry, which focuses on the seamless integration with Docker containers.

What Docker Compose does is create a ‘link’ for multiple interactive relationships, Compose them all in one docker-compose.yaml file, and publish them all together (which is what ELK does in the groups I’ll talk about later). This has some advantages, too. It is convenient for simple interactions between several containers. However, for large clusters, it is not enough, and the development model of having to do a new function separately for each new requirement makes it very difficult to maintain the code later.

 

 

If Kubernetes wants to compete with Docker, it must not only do Docker container management, which Docker itself has already supported. In this case, it may not even attract basic users of Docker, let alone compete with each other. Therefore, at the beginning of the design of Kubernetes, the design concept of not relying on Docker as the core was determined. In Kubernetes, Docker is only an optional implementation of the container runtime. Users can change the contents of the container according to their own preferences and Kubernetes provides interfaces for all of these containers. In addition, Kubernetes has correctly identified one of the fatal weaknesses of Docker containers and innovated.

 

Next, let’s understand what is the content that has hit Docker with dimensionality reduction?

Kubernetes container choreography

Different from Docker, which can only deal with the relationship between containers from the perspective of containers, Kubernetes starts from the design concept of software engineering, divides the relationship into different classes, and defines the concepts of tight relationship (POD) and interactive relationship (Service). The different relationships are then choreographed in a specific way.

 

At first you might be confused. Here is an example that is not practical but easy to understand. If the relationship between containers is compared to the relationship between people, what Docker can deal with is the interpersonal relationship between people from the perspective of a single individual. Kubernetes is indeed God. From the perspective of God, Kubernetes can not only deal with the interpersonal relationship between people, but also deal with the inter-dog relationship between dogs. The most important thing is that he can deal with the communication relationship between people and dogs.

The principle of realizing the above close relationship is the POD innovated by Kubernetes.

 

POD is a concept innovated by Kubernetes. Its prototype is Alloc in Borg, which is the minimum execution unit for Kubernetes to run applications. It is composed of one or more closely coordinated containers. This gives the container the idea of a process group. In the first section, we learned that the essence of a container is a process, which is itself a super process, and all other processes must be its children. Therefore, there is no concept of process groups in the container, but in the daily program running, process groups are often used together.

Use POD for tight relationships

To show you how POD handles tight relationships, here is an example of a process group:

 

A Linux program that handles the operating system log, rsyslogd, consists of three modules: the imklog module, the muxsock module, and rsyslogd’s own main process. All three groups must be running on the same machine, otherwise socket-based communication and file exchange between them will be problematic.

 

However, if the above problem appears in Docker, three different containers have to be used to describe it respectively, and users have to simulate and deal with the communication relationship between the three. This complexity may be much higher than the operation and maintenance of containers. And Docker Swarm has its own problems with this problem. Based on the above example, if the three modules each require 1GB of memory to run, and if Docker Swarm runs a cluster with two nodes, Node-1 has 2.5GB left and Node-2 has 3GB left. Based on Swarm’s affinity=main constraint, all three of them must be scheduled to the same machine. However, Swarm will likely assign two to Node-1 first. Then the remaining one fails the schedule because there is still 0.5GB left to satisfy the schedule. This classic example of gang scheduling not being handled properly is often found in Docker Swarm.

 

Based on the above requirements, Kubernetes has the concept of POD to deal with this tight relationship. Containers in a Pod share the same Cgroups and Namespace, so there are no boundaries or isolated environments between them, they can share the same network IP, use the same Volume to process data, and so on. The idea is to create links between containers that share their resources. But in order to solve the problem of whether A shares B or B shares A, and which of A and B starts this topology first, A POD is actually composed of an INFRA container combined with two containers AB, in which the INFRA container is started first:

The Infra container is a container written in assembly language where the main process is always in a “paused” state. It takes up very little resources and is only about 100KB after unpacking.

The example demonstrates POD in Kubernetes

After that, let’s show you what a POD looks like in an example.

 

Let’s run a POD in any cluster with Kubernetes using the following YAML file and shell command. I won’t forget what this YAML means for now. I’ll explain it later. All resources in Kubernetes can be described in a YAML or JSON file. For now, all we need to know is that this is a Pod running BusyBox and Nginx:

After creating the hello-pod.yaml file, run the following command:

We can see from the result that the main process of the INFRA container becomes a super process of this POD with PID==1, which indicates that POD is composed of:

At this point, it is important to understand the concept that POD is the minimum scheduling unit of Kubernetes, and also to think of POD as a whole rather than as a collection of containers. Let’s look again at the file type YAML that describes this Pod.

 

Syntactic definition of YAML:

YAML is a configuration file writing language. Its simplicity and power are much better than JSON in describing configuration files. As a result, many emerging projects, such as Kubernetes and Docker Composer, use YAML as a configuration file description language. Like HTML, YAML is also an English abbreviation: YAML Ain’t Markup Language. Clever students can see that this is a recursive writing method, which highlights the programmer’s taste. Its grammar has the following characteristics:

 

– Case sensitive

– Use indentation to indicate hierarchy, similar to Python

– No Tab is allowed for indentation, only Spaces are allowed

– The number of indented Spaces is not important, as long as the element of the same level is to the left of it

– Arrays are denoted by dashes

-null is denoted by a wavy ~

 

With the above concepts in mind, let’s rewrite YAML as a JSON and see the difference:

The two are equivalent in Kubernetes, and the above JSON works fine, but Kubernetes still prefers YAML. As you can see from the above comparison, JSON, which used to work well in the past, is now a bit clunky and requires a lot of string flags.

 

After looking at the syntax, let’s talk about what each node in YAML above means in Kubernetes. In Kubernetes, there is a concept similar to Java syntax that everything is an object. All internal resources, including server node, service and running group Pod, are stored in the form of objects in Kubernetes. All objects are composed of the following fixed parts:

-apiversion: This is a field that specifies an API version. This field is not customizable and must conform to the official Kubernetes constraints

-kind: Indicates the current configuration type, such as Pod, Service, Ingress, Node, etc. Notice that the initial letter is uppercase

-metadata: Meta information that describes the current configuration, such as name, label, etc

-spec: Specifies the implementation of the current configuration

All Kubernetes objects basically fit the above format, so the YAML file for the initial Pod says “Use the API information for the stable version of v1, type Pod, name hellopod, and implement it by opening processNamespace and having two containers.

 

With the concept of YAML in mind, let’s return to the theme. One of the reasons for creating only PODs to solve the single-process container problem is that Google has implemented its own container design pattern with PODs, while Google has written the most appropriate container design pattern for Kubernetes.

 

The most common example is:

 

Java projects are not like. NET Core projects run directly from the host after compilation. You must copy the generated WAR package to the runtime directory of the service host program, such as Tomcat, before it can be used properly. In practice, however, the larger the company, the more specific the division of labor, and there is a high probability that the teams responsible for Java project development and service host development are not the same team.

 

To allow the two teams in the above situation to develop independently and work closely together, we can use POD to solve this problem.

The following YAML file defines a POD that meets these requirements:

In this YAML file, we define a container for a Java program and a Tomcat program and mount the container between the two containers once: The sample-volume is mounted on both the /app path for the Java application and the /root/apache-tomcat/webapps for the Tomcat application, and is set to be an in-memory data volume. It also defines that the container in which the Java program lives is an InitContainer, indicating that the container is started before the Tomcat container, and a cp command is executed after the container is started.

 

The above POD describes such a scenario: When the program starts to run, the Java container starts to copy its war package sample.war to its /app directory; The Tomcat container then starts, executing the startup script, and executing the WAR package from its own /root/apache-tomcat/webapps path.

 

As you can see from the above configuration description, we haven’t changed the Java program or the Tomcat program, but they work perfectly together to decouple. This is an example of the Sidecar pattern in Container Design Patterns, and there are many other design patterns that you can learn on your own if you are interested.

 

 

conclusion

The above is the basic content of the concept POD, which is abstract by Kubernetes in order to solve the tight relationship. It should be noted that POD only provides an idea of arrangement, rather than a specific technical solution. In the Kubernetes framework we use, POD is just implemented with Docker as the carrier. If you use virtual machine as the underlying Container, such as virtlet, then this POD does not need to Infra Container at all when creating, because virtual machine naturally supports multi-process collaboration.

 

Now that we’ve covered the basics of PODs, in the next section we’ll show how Kubernetes stands out in the container choreography wars that follow.