Exploration and practice of building enterprise Serverless Container platform based on K8S

preface

At present, Kubernetes has become a truly enterprise-level container choreography specification, and many cloud platforms have begun to provide container services compatible with Kubernetes interface. In terms of multi-user support, most platforms choose to directly provide dedicated virtual machine clusters, and users need to spend a lot of energy to deal with cluster scale, resource utilization, cost and other issues. This sharing introduces huawei Cloud’s exploration and practice in the process of building an enterprise Serverless Container platform based on K8S, including Container security isolation, multi-lease management, and the implementation of the Serverless concept on Kubernetes platform.

The history of Kubernetes in Huawei cloud

First to understand the development of Huawei cloud in Kubernetes. In 2014, Huawei Cloud began to study and use Kubernetes, the early focus is to apply Kubernetes in the private cloud environment. In 2016, Huawei public Cloud released the Container Engine Platform (CCE), which is similar to most public cloud Kubernetes services (such as GKE and AKS) on the market. It provides users with a complete set of hosted K8S clusters. Earlier this year, Huawei Cloud released Kubernetes Serverless Container, but it is different from some of the industry’s traditional Container instance services.

The three main benefits of containers are created for applications

As we all know, container technology has three major benefits.

First, it provides resource isolation, so that users can easily improve resource utilization by applying the joint device.
Second, it is capable of second-level elasticity. Because of the technical characteristics of the container itself, it does not need to load heavy virtualization, so it can achieve very fast elastic expansion and shrinkage;
Thirdly, container mirroring technology solves the consistency problem of application and its dependent environment, and simplifies the business delivery process.

But how much end – end convenience does container technology bring in the real world? This has to start from the use of Kubernetes form.

The common usage of Kubernetes

Private cloud deployment Kubernetes

One common way people use Kubernetes is to set up clusters in their own data centers.

The advantages of this approach are:

First, you can enjoy the fun and accomplishment of the DIY process (although it can also become a misery with more problems over time).
Second, under the full privatized model, data requests are handled locally and there are no privacy concerns.
Third, resource planning, cluster installation, deployment and upgrade are all controlled end-to-end by users.

But the disadvantages are also obvious: first of all, many people only look at Kubernetes when they build their own, and have not done a very deep research on the surrounding supporting systems. In the implementation process, they will face the selection of network, storage and other supporting systems. Secondly, users need to bear 100% of the operation and maintenance costs, and the input of resources is often one-time (or phased), and the input cost threshold is very high. In addition, the number of Kubernetes clusters in the self-built environment, the scale of a single cluster is often not very large, so when the scale of business deployment is relatively large, elastic scaling will be limited to the scale of the underlying resources, but the expansion speed of hardware resources is often unimaginable slow. Finally, developers tend to reserve a lot of resources, so resource utilization is limited. In other words, the self-builder has to pay for the full resource utilization.

Public cloud semi-hosted Kubernetes exclusive cluster

The second common form of Kubernetes is a (semi-managed) cluster of public clouds.

It can be understood that users buy a set of virtual machines and the cloud platform automatically deploy a set of Kubernetes on these machines, while semi-managed means that some platforms may have their control surfaces shipped with them.

The advantages of this form are:

Users own their own cluster and don’t have to worry about the interference problems that might arise from sharing a set of Kubernetes with other users.
Cloud platforms often undergo extensive testing and tuning when providing Kubernetes services, so it is a best practice to provide a cluster configuration on your own platform. In this mode, users can run Kubernetes on the cloud and get a much better experience than deploying their own operations.
Once the Kubernetes community releases a new version, the cloud platform does at least one additional round of testing, bug fixes, and then goes live and recommends upgrades. This saves the user the effort of evaluating the timing of the upgrade. However, users who directly use the open source version will have to step in a lot of holes if they follow the new version too fast, but they will have to keep following up the progress of bugs and fixes in the community if they want to postpone the upgrade to which version, which is time-consuming and laborious.
When users have problems with Kubernetes, they can get professional technical support from the cloud platform. Therefore, using (semi-hosted) Kubernetes service on the public cloud is a good way to pass on the cost, and the operation and maintenance costs are shared with the cloud platform.

There are, of course, some obvious drawbacks

The first is the price. When a user buys a group of virtual machines, the price is the unit price of virtual machine Flavor multiplied by the number of nodes N. Secondly, because the user owns a Kubernetes cluster, the size is not too large, and the overall resource utilization is still relatively low. Tuning attempts do not work, and in most cases the user name does not fully customize the configuration of the control surface component. In addition, if a cluster has few free resources and needs to be expanded, the cluster must be expanded first. Therefore, the end-to-end capacity expansion is limited by the virtual machine creation time.

Container instance service

The third, strictly in the form of a user using a container, uses the container instance service of the public cloud.

Its advantages are obvious:

Users are not aware of the underlying cluster, and no O&M is required.
Resource pricing is granular enough to buy as much as you use;
Real second scaling, and second billing.

Its disadvantages are:

Container instance services on many platforms mainly provide private apis that are not compatible with Kubernetes apis and are easily bound by vendors.

However, the entire container instance service is virtualized as a supernode. A series of application high availability features designed for multiple nodes in Kubernetes didn’t take effect. Another problem is that this virtual-Kubelet project-based compatibility solution is incomplete on the data side, including the wobble of project members in the Kube-Proxy deployment hierarchy, and the compatibility of container storage that is still silent.

Why not try using Kubernetes’ multi-tenancy scheme to build the Serverless Container service?

In fact, based on Kubernetes multi-lease to build container instance services, there are many advantages, the biggest in support of K8S native API and command line. Applications developed by users around Kubernetes are deployed and run directly on the K8S-BASED Serverless Container. Because containers can be charged in seconds, users can take advantage of the lower price threshold for container instance services. In addition, the cloud platform usually operates and maintains a large resource pool. Users only need to pay for the resources of the business container, and they do not need to care about the resource utilization rate of the underlying cluster, and there is no cluster operation and maintenance cost.

The main challenge of this form is that K8S native only supports soft multi-tenancy, isolation and other aspects are still lacking.

Let’s review a typical multi-tenancy scenario in K8S.

The first is SaaS platform. Or other services provided based on K8S packages that do not directly expose K8S apis. Because there is a layer of its own API encapsulation, the platform can do a lot of extra work, such as implementing its own tenant definitions, so tenant isolation requirements for the K8S control surface are low. The application comes from the end user and is not trusted, so it actually requires strong data side resource isolation and access control when the container runs.
The second smallest company’s internal platform. Both users and applications are from within the company, so there is a high degree of mutual trust. Neither the control side nor the data side need much additional isolation enhancement. The native K8S will do the job.
The third is the platform for large enterprises. In this scenario, the users of K8S are basically from various departments within the enterprise, and the applications developed and deployed can be launched only after internal verification. So the behavior of the application is trusted and the data surface does not need to be isolated much. It is more important to implement protection control on the control plane to avoid management interference between different departments and services. For example, traffic limiting for tenants is required during API calls.
The fourth scenario is to provide a multi-lease K8S platform on the public cloud. It has the highest requirements on the control surface and data surface. Because the source of the application is not controllable, it is likely to contain some malicious code. The API of K8S is directly exposed to the end user, and the isolation ability of the control surface, such as the API traffic limiting and access control, is indispensable.

To summarize, there are three major challenges that K8S needs to solve if it wants to provide Serverless Container service in public cloud scenarios.
The first is the introduction of tenant concept and the implementation of access control. At present, K8S still does not have the original tenant concept, and Namespace boundary is not suitable for multi-tenancy scenarios.
The second is the isolation of nodes (computing resources) and the security of Runtime.
Third, network isolation, K8S default network all-pass mode in this scenario will have a lot of problems.

Exploration and practice of Huawei cloud

The following figure shows the overall picture of Huawei cloud Container instance service, which is built based on Kubernetes and provides K8S API directly to end users. As mentioned earlier, its biggest advantage is that users can run applications directly around K8S definitions.

It is worth mentioning here that we use the full physical machine scheme, for the end-to-end resource utilization has a great improvement. On top of K8S, we achieve super scale resource pool through a layer of encapsulation

So in Huawei cloud, what we do is to achieve the super scale of the whole service through a layer of encapsulation and the introduction of Federation. At the same time, because K8S native multi-lease capacity is very limited, we choose to implement additional rent-based authentication, multi-lease flow limiting and other work in this layer of encapsulation. But for the application definition and other interfaces, it is directly transparent K8S native API data, only in the call process to increase such as request legitimacy and other verification. In the container network and container storage on the right, existing open source solutions cannot meet the requirements. Therefore, Huawei cloud adopts a self-developed strategy.

Tenant concept and network isolation

As mentioned earlier, K8S has no concept of a tenant, only a layer of isolation bounded by Namespace. At the Namespace level, in addition to API object visibility isolation, K8S provides fine Quota management capabilities such as Resource quotas (total Resource limits) and Limit ranges (defining the amount of resources each Pod and Container can use). On Huawei cloud, we design tenant model: tenant (user), project, Namespace three-layer model, convenient for users to manage the development, testing, production and other stages of multiple projects.

In terms of network isolation, the multi-network model is adopted. Multiple VPCS can be defined in a project. The RELATIONSHIP between VPCS and namespaces is one-to-many. You can deploy applications in the development and testing phases in different namespaces in the same VPC to facilitate debugging and fault locating. In the production environment, deploy applications in namespaces in separate VPCS to prevent interference from other activities.

Runtime security and isolation

The Runtime uses a secure Container (runV in the early days, Kata Container in the present) because the nodes are shared by different tenants and the common Docker Container cannot meet the isolation requirements between pods. The main idea of using secure container is to package a layer of lightweight virtual machine in Pod periphery, which not only ensures the isolation between pods, but also compatible with the container sharing network and storage design of K8S native Pod. The lightweight virtual machine packaging layer, because it only needs to run containers, can be optimized to the same order of magnitude of startup time as ordinary containers through pruning and other means.

Interface level
We adopt the scheme of branch processing inside Docker: in the container engine service, the original logic is still used to directly create ordinary containers; In our container instance service, the Docker API calls create secure containers

Finally, let’s review the key points of this share.

Firstly, we build the core part of Huawei cloud container instance service based on Kubernetes, and encapsulate and implement multi-tenant definition and access isolation on it. The biggest benefit for users is that they can use the native K8S API and command line. There is no need to be aware of THE K8S cluster and underlying resources, and there is no need to create a cluster before using it, and there is no need to worry about any problems with the cluster in the process of using it. The platform itself can guarantee the availability of the service.
Secondly, in terms of computing resource isolation, we use Docker native API backend to connect with Kata Container, which can maximize the compatibility of the ecology of the two projects. For the end user, the user just needs to know that security isolation is robust. In terms of network isolation, the multi-network model allows you to define multiple VPCS and create namespaces and applications in different VPCS to isolate each other.
In addition, for high-performance computing scenarios, we also completed the allocation and scheduling optimization of GPU and FPGA acceleration chips, and further improved the end-to-end computing performance with high-performance network and local storage acceleration.

conclusion

The above is huawei Cloud’s practical experience in Implementing Kubernetes’ Serverless Container product. As the product matures, we also plan to bring some common enhancements back to the community and drive Kubernetes’ capabilities and ecology for scenarios like Serverless containers and multi-tenancy isolation.