Introduction: On the scene of 2021 Computing Conference, Aliyun engineers demonstrated that they successfully started 3000 ECIs within 6 seconds and all of them entered the Running state. This article will reveal how Ali Cloud ECI can achieve rapid expansion for you.

The introduction

According to the latest CNCF report, more than 90% of users use containers in production and more than 80% manage containers through Kubernetes. Is K8s the perfect solution to the application deployment problem in our production environment? There is a saying in IT circles that nothing is panacea, and K8s is not panacea either. K8s solves the scheduling and scheduling of applications, but IT does not solve the limitation of resource capacity, security isolation of containers, and high operation and maintenance costs.

Problems and dilemmas of traditional K8s

  • Low resource efficiency

This resource bar chart comes from a customer of Aliyun. The CPU of the service is about 7000-8000 cores at the peak of traffic, and only a few hundred cores are used at the bottom of traffic. If you plan resources based on peak traffic during IT planning or resource purchase, serious resources will be wasted. However, if ECS resources are planned in real time according to the usage, the capacity cannot be expanded in a timely manner in the face of some unexpected traffic, affecting the service stability.

  • Weak resource isolation

A container uses the namespace of the system kernel to isolate resources, but the kernel supports only six namespaces such as UTS and IPS to isolate resources. We had a customer who needed to change the time of a business Docker in a test environment, which resulted in all container times being changed on one machine. The same applies to scenarios such as customizing kernel parameters and fair sharing of IO.

At the same time, container security has been criticized, such as privileged container can directly see all disk data on the machine.

  • High operation and maintenance costs

Cloud native brings a lot of convenience to IT, but at the same time, cloud native also makes the whole IT operation and maintenance become more and more complicated. A K8S container cluster must have at least a high availability Master, network plug-ins, mirror repositories, log services, and monitoring components. Even after installing these components, you still have to deal with o&M and alarms every day. O&m is a fire fighting task.

Ali Cloud elastic container instance ECI came into being

Is there a secure container solution that is operation-free and on-demand? Ali Cloud elastic container example emerged at the historic moment.

Ali Cloud Elastic Container Instance (ECI, Elastic Container Instance for short) is a Container running service provided by Ali Cloud combining Container and Serverless technologies. By using ECI, when deploying containers on Ali Cloud, Pod and containers can be directly run on Ali Cloud without purchasing and managing cloud server ECS, which saves the operation and management of the underlying server. Simply put, an ECI is a Pod that can be choreographed and scheduled by K8s.

Ali Cloud elastic container instance is especially suitable for sudden business traffic or short cycle task operation. So what’s the difference between ECI and customers buying ECS and running Docker in ECS? The biggest difference is that if ECI is used, the runtime of the entire container will be operated and maintained by Ali Cloud.

ECI has the following advantages:

  • The underlying resources are hosted by Ali Cloud, and users no longer need to manage the underlying VMS (virtual machines).
  • Reuse the elastic computing resource pool of the whole Ali Cloud to ensure sufficient inventory.
  • Low cost, charged by the second, from Pod creation.
  • Fast start, second start bottom safety sandbox.
  • Strong compatibility, fully compatible with K8s.

Ali Cloud elastic container instance adopts the community’s Virtual Kubelet scheme to integrate with K8s. When Pod is created and scheduled to Virtual Kubelet in the cluster, Kubelet will call ECI interface and start ECI.

The ECI can connect to the service system in the following ways:

  • (Recommended) Deploy services through Alibaba cloud container service Serverless Kubernetes (ASK) to provide Kubernetes cluster capability without o&M. All the underlying Pod resources are carried by ECI.
  • (Recommended) Deploy services through Alibaba Cloud container service Kubernetes (ACK) to provide additional mass flexibility for ACK clusters.
  • Virtual nodes are used to connect to Kubernetes clusters created by users on ECS to provide convenient and flexible computing resources.
  • The Virtual Node connects to the Kubernetes cluster built by IDC offline to provide infinite elastic computing capability on the cloud.
  • OpenAPI is used to connect to the service system, and ECI service containers can be created or released at any time at a low cost.

ECI fast start instances: 3000 container instances in 6 seconds

On the site of the 2021 Cloud Conference, Ali Cloud Serverless Container Service Elastic Container instance released the new feature of extreme startup instance. The elastic container example solves the above application deployment problems and innovatively provides the product features of fast start. The field demonstrated that 3000 ECIs were successfully started within 6 seconds and all entered the Running state.

How did Ali Cloud start 3000 container instances in 6 seconds?

On the one hand, through a large number of user-level historical data of creation, machine learning is applied to find out the rule of user creation of Pod. By means of prediction pre-scheduling, resource reuse and other means, the scheduling and creation time of ECI is saved. At the same time, Ali Cloud kangaroo sandbox container is used as the engine, supplemented by overlay network and storage scheme. The single ECI instance cold start time is compressed to less than 3 seconds, for the kangaroo engine will be a special article for detailed introduction, also please look forward to.

In the mirror, on the other hand, pull dimension, through the mirror caching make container to mirror a snapshot, from each Pod launched pull container mirror movements, such as ali cloud dharma school AI team part image can reach hundreds of G, if pull need ten minutes, in the traditional way by ECI image caching scheme can achieve Pod second class started.

Looking to the future

Aliyun elastic container instance provides o&M free full hosting services from Runtime, GuestOS, underlying computing, network, and storage resources, and released the extremely fast instance startup speed at the 2021 Cloud Conference to help customers quickly complete the capacity expansion and reduction of business systems.

As the service boundary of cloud vendors is further moving up, ECI expects to provide better elasticity, performance and cost capability than customer built container resource pool through large-scale and intensive resource scheduling and end-to-end Runtime design. This will be the direction that Alibaba Cloud elastic container instances will continue to explore in the next 1-2 years.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.