The author | li source (yuan yi) | Serverless public number

1. Why is Knative needed

At present, K8s has become the mainstream operating system in the cloud native market. K8s pair exposes infrastructure capabilities, such as Service, Ingress, Pod, Deployment, etc., through data abstraction. These capabilities are exposed to users through K8s native API. K8s provides some standard interfaces for infrastructure access, such as CNI, CRI and CRD, so that cloud resources can enter K8s system in a standardized way.

K8s is a link between the previous and the next. Cloud native users use K8s for the purpose of delivering and managing applications, including grayscale publishing, capacity expansion and reduction, etc. However, implementing these capabilities through direct manipulation of the K8s API is somewhat complicated for users. Savings in resource costs and flexibility are also increasingly important to users.

So, how can we simply use K8s technology, and achieve on-demand use, and ultimately achieve the purpose of reducing costs and increasing efficiency? The answer is Knative.

Ii. Introduction to Knative

1. What is Knative

  • define

Knative is a Serverless choreography engine based on Kubernetes. An important goal of Knative is to develop cloud-native cross-platform choreography standards, which it achieves by integrating container builds, workloads, and event-driven.

The Knative community currently has a strong roster of contributors including Google, Pivotal, IBM, and Red Hat, as well as PAAS providers like CloudFoundry and OpenShift who are actively participating in the construction of Knative.

  • The core module

The Knative core module consists of two main parts: Eventing, an event-driven framework, and Serving workloads, and this article focuses on Serving.

2. Grayscale release of traffic

Take a simple scenario:

  • Realize grayscale publishing based on flow in K8s

If the grayscale distribution based on traffic is to be implemented in K8s, the corresponding Service and Deployment needs to be created, the elasticity related needs to be done by HPA, and then the new version needs to be created when the grayscale distribution of traffic is implemented.

For example, the original version is V1. In order to achieve grayscale release of traffic, we need to create a new version v2. When v2 is created, the corresponding Service, Deployment, and HPA are created. After the creation, set the corresponding traffic ratio through Ingress, and finally realize the function of grayscale distribution of traffic.

  • Realize grayscale publishing based on flow in Knative

As shown in the figure above, to achieve traffic-based grayscale publishing in Knative, you only need to create a Knative Service and perform grayscale publishing based on different versions, which can be represented by Revision1 and Revision2. Autoelasticity has been included in various versions. From the two simple illustrations above, we can see that when implementing grayscale publishing of traffic in Knative, there are significantly fewer resources to operate directly.

3. Knative Serving

  • **Service **

Service corresponds to the Serverless orchestration abstraction and manages the application lifecycle through Service. A Service contains two main parts: Route and Configuration.

  • Route

Route Indicates the routing policy. Requests are routed to revisions, and different grades of traffic can be forwarded to different revisions.

  • Configuration

Configuration Indicates the resource information. Configuration of the current expected state. The Configuration is updated each time the Service is updated.

  • Revision

Each Configuration update generates a snapshot, which is called Revision. Revision implements multi-version management and grayscale publishing.

We can understand it as follows: Knative Service ≈ Ingress + Service + Deployment + flexibility (HPA).

4. Rich flexibility

Of course, the Serverless framework relies on elasticity, and Knative offers the following rich elasticity strategies:

  • Automatic expansion capacity based on flow request: KPA;
  • Automatic expansion capacity based on CPU and Memory: HPA;
  • Supports automatic capacity expansion and reduction by timing + HPA.
  • Event gateway (precise elasticity based on traffic requests).

3. Integration of Knative and ASK

1. ASK: Serverless Kubernetes

Capacity planning is required in advance if ECI resources are to be prepared, which defeats the purpose of Serverless. In order to get rid of the bondage of ECI resources, there is no need to plan ECI resources in advance, Ali Cloud proposed Serverless — ASK. Users can deploy container applications without purchasing nodes, and do not need to maintain or plan the capacity of nodes. ASK provides K8s compatibility capabilities while dramatically lowering the barriers to use for K8s, allowing users to focus on the application rather than the underlying infrastructure.

ASK provides the following capabilities:

  • Free operations

Out of the box, no node management, o&M, node security maintenance, and NotReady simplify K8s cluster management.

  • Extreme elastic expansion

No capacity planning, second expansion, 30s 500POD.

  • Low cost

Create pods on demand, support Spot, and reserve instance coupons.

  • Compatible K8s

Support Deployment/statfulset/job/service/ingress/CRD, etc.

  • Storage mount

Cloud disks, NAS, and OSS certificates can be mounted.

  • Knative on ASK

Automatic elasticity based on application flow, out of the box, reduced to minimum specifications.

  • Elastic Workload

Support ECI by volume and Spot mixed scheduling.

  • Integrate cloud products such as ARMS/SLS

2. Knative operation and maintenance complexity

There are three main problems in Knative operation and maintenance: Gateway, Knative control and control components and cold start.

As shown in the figure above, the control component in Knative will involve the corresponding Activator, which is a component from 0 to 1. Autoscalers are scaling-related components; Controller is its own control component and gateway. For the operation and maintenance of these components, if placed at the user level, it will undoubtedly increase the burden, and these components will also occupy the cost.

In addition, the problem of cold start from 0 to 1 also needs to be considered. When an application request comes in, it takes a period of time from the start of the first resource to the start of the resource. If the request is not responded in time during this period, the request will time out, resulting in the cold start problem.

For these problems mentioned above, we can solve them through ASK. Now what does ASK do?

3. Gateway and SLB are integrated

We need to operate and manage ISTIO-related components compared to the capabilities provided by Istio before, which undoubtedly increases the cost of management and control. In fact, for most scenarios, we are more concerned with the capabilities of gateways, and some of Istio’s own services, such as the service grid, are not needed.

In ASK, we replace the gateway layer with SLB:

  • Cost reduction: reduced more than a dozen components, greatly reducing operation and maintenance costs and IaaS costs;
  • More stable: SLB cloud products and services are more stable, more reliable and easier to use.

4. Sink control components

For Knative control components, ASK does some hosting:

  • Out of the box: Users directly use Serverless Framework without installing it themselves.
  • O&m free and low cost: The integration of Knative components and K8s clusters enables users to bear no o&M burden and no additional resource costs.
  • High control: All components are deployed on the control side, making upgrades and iterations easier.

5. Keep elegant examples

In the ASK platform, we provide the ability to elegantly retain instances for cold-free startup. By preserving the instance, the cold start time from 0 to 1 is eliminated. When we scaled down to 0, we didn’t actually scale the instance down to 0, but to a low-spec reserved instance to reduce costs.

  • Cold-free start: Eliminates the 30-second cold start time from 0 to 1 by retaining specifications;
  • Cost control: Burst performance instances cost 40% less than standard spec instances, and can be further reduced when combined with Spot instances.

Four, practical operation demonstration

Finally, a hands-on demonstration is conducted, taking a cafe as an example. The main contents of the demonstration are as follows:

  • Install Knative in the ASK cluster;
  • Deploy the coffee service;
  • Access the coffee service;
  • Keep the instance.

Demonstration to watch links: developer.aliyun.com/live/246126

Author introduction: Peng Li, Name: Yi Yuan, senior development engineer of Ali Cloud Container Platform, joined Ali in 2016, deeply involved in alibaba’s comprehensive containerization and supporting double Eleven containerization link for many years. It focuses on cloud native areas such as containers, Kubernetes, Service Mesh and Serverless, and is committed to building a new generation of Serverless platform. Currently, I am responsible for the work related to Ali Cloud container service Knative.