The author | juven xu Ali cloud senior technical experts, at present is responsible for the alibaba Serverless research and development operations platform construction, the author Maven of actual combat, was once the Maven central repository maintainers.

** The author of this paper, as the head of Alibaba Group Serverless R&D operation platform, analyzes why Serverless fascinates so many people from the perspective of application architecture, what is its core concept, and summarizes some problems that Serverless will inevitably face.

preface

In my article The Sound and Fury of Serverless, I used a metaphor for the state of Serverless in the industry today, which goes something like this:

Serverless is like teenage sex: Everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.

Although it has been half a year since I wrote that article, this situation has not changed much in my opinion. Many front-line R&D or managers have a very one-sided understanding of Serverless technology, and some of them are even wrong. Without an understanding of the evolution of the application architecture, the capabilities of the cloud infrastructure, and the judgment of risk, blindly embracing new technologies may not only fail to deliver business value and waste energy, but also introduce unnecessary technical risks.

This paper tries to analyze why Serverless fascinates so many people from the perspective of application architecture, what is its core concept, and summarizes some problems that Serverless will inevitably face from my personal practical experience.

Application architecture evolution

To better understand Serverless, let’s review how application architectures have evolved. More than a decade ago, the mainstream application architecture was single application, which was deployed in the form of a server and a database. In this architecture, operation and maintenance personnel would carefully maintain the server to ensure the availability of services. As the business grows, this simplest architecture soon faces two problems. First, there is only one server, and if that server fails, for example with hardware damage, the entire service becomes unavailable. Secondly, as the volume of services increases, the resources of one server will soon be unable to carry all the traffic. The most direct way to solve these two problems is to add a load balancer at the traffic entrance. In this way, the single application is deployed on multiple servers at the same time. In this way, the single application can scale horizontally.

As the business grew, more developers joined the team to develop features on individual applications. At this time, due to the lack of clear physical boundaries of the code within the single application, people will soon encounter various conflicts, which require manual coordination and a large number of conflict merge operations, resulting in a sharp decrease in r&d efficiency. At this time, people began to separate individual applications into microservice applications that could be independently developed, tested, and deployed. Services communicate with each other through apis such as HTTP, GRPC, or DUBBO. The disintegrating micro-service architecture based on Bounded Context in domain-driven design can greatly improve the R&D efficiency of medium and large teams. If you want to know more about Bounded Context, you are recommended to read books related to domain-driven design.

As applications evolve from monolithic to microservice architectures, and distribution becomes the default option from a physical perspective, application architects have to face new challenges posed by distribution. In this process, we will start to use some distributed services and frameworks, such as cache service Redis, configuration service ACM, state coordination service ZooKeeper, message service Kafka, as well as communication frameworks such as GRPC or DUBBO, and distributed tracking system, which will not be listed here. In addition to the challenges posed by distributed environments, microservices architecture brings new challenges to o&M. An application developers need only original operations, may now requires the application of ten or more, which means that the security patch updates, capacity evaluation, fault diagnosis, such as transaction workload growing exponentially, this time, the application of distribution of the standard, the life cycle of standard, the standard of observation, automatic elastic has highlighted the importance of ability.

Now let’s talk about the term “cloud native.” To understand simply whether an architecture is cloud native, it depends on whether the architecture is built on the cloud. The understanding of “growing” on the cloud is not simply to use cloud IaaS layer services, such as simple ECS, OSS and other basic computing storage; It should be understood as the use of distributed services on the cloud, such as Redis, Kafka, etc., which directly affect the business architecture. As we mentioned earlier, distributed services are necessary under the micro-service architecture. In the past, everyone developed such services by themselves, or operated and maintained such services based on the open source version, but in the era of cloud native, businesses directly use cloud services.

Two other technologies that have to be mentioned are Docker and Kubenetes. The former standardized the distribution of both Spring Boot and NodeJS apps as mirrors. The latter defines the standard of the application life cycle on the former technology. An application has a unified standard from startup to online, to health check and offline. With standards for application distribution and life cycle, the cloud can provide standardized application hosting services. This includes application versioning, release, post-launch observation, self-healing, and so on. For stateless applications, for example, a failure of the underlying physical node won’t affect the research and development, the application hosting service based on the standardized application life cycle can be automatically run work, container referrals in fault physical nodes, general application, in the new physical nodes started on an equal amount of container. We are seeing further value dividends from cloud native.

On this basis, as the application hosting service can sense the data during the application running period, such as the concurrency of service traffic, CPU load, memory occupancy, etc., the business can configure scaling rules based on these indicators, and the platform can execute these rules, and increase or decrease the number of containers according to the actual situation of service traffic. So this is basically auto scaling. This helps users avoid limiting resources during off-peak periods, saving costs and improving o&M efficiency.

In the evolution of architecture, R&D operation and maintenance personnel gradually shift their focus from machines, hoping to be managed more by platform systems rather than by people. This is a very simple Serverless understanding.

The core concepts of Serverless

In fact, we all know, although it is said to be Serverless, but Server (Server) is impossible to really disappear, Serverless in the word less is more accurate is the meaning of development need not care. This is similar to the modern programming languages Java and Python, where development does not allocate and free memory manually, but the memory is still there, just managed by the garbage collector. Calling a platform that helps you manage your servers Serverless is like calling Java and Python Memoryless languages.

If we take a look at today’s cloud era, then Serverless can not be narrowly understood as simply not caring about servers. In addition to the basic computing, network and storage resources contained by the server, the resources on the cloud also include various types of higher-level resources, such as databases, caches, messages and so on.

In February 2019, UC Berkeley published a paper entitled Cloud Programming Simplified: A Berkeley View on Serverless Computing, A Berkeley View on Serverless Computing

In the context of the cloud, Serverful computing is like programming in low-level assembly language, while Serverless computing is like programming in a high-level language like Python. For simple expressions such as C = a + b, if described in assembly, you must first select several registers, load the values into the registers, perform mathematical calculations, and then store the results. Just like Serverful computing in the cloud today, development first needs to allocate or find available resources, then load code and data, then perform calculations, store the results of the calculations, and finally manage the release of resources.

Serverful computing, as the paper calls it, is the mainstream way we use the cloud today, but it shouldn’t be the way we use the cloud in the future. I think Serverless’s vision should be Write locally, compile to the cloud, where the code only cares about the business logic and the tools and cloud manage the resources. Now that we have a general but abstract idea of Serverless, let me elaborate on the main features of the Serverless platform.

Number one: Don’t worry about the server

Managing one or two servers may not be a hassle; managing thousands or even tens of thousands of servers is not. Any server can fail, and the Serverless platform must have the ability to automatically identify failures and remove problematic instances; In addition, the security patch upgrade of the operating system must be automatically completed without affecting services. The log and monitoring system must be connected by default. System security policies need to be configured automatically to avoid risks. When resources run low, you need to be able to automatically allocate resources and install the relevant code and configuration, and so on.

Second: automatic elasticity

Today’s Internet applications are designed to be scalable, enabling the Serverless platform to be automatically resilient in a timely and stable manner when the business has significant peaks and troughs, or when the business has temporary capacity requirements (such as marketing campaigns). In order to achieve this capability, the platform needs to have very strong resource scheduling capabilities, as well as a very keen sense of application metrics such as load and concurrency.

Third: charge by actual resource usage

Serverful cloud resource usage is based on occupancy rather than usage. If a user buys three ECS on the cloud, he will have to pay for all three ECS regardless of how much CPU and memory he actually uses. In Serverless mode, users are charged based on the resources actually used. For example, if a request actually uses a 1core2g resource for 100ms, users only need to pay for the unit price multiplied by time (i.e. 100ms). Similarly, if the user is using a Serverless database, the user will only have to pay for the resources actually consumed by the Query, as well as the resources of the data store.

Fourth: less code, faster delivery

Serverless architecture-based code often makes heavy use of back-end services, separating data, state management, and so on from the code; In addition, the more thorough FaaS architecture leaves the code Runtime to platform management. This means that the same application will have much less code in Serverless mode than in Serverful mode, so it will be faster both from distribution to startup. The Serverless platform also typically offers very sophisticated code build releases, version switching, and other features to speed up delivery.

The challenges of implementing Serverless

So much has been said about the benefits of Serverless, but it is not an easy thing to implement Serverless on a large scale in mainstream scenarios. There are many challenges, and I will analyze these challenges in detail below:

Challenge 1: Difficulty in business lightweight

To achieve complete automatic resiliency, paying for resources actually used means that the platform needs to be able to scale out business instances in seconds or even milliseconds. This is a challenge for the infrastructure and a high demand for the business, especially for the larger business applications. If an application takes ten minutes to distribute and start up, the automatic resilience of the response capacity is basically unable to keep up with the changes in business traffic. There are many ways to solve this problem. Microservitization can make huge applications smaller. FaaS uses a new application architecture that breaks down applications into more fine-grained functions to make them lighter. The downside of this approach, of course, is that it requires a major overhaul of the business. For the Java language, modules introduced in Java 9, as well as GraalVM’s Native Image technology, can help slim down Java applications and reduce startup time.

Challenge 2: Insufficient infrastructure responsiveness

Once instances of Serverless applications or functions can scale in seconds, or even milliseconds, the associated infrastructure can quickly come under tremendous strain. The most common infrastructure is service discovery and log monitoring systems, where a cluster instance might change at a rate of several times per hour, but now it changes at a rate of several times per second. In addition, if the responsiveness of these systems can’t keep up with instance changes, such as for the business, container instances can be expanded in 2 seconds but wait 10 seconds for service discovery to complete synchronization, then the whole experience is compromised.

Challenge 3: The business process lifecycle is inconsistent with the container

The Serverless platform relies on a standardized application lifecycle to achieve fully automated container movement, application self-healing, and other features. In a system based on standard containers and Kubenetes, the life cycle that the platform can control is the life cycle of the container. The business needs to keep the lifecycle of the business process consistent with the lifecycle of the container, including startup, shutdown, and compliance with readiness and LiVENESS probes. In practice, although many services are containerized, the container contains not only the main service process but also many auxiliary processes, which may lead to inconsistent life cycles between the service process and the container.

Challenge 4: Observable capability needs to be improved

In Serverful mode, if anything goes wrong in the production environment, the server will not disappear, and users will naturally want to log on to the server, operate Linux commands, search logs, analyze processes, and even dump memory for problem analysis. In Serverless mode, we say that the user does not need to care about the server, which means that the server is not seen by default. What if the system fails and the platform cannot heal itself? Users still need to have rich troubleshooting and diagnosis tools, which can observe the comprehensive status of various aspects including traffic, system indicators, dependent services, etc., to achieve rapid and accurate problem diagnosis. When the overall observability around the Serverless pattern is insufficient, users are not necessarily reassured.

Challenge 5: The mindset of R&D operations needs to change

Almost all development, when deploying your application for the first time in your career, is for one server, or one IP, which is a very powerful habit. Today we still see many applications that are stateful and cannot automatically change instances; You can also see a lot of change deployment behavior tied to THE IP, such as selecting a specific machine for Beta; In many publishing systems, instances will not be replaced when Rolling Update is made, so relevant operation and maintenance systems build capacity based on this assumption. In the process of the gradual implementation of Serverless, the R&D needs to change some modes of thinking, gradually adapt to the mind that “IP may change at any time”, and move to operate and maintain their own systems from the perspective of service version and traffic.

summary

Let’s go back to the great metaphor in Cloud Programming Simplified: A Berkeley View on Serverless Computing: today we use the Cloud as if we were writing code in aggregates. I think this will change over time, and ideally 100 percent of the packages that users deliver to platform deployments should be user-describing business code. While that is far from the case today, many technologies, such as Service Mesh, Ddr. IO, and Cloudsteate. IO, are taking business-neutral but distributed architecture essential logic out of the business runtime and into the hands of the platform. This trend has become clear and strong over the past year, and is well summarized and recommended by Bilgin Ibryam in multi-Runtime Microservices Architecture.

In this paper, we see that the evolution of Serverless has put forward new requirements for application architecture, continuous delivery, service governance, operation and maintenance monitoring. In fact, In addition, Serverless will also put forward higher response requirements for computing and storage network and other lower-level technical facilities. So this is really a radical technology evolution across multiple levels of application, platform and infrastructure, and I’m very excited to be a part of it.

In order for more developers to enjoy the dividends brought by Serverless, this time, we gathered 10+ Technical experts in the field of Serverless from Alibaba to create the most suitable Serverless open course for developers to learn and use immediately. Easily embrace the new paradigm of cloud computing – Serverless.

Click to free courses: developer.aliyun.com/learning/ro…

“Alibaba Cloud originator focuses on micro-service, Serverless, container, Service Mesh and other technical fields, focuses on the trend of cloud native popular technology, large-scale implementation of cloud native practice, and becomes the public account that most understands cloud native developers.”

Behind the hubbub: Concepts and challenges of Serverless