Since AWS launched Lambda service in 2014, the word Serverless has become more and more popular and has become a new type of software design Architecture, namely Serverless Architecture. As an architecture born of the public cloud, what are the advantages and disadvantages of Serverless? Can it be applied to traditional enterprise applications? Is it suitable for private cloud scenarios? Will it be the force that changes cloud computing in the future, as many articles claim? As a veteran of the cloud computing industry, the author would like to share some of his views in this article.

What is a Serverless

Serverless is not a mystery and can be illustrated with a simple example. We designed an AI application that can identify the race of the people in the picture. We set it up on the public cloud as a SaaS service and provide it to customers. Its typical back-end architecture design is as follows:

In this architecture, Tomcat Web Server is running on the cloud host we purchased to host AI applications written in Java. Users upload images through the API. Limited by the local storage space of cloud hosts, the AI application implements a storage gateway to import images to the object storage of the public cloud to meet the requirement that a large number of customers upload images at the same time. After the image is imported, the AI application reads the image from the object storage for identification and saves the image to the database (for example, RDS) in the public cloud. Users can query the image using the API. After being online for a while, AI applications became popular with users, and more and more companies started using the service. According to the statistics, most companies upload pictures centrally from 9 a.m. to 11 a.m. and from 2 p.m. to 5 p.m. In order to meet the sudden traffic during this period, we set up the auto-scaling strategy of public cloud to dynamically create more cloud hosts to respond to customers when the access increases. The architecture of AI applications has evolved into:

  • In this architecture, we need to do the following:
  1. Manage cloud hosts. We care about the number of cpus, memory size, IP addresses, and so on at the system level. Also care about the operating system of the cloud host and formulate a strategy for deploying AI applications. Security patches for the operating system and Tomcat cannot be ignored, otherwise competitors may hire hackers to attack our systems.
  2. Configure the auto-scaling policy for the public cloud to deal with unexpected traffic at peak times.
  3. Use object storage and databases in the public cloud.
  4. Write AI applications. To do this, we need to develop AI applications as well as run the support business (e.g., manage the cloud host lifecycle, manage the operating system). This is the reality of the current structure: run 80% of the supporting business for 20% of the core business.
  • The following uses Serverless architecture to rewrite the AI application:

  • Multiple processes running the AI application code are started to concurrently process the images uploaded by the user.

  • In AI applications with Serverless architecture, there are only two things we need to do:

  1. Use object storage and databases in the public cloud.

  2. Write AI applications with Serverless framework of public cloud.

Compared with the previous architecture, we no longer operate cloud hosts, operating systems and Tomcat, and there is no need to configure auto-scaling Group. The Serverless framework of public cloud will start a process to run AI applications after each image upload, which automatically achieves horizontal Scaling. We finally only care about the core business, writing AI applications in languages supported by the Serverless Framework (e.g. AWS Lambda supports Java, Python, and JVM languages), and outsourcing all non-core business to public cloud operators.

Our Serverless AI application uses two technologies. Backend as a Service (BaaS) is used for object storage and database services provided by the public cloud. Lambda framework is used next, called Functions as a Service (FaaS).

BaaS and FaaS are the basic characteristics of a Serverless application. An application that complies with these two basic characteristics can be called a Serverless application.

BaaS, not PaaS

AI applications use object stores and databases, and may use message queues in the future. The intuitive sense is PaaS, why make a new word BaaS? There are too many confusing terms in the tech world.

BaaS are not PaaS. The difference is that PaaS are involved in application lifecycle management, while BaaS only provides third-party services that the application depends on. A typical PaaS platform needs to provide a means for developers to deploy and configure applications, such as automatically deploying applications into a Tomcat container and managing the application lifecycle. BaaS does not contain these contents. BaaS only provides the back-end services that applications depend on, such as databases and object storage, in the form of apis. BaaS can be provided by public cloud Service providers or third-party vendors. For example, Facebook acquired Parse as a famous MBaaS provider (Mobile Backend as a Service). Functionally, BaaS can be viewed as a subset of PaaS, the part that provides third-party dependent components.

FaaS is the heart of Serverless

The AI application starts out as a typical Java application that might use a technology like Spring, because we need a framework to ensure that the various components of the application are loaded correctly, and MVC to ensure that the REST API is handled correctly by the Controller. AI applications are deployed in Tomcat containers and run on cloud hosts for 7 x 24 hours. We provide uninterrupted service. From 12pm to 8am, there was hardly any use, but we had to leave it there in case the occasional late night user got a 503 error and thought the AI was unstable. We pay for the cloud host we buy, even though half the time its CPU usage is close to zero, but no public cloud is billed by CPU usage and we pay for the time we don’t work. We must pay attention to the configuration of auto-scaling Group. How to configure the auto-scaling strategy accurately is a technical task that requires long-term experience accumulation. In the early stage, we had to deploy more idle cloud hosts to ensure that the service would not be congested due to improper configuration of auto-scaling.

After rewriting the AI application with the Serverless architecture, all of this pain disappeared. With the Spring framework and Tomcat removed, Lambda’s Java SDK only needs to implement a Function Handler to handle the image upload event, which is as simple as writing a Callback. The logic for image recognition is called in the Function Handler, and the REST API of the database is called to store the results. We don’t have to build MVC, we don’t have to configure Tomcat XML files, we’ve removed the storage gateway completely, because users can upload images directly to the object store.

The AI app no longer runs 24/7, it’s just a compiled piece of code without the user uploading an image. When the user image is uploaded, FaaS starts a new process to execute the code for the AI application. The process destroys itself after code execution is complete. We only pay for a few seconds of code execution, saving a lot of money.

Finally we don’t need to worry about auto-scaling, as FaaS will automatically scale when needed.

  • These are the core of FaaS, and its characteristics can be summarized from the above example:
  1. FaaS runs the back-end code rather than the entire back-end program. For example, the AI application only contains the logic to handle the event that the image upload is completed. It is not a complete back-end application, but a piece of back-end code.

  2. Code is triggered by an event. Because there is no longer a long-running process waiting or polling for user requests, the code can only be triggered by special events. These events are defined by the FaaS framework, such as uploading a file to the object store, receiving a new message from the message queue, receiving a new API request from the API Gateway, and so on.

  3. The code life cycle is short. In our AI application, for example, there will be no resident process running from the time Function Handler is called upon receiving an event to the time the call returns. In addition, the public cloud provider will limit the time of code execution, after which the code execution process will be forcibly destroyed. For example, AWS Lambda has a maximum execution time of 5 minutes.

  4. The code must be completely stateless, with no memory state shared between calls. Our AI application was the first to use a global variable to count the number of images processed, incrementing the counter by one for each image processed. With FaaS we can no longer share data between calls with any global variables or in-memory data structures (such as Hashmap) because the code runs in separate processes and cannot access each other’s memory address Spaces. So we reworked the code to put the global counter in the Redis service of the public cloud, which added additional complexity to the code.

  5. Horizontal scaling is no longer a concern, and FaaS runs a new copy of code for each event and request.

  6. Application deployment moves from uploading and configuring the entire application to uploading a package of code (such as a Jar file or a Zip file).

What does Serverless bring us

Compared with traditional architecture, AI applications rewritten with Serverless architecture have significant advantages. We no longer operate and maintain any cloud hosts and operating systems, or even Web containers like Tomcat. We just focus on the code itself, and all the configuration and application lifecycle management is handled by the FaaS framework. The emergence of the public cloud freed us from physical hardware management, and the Serverless architecture freed us further from operating system management and really focused on the core business for the first time.

The business has also become more agile. We only need to write code that is relevant to the core business, such as the image recognition part of the AI application. There is no need to write any code to load, deploy, or configure applications, such as systemd to load applications on system startup.

Horizontal scaling is also not a problem. As mentioned repeatedly, the FaaS framework starts a new process execution code for each event, each API request. This is similar to traditional application thread pooling, where each request is executed in a separate thread, except that threads share the same memory address space, whereas FaaS processes do not share any memory. Just as thread pools have a limit on the number of threads, FaaS frameworks often limit the number of processes. For example, the default maximum number of concurrent calls AWS Lambda can execute in a Region is 600, which means that our AI application can execute in 600 processes at the same time.

Finally, and most importantly, the Serverless architecture saved us a lot of money. We only pay for the time the AI application runs, not for the time the application waits for requests. The granularity of horizontal Scaling is refined from the original cloud host to the process, saving additional costs and eliminating the need to purchase idle cloud hosts to offset configuration inaccuracies in auto-scaling. The increased agility of the business also reduces operating costs. We no longer need operating personnel proficient in operating system configuration and management, which not only saves labor costs, but also saves the time from development to launch of the application.

Serverless is not a silver bullet, it is the future of back-end applets

The Serverless architecture’s advantages in some scenarios are so obvious that some proponents are already hyping it as a disruptive new architecture for cloud computing. As has always been the case in tech circles, some people are always on the lookout for the cure-all, the silver bullet that will solve all problems. “All design is about Tradeoff” Serverless is not a silver bullet either, it has unique advantages that come with inevitable limitations.

Starting a brand new process run code for each event/request is at the heart of FaaS, and process start latency is the first issue Facing Serverless. Depending on the language in which the application is written, the startup delay can range from 10 milliseconds (such as a simple Python application) to 1 minute (such as a complex Java application). Such delays are unacceptable for realTime programs. Currently, Serverless applications usually run in the multi-tenant environment of public cloud, and the startup delay is also affected by the system load. Therefore, it is difficult to ensure that the application can be run within the specified time. Public cloud providers currently do not provide a corresponding SLA guarantee for Serverless, and AWS Lambda does not have a relevant SLA clause at the time of this writing.

Serverless is not suitable for high-concurrency applications, and it is too expensive to start one process per request. For example, alipay processed 85,900 transactions per second during the peak period of Singles’ Day. If Serverless architecture is used, 85,900 processes are created and destroyed in our system every second, which is an unaffordable cost.

Serverless applications cannot reside in memory and run for a limited time. Serverless is not an option if your application cannot do the work in a few minutes, for example AWS Lambda gives a maximum running time of 5 minutes for a process, after which the process will be forced to terminate. This presented programming challenges, such as our AI applications that had to be optimized to recognize complex images in less than five minutes. Nor can we write applications for long time IO operations, such as complex encoding of 1 terabyte of data in an object store.

The inability to share state between Serverless calls makes writing complex programs extremely difficult. Statelessness is the goal pursued by Internet applications, such as those satisfying the “12 elements”. But Serverless takes stateless even further, with no way to share memory state between different calls, such as using hashMap. The global counter that counted the total number of images processed in our AI application was just a global variable in the traditional architecture, but in the Serverless architecture it became a record stored in an in-memory database (Redis), and the cost of updating, ensuring atomicity, and other factors made our code many times more complex. Such a complete stateless architecture is a huge challenge for the large and multi-cloud native Internet applications, while the stateless transformation of Serverless is almost an impossible task for enterprise applications with hundreds of thousands or millions of lines of code and full of states.

Skilled Microservices architects are familiar with breaking a business into individual services, and there are classic books (Building Microservices: Designing Fine-Grained Systems, for example) that guide us through that process. But even they have a headache with the Serverless architecture, and the challenge of breaking up the business into hundreds or thousands of functions running in separate processes with limited running time is huge. Whether such fine-grained splitting is needed is the first question to be answered. Some problems can become unsolvable or costly, such as distributed database transactions.

The above are some inherent limitations of the Serverless architecture. They are due to the characteristics of the Serverless architecture and are difficult to be resolved with the passage of time and improvement of technology. In addition, as a new technology, Serverless also faces many shortcomings, such as difficulties in integration testing, Vendor Lock-in, debugging monitoring, and version control, all of which will become obstacles to the adoption of Serverless architecture.

Because of these limitations, the Serverless architecture will not be the architecture of choice for complex applications. Instead, it should be the future of back-end applets.

Cloud applications have a large number of small program scenarios, such as recognizing a picture, encoding and decoding an audio/video, returning a small piece of data to the IOT device request, and notifying customer service personnel of work orders submitted by customers via email, etc. These event-based applets are relatively complex to implement in a traditional architecture, where you often need to run 80% of the supporting business for 20% of the core business. Serverless solves these problems perfectly and can be a complementary architecture for complex applications. We can split the stateless, event-triggered business into Serverless applications, making the whole architecture more concise and efficient.

Serverless is also evolving, as AWS recently introduced Step Functions that attempt to address the issue of sharing state between calls, and the effect remains to be seen.

Serverless is not a traditional PaaS

The boundary between Serverless and PaaS is fuzzy. Many people think Serverless is a kind of PaaS, and the author also tends to think Serverless is a special form of PaaS.

Serverless is composed of BaaS and FaaS. BaaS is responsible for providing business dependent services, while FaaS is responsible for business deployment and life cycle management. In this sense, Serverless plays the same role as PaaS. The difference from traditional PaaS is that traditional PaaS manages the application lifecycle at application granularity, whereas Serverless manages the application lifecycle at function granularity. Whereas applications in traditional PaaS are resident processes, Serverless applications are destroyed upon running. In addition, with traditional PaaS, users still need to care about horizontal Scaling, such as how to configure auto-scaling groups, but Serverless does not have this problem and horizontal Scaling is a natural feature of the architecture.

Serverless and microservices

Serverless is not directly related to microservices, but they have similarities, such as the need for business separation, the emphasis on statelessness, and agility. Serverless is in many ways more granular and demanding than microservices. For example, microservices split services on the boundary of services, and Serverless split services on the boundary of functions. Microservices can have memory state sharing across invocations, and Serverless requires that invocations be completely stateless. In addition, Serverless relies on BaaS to provide third-party dependencies, while microservices are free to choose third-party dependency sources, such as traditional middleware stacks built locally (such as local MySql and message buses).

Serverless and containers

Serverless and container are apples and oranges, not on the same plane. Serverless is a software design architecture, and the container is the host of the software architecture. Although there is no publicly available information, we can assume that Serverless frameworks such as AWS Lambda use some level of container technology that makes it difficult to achieve language-independent and millisecond startup. Although there are some open source projects that use Docker to implement FaaS in Serverless, I don’t think that a public Serverless framework like AWS Lambda directly uses Docker and is necessarily a more lightweight and smaller container technology. We might call it a Nano-container.

Does Serverless make sense for private clouds?

It is too early for private clouds to move their business to a Serverless architecture. Firstly, Serverless is a new architecture evolved from the public cloud, which is suitable for small programs running on the public cloud. However, private clouds carry more old and cumbersome traditional services, which are difficult to transform with Serverless architecture. Secondly, Serverless relies on BaaS. The cost of establishing, operating and maintaining BaaS in a private cloud is not low. Using public BaaS is also limited by network bandwidth and delay, which may easily lead to system instability.

With the further cloud of enterprise applications and the maturity of open source Serverless framework, Serverless can also be used as CI/CD in Devops scenarios of private clouds. For example, most of the work undertaken by Jenkins can be replaced by Serverless. For example, if the FaaS framework corresponds to Jenkins itself, the uploaded code corresponds to the Bash script in Jenkins Job, and the original Jenkins API trigger Job is changed to trigger the code in FaaS.

conclusion

Serverless, as a new architecture, is the inevitable result of the evolution of cloud computing. The trend in cloud computing is to pursue more fine-grained billing units, focus more on core business, and outsource support to infrastructure providers. Features of the Serverless architecture make it easier to write event-triggered back-end applets. At the same time, it also has its own inherent limitations, is not suitable for complex application architecture. As things stand, a hybrid architecture with partial Serverless adoption is a good choice for public cloud applications, while Serverless adoption for private applications is too early. Cloud computing technology is advancing rapidly, and the possibilities are endless.