Abstract: The goal and opportunity of Serveless computing is to benefit cloud programmers as much as they benefit from using high-level languages.

This article is shared from Huawei cloud community “Simplify cloud programming, Berkeley’s View on Serverless (translation)”, author: second-hand Lion.

The translator said:

As one of the best ways to understand a technology is to read related papers, such as Spark paper and Kafka paper, which will greatly improve myself. As a sentence often involves a huge amount of information, it is necessary to thoroughly translate the paper into Chinese and carefully understand and read it. A few years ago, I read the English version of Spark’s paper. I read it in general, and I have not applied it for a long time. In fact, I have almost forgotten it. After all, my English is not so good. If I always read long articles in English, it is not an efficient way to learn. I might as well spend a few days translating papers to deepen my understanding. Finally, I also attached my own notes to help readers understand.

I translated this article in nineteen, and I share it today in the hope that it will promote understanding of Serverless

The original www2.eecs.berkeley.edu/Pubs/TechRp…

Copyright © 2019, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission.

1. Introduction

In 2009, to help explain the benefits of cloud computing, “Berkeley’s View on Cloud Computing,” outlined six potential advantages:

1. The emergence of unlimited computing resources on demand

2. Eliminate the commitment of cloud users

3. Ability to pay for the use of computing resources on a short-term basis as needed.

4. Economies of scale that significantly reduce costs due to many very large data centers.

5. Resource virtualization simplifies OPERATION and maintenance (O&M) and increases utilization

6. Achieve higher hardware utilization by reusing workloads from different organizations

These advantages have been widely recognized over the past decade, but cloud users continue to be burdened with complex operations and many workloads and still cannot benefit from effective multiplexing. These deficiencies are mainly related to the failure to realize the last two potential advantages

Cloud computing frees users from managing their physical infrastructure, but leaves them with virtual resources to manage.

Reuse is well suited for batch processing workload scenarios [T1], such as Map Reduce, high-performance computing, and scenarios where instances of allocated resources can be fully utilized. But stateful services, such as porting enterprise software such as databases to run in the cloud, do not work well [T2].

In 2009, there were two competing approaches to cloud computing virtualization, as described in the paper:

Amazon EC2 is one of these products. EC2 instances are more like physical hardware, and users can operate on a full stack of software almost from the kernel up. At the other end of the spectrum are platforms for domain-specific applications, such as GAE. An application architecture that enforces a clean separation between the stateless computing layer and the stateful storage layer. App Engine’s impressive automatic extension and high availability mechanisms… Depending on that limitation

The market eventually embraced Amazon’s lower-level virtual machine approach to the cloud, and Google, Microsoft and other cloud vendors offered similar interfaces. We believe that the main reason for the success of lower-level virtual machines is that early cloud users want to completely rebuild their cloud environments as well as their on-premise computing to simplify porting workloads. That’s a real need, smart enough, and takes precedence over writing new programs for the cloud alone, especially when it’s not clear how successful the cloud will be.

The downside of this option is that the developer has to manage the virtual machine himself, essentially by becoming or working with the system administrator to install the environment. Table 1 lists the issues that must be managed to operate a cloud environment. This long list of low-level virtual machine management responsibilities has inspired some customers with simple applications to seek an easier path to the cloud for new applications. For example, suppose

1. Make redundancy for availability, so that the failure of one machine does not cause service interruption.

2. Preserve the geographic distribution of redundant copies of services in the event of a disaster

3. Request routes utilize resources efficiently through load balancing

4. Automatic scaling system according to load changes

5. Monitor services to ensure that they are always operating healthily

6. Logs are used for debugging and performance tuning

7. Upgrade the system, including security patches

8. Migrate to a new instance when it becomes available.

Table 1: Eight issues to be addressed in setting up an environment for cloud users. Some problems require many steps. For example, automatic scaling needs to determine scaling needs; Select the size and type of server, apply for servers, wait for them to come online, configure applications on top of them, make sure no errors occur, monitor them with monitoring tools, and then send requests to test them.

When an app wants to send images from the phone to the cloud, the cloud should first create thumbnails of the images and put them on the website. The amount of Java Script code required to do this is negligible compared to the amount of server development that is ready to run it.

Recognizing these needs, Amazon has launched a new option called AWS Lambda. Lambda provides cloud functions that have attracted a lot of attention for Serverless computing. Serverless computing, though, is an oxymoron — you’re still computing with a server — because it encourages cloud users to write code and leave all server configuration and management to the cloud provider. Cloud functions are packaged as FaaS to be provided to users, representing the core of Serverless computing. The cloud platform also provides a dedicated Serverless framework to meet the application requirements of Topedin as BaaS. Simply put, Serverless computation =FaaS+BaaS

In our definition, to think of a service as serverless, it must automatically scale without explicit configuration and be billed based on usage. In the remaining time, this paper mainly studies the generation, evolution and future of cloud functions. Cloud functions are a common element in today’s serverless computing and are leading the way towards simplified and cloud-oriented generic programming models.

Next, we define serverless computing as Berkeley’s view of cloud computing, and then we list the challenges and research opportunities that need to be addressed if serverless computing is to live up to its promise. We are not sure which solution will prevail, and we believe that all the issues will eventually be resolved, making serverless the face of cloud computing.

The birth of serverless computing

On any Serverless platform, users only need to write cloud functions in a high-level language. Select a trigger that triggers the function to run – such as loading images to the cloud or adding thumbnails to the database – and let the Serverless platform handle everything else: instance selection, scaling, deployment, fault tolerance, monitoring, logging, security patching, and so on. Table 2 summarizes the differences between serverless and traditional computing methods. In this paper we call the traditional serverful cloud computing. Note that the above two approaches represent the end of the function-based/server-centric computing platform and both use container orchestration frameworks such as Kubernetes as the middle layer

Table 2: Features of serverless and VIRTUAL machines are listed in developer and system administrator categories respectively. Lambda and On demand EC2 specifications and prices

Figure 1 shows how ServerLess simplifies application deployment and makes cloud resources easier to use. In the context of the cloud, Serverful computing is like assembly language for programming languages, and Serverless computing is like a high-level language like Python. The assembler computes simple formulas, such as C =a+ B, by selecting one or more registers, loading values into the registers, calculating the formula, and then storing the results. This is similar to the steps of Serverful computing, where you configure the resource, determine the available resource, load code and data into the resource, start the calculation, return or store the results, and finally release the resource. The goal and opportunity of Serveless computing is to benefit cloud programmers as much as they benefit from using high-level languages. Other high-level language features can also naturally correspond to Serverless computing. Automated memory management frees developers from having to manage memory resources, and Serverless Computing frees developers from having to manage server resources.

To be precise, there are three key differences between Serverful and Serverless computing:

1. Compute storage decoupling. Storage and computing scale separately and are configured and calculated independently. Storage is provided by a separate cloud service, while computing is stateless.

2. Execute code without managing resource allocation. Instead of requesting the resource, the user provides the code snippet, and the cloud is responsible for automatically configuring the resource and executing the code.

3. Pay for resources used, not allocated. Billing is based on execution, such as execution time, rather than the underlying cloud platform, such as the specifications of the VM being assigned.

With these differences, let’s explain why serveless is different from a number of similar solutions, both past and present.

Figure 1: Architecture of Serverless, the Serverless layer between the application layer and the underlying cloud platform, simplifies cloud programming. Cloud functions (such as FaaS) provide general-purpose computing and complement the ecosystem with specialized BaaS, such as object storage, databases, and messaging. Specifically, a Serverless application on AWS might use lambda, along with S3 (object storage) and dynamoDB (KV database). A Serverless application running on Google uses Cloud Function along with Cloud FireStore (mobile app backend database) and Cloud Pub/Sub (messaging). Serverless also has big data services such as AWS Athena and Google BigQuery (Big Data Query), Google Cloud DataFlow and AWS Glue (Big Data Transform). Basic cloud platforms include VMS, VPCS, block storage, IAM, billing, and monitoring

2.1 Context scenario of Serverless

What technological breakthroughs would be needed to make Severless computing possible? Some argue that serverless computing is simply a renaming of an earlier product, perhaps a generic PaaS platform such as Heroku, Firebase or Parse. Some point out that the shared Web hosting environment offered in the ’90s is comparable to what Serverless Computing offers today. For example, these solutions have stateless programming models for multi-tenancy, elastic response to requests, standard function call apis, common gateway interfaces (CGI), and even allow direct deployment of source code written in high-level languages such as PHP and Perl. Google’s original App Engine, which was rejected years ago before Serverless became popular, also allowed developers to deploy code, leaving most of the rest to cloud providers. We believe serverless computing is a significant innovation over PaaS and other models.

Today’s cloud function Severless differs from its predecessors in several important respects: better scaling, greater isolation, platform flexibility, and service ecosystem support. Among these factors, the automatic scaling offered by AWS Lambda is a complete reversal of previous solutions. Unlike Serverful’s auto-scaling technology, it more accurately tracks the workload. Fast response scaling when needed, even down to zero resources, zero overhead when not required. It generates costs in a more granular way. It offers a minimum charge for each additional 100ms, while other automatic services are charged by the hour. The key difference is that the user only pays for the time it takes to execute the code. Rather than all the resources it reserves for the program. This distinction ensures that cloud providers reap benefits and take risks when scaling automatically [T3], and therefore provide incentives [T4] to ensure efficient resource allocation.

Severless computing relies on strong performance and security isolation to make multi-tenant, hardware sharing possible. Vm-like isolation is the current multi-lease hardware sharing solution for standard cloud functions, but because VM configuration takes several seconds, Serverless vendors use sophisticated techniques to speed up the creation of function execution environments. One approach is reflected in AWS Lambda, which maintains a VM instance, a Warm Pool, to be allocated to tenants when needed, and an active pool to run functions and service subsequent calls. Resource life cycle management and multi-tenancy packing need to achieve a high utilization, which are key techniques for severless computing success. We recognize that several recent proposals aim to reduce the cost of multi-tenancy isolation through containers, UnikerNAL, Library OSes, or Language VMs. For example, Google announced app Engine, Cloud Functions, and Cloud ML Engine using Gvisor. Amazon Firecracker VMs is available for Lambda and Fargate. CloudFlare Workers Serverless platform that provides multi-lease isolation between javascript cloud functions using browser sandbox technology

Several other differences help Serverless succeed. By allowing users to bring in their own libraries, Serverless computing can help with a wider range of applications than PaaS services, which only support specific usage scenarios. Serverless computing runs in modern data centers on a much larger scale than older Web hosting environments.

As described in Section 1, the Cloud Function (or FaaS) popularized the Serverless pattern. It is important to recognize that part of their success is due to BaaS products, services that have existed since the beginning of the public cloud, such as AWS S3. In our view, these services are domain-specific, highly optimized implementations of Serverless computing. Cloud functions represent a generic form of serverless computing. These observations are summarized in Table 3 by comparing the programming interfaces and cost models of several services

Table 3: Serverless computing services with their corresponding programming interfaces and consumption models. Note the above bigQuery, Athena, and Cloud functions, where users pay separately for the storage they consume (such as Google Cloud Storage, AWS S3, or Azure Blob Storage).

A container choreography technique for deploying microservices. Unlike Serverless computing, K8S is a technology that simplifies Serverful computing. Thanks to years of use and development within Google, which has achieved rapid adoption of a large number of systems, K8S offers short-life computing environments like Serverless computing, with fewer constraints, for example, in terms of hardware resources, execution time, and network communications. It can also deploy applications originally deployed in the public cloud with minimal adaptation. Serverless computing introduces a specification shift that completely offloads the operational classes and offloads them to the provider, making fine-grained [T5] multi-tenancy reuse possible. K8s hosting services such as GKE, EKS provide an intermediate layer: they take the job of managing K8s off the table and give developers the flexibility to configure any container. The key difference between K8s and Serverless computing is the billing model. The former is charged by retained resources, while the latter is charged by duration of function execution.

The K8s is also a perfect match for hybrid applications, running partly on local hardware and partly on the cloud. Our view is that this hybrid application makes sense in the transition to the cloud. In the long term, however, we believe that the economic cost of cloud scaling, faster network bandwidth, increased cloud services, and simplified cloud management through Serverless computing will reduce the importance of such hybrid applications.

Edge computing is a good companion to cloud computing in the post-PC era, and this post focuses on how Serverless Computing is transforming programming habits in the data center. The interesting underlying trend is that this will also affect edge computing. How many CDN providers enable users to execute their functions in the facility closest to them, no matter where the user is. AWS IoT GreenGrass even integrates Serverless execution on edge devices

Now that we have defined and determined the context scenario for Serverless computing, let’s see why it is so attractive to cloud providers, users, and researchers

2.2 The appeal of Serverless computing

For providers, Serverless drives business growth by developing simplified ways to attract new customers and help existing customers make more use of cloud resources. For example, a recent survey found that 24% of Serverless users are new to cloud computing and 30% of existing Serverful customers also use Serverless computing. Additionally, short uptime, small memory footprint, and statelessness naturally improve reuse by making it easier for providers to find resources that are not in use and perform these tasks. Providers can also leverage less desirable compute nodes – instance types are entirely up to the provider – such as older servers that may no longer be attractive to Serverful users. All this adds to income from existing resources.

Table 4: Popularity of Serverless computing usage scenarios, 2018 survey

The benefit to customers is increased programming productivity and cost savings in many application scenarios as a result of higher utilization of the underlying servers. Even though Serverless makes customers more efficient, the Jevons paradox [T6] states that efficient use only increases users and demand, ultimately increasing usage rather than reducing it.

Serverless elevates the cloud deployment hierarchy from x86 machine code to high-level programming languages — 99% of cloud computers use the x86 instruction set, enabling an architectural revolution. If ARM and RISC-V offer better cost performance, Serverless computing can easily switch instruction sets. Cloud providers can also investigate language-oriented optimization and domain-specific architectures [T7] to speed up the execution of programs written in a language, such as Python.

Users like Severless because it allows functional deployment without knowledge of the cloud infrastructure, and experts save deployment time and focus on application-specific problems. Since functions are billed only at execution time and are fine-grained, Serverless users can save money, which means they only pay for what they use, not what they keep. Table 4 shows the current popular Serverless usage scenario.

Researchers are attracted to Serverless computing, especially in cloud functions, because it is a new general-purpose computing abstraction and is likely to be the future of cloud computing, as there are many possibilities for accelerating performance and overcoming limitations.

3. Limitations of today’s Serverless computing platform

Cloud functions have been successfully applied to a variety of workloads, including API services, event flow processing, and specific ETL (Table 3).

To see what was preventing it from being applied to a general workload, we tried to create serverless versions of the applications we were interested in and looked at samples published by others. They do not represent other information technologies outside the current serverless computing ecosystem; They are just a few simple examples to reveal common weaknesses that may hinder the development of serverless versions of many other applications.

In this chapter, we present an overview of the five research projects and discuss the barriers that prevent the Serverless platform from achieving state-of-the-art performance, such as matching the performance of workloads in Serverful deployments. We are particularly concerned with the approach of using generic cloud functions rather than relying on other specific Serverless applications (BaaS). However, in our final example, SQLLite, we identified a use Case that maps badly to FaaS, and we concluded that databases and heavily state-dependent applications are still suitable for BaaS. An attachment is at the end of the paper, with more detailed information for each application.

Interestingly, even the eclectic [T8] application combination exposes similar weaknesses, which we list after describing the application. Table 5 summarizes the five applications.

ExCamera: Video real-time encoding. Providing real-time encoding services like YouTube, today’s encoding schemes can take tens of minutes or even hours, depending on the size of the video. In order to code in real time, ExCamera parallelizes the slow part of the code and executes the fast part serially. ExCamera exposes the internal codec state, allowing codec tasks to be performed in pure functional semantics. In particular, each task needs to take the internal state with the video frame as input and the modified internal state as output.

MapReduce: Analysis frameworks such as MapReduce, Hadoop, and Spark are deployed in traditional management clusters. While some of the analyzed workloads have already started migrating to Serverless, most of these workloads consist of Map jobs, and the next natural step is to support full MapReduce jobs. The driving force behind this effort is to leverage the flexibility of Serverless computing to efficiently support jobs that require varying amounts of consumption during execution.

Numpywren: Linear algebra. Large-scale linear algebra computing is a kind of high-performance computing cluster traditionally deployed in supercomputers or connected by high-speed and low-latency networks. In this historical context, Serverless seems inappropriate for this scenario [T9], but there are two reasons why serverless might make sense in linear algebra calculations. First, managing clusters is a huge hurdle for scientists with no CS background. Second, the amount of parallelism varies significantly during computation. Configuring a cluster with a constant number of nodes either slows down operations or does not take full advantage of cluster performance.

Cirrus: Machine learning training. Machine learning researchers use VM clusters to perform various ML workflow tasks, such as preprocessing, model training, and overparameter optimization. One challenge with this approach is that different stages of the pipeline require different resources, and as with linear algebra, a fixed-size cluster can lead to severe underutilization or slow down. Serverless computing can overcome this challenge by properly allocating resources to different phases and freeing developers from maintaining clusters.

Serverless SQLLite: Database. Automatic scaling database services already exist, but to better understand the limitations of Serverless, it is important to understand what makes database workloads so challenging. In this context, we consider whether a third party can implement a Serverless Database directly using cloud functions. One solution is to run generic transactional databases such as PostgresSQL, Oracle, and mysql in the cloud functions. However, there are several challenges to this. First, Serverless computing does not have built-in persistent storage, so we need to take advantage of remote, which can cause significant latency. Second, the database uses connection-oriented protocols, such as the database running as a server to receive connections from clients. This scheme conflicts with cloud functions running behind network address translation and therefore cannot support these links. Finally, many high-performance databases rely on shared memory, and cloud functions are isolated and cannot share memory. Some databases that do not share memory want nodes to remain online and directly accessible. All of these issues present significant challenges to running traditional database software on a serverless server or implementing its equivalent, so we wanted the database to be in BaaS.

One of the key reasons these applications want to use Serverless is fine-grained automatic scaling, so that resource utilization can match the different needs of each application. Table 5 summarizes the characteristics, challenges, and workaround of these five applications. We used them to identify four limitations of current Serverless computing

Table 5: Summary of requirements for the new application domain Serverless

3.1 Insufficient Storage in fine-grained Operation Scenarios

The natural statelessness of the Serverless platform makes it difficult to support fine-grained state sharing by applications. This is mainly due to the limitations of storage services provided by current cloud providers. Table 6 summarizes the features of cloud storage services.

Table 6: Ideal Serverless storage and cloud provider storage service features. Green is good, orange is medium and red is bad.

Persistence and availability guarantees describe how tolerant a system is to failures:

Ephemeral means data is stored in memory, and the data will be lost if the application crashes. The ideal Serverless storage provides cost-effectiveness comparable to block storage while transparently allowing cloud functions to configure and access the storage

Object Storage such as S3 and Azure Blob Storage are highly scalable and offer inexpensive long-term object Storage. But it has high access costs and latency. According to recent tests, all such services take at least 10 microseconds to read or write small objects. S3 offers high throughput, but it’s getting more expensive. Maintaining 100,000 IOPS costs $30 per minute, which is three to four orders of magnitude more than running ElastiCache. ElastiCache provides high performance with read/write latency of less than milliseconds, and a Redis sever thread can perform more than 10W IOPS.

Key value databases such as DynamoDB and Google Cloud Datastore provide high IOPS, but are expensive and take a long time to start up. Finally, cloud vendors provide in-memory storage instances, such as memcached or Redis, but they are not fault-tolerant or scalable like Severless platforms.

As we see in Table 5, applications built on Serverless require transparently configured storage services that match the scale of computing. Different applications drive different persistence and availability guarantees, and may drive latency or other performance measures. We believe this requires the development of ephemeral and persistent Serverless storage, which we discuss further in Part 4.

3.2 Lack of fine-grained coordinator

To extend support for stateful applications, the Serverless framework needs to provide a way for tasks to collaborate. For example, if task A needs the output of task B, there must be A way to let A know that the input is ready, even if A and B are on different nodes. Many protocols [T10] are designed to ensure data consistency and require similar coordinators.

None of the current cloud storage services have notification capabilities. But cloud vendors offer independent notification services, such as SNS and SQS, which add significantly to the latency, sometimes reaching hundreds of milliseconds. In addition, they can be costly when used for fine-grained coordination. There have been some relevant studies, such as Pocket, which does not have many disadvantages, but no cloud vendors have adopted it.

As mentioned earlier, applications have no choice but to choose between vM-managed systems that provide notifications within systems such as ElastiCache and SAND, or implement their own notification mechanism such as ExCamera, which lets cloud functions communicate with each other through a long-running vM-based server. This limitation also suggests that a new variety of Serverless computing might be worth exploring, such as naming function instances and allowing direct addressing to access their internal state (Actor as a service)

3.3 Poor performance and standard communication modes

Broadcast, aggregation, and shuffle are some of the most common communication primitives in distributed systems. These operations are applied to machine learning and big data analysis. Figure 2 shows the communication patterns of these primitives in a VM-based and functional solution.

Figure 2: Three common communication modes for distributed systems: A represents VM instances, each running two functions or tasks. B represents the same pattern but with instances of cloud functions. Note that the VM scheme has less remote communication. This is because VM instances provide a number of opportunities to share, aggregate, or combine data locally across tasks, either before sending or after receiving

With the VM scenario, all tasks running in one instance can share copies of data for broadcast, local aggregation, before sending results to other instances. So the complexity of broadcast and aggregation is O(N), where N is the number of VM instances. However, the complexity of cloud functions is O(NK), where K is the function instance in each VM. The Shuffle operation is more dramatic. In the VM scenario, local tasks can combine data so there is only one message between two VM instances. Suppose the same number of senders and receivers are N squared messages. By contrast, the cloud function requires NK squared messages. Functions have smaller core numbers than VMS, with K typically ranging from 10 to 100. Since the application has no control over the location of functions, a Serverless application may send two to four orders of magnitude more data than a VM solution

3.4 Predictable performance

Even though cloud functions have lower startup latency than traditional VM instances, for some applications the latency that occurs when a new instance is started can be high. There are three factors that affect the cold start delay:

1. Time to start the cloud function

2. The practice of initializing software environments, such as Python libraries

3. User code

The latter two can dwarf the former. It may take 1 second to start the function, but it may take more than 10 seconds to load the application’s libraries. [t11]

Another barrier to predictive performance is the variability of hardware resources, which is the result of cloud providers having the flexibility to choose underlying servers. In our experiments, the CPUS we sometimes occupy are products of different eras. This uncertainty exposes a fundamental trade-off between cloud providers wanting to maximize their resources and predictability.

4. What will Serverless computing become

Now that we’ve explained serverless computing today and its limitations, let’s look into the future and see how its strengths can be leveraged for more applications. Researchers have begun to address these issues and explore ways to improve the Serverless platform and the workloads running on it. Additional work done by Berkeley colleagues and some of our colleagues emphasizes the opportunities and challenges of data-centric, distributed systems, machine learning, and programming models in Serverless computing. Here, we take a broad look at increasing the types of applications and hardware that work well in Serverless computing, identifying research challenges in five areas: abstraction, systems, networks, and security and architecture

4.1 the abstract

Resource requirements: Serverless now allows developers to specify memory size and execution time limits for functions, but no other resource requirements. This abstraction discourages those who want more control over a given resource, such as CPU or GPU. A way for developers to explicitly specify these resource requirements. However, this will make it more difficult for cloud providers to achieve high utilization through reuse, because it adds more constraints to function scheduling. It also goes against the spirit of serverless, increasing the administrative overhead of cloud application developers

It is better to raise the level of abstraction and let cloud providers infer resource requirements rather than developers specify them. To do this, cloud providers can use a variety of methods, from static code analysis to analyzing previous runs to dynamic (re) compilation to reposition code to other machine architectures. Providing the appropriate amount of memory automatically is particularly attractive, but can be very challenging when the solution must work with automated garbage collection. Some studies suggest that these language runtimes can be integrated with the Serverless platform.

Data dependencies: Cloud function platforms today are not aware of data dependencies between functions, let alone the amount of data they may exchange. This neglect can lead to suboptimal system layout, which leads to inefficient communication modes such as MapReduce and NumpyWren

One way to address this challenge is for cloud providers to expose apis for applications to specify their graphs, minimizing traffic and improving performance through better location decisions. We note that many common distributed frameworks (MapReduce,Spark, Beam, Cloud Datafow), parallel databases (BigQuery, Cosmos DB), and orcheography frameworks (Airflow) generate an internal diagram. In principle these systems can run on the implication after modification and expose the map to the cloud provider [T12]. Note that AWS Step Functions represent an advance in this direction, providing a stateful machine language and API

4.2 System Challenges

High-performance, affordable, transparently configured storage: As discussed in Tables 3 and 5, we have two different unresolved storage requirements: Serverless temporary storage and persistent storage

Temporary storage. The first four applications in Part 3 are limited by the speed and latency of storage, which is used to transfer state between cloud functions. The amount of capacity required varies, and they all require such a storage to maintain the state of the application for its lifetime. Once the application ends, the state can be discarded. Such ephemeral storage may be configured using caches from other applications. [t13]

One way to provide transient storage is to build a network stack optimized distributed memory service that ensures microsecond latency. This system ensures that the application’s functions can efficiently store switched state for the lifetime of the application. The memory service automatically scales storage capacity and IOPS based on application requirements. One unique aspect of this service is that it requires not only transparent allocation of memory, but also transparent release of memory. Specifically, storage needs to be released automatically when the application terminates or fails. This management is similar to the operating system automatically releasing resources allocated by the process when it completes (or crashes). Further, such storage must provide access protection and performance isolation between applications.

RAMCloud and FaRM show that it is possible to provide microsecond experiments with memory storage services that support hundreds of thousands of IOPS per instance. They achieve this by optimizing the entire software stack and taking advantage of RDMA to minimize latency. However, they need to apply specified configuration storage. They also do not provide strong segregation for multiple rentals. Another recent, Pocket, aims to provide an abstraction of ephemeral storage, but also lacks automatic scaling, requiring applications to pre-allocate storage.

By leveraging multiplexing, temporary storage uses memory more efficiently than Serverless computing does today. With Serverless, memory is wasted if the application needs less memory than the allocated VM instance memory, whereas with shared memory services, any unused memory from one Serverless application can be allocated to another application. In fact, even an application can benefit from multiplexing: with Serverless, memory not used by VMS is not used by programs running on other VMS, which all belong to the same application. Shared memory services do. Of course, even with Serverless computing, a cloud function that does not use all of its local memory can cause internal memory fragmentation. In some cases, memory fragmentation can be mitigated by the application state of storage cloud functions in shared memory services.

Persistent storage. Just like other applications, serverless database applications are limited by latency and IOPS within the storage system, and it also requires long-term data storage and file system variable state semantics. And database functions including OLTP may increasingly be provided as BaaS [T14]. We see this application as representative of several applications that require a longer retention period and higher persistence than the temporary storage provided by ServerLess. To achieve high performance Serverless persistent storage. One approach is to use ssD-based distributed storage pairs with distributed memory caching. One recent system that recognizes the need for these goals is the Anna Key Value database, which achieves cost-effectiveness and high performance by combining multiple existing cloud storage. A key challenge of this design is to achieve low tail latency[T16] in heavy tail access distribution [T15], based on the fact that memory cache capacity can be much, much lower than SSD capacity [T17]. Promising microsecond access times using new storage technologies [T18] are emerging as a promising approach to this challenge.

Similar to Serverless temporary storage, the service must be configured transparently and should ensure interapplication isolation, tenant security and predictable performance. However, serverless’s temporary storage will reclaim resources when the application terminates, and persistent storage must simply release resources (such as terminations caused by remove or delete commands), just like traditional storage systems. In addition, it must ensure persistence so that written data can be saved.

Coordinator/signaling services: Sharing state between functions often uses the producer-consumer pattern, which requires consumers to be aware of the data as it becomes available. Similarly, when a condition changes, one function may want to signal another function or multiple functions may want to coordinate, for example to implement a data consistency mechanism. Such signalling systems can benefit from microsecond delays, reliable messages and broadcast, and group communication. We also note that since cloud function instances are not independently addressable, they cannot be used to implement standard distributed system algorithms, such as consensus or leader elections

Minimize startup time: The startup time consists of three parts

1. Scheduling and starting resource users are used to run functions

2. Download the application environment (OS, library) to run the function code

3. Application startup, such as loading, initializing data structures, libraries.

Resource scheduling and initialization cause significant delays and burdens because it creates isolated execution environments and configates customers’ VPC and IAM policies. Cloud providers have recently focused on developing new lightweight isolation mechanisms to reduce startup times.

One way to reduce 2 is to use Unikernal. Unikerna eliminates the overhead of traditional operating systems in two ways. First, unlike traditional OSS that dynamically detect hardware, apply user configuration, and assign data structures, Unikernal compresses these costs by pre-configuring and statically assigning data structures to the hardware that runs them. Second, Unikernal only includes the drivers and system libraries required by the application, which is much less expensive than traditional systems. It is important to note that because Unikernels are customized for a specific application, they may not be efficient when running many instances of the standard kernel. Such as different cloud functions sharing kernel pages with the same VM, or precaching to reduce startup time. Another way to reduce 2 is to load libraries dynamically and incrementally when the application is called, such as Azure Functions using shared file systems.

Application-specific initialization (3) is the responsibility of the programmer, but cloud providers can include a readiness signal in their apis to prevent cloud functions from receiving work before instances are started. More broadly, cloud providers can look for ways to perform startup tasks ahead of time [T19]. This is especially powerful for tasks that are not customer-related, such as booting a VM with a popular operating system and a series of libraries. For example, Warm Pool [T20] can be shared among tenants.

4.3 Network Challenges

As shown in Figure 2 of Chapter 3, cloud functions can cause significant communication burden, and these communication primitives are very popular, such as broadcast, aggregation, and shuffle. In particular, if we package K functions on the same VM instance, the cloud version will send K times more messages than the single instance version, and K squared on shuffle.

There are several ways to tackle this challenge:

  • Cores are provided to functions in large numbers, similar to VM instances, so that multiple tasks can combine and share data between them before and after a message is sent on the network.

  • Allows developers to explicitly deploy functions to the same VM, provides distributed communication primitives that applications can use out of the box, and allows cloud providers to assign functions to the same VM instance.

  • Let the application provide the graph and the cloud provider locate the function to minimize the communication burden (see abstract Challenge)

Note that the first two proposals may reduce the flexibility of cloud provider placement functions, leading to reduced data center utilization. Arguably, they also violate the spirit of Serverless by forcing developers to think about systems management

4.4. Security challenges

Serverless computing shuffles security responsibilities, shifting them from the user to the provider without fundamentally changing them. However, Serverless computing must also address the risks inherent in application and multi-tenant resource sharing.

Random scheduling and physical isolation: Physical coexistence is at the center of a hardware-level side-channel or Rowhammer[T21] attack inside the cloud. As the first step in an attack, the attacking tenant needs to confirm that he or she is on the same physical host as the victim, rather than attacking random strangers. The transient nature of cloud functions may limit an attacker’s ability to identify victims (two functions running at the same time). A randomized enemy awareness scheduling algorithm can reduce the risk of locating the attacker and the victim together and make the attack difficult. However, deliberately preventing physical co-placement may conflict with optimizing start-up times, resource utilization, or communication arrangements

Fine-grained security context: Cloud functions require fine-grained configuration, including secret key access, storage objects, and even local temporary resources. The security policies of existing Serverful applications need to be translated to provide high expressive security apis for dynamic use by cloud functions. For example, functions may delegate security permissions to other functions or cloud services. A security context that uses encryption to protect function-based access control mechanisms is a natural fit for this distributed security model. Recent work proposes using information flow to control cross-function access control in multi-party configurations. Others that provide distributed management of security primitives, such as non-equivocation and revocation, can worsen the situation if functions create keys and certificates dynamically.

At the system level, users need more fine-grained security isolation for each function, at least optionally. The challenge of providing a function-level sandbox is to keep startup times short without caching the execution environment, sharing state between repeated function calls, and one possibility is to take a local snapshot of the instance so that each function can start in a clean state. Or lightweight virtualization technologies start to be adopted by Serverless providers: Library OSes includes gVisor, which implements the system API in the user space “Shim Layer”, while Unikernal and microVMs, including AWS Firecracker, optimize the guest kernel and help minimize the host attack surface. These isolation techniques reduce startup time to tens of microseconds, as opposed to VM startup time measured in seconds. Whether these solutions are equivalent to traditional VMS in terms of security remains to be proven. We expect the search for strong isolation mechanisms with low startup overhead to become an active area of research and development. On the positive side, vendor management and short-term instances can fix bugs faster.

One solution for user systems to avoid co-living attacks is to require physical isolation. Recent hardware attacks have also used retention of cores or even entire machines to attract users. Providers can provide customers with advanced options to start functions on physical hosts that are fully owned by them using [T22].

Unmindful of Serverless computing: Functions may leak access to mode and time information through communication. For Serverful applications, data is typically retrieved in batch processing and cached locally. Conversely, because cloud functions are ephemeral and widely distributed in the cloud, network transport patterns can leak more sensitive information to network attackers in the cloud (for example, employees are attackers), even if the payload is end-to-end encrypted. The tendency to decompose serverless applications into many small functions exacerbates this security exposure. Although the primary security problem is from external attackers, network patterns can be protected from employee attacks by using a casual algorithm [T23]. Unfortunately, these tend to have a high cost

4.5 Computing Architecture Challenges

Heterogeneous hardware, price, ease of management: x86 microprocessors, the dominant technology in cloud computing, show little performance improvement. An application’s performance improvement in 2017 was only 3%. If this trend continues, performance won’t double in 20 years. Similarly, DRAM capacity per chip is nearing its limit; They’re selling 16Gbit DRAM right now, but it seems impossible to make a 32Gbit DRAM chip. One good sign of this slow change is that vendors can put worn out old computers into the Serverless market undisturbed [T24].

Performance problems with general-purpose microprocessors do not reduce the need for faster computing. There are two paths, for functions written in high-level scripting languages such as JavaScript or Python, where hardware/software co-design can make language-specific custom processors one to three orders of magnitude faster, and domain-specific architectures [T25] are the other way forward. DSA is customized for a specific problem domain and provides significant performance and efficiency gains, but applications outside that domain perform poorly. Gpus have always been used to speed up graphics, and we started using DSAs for machine learning, such as TPU. TPU performance can exceed CPU performance by 30 times. These examples are the first of many in which general-purpose processors enhanced with DSA for different domains will become the norm

As mentioned in 4.1, we see serverless computing supports heterogeneous hardware in two ways:

1. Serverless embraces multiple instance types and provides different billing methods based on the hardware used.

2. The provider can automatically select accelerators and DSA based on language. This automation can be implicitly based on the software library or language used in the function, such as GPU hardware for CUDA code and TPU for TensorFlow code. Alternatively, the provider monitors the performance of functions and migrates them to a more suitable hardware the next time they run.

Severless computing now faces the heterogeneity of the SIMD instruction set in x86. AMD and Intel have rapidly evolved this part of the x86 instruction set by increasing the number of operands executed per clock cycle and adding new instructions. For programs that use SIMD instructions, it is much faster to run the 512-bit wide SIMD instructions on the latest Intel Skylake microprocessors than to run the 128-bit wide SIMD instructions on older Intel Broadwell microprocessors. AWS Lambda now offers all microprocessors for the same price, but users currently have no way to specify that they want a faster SIMD. In our opinion, the compiler [T26] should suggest which hardware would be the best match.

As accelerators become more popular, serverless providers will not be able to ignore the heterogeneous dilemma, not least because there are reasonable compensations.

Fallacies and traps

Myth: Since Lambda cloud instances have the same memory capacity as T3.nano and cost 7.5 times a minute, Severless is more expensive.

The beauty of Severless computing is all the system administration features included in the price, including availability redundancy, monitoring, logging, and scaling. Cloud providers report that customers see cost savings of 4-10 times when migrating applications to Serverless. It has much more functionality than a single t3.nano instance, and in addition to a single point of failure, the credit system limits it to a maximum of 6 minutes of CPU usage per hour (5% of two vCPUs), so it may refuse service during peak load while Serverless can handle it with ease. Serverless uses resources at finer boundaries, including scaling, so the computing resources used are likely to be more efficient. Because there is no charge when there is no event that triggers a function call, serverless can be much cheaper.

Trap calculations can have unpredictable costs.

For some users, a pay-as-you-go disadvantage is the inability to predict costs, which is inconsistent with the way many organizations manage budgets. When approving a budget, which is typically done annually, it is reasonable for an organization to expect to know what the cost of serverless services will be in the next year, which cloud providers can mitigate by offering bucket-based pricing, similar to the way phone companies offer flat-rate plans for a certain amount of usage. We also believe that as organizations increasingly use Serverless, they will be able to forecast their Serverless computing costs based on history, similar to the way they do today for other utility services such as electricity.

Fallacy Since Serverless computing programming is a high-level language like Python, it is easily portable between different Serverless providers.

Not only do function invocation primitives and packaging differ, but Serverless applications also rely on a proprietary BaaS product ecosystem that lacks standardization. Object storage, key value databases, authentication, logging, and monitoring are prominent examples. To achieve portability, users must adopt some standard API, such as the ONE POSIX is trying to provide to the operating system. The Knative project from Google is a step in this direction, aiming to provide a common primitive for application developers across deployment environments

Pitfalls Serverless vendor locking can be very strong compared to Serverful computing

This trap is the result of an earlier fallacy; If migration is difficult, vendor lock-in is likely. Some frameworks promise to alleviate this lock-in through cross-cloud support

Fallacy cloud functions cannot handle low-latency applications that require predictive performance

Serverful instances handle this low-latency application well because they are always open, so they can respond quickly to requests when they are received. We note that if the startup delay of cloud functionality is not good enough for a given application, a similar strategy can be used: warm up the cloud function with periodic runs to ensure that there are enough running instances to handle requests at a given time.

Pitfalls Few so-called “resilient” services can match the true flexibility of serverless computing

The term elasticity is a popular term these days, but the name is applied to services that are not as good as Serverless computing services.

We are interested in services that can rapidly change capacity, with minimal user intervention, and scale to zero when not in use. For example, despite its name AWS ElastiCache, it only allows you to instantiate a fixed number of Redis instances. Other “resilient” services require explicit capacity configuration, some take minutes to respond to changes in demand, or scale only within a limited range. When users build applications that combine highly elastic cloud functions with limited elastic databases, search indexes, or server-level applications, they lose many of the benefits of Serverless computing.

Without a quantitative, widely accepted technical definition or metric to help compare and compose systems, resilience remains an ambiguous description.

6. Summary and prediction

By providing a simplified programming environment, serveless makes the cloud easier to use, thus attracting more people who can and will use it. Serverless Computing promises FaaS and BaaS delivery and is an important sign of maturity for cloud programming. It eliminates the need for manual resource management and optimization for application developers in today’s Serverful computing. Similar to the evolution of assembly to high-level languages over the last 40 years.

We predict a surge in serverless usage, and we also expect a gradual decline in native applications in the hybrid cloud over time. Although some deployments may remain as they are due to regulatory constraints and data governance rules. [t27]

While this has been a success, we have identified some challenges that, if overcome, will make the server popular in a wider range of applications. The first step is Serverless temporary storage, which must provide low latency, high IOPS, and a reasonable price, but does not need to provide affordable long-term storage. The second type of application requires persistent storage, which can be stored for a long time on demand. New non-volatile memory technologies could help with such a storage system. Other applications can benefit from low latency signal services and support for popular communication primitives.

Two future improvement challenges for serverless computing are security and cost-effectiveness (which may come from special-purpose processors). In both cases, the features of Serverless computing have the potential to help address these challenges. Physical cohabitation is a condition for side-channel attacks, but is difficult to determine in Serverless computing, and cloud functions can easily be randomly placed. Programming with cloud functions in high-level languages such as JavaScript, Python, and TensorFlow raises the level of programming abstraction, making it easy to innovate and deliver cost-effective hardware.

Berkeley’s View on Serverless Computing Paper predicted that in 2009 the challenges of cloud computing would be solved and would flourish, and this has come true. Cloud turnover is growing at 50% a year and has proven to be highly profitable for providers.

We make the following predictions for the next 10 years of Serverless computing:

  • We expect the new BaaS storage service to extend the types of applications that work well on Serverless computing. Such storage will match the performance of the local block, both transient and persistent. We’ll see heterogeneous hardware for Serverless computing far outstripping the traditional x86 microprocessors that power it today.

  • We expect Serverless computing to be easier to program securely than Serverful computing, thanks to high-level programming abstraction and fine-grained isolation of cloud functions.

  • We do not see a basis for serverless computing costing more than Serverful computing, so we predict that the billing model will change and that applications of any size will consume less cost.

  • The future of Serverful computing will be to promote BaaS. Applications that are difficult to write on Severless computing, such as OLTP databases and via message primitives such as queues, may be part of this service.

  • Serverful computing will not go away, and as Serverless Computing overcomes its many limitations, its importance in the cloud will decline

  • Serverless computing will become the default computing mode in the cloud era, largely replacing Serverful computing, thus ending the client-server era

As long as it runs at full speed, it’s ok to reuse resources in the same pool

[T2] 原 文 : Exclusive resource pool, usually idle more, ready to use, resource reuse is difficult

Efficient use of resources and, for the user, the risk of resource management and other risks to the cloud provider

[T4] New forms of charging

In contrast, coarse-grained resources are allocated at the virtual machine level, whereas serverless makes shared virtual machines possible

[T6] This is why traditional server vendors like Oracle rushed to do cloud computing in the early days of cloud computing. The high efficiency of VM allowed servers to sell well

[T7] Translator: Domain related, such as security field

Not stateless, but not state-dependent applications like databases

[t9] network latency and high performance are not suitable

[T10] For example, raft

[T11] So Java needs a solution, otherwise Serverless will have a hard time in the domestic market, such as Graal and Quarkus

[T12] In this case, who sets the standards or do cloud vendors take the initiative to adapt?

[T13]

[t14] One recent example is Amazon Aurora Serverless (aws.amazon.com/rds/aurora/…). Services such as Google’s BigQuery and AWS Athena are basically serverless query engines rather than fully cassis databases.

[T15] The latency and speed of each cloud storage are completely different, presenting a heavy tail distribution

[T16] Such as P99 1ms, but P99.9 up to 2s, said that this part of the delay needs to be reduced

It is possible that [T17] presents a 2/8 distribution as described by the heavy tail distribution

[t18]An Chen. A review of emerging non-volatile memory (NVM) technologies and applications. Solid-State Electronics, 125:25–38, 2016.

[t19]Edward Oakes, Leon Yang, Dennis Zhou, Kevin Houck, Tyler Harter, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. SOCK: Rapid task provisioning with serverless-optimized containers. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 57–70, 2018.

[t20]Timothy A. Wagner. Acquisition and maintenance of compute capacity, September 4 2018. US Patent 10067801B1.

[t21]Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. Flipping bits in memory without accessing them: An experimental study of dram disturbance errors. In ACM SIGARCH Computer Architecture News, volume 42, Pages 361 — 372. IEEE Press, 2014.

Very good charging model and security policy, though against the spirit of Serverless

[t23] An algorithm whose behavior, by design, Is independent of some property that influences a typical algorithm for the same problem…

Elaine Shi, T-H Hubert Chan, Emil Stefanov, and Mingfei Li. Oblivious RAM with O((log N) 3 ) worst-case cost. In International Conference on The Theory and Application of Cryptology and Information Security, Pages 197 — 214. Springer, 2011.

[T24] Translator: Is there no interference when the user doesn’t perceive it?

Translator: Compiler for Serverless platform

This also hinders the transition of some businesses to service Mesh

Click to follow, the first time to learn about Huawei cloud fresh technology ~