Welcome to Tencent cloud technology community, get more Tencent mass technology practice dry goods oh ~

Keyword: no server, cloud function

Chen Jie is a technical expert in Tencent Cloud Architecture Platform Department with 10 years of cloud computing experience. He is now working in Tencent Architecture Platform Department, responsible for the r&d of elastic computing and cloud function technology, and is committed to providing leading infrastructure platform to improve resource utilization and optimize the efficiency of programmer development and operation.

The virtual machine, container technology, no hot server into a new industry, no server cloud function allows users to not much concern for the deployment of the server operation, only need to develop the core business logic, running can be realized, disaster distribution ability, can according to the load automatically expand capacity, according to the actual call number and time billing. This time, we mainly share the technical challenges and architecture realization principles of Tencent Cloud serverless cloud function.

Share the serverless cloud function from the following four aspects: 1. Value of cloud function and application scenario 2. Cloud function architecture principle 3. Key technical points of cloud function 4. Development trend of cloud function industry

Serverless Cloud Function is a Serverless execution environment provided by Tencent Cloud, which helps users to run codes even without purchasing and managing servers. Users only need to use the language supported by the cloud platform to write the core code and set the operating conditions of the code, so that the code can run flexibly and safely on Tencent cloud infrastructure, and can fully manage the underlying computing resources, including server CPU, memory, network, code deployment, elastic scaling, load balancing and other services.

The use of serverless cloud functions can eliminate all operation and maintenance operations, so that enterprises and developers can focus more on the development of core business, achieve rapid launch and iteration, and grasp the pace of business development.

First, the value and usage scenarios of cloud functions

With the maturity of the cloud computing service market, users have gradually increased their acceptance of cloud computing. With the help of various basic cloud components, the online time of services has been shortened from month to day. However, compared with the traditional mode, users still need to reconstruct non-functional requirements based on cloud components.

Cloud functions try to extract business algorithms and processes for users to implement, connecting various cloud services and realizing universal functions such as load balancing, automatic scaling, fault recovery and safety supervision, so that users can build personalized services like building blocks and shorten the online time of services from days to minutes.

Cloud functions are better suited to support microservices architecture business scenarios than cloud hosts. Take multi-specification image compression service as an example. When users upload pictures to COS, the service automatically compresses the original pictures into specifications suitable for mobile phones, tablets, computers and other sizes. If the cloud function is used to realize the service, the user only needs to create the function, define the trigger condition of the function as “picture upload”, edit online or use IDE to complete the code after uploading, the service is completed. When a user uploads a picture, the defined function is automatically called to complete multi-specification compression of the picture. The cloud function platform automatically expands and shrinks the function instance according to the number of concurrent uploads, and finally charges according to the actual call consumption.

As you can see from this example, the main values that cloud functions bring to users are:

  • To accelerate the online time of user services, users only need to implement the service algorithm and process, and the online time is shortened to minutes;

  • Reduce the operation burden of users, users do not have to bear service expansion, fault recovery operation and maintenance;

  • Eliminate the user’s resource cost, the user does not have to bear the resource idle charge, only the actual call consumption

Second, cloud function architecture principle

The overall architecture principle of cloud function platform is shown in the figure.

The cloud function provides users with SDK or WEBUI. It communicates with other cloud components through event registration and callback mechanisms and provides standard API interfaces. Call distribution according to the function area, user, name, version number, authentication and other information to apply for function instances, and evenly distribute calls to available function instances; Function management is responsible for creating/modifying/deleting functions, and provides function code management, version management and other functions; Function scheduling selects the appropriate location to create/destroy function instances according to function resource requirements; Function instances deploy user-defined functions and are responsible for their execution and supervision.

From the perspective of cloud function positioning and architecture principle, the key technical indicators to measure cloud function platform can be summarized as follows:

  • It can not only support the rapid launch of business, but also achieve sustainable development;

  • It not only supports on-demand services, but also releases idle resources.

  • Not only can the service never be interrupted, but also can expand the operating range;

  • Not only support the free operation of business, and can avoid interference intrusion;

More on this below.

Iii. Support the rapid launch of business and achieve sustainable development

Support minute-level service online, which needs to reduce users’ research and development workload as much as possible. Cloud function users only need to provide simple function configuration and code to complete the online. Take image compression as an example. You can create an image compression service by editing python code as follows:

The first line introduces the dependency library, the fourth to the ninth line parses the input parameters, the eleventh line calls the library to complete the image compression, and the twelfth to the fifteenth line judges the results and returns them. Users can finish editing and submitting the code online, or use their favorite IDE to edit the code as they develop local programs. After debugging, they can submit the code into a ZIP package through SDK, and the service will be online once submitted successfully.

To support sustainable service development, smooth function upgrade and version change are required. When a user updates the function code or configuration, a new call request is sent to the new function instance. After the original call request is executed, the old function instance automatically disappears, and the service is smoothly updated without the customer’s awareness. It supports multi-version management of user functions, maps function aliases to user-specified versions, and realizes smooth switching between multiple versions without customers’ awareness.

During the function running, users print logs and upload standard output/error output logs to Tencent cloud log service platform. Users can monitor the function running status in real time.

Four, support business on-demand access, and can release idle resources

To support cloud functions to be truly on-demand, it is necessary to realize delayed allocation of resources when users call for the first time. The function call process is shown in the figure below:

During the call distribution, the cloud function platform will judge whether there is a function instance. If there is no instance, the instance will be started in real time. After the instance is started, the function call will be executed. In order to achieve the goal that the first call is fast enough, the call process needs to be optimized step by step:

  • Distribution and invocation phase: the call distribution level should be reduced. For example, for HTTP synchronous invocation initiated by users, the normal path can be saved to the persistent queue.

  • Image and code download stage: pre-deploy as much as possible to reduce the download time. For example, for new submitted functions, pre-load is started in parallel, so that there is no need to real-time download when the first call is initiated.

  • Container startup process: Container startup scripts need to be simplified to make the startup process as light as possible. For services sensitive to delay, instance reservation mechanism is provided. Users can reserve a small number of instances to reduce the extra delay of the first invocation.

  • Perform function calls: minimize the number of memory copies caused by function parameters, return data, and log passing;

  • Return call: need to minimize return hierarchy;

Through layer by layer optimization, the time of the first call to the platform can be controlled within 2s, and the time of subsequent call to the platform can be controlled within 5ms. As the number of customer requests increases or decreases, function instances automatically expand and shrink. The general algorithm is as follows:

If Number of current requests/Number of current instances > Capacity expansion threshold: Capacity expansion instanceelseNumber of current requests/Number of current instances < Capacity reduction threshold: capacity reduction instanceCopy the code

When the capacity is reduced to the last function instance, you need to delay the release for a period of time to avoid the delay caused by the repeated start or stop of the function instance within a short period of time.

Five, support the business is never interrupted, and can expand the scope of operation

To ensure that cloud functions are never interrupted, two Dr Objectives must be met:

  • Services are not interrupted when hardware is faulty

  • Services are not interrupted during platform upgrade

To achieve the three Dr Objectives, the overall architecture must be set, and corresponding support must be provided at each layer:

  • Access layer: Based on Tencent cloud CLB to achieve horizontal expansion, load balancing, 7-layer routing capability;

  • Logic layer: realize stateless module, stateless data inside the module, can be replaced at will;

  • Data layer: critical data is stored in a consistent storage warehouse.

  • Node layer: realize fast node fault detection and replacement recovery

For example, when the hardware of the Invoker module instance inside the platform fails, as shown in the figure below, because the Invoker module is stateless, the failure can be automatically eliminated by the CLB module of the access layer. After the elimination, new requests are sent to the remaining Invoker module instances, and the received asynchronous events can be completed by other Invoker retry. A synchronous HTTP call is directly returned to the user with an error request, which is retried by the user. After the failed Invoker instance recovers, it is automatically added to the CLB to continue load balancing.

When the platform need to upgrade the API interface, USES the increase not only change strategy, provide a new version of the API interface, keep the user service compatibility, original user using the new interface, CLB through 7 layer routing, routing to the new version invoker module as an example, the old version instance as the load reduced gradually shrink capacity, the new version instance as the load increases gradually expanded, In this way, users can achieve smooth and transparent version upgrade.

In order to realize cloud functions, it is necessary to connect with various cloud components, which need to provide event registration and callback mechanism. Cloud components provide registered events and corresponding callback interface. Cloud functions ensure that user permissions for communication of cloud components are passed through. Currently, cloud function has been connected with COS storage components of Tencent Cloud, and will soon be connected with CMQ, cloud monitoring and other cloud products of Tencent Cloud, and the operation scope will be extended to CDN nodes and IOT device gateways to realize edge computing.

Support free operation of business, and can avoid interference and invasion

The cloud function shall support the code passed by the user’s local test to be seamless in the cloud function platform, and shall have sufficient compatibility, and the user function runtime environment, and shall have software packages similar to the user development test environment, security and other configurations; At the same time to avoid interference between functions to prevent malicious intrusion.

In order to avoid interference between user functions, cloud functions use Docker container to encapsulate function instances, and achieve user isolation through Docker name isolation, space isolation, permission restriction and other mechanisms, supplemented by real-time conflict monitoring scheduling and other measures to deal with interference in a timely manner.

In order to avoid affecting the whole cloud function platform by the code executed by users, as shown in the figure below, the cloud function management platform is isolated from user functions. The user functions cannot sense the network address, run logs and other information of the management platform, and thus cannot affect the operation of the cloud function platform.

To avoid network detection and intrusion by malicious codes, user function instances are restricted to the limited public VPC network, as shown in the following figure. The gateway is required to communicate with external services, other function instances, and cloud components. At the same time, to support the integration of user function instances with individual CVM VMS, The cloud function platform communicates with its private VPC through an elastic network adapter.

Vii. Development trend of cloud function industry

In recent years, Serverles, micro-services and other concepts have gradually gained popularity, and cloud functions have begun to be understood and accepted by users. In order to meet users’ demands for faster online, lower cost and better architecture, Tencent Cloud launched cloud function products. Users may try out cloud functions from solving practical problems, such as realizing a simple service measurement tool, realizing a scheduled task, realizing the calculation of pictures, videos and files stored in COS, etc. As cloud function can linkage cloud components development, support the language rich, debugging tools, the process engine gradually improve, such as the cloud will gradually become the entire cloud platform function adhesives, the various cloud component fusion together, make cloud your public background, can support more complex state service scenario, then become a user general consideration of thick walls.

Welcome to try Tencent Cloud serverless cloud function products, cloud function to solve security access, fault recovery, automatic scaling, cost optimization, version management and other background general problems, users can be more relaxed and focused on business innovation. We hope that through cloud functions, we can further open up Tencent’s ability to cultivate and cultivate in massive services for many years, share it with the majority of users, and grow up with everyone.


Q: How is the code deployed in docker?

A: Download the code directly to the mother machine, and then mount the code directory to the Docker

Q: Are cloud functions generic or only run on cloud platforms?

At present, there are many open source cloud function platforms on Github, such as Openlambda, iron.io, etc. It is suggested to use cloud service directly, because it can be connected with multiple cloud products, and it is difficult to build A complete service by cloud function itself.

Q: Are queues used for event passing?

A: Asynchronous events use CMQ message queue persistent storage, synchronous events are not used

Q: Are cloud functions limited to development languages? If so, what is the current support for the Go language?

A: It currently supports Python 2.7/3.6, Node. js 4.3/6.10, Java8, and other languages, such as PHP and Go, if there is A general user need

Q: Are there system function calls? Any suggestions for granularity of custom functions?

A: Most of the system calls can be called, except for some dangerous operations, such as shutdown, restart, network service monitoring, etc., function granularity can refer to the design principle of micro-service, the function can be as detailed as possible

Q: Can it land?

A: There have been A lot of user cases, we will do some sharing later, you might as well try it yourself, it is free at present, we will always provide free packages, if there is any need, you can directly give us

Q: Do cloud functions support Kotlin?

A: I haven’t received any feedback from users about the need for this language, but I’m optimistic and will keep an eye on it

Q: Will the request scheduling function instance, the implementation of this scheduling algorithm?

A: In fact, this is A general load balancing and capacity expansion algorithm. What is more complicated here is to predict the need for capacity expansion in advance, which will be shared in detail later.

Q: Can you describe the implementation of scheduling requests to function instances?

A: Here is an Invoker module that maintains A request queue for each function. Currently, no priority is set. The request queue is scheduled on A first-come-first-served basis. There is a loop in the function instance that accepts the request and calls the user function with an argument when it is received.

Q: Can code fall into the clouds?

A: The code usually involves the call of other cloud products, so it is generally dependent on the cloud platform. You can pay attention to the open source Serverless framework, which encapsulates A layer on the public cloud functions to remove the dependency and realize smooth migration on various cloud platforms.

Q: What are the limitations of the code for cloud functions? For example, what functions cannot be called and what libraries cannot be imported?

A: It can be considered basically unlimited, but it will prohibit malicious behaviors, such as shutdown, restart, port scanning, etc. Port listening is also disabled because resident processes do not comply with the principle of enabling cloud functions on demand. If the pre-installed library does not meet the requirements, you can package the dependent library into zip to upload.

Q: What is the underlying container choreography based on? K8s?

A: Container platform based on Tencent Cloud, its bottom layer is K8S


What can we do with cloud functions? Honeypot case analysis: a record of sample capture and analysis of raspberry PI miniature worms

Has been authorized by the author tencent cloud community released, reproduced please indicate the article source The original link: https://cloud.tencent.com/community/article/455966