Tencent Cloud Serverless retry policy configuration ability interpretation

During a function call, a function call may fail for a number of reasons. Different error types and the way in which ** is called (synchronous, asynchronous) ** can affect the retry strategy. In actual business production, there are many developers who are confused about this policy. This article will give a comprehensive interpretation of the Serverless asynchronous queue retry policy and provide relevant configuration suggestions for various usage scenarios.

Wrong type

During a function call, a function call may fail for a number of reasons. Error types fall into the following categories:

Call wrong

The call error occurs before the function is actually executed. An invocation error occurs in any of the following cases:

Call request error. For example, the incoming Event data structure is too large, the input parameter does not meet the requirements, and the function does not exist.
Caller error. This occurs mainly when the caller does not have enough authority.
Overrun error. The number of concurrent calls exceeded the maximum number of concurrent calls

Runtime error

Run errors occur when the function is actually running. A runtime error occurs when:

Code running error. This type of error occurs during user code execution, such as when the function code throws an exception, or when the result is returned with a formatting problem.
Runtime error. During function execution, the Runtime is responsible for pulling up user code and executing it. Runtime errors refer to errors detected and reported by the Runtime, such as function timeouts or code syntax errors.

System error

Function platform errors, such as internal errors.

Retry strategy

Different error types and the way in which ** is called (synchronous, asynchronous) ** can affect the retry strategy.

A synchronous invocation

Synchronous calls include synchronous calls of cloud API triggers, API gateway triggers, and CKafka triggers. During a synchronous invocation, error information is directly returned to the user. Therefore, when an error occurs during a synchronous invocation, the platform does not automatically retry. The retry policy (whether to retry and how many times to retry) is determined by the caller.

The asynchronous call

Asynchronous invocation includes cloud API trigger asynchronous invocation, COS trigger, timing trigger, CMQ Topic trigger, etc. For specific trigger invocation types, please refer to related trigger description documents. In the new retry policy, developers can modify and customize the default Settings of Retry times and Maximum wait time in function configuration based on service requirements. These Settings are only applicable to asynchronous invocation scenarios.

Retry times: number of cloud function retries when the function returns an error. This parameter applies only to incorrect policy configurations. The default value is 2.
** maximum retention time: the maximum duration for which the cloud function can hold events in the asynchronous event queue. This parameter applies to all asynchronous call retry configurations. The default value is 6 hours, and the maximum length can be 10w.

Retry strategies for asynchronous calls with various types of errors:

Runtime error (including user code Runtime error and Runtime error) : When this type of error occurs, the function platform will default to two retries or use the configured number of retries, fixed interval of one minute. The new triggering event can be processed normally while the automatic retry is performed. If you configure a dead letter queue, events after three failures will be passed to the dead letter queue, otherwise the event will be discarded by the function platform.
System error: When this type of error occurs, the function platform will continue to retry based on the maximum waiting time you set (the default duration of retry is 6 hours). The retry interval increases exponentially to 5 minutes. If you configure a dead letter queue, events that fail after the maximum waiting time are sent to the dead letter queue for further processing by the user, otherwise the event will be discarded by the functional platform.
Timeout error: When this type of error occurs, the function platform will continue to retry based on the maximum waiting time you have configured (the default duration is 6 hours). The retry interval is 1 minute. If you configure a dead letter queue, events that fail after the maximum waiting time are sent to the dead letter queue for further processing by the user, otherwise the event will be discarded by the functional platform.
Call request errors and caller errors: When this type of error occurs, the platform will not retry any errors of this class except for overrun errors, because other request errors will not succeed even if they are retried.

Retry configuration suggestions

General usage scenarios:

In common scenarios, you are advised to use the default Settings to meet retry requirements in most error cases.

Sensitive to code reentrant:

For scenarios sensitive to code reentrant, you are advised to set the number of retries to 0, that is, code errors will not be retried.

High effectiveness in event handling:

In scenarios that require high efficiency in event processing and retry within a certain period of time, you can configure the event with the longest duration to eliminate expired events in time. Ensure the effectiveness of error retry.

Concurrent overload scenario:

ResourceLimitReached Indicates that the number of concurrent operations of the cloud function SCF exceeds the concurrency quota at the same time. Concurrent overload can be divided into synchronous invocation and asynchronous invocation. The cloud function SCF automatically retries the processing logic of asynchronous calls when the concurrent limit is exceeded. Data will not be discarded when the concurrent limit is exceeded within the retention time. In most cases, the user who calls asynchronously does not need to perform any operation. Within the maximum waiting time, the function platform automatically retries the concurrent error. In asynchronous invocation, if it is sensitive to effectiveness, it can reduce or reduce the impact of overlimit on the business system by configuring reserved concurrency. Dead-letter queue can be configured for data that is more important. During a synchronous call, error messages are directly returned to the user.

conclusion

Currently, retry configuration is fully enabled. You can configure retry policies based on service requirements. The default retry policy of the platform can meet the demands of most developers for error retry. For more information on the ability construction of asynchronous queue retry configuration, please follow wechat or the official website.

One More Thing

Immediately experience Tencent Cloud Serverless Demo and receive Serverless new user package 👉 Serverless /start

Welcome to: Serverless Chinese!