Due to high concurrency and massive data access, most systems upgrade from single architecture to cluster architecture, and inevitably use RPC framework. However, due to external uncontrollable factors such as network jitter, system calls may occasionally fail. In special scenarios, retry may be required. The following describes some common retry schemes.

A fixed number of cycles immediately retry

Can view of dubbo com. Alibaba. Dubbo. RPC. Cluster. Support. FailoverClusterInvoker# doInvoke method relevant code:

public Result doInvoke(Invocation invocation, final List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException { ... int len = getUrl().getMethodParameter(invocation.getMethodName(), Constants.RETRIES_KEY, Constants.DEFAULT_RETRIES) + 1; . // retry loop. RpcException le = null; // last exception. List<Invoker<T>> invoked = new ArrayList<Invoker<T>>(copyinvokers.size()); // invoked invokers. Set<String> providers = new HashSet<String>(len); for (int i = 0; i < len; i++) { ... try { Result result = invoker.invoke(invocation); . return result; } catch (RpcException e) { if (e.isBiz()) { // biz exception. throw e; } le = e; } catch (Throwable e) { le = new RpcException(e.getMessage(), e); } finally { providers.add(invoker.getUrl().getAddress()); } } throw new RpcException(le ! = null ? le.getCode() : 0); }Copy the code

tips

Common configurations of Dubbo are as follows:

<dubbo:reference id="xxxx" interface="xx" retries="3" timeout="1000"/>
Copy the code

Retries =”3″ tries three times, or a maximum of four tries, and throws an exception if it fails. Timeout =”1000″, in milliseconds, indicates the timeout duration of the dubbo service. When the client invokes the dubbo service, the timeout detection is enabled. If the timeout is 1 second, the client will report a timeout exception. If the timeout fails, the system tries again. If other exceptions cause the Dubbo service invocation to throw an exception, it will retry immediately instead of waiting for timeout to retry. So, be careful, be careful, be careful:

  • In the worst case, the maximum response time of the interface is (retries + 1) * timeout.

Retry with fixed delay

Add a fixed delay to the Dubbo retry code, for example, to simulate thread.sleep (delay).

for (int i = 0; i < len; i++) { ... try { Result result = invoker.invoke(invocation); . return result; } catch (RpcException e) { Thread.sleep(delay); }}Copy the code

This method can cause intermittent burrs on the dependent interface

Retry with random delay

You can see the source code of gRPC, parameter meanings are as follows:

  • INITIAL_BACKOFF (how long to wait after the first failure before retrying)
  • MULTIPLIER (factor with which to multiply backoff after a failed retry)
  • JITTER (by how much to randomize backoffs).
  • MAX_BACKOFF (upper bound on backoff)
  • MIN_CONNECT_TIMEOUT (minimum time we’re willing to give a connection to complete)
ConnectWithBackoff() current_backoff = INITIAL_BACKOFF current_deadline = now() + INITIAL_BACKOFF while (TryConnect(Max(current_deadline, now() + MIN_CONNECT_TIMEOUT)) ! = SUCCESS) SleepUntil(current_deadline) current_backoff = Min(current_backoff * MULTIPLIER, MAX_BACKOFF) current_deadline = now() + current_backoff + UniformRandom(-JITTER * current_backoff, JITTER * current_backoff)Copy the code

Spring Retry library

The @enableretry annotation is used to enable the retry framework and must be modified on the class

ProxyTargetClass: Boolean, used to specify the proxy mode; If the value is true, the Cglib proxy is used. If the value is false, the JDK dynamic proxy is used. By default, the JDK dynamic proxy is used

@Target({ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
@EnableAspectJAutoProxy(
    proxyTargetClass = false
)
@Import({RetryConfiguration.class})
@Documented
public @interface EnableRetry {
    boolean proxyTargetClass() default false;
}
Copy the code

The @retryable annotation modifies the method

@retryable (value = Exception. Class, maxAttempts = 3, backoff = @backoff (delay = 3000, multiplier = 2)) The delay is increased by two times.

Retry strategy

@Target({ElementType.METHOD, ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
@Documented
public @interface Retryable {
    String interceptor() default "";

    Class<? extends Throwable>[] value() default {};

    Class<? extends Throwable>[] include() default {};

    Class<? extends Throwable>[] exclude() default {};

    String label() default "";

    boolean stateful() default false;

    int maxAttempts() default 3;

    String maxAttemptsExpression() default "";

    Backoff backoff() default @Backoff;

    String exceptionExpression() default "";
}
Copy the code

Value: only specified exceptions thrown will be retried include: Functions similar to value. Exclude: Specifies the exception attempts that do not need to be processed. MaxAttempts: indicates the maximum number of retries (3 by default). Backoff: sets the compensation mechanism

Retreat strategy

@Target({ElementType.TYPE}) @Retention(RetentionPolicy.RUNTIME) @Import({RetryConfiguration.class}) @Documented public @interface Backoff { long value() default 1000L; long delay() default 0L; long maxDelay() default 0L; 0.0 D double multiplier (default); String delayExpression() default ""; String maxDelayExpression() default ""; String multiplierExpression() default ""; boolean random() default false; }Copy the code

Backoff: indicates the retry wait policy. Value: indicates the retry delay time. The default value is 1000L, that is, 1 second. Multiplier: Specifies the delay multiple. The default value is 0, indicating that the retry is performed after a fixed pause of 1 second

@recover This method is called back after a specified number of retries

  • The exception type must be the same as the Recover method parameter type.
  • The return value of the retry method must be the same as that of the Recover method.
@Target({ElementType.METHOD, ElementType.TYPE}) @Retention(RetentionPolicy.RUNTIME) @Import({RetryConfiguration.class}) @Documented public @interface  Recover { }Copy the code

Spring Retry the specific source code analysis can be reference: albenw. Making. IO/posts / 69 a96…