Chapter 4 RocketMQ applications

Message sending retry mechanism

1 shows

The mechanism used by Producer to resend messages that fail to be sent is also known as message sending retry mechanism.

For message recasting, note the following:

  • When a producer sends a message in synchronous or asynchronous mode, the producer tries again after the message fails to be sent. In OneWay mode, however, there is no retry mechanism
  • Only ordinary messages have a send retry mechanism, not sequential messages
  • The message resending mechanism can ensure that the message is sent successfully and not lost, but it may cause message duplication. Message duplication is an unavoidable problem in RocketMQ
  • In general, message repetition does not occur. However, when a large number of messages and network jitter occur, message repetition becomes a high probability event
  • Duplicating messages can also result from producer active resending and consumer load changes that Rebalance the message
  • Message duplication is unavoidable, but repeated consumption of messages should be avoided.
  • The solution to avoiding double consumption of messages is to add a unique identifier (such as a message key) to the message so that consumers can make consumption judgments about the message to avoid double consumption
  • There are three policies for message sending retry: synchronous sending failure policy, asynchronous sending failure policy, and message flushing failure policy

2 Failed to synchronize the sending policy

For ordinary messages, the round-robin policy is adopted by default to select the queue to be sent to. If the sending fails, retry twice by default. However, the Broker that failed last time is not selected during retries, but another Broker is selected. Of course, if there is only one Broker, it can only be sent to that Broker, but it will try to send to other queues on that Broker.

// Create a producer with the name of the Producer Group
DefaultMQProducer producer = new DefaultMQProducer("pg");
// Specify the nameServer address
producer.setNamesrvAddr("rocketmqOS:9876");
// Set the retry times when sending synchronization fails. The default value is 2
producer.setRetryTimesWhenSendFailed( 3 );
// Set the sending timeout period to 5s, 3s by default
producer.setSendMsgTimeout( 5000 );
Copy the code

Brokers also have failure isolation, enabling producers to select as many brokers as possible that have never failed to send. It can ensure that no other messages are sent to the problem Broker, to improve message sending efficiency and reduce message sending time.

Consider: How do we implement failure isolation ourselves?

1) Scheme 1: A Map set of a JUC is maintained in Producer, whose key is the timestamp of failure and whose value is the Broker instance. The Producer also maintains a Set that holds all Broker instances that did not send exceptions. Select the target Broker from the Set collection. Define a scheduled task to periodically purge brokers from the Map collection that have not experienced sending exceptions for a long time and add them to the Set collection.

Option 2: Add an identity, such as an AtomicBoolean property, to the Broker instance in Producer. Set it to true whenever a send exception has occurred on the Broker. Selecting the target Broker selects the Broker with this property value of false. Define a scheduled task that periodically sets this property of the Broker to false.

3) Scenario 3: Add an identity to the Broker instance in Producer, such as an AtomicLong property. Incrementing the value by one whenever a send exception occurs on the Broker. Selecting the target Broker selects the Broker with the smallest value for this property. If the values are the same, select polling mode.

If the number of retries exceeds, an exception is thrown and the Producer ensures that the message is not lost. Of course, when a Producer has a RemotingException, MQClientException, and MQBrokerException, the Producer automatically recasts the message.

3 Asynchronous sending failure policy

If asynchronous sending fails and retry, the asynchronous retries are performed on the same broker instead of other brokers. Therefore, this policy cannot ensure message loss.

DefaultMQProducer producer = new DefaultMQProducer("pg"); producer.setNamesrvAddr("rocketmqOS:9876");/ / specify asynchronous send failed do not retry sending producer. SetRetryTimesWhenSendAsyncFailed (0);
Copy the code

4 Failed to flush messages

By default, a Slave does not attempt to send a message to another Broker when the message flush timeout (Master or Slave) or the Slave is unavailable (when the Slave returns a status other than SEND_OK to the Master during data synchronization). However, for important messages can be set in the configuration files of the Broker retryAnotherBrokerWhenNotStoreOK attribute to true to open it.

Message consumption retry mechanism

1 Consumption retry of sequential messages

For sequential messages, when a Consumer fails to consume a message, it automatically retries the message continuously until the consumption succeeds in order to keep the message sequential. The default consumption retry interval is 1000 ms. Application message consumption is blocked during retry.

DefaultMQPushConsumer consumer = new DefaultMQPushConsumer("cg");/ / order message consumption failed retry interval, unit of milliseconds, the default value is 1000, its scope is [10300] consumer. SetSuspendCurrentQueueTimeMillis (100);
Copy the code

Since the retry of sequential messages is endless and continuous until the consumption succeeds, it is important to ensure that the application can monitor and handle the consumption failure in a timely manner to prevent the consumption from being permanently blocked.

Note that sequential messages do not have a retry mechanism for sending failures, but have a retry mechanism for consuming failures

Retry consumption of unordered messages

For unordered messages (normal messages, delayed messages, transaction messages), when the Consumer fails to consume the message, the message retry effect can be achieved by setting the return status. Note, however, that retry of unordered messages only applies to cluster consumption. Broadcast consumption does not provide retry failures. That is, after a broadcast consumption fails, the failed message is not retried and subsequent messages continue to be consumed.

3 Consumption retry times and interval

For the retry consumption in the case of unordered message cluster consumption, each message can be retried for a maximum of 16 times by default. However, the retry interval is different and gradually increases. The following table describes the retry interval.

Retry count Time elapsed since the last retry Retry count Time elapsed since the last retry
1 10 seconds 9 7 minutes
2 30 seconds 10 Eight minutes
3 1 minute 11 9 minutes
4 2 minutes 12 Ten minutes
5 3 minutes 13 Twenty minutes
6 4 minutes 14 30 minutes
7 5 minutes 15 1 hour
8 6 minutes 16 2 hours

If a message fails to be consumed continuously, a 16th retry will be made 4 hours and 46 minutes after normal consumption. If it still fails, the message is posted to the dead-letter queue

Example Change the retry times of consumption

DefaultMQPushConsumer consumer = new DefaultMQPushConsumer("cg");
// Change the consumption retry count
consumer.setMaxReconsumeTimes( 10 );
Copy the code

For the modified number of retries, the following policies apply:

  • If the value is less than 16, retry at a specified interval
  • If the value is greater than 16, the retry interval after 16 times is 2 hours

For a Consumer Group, changing the consumption retry count for only one Consumer applies to all other Consumer instances in the Group. If multiple consumers have made modifications, the overwriting method will take effect. That is, the last value to be changed overwrites the previous value.

4 Retry Queue

For messages requiring retry consumption, the Consumer does not wait for a specified period of time to pull the original message for consumption, but puts the messages requiring retry consumption into a special Topic queue and then consumes them again. This particular queue is the retry queue.

When there is a message that needs to be retried, the Broker sets up a RETRY queue with Topic name %RETRY%consumerGroup@consumerGroup for each consumer group.

1) This retry queue is for message groups, not per Topic (a Topic message can be consumed by multiple consumer groups, so a retry queue is created for each of these consumer groups)

2) A retry queue is created for the consumer group only when there is a message requiring retry consumption

Note that the time interval between consumption retries is very similar to the delay level for delayed consumption, and The Times are the same except for the first two times without a delay level

The Broker handles retry messages through delayed messages. The message is saved to the SCHEDULE_TOPIC_XXXX delay queue. After the delay expires, the message is delivered to the %RETRY%consumerGroup@consumerGroup RETRY queue.

5 Consumption retry configuration mode

In cluster consumption mode, if you want to consume retry after message consumption fails, you need to explicitly configure one of the following three methods in the implementation of the message listener interface:

  • Method 1: return ConsumeConcurrentlyStatus. RECONSUME_LATER (recommended)
  • Method 2: Return Null
  • Method 3: Throw an exception

6 Consumption Does not retry configuration

Cluster consumption mode, message consumption after failure if don’t want to try again, then after capturing the abnormal return and also after the success of the consumption of the same result, namely ConsumeConcurrentlyStatus. CONSUME_SUCCESS, do not try again to consume.

Dead letter queue

1 What is a dead letter queue

When a message fails to be consumed for the first time, the message queue automatically retries the consumption. If the consumption still fails after the maximum number of retries is reached, it indicates that the consumer cannot consume the message correctly under normal circumstances. In this case, the message queue does not immediately discard the message, but sends it to the special queue corresponding to the consumer. This Queue is called a dead-letter Queue (DLQ), and the messages in it are called dead-letter messages (DLM).

Dead-letter queues are used to process messages that cannot be consumed normally.

2 Characteristics of dead-letter queues

Dead-letter queues have the following characteristics:

  • Messages in a dead-letter queue are no longer normally consumed by consumers, that is, the DLQ is invisible to consumers
  • The validity period of dead letters is the same as that of normal messages. The dead letters are automatically deleted after 3 days (the expiration time of commitlog files)
  • A dead letter queue is a special Topic named %DLQ%consumerGroup@consumerGroup, that is, each consumer group has a dead letter queue
  • If a consumer group does not generate a dead-letter message, no dead-letter queue is created for it

3 Processing of dead-letter messages

In fact, when a message enters the dead-letter queue, it means that something is wrong in the system that prevents consumers from consuming the message, such as a Bug in the code. Therefore, dead-letter messages often require special processing by developers. The most critical step is to identify suspicious factors and resolve possible bugs in the code before the original dead-letter message is delivered and consumed again.