I’m participating in nuggets Creators Camp # 4, click here to learn more and learn together!

I wanted to write the RocketMQ core foundation, which involves a lot of concepts and code;

Because this period of time is looking for a house to find a job in advance, really can not spare time;

Skipped the second and started the third;

This article is mainly about the introduction of MQ will appear after the problems and solutions; In fact, every question in this can be taken out and discussed in detail

In view of the time problem, I first describe the problem and solution ideas;

RocKetMQ Primer (1)- Juejin Nuggets

What are the problems with introducing MQ?

The introduction of MQ reduces the coupling between subsystems, and the asynchronous processing mechanism reduces the response time of the system. Meanwhile, it can effectively deal with the problem of peak request and improve the stability of the system.

However, the introduction of MQ also brings some problems.

1.1 Duplicate messages

Repeat consumption is a common problem with MQ, no matter which MQ you use.

What are the scenarios where duplicate messages occur?

  • Message producers produce duplicate messages
  • Message consumer confirmation failed
  • Message consumer confirmation timed out
  • The service system initiates a retry

If repeated messages are not properly processed, services will be greatly affected, resulting in repeated data or data anomalies. For example, the member system has opened more than one month.

1.2 Data consistency

Many times, data consistency issues arise if mq’s consumer business processes exceptions. For example, a complete business process is to give 100 credits after placing a successful order. An order is written to the repository, but the message consumer fails to send points, resulting in data inconsistencies where part of the business process is written to the repository and the other part is not.

Case 1:

Situation 2:

If an order and a credit are placed in the same transaction and either succeed or fail at the same time, there will be no data consistency problem.

However, because of cross-system calls, for the sake of performance, strong consistency is generally not used, but to achieve final consistency.

1.3 Message order problem

Some business data is stateful, such as order: placed, paid, completed, returned, etc. If order data is the body of the message, order issues are involved. It is fine for a consumer to receive two messages for the same order, the first with the status of order and the second with the status of pay. But if the state of the first message is pay, and the state of the second message is order then there’s a problem, you pay before you order, right?There is no guarantee of order if the consumer consumes the messages using multiple threads.

1.4 Message Accumulation

The entire MQ mechanism works best when message consumers can read messages at a pace that keeps up with message producers. Many times, however, messages are consumed at a slower rate than they are produced due to some batch process or other reason. This leads directly to message stacking problems that affect business functions.

For example, if messages are piled up, users can only become members after placing orders for a long time, which will definitely cause a large number of complaints from users.

Second, how to solve these problems?

2.1 Duplicate messages

We can solve this problem among consumers, whether it is because of duplicate messages generated by producers or by consumers, which requires idempotent design in business processing by consumers.

Here we explain the use of consuming message tables to solve such problems with MQ. In the consumption message table, the messageId is used as the unique index. Before processing the business logic, the user queries whether the message has been processed according to the messageId. If the message has been processed, the user returns success directly.

2.2 Data consistency

Case 1:

For the first case, it belongs to the situation that the data is inconsistent due to the abnormal occurrence of message consumers when consuming messages.

After a message consumption failure, the message will be automatically retried, so we just need to keep the interface idempotent. It retries the consumer at 1S,5S,10S,30S,1M,2M····2H, with a maximum of 16 retries. RocketMQ delivers the message to a dead-letter queue (dead-letter queue topic: %DLQ%+consumerGroup), then we need to look at the dead-letter queue and do manual compensation for the dead-letter message business.

Situation 2:

For the second case, a message was sent to MQ in the order system, but the local transaction was abnormal, so the order was not created, but the message was sent to MQ, and the integral was increased eventually, resulting in inconsistent data.

In this case, we can use RocketMQ transaction messages to solve the problem.

2.3 Sequential Messages

RocketMQ supports sequential messaging by default, but requires some setup at the producer and consumer level.

To increase the concurrency of sending and consuming, RocketMQ has four write queues and four read queues for each Topic.

2.4 Message Accumulation

If consumers consume messages at a slower rate than producers produce them, message accumulation can occur.

First of all, we need to locate the problem of message backlog. If the consumption situation is normal before, the message backlog is caused by internal abnormalities of consumer consumption. When we solve the consumption anomaly, the consumer speed should be greater than or equal to the producer speed, but due to the backlog of too many messages, new messages cannot be consumed in time. At this time, we need to judge the importance of the message. If the message is not very important, we can delete the previous message. If it is important and cannot be deleted, we need to increase the consumer cluster to increase the speed of consumption.

The following situations may occur:

1. It’s true that consumers are slow

  • Increase the number of consumer clusters

2. There is a problem with the internal logic of consumers, resulting in slower consumption than usual

  • Locate the internal problem of the consumer, the problem is solved, and return to normal speed

    At this point, the normal speed is restored. The current messages in the queue are all important, not important

3. The production speed is faster than usual due to the rush of business

  • Take the current limit
  • Increase the number of consumer clusters