Original: Monkey World (wechat official ID: Cxytiandi), welcome to share, please reserve the source.

A reader told me that when he went to a company interview recently, the interviewer had to ask him how he would handle MQ failure. The reader said at that time also quite meng, because in the daily work also did not think of such a question, then answered: hang up on the error bai **, ** immediately restart bai, how to deal with.

This question does not mean that the interviewer is pulling back, as MQ can fail and is normal. It just means that the probability of this failure is very small, after all, it is cluster mode.

If you’re talking to a friend or colleague about this question, you can answer it any way. If you’re in the middle of an interview, you still need to think carefully about how you’re going to respond. Don’t be too casual or you might end up with a bad result.

Step 1: Encapsulate the operation of MQ uniformly

If MQ fails, it will affect your messaging logic. One way to think about this is that MQ, unlike a database, can’t do anything when it dies. MQ itself is designed for multi-system decoupling, asynchronous processing, etc., and even if MQ fails, the main process will not be affected. So this is really a downgrade, nothing special.

To downgrade, it must be done uniformly. It is unlikely that every place a message is sent will be processed. Only funny people would do that. So the first step is to encapsulate MQ operations, and then implement a unified degradation logic within the encapsulation, without letting users care about your degradation logic.

Step 2: Degrade processing, data storage

Redegrade can be done either by storing the message to be sent to the database, or by writing directly to the local disk.

Write the database

Write database is relatively simple, itself procedures will use the database, this time only need to add a separate table. The message is stored when the message is sent abnormally.

Write a disk

Write disk functions like a database. Write disk is more independent of the database. The downside is that when writing to disk, you have to consider the format of writing, such as whether to write messages to multiple files or not. Overall, you need to consider more points than a database.

Write the log

Logging is probably the easiest way, but human intervention is needed to retrieve failed messages and resend them later.

Step 3: Resend the message

A separate scheduled task can be set up to periodically resend these failed stored messages, and if your MQ service fails and recovers after a few minutes, the messages can be successfully retried.

It can also be handled manually, but most importantly when MQ fails and messages cannot be sent, these messages need to be stored and not lost, which is important.

The complete process is as follows:

conclusion

This article is just to give you some ideas. In fact, any middleware dependency needs to consider exceptions and how to roll back. Of course, the most important is monitoring, in the timely detection of problems after failure, fast repair. Then, the failback logic is added to resend the message to ensure the integrity of the service.

Train of thought is more important than coding, if it is useful to you, come forward!

About the author: Yin Jihuan, a simple technology lover, the author of Spring Cloud Micro-service – Full stack Technology and Case Analysis, Spring Cloud Micro-service Introduction combat and Progress, the founder of the public account Monkey World.