As we know, messages go from production to consumption in three steps:

  • The production end sends messages to RabbitMQ
  • RabbitMQ sends messages to the consumer
  • The consumer side consumes this message

Each of these three steps may lead to message loss, which is not terrible, but what is terrible is that we do not know the loss, so some measures should be taken to ensure the reliability of the system. The reliability here is not necessarily 100% not lost, disk damage, machine room fire and so on can lead to data loss, of course, this is a very small probability, can do 99.99999% of the message is not lost, is reliable.

The following is a detailed analysis of the problems and solutions.

Reliability delivery at the production end

The production end ensures that messages are delivered to RabbitMQ correctly. There are many reasons for message loss at the production end. For example, a network fault occurs during message transmission. Or when a message is delivered to RabbitMQ and RabbitMQ hangs, the message can be lost without us knowing what happened. RabbitMQ itself provides some mechanism for dealing with these situations.

Transaction messaging mechanism

There are three methods associated with transactions in RabbitMQ: TxSelect (), txCommit(), txRollback(), txSelect(), txCommit(), txRollback(), After starting the transaction with txSelect, we can publish the message to the broker proxy server. If txCommit is successful, the message must reach the broker. If the broker crashes or throws an exception for some other reason before txCommit is executed, at this point we can catch the exception and roll back the transaction with txRollback.

try { channel.txSelect() channel.basicPublish(EXCHANGE_NAME, ROUTING_KEY, MessageProperties.PERSISTENT_TEXT_PLAIN, msg.toByteArray(StandardCharsets.UTF_8)) channel.txCommit() }catch (e: Exception){channel.txrollback () // Do channel.basicPublish(EXCHANGE_NAME, ROUTING_KEY, MessageProperties.PERSISTENT_TEXT_PLAIN, msg.toByteArray(StandardCharsets.UTF_8)) }Copy the code

Transactions do solve the problem of message confirmation between producer and broker. If the message is successfully accepted by the broker, the transaction can commit successfully. Otherwise, we can catch the exception and roll back the transaction while resending the message.

Transactional messaging mechanisms typically do not use this approach because they can severely degrade performance. So is there a better way to ensure that the producer knows that the message has been sent correctly without incurs a significant performance penalty? There is no better way to do this from the AMQP level, but RabbitMQ provides a better solution by setting the channel to confirm mode.

Confirm Message confirmation mechanism

What is the Confirm message confirmation mechanism? As the name implies, RabbitMQ will send an acknowledgement message to the production to let the production know that it has received the message, otherwise it will be lost and need to be re-sent.

Turn on the confirmation mode with this code:

Channel.confirmselect () // enable producer confirmation modeCopy the code

Then asynchronously listen for confirmed and unconfirmed messages:

channel.addConfirmListener(object: ConfirmListener{ override fun handleAck(deliveryTag: Long, multiple: Boolean) {override fun handleNack(deliveryTag: Long, multiple: Boolean) {RabbitMQ will send a nACK message if it loses the message due to an internal error}})Copy the code

This will make the production side aware of whether the message is delivered to RabbitMQ, but of course this is not enough, I’ll talk about the extreme case later.

Message persistence

What about message persistence? We know that RabbitMQ receives the message and stores it in memory temporarily, which is a problem. If RabbitMQ dies, the data will be lost after the restart, so the data should be persisted to hard disk so that it can be retrieved even after the RabbitMQ restarts. So how does that persist?

The message arrives at RabbitMQ and is first sent to the Exchange switch, then routed to the queue and finally to the consumer.

So you need to persist exchanges, queues, and messages.

Exchange persistence:

// The third argument true represents the exchange persistence channel.exchangeDECLARE (EXCHANGE_NAME, EXCHANGE_TYPE, true)Copy the code

Queue persistence:

Channel. queueDeclare(QUEUE_NAME, true, false, false, null)Copy the code

Message persistence:

. / / the third parameter MessageProperties PERSISTENT_TEXT_PLAIN said this message persistent channel. BasicPublish (EXCHANGE_NAME, ROUTING_KEY. MessageProperties.PERSISTENT_TEXT_PLAIN, msg.toByteArray(StandardCharsets.UTF_8))Copy the code

This way, if RabbitMQ gets a message and hangs, the message will resume itself after restart.

At this point, the mechanisms provided by RabbitMQ are all covered, but this is not enough to ensure reliable delivery of messages to RabbitMQ. I also mentioned the extreme cases where RabbitMQ can hang up after receiving a message before persisting it to disk, and the message is lost.

So in addition to the mechanisms provided by RabbitMQ, we have to make our own message compensation mechanisms to deal with extreme cases. Let me introduce one of these solutions — message warehousing.

The message storage

Message inputting, as the name implies, means that the message to be sent is saved in the database.

The message is stored in the database before being sent, with a status field status=0 indicating that the production end has sent the message to RabbitMQ but has not received an acknowledgement. If status is set to 1, RabbitMQ has received the RabbitMQ message. The two situations mentioned above may occur here, so a timer is set on the production side to periodically retrieve the message table, set status=0 and after a fixed time (it may be that the message has just been sent and the timer just retrieved the message status=0. So give a time) not received the confirmation of the message to take out resend (the second case will cause message repetition, the consumer side should do idempotency), it may fail to resend, so you can do a maximum number of resend, exceed the number of other processing.

This allows messages to be reliably delivered to RabbitMQ and sensed by the production end.

The consumer side is not lost

Now that you can get 100% reliable delivery to RabbitMQ from the production side, it’s time to look at the consumer side and see how they don’t lose messages.

By default, messages can be lost in the following three cases:

  • If a network fault occurs and the consumer is disconnected from RabbitMQ before the consumer receives the RabbitMQ message sent by RabbitMQ, the message will be lost.
  • After RabbitMQ sends the message, the consumer hangs up and the message is lost before the consumer receives it.
  • The consumer receives the message correctly, but if an exception or outage occurs during the processing of the message, the message will also be lost.

In these cases, the message is lost because of RabbitMQ’s automatic ack mechanism, which means that by default RabbitMQ deletes the message as soon as it is sent, regardless of whether it has been received or processed.

Therefore, you need to change the automatic ACK mechanism to manual ACK mechanism.

Consumer side manual confirmation message:

BasicConsume (QUEUE_NAME, false, DeliverCallback {consumerTag, delivery -> try { Do processing / / manual confirmation channel. BasicAck (delivery) envelope) deliveryTag, false)} the catch (e: Java.lang. Exception) {// Error handling, where the message can be requeued to resend or simply discarded}}, CancelCallback {})Copy the code

Thus, when the autoAck parameter is set to false, for the RabbitMQ server the message queue is split into two parts: one for delivery to the consumer; Some of them have been delivered to the consumer, but have not received a message from the consumer confirming the signal. If RabbitMQ does not receive an acknowledgement from the consumer, and the consumer is disconnected or down (as RabbitMQ will sense), RabbitMQ will re-queue the message (at the head of the queue) for delivery to the next consumer. Of course, it could be the same consumer side, and of course the consumer side needs to be idempotent.

Well, the full link from production to RabbitMQ to the consumer ensures data loss.