Kafka has been used in previous projects, so in a recent internship interview I was asked how to ensure that Kafka messages are ordered, so this article summarizes the three frequently asked questions about Kafka.

reliability

For message queues, failure to ensure the reliability of messages may lead to serious production accidents. If we buy something in the supermarket and pay with our phone, the payment message is stored in Kafka, but for some reason the message is lost, the merchant doesn’t receive the money, and we already show a charge, that is absolutely unacceptable, so the reliability of the message queue is critical.

The guarantee of reliability mainly needs to be realized from the perspectives of producer, Broker and consumer. Let’s give a brief introduction first.

Kafka is a message queue based on a publish-subscribe model. It consists of producer, Broker, and consumer parts. In the Kakfa model, producers are responsible for producing messages and sending them to a Broker for storage, from which consumers pull subscribed messages for consumption. As shown below:


From the figure above we can see that the three places where messages can be lost in Kafka are the producer, Broker, and consumer. Here are three cases.

Ensure producer message reliability

In essence, producers and brokers communicate through a network. Therefore, network reliability must be guaranteed for producers. Kafka uses acks=all to set the Broker to receive the message and synchronize it to all slave nodes to send an acknowledgement message to the producer. If the producer does not receive an acknowledgement message, it sends the message repeatedly to the Broker, ensuring the reliability of the message between the producer and the Broker.

Ensure the reliability of the Broker’s messages

After a Broker receives a message from a producer, it is possible to lose a message, most commonly when a message reaches a Broker and the server goes down. Kafka can be divided into multiple Broker nodes. To increase Kafka’s throughput, a topic is usually divided into multiple partitions, each of which is distributed among different brokers. If a partition is lost, part of the topic content is lost, so partitions often need multiple copies to ensure high availability.


As shown in the figure above, a Topic is split into three partitions, and each partition is copied, so if Broker1 fails at this point, there are still three full partitions on Broker2 and Broker3. In this case, you only need to re-elect the partition leader. It is important to note that the data on the leader must be synchronized to the follower to return messages to the producer. Otherwise, messages may be lost.

To ensure the reliability of consumer information

A more likely case of message loss is when the consumer pulls the message from the Broker, and if the consumer hangs before the consumption is complete and automatically submits the offset, a message is lost. So the solution is to turn off the automatic offset submission and manually submit the offset after the actual consumption is successful.

idempotence

Ensuring that messages are idempotent is essentially ensuring that messages are not consumed repeatedly. The idempotent guarantee needs to be analyzed according to specific services. For example, if an order information is inserted into MySQL, you can query whether the information already exists in the database according to the order ID for deduplication. If you set a key like Redis, Redis naturally supports message idempotency, so you don’t need to care about message idempotency in this case. In short, the guarantee of idempotency can be analyzed in detail according to business requirements.

order

Message queues need to be strict about the order in which messages are consumed in certain situations. Imagine a scenario where you see your boyfriend Posting pictures of other girls on moments, and you make a comment below: “Cheating man who plays with women’s feelings”, and then delete his wechat. At this time, if the consumption order of the two messages is reversed, that is, delete wechat first, and then make comments, then it is impossible to make comments.

When talking about reliability, we mentioned that Kafka’s topics are composed of multiple partitions, so we can ensure message ordering in the most extreme way possible, with only one partition per topic. The problem here is that if a topic corresponds to only one partition, then the throughput drops dramatically, which is the opposite of what Kafka was designed for.


Another method is recommended, because the order between different partitions cannot be guaranteed, so ensuring that the message is in the same partition is the key to ensuring the order of the message. In addition to the extreme solution described above, in fact, you can specify a partition when sending the message. Or specify a key, because messages with the same key are guaranteed to be sent only to the same partition. In this case, the key can be represented by an attribute like userID. In the above scenario, the userID of the girl first performs the comment operation, and then performs the delete friend operation. The two operations are sent to the same partition due to the same key value.