Data reliability

As a commercial level messaging middleware, The importance of message reliability can be imagined. This article introduces data reliability from the perspective of Producter sending messages to brokers, Topic partition replicas, and Leader election.

Topic partition copy

Before Kafka 0.8.0, Kafka did not have replicas. At that time, Kafka was only used to store unimportant data because without replicas, data could be lost. However, with the development of business, support for replicas has become more and more powerful, so in order to ensure data reliability, Kafka has introduced partitioned replicas since version 0.8.0 (see Kafka-50 for details). That is to say, each partition can be artificially configuration several copies (such as specified when creating the theme replication – factor, can also be configured at the Broker level. The default replication. Factor), usually set to 3.

Kafka ensures that events within a single partition are ordered and that partitions can be online (available) or offline (unavailable). Among the partitioned replicas, one is the Leader and the rest are followers. All read and write operations are performed by the Leader, and the followers regularly copy data from the Leader. When the Leader dies, one of the followers becomes the new Leader again. By partitioning replicas, data redundancy is introduced and data reliability in Kafka is also provided.

Kafka’s partitioned multi-copy architecture is the core of Kafka’s reliability assurance. Writing messages to multiple copies ensures that Kafka can persist messages in the event of a crash.

Producer sends messages to the Broker

If we want to send messages to Kafka’s topic, we need to do so using Producer. The Kafka theme corresponds to multiple partitions, and each partition corresponds to multiple copies; To allow users to set data reliability, Kafka provides message confirmation mechanisms within Producer. That is, we can configure how many copies of the message to be sent to the corresponding partition before the message is sent successfully. This can be specified by the acks parameter when defining Producer (prior to version 0.82.x it was set by the request.required. Acks parameter).

This parameter supports the following three values:

Acks = 0: means that if the producer can send the message over the network, the message is considered to have been successfully written to Kafka. Errors can still occur in this case, such as sending objects that can’t be serialized or a network card that fails, but if the partition is offline or the entire cluster is unavailable for a long time, no errors will be received. Running in acks=0 mode is very fast (which is why so many benchmarks are based on this mode), and you can get amazing throughput and bandwidth utilization, but you will definitely lose some messages if you choose this mode.

Acks = 1: means that if the Leader receives a message and writes it to the partition data file (not necessarily to disk), it will return an acknowledgement or error response. In this mode, in the event of normal Leader election, producers will be received when the election was a LeaderNotAvailableException exceptions, if producers can properly deal with the error, it will retry sending sad, messages will eventually arrived safely new Leader there. However, it is still possible to lose data in this mode, for example if a message has been successfully written to the Leader but the Leader crashes before the message is copied to the follower copy.

Acks = all (this has the same meaning as request.required. Acks = -1) : means that the Leader waits for all synchronized replicas to receive the message before returning an acknowledgement or error response. If combined with the min.insync.replicas parameter, it determines the minimum number of replicas that can receive the message before an acknowledgement is returned. The producer will retry until the message is successfully committed. However, this is also the slowest because the producer needs to wait for all replicas to receive the current message before continuing to send other messages.

According to actual application scenarios, different acks are configured to ensure data reliability.

In addition, a Producer can also send messages in synchronous (by default, using the producer.type=sync configuration) or asynchronous (producer.type=async) mode. If it is set to asynchronous, the performance of sending messages is greatly improved, but the risk of data loss is increased. To ensure message reliability, producer.type must be set to sync.

Leader election

Before introducing the Leader election, let’s take a look at the IN-Sync Replicas (ISR) list. The leader of each partition maintains an ISR list, which contains the Borker number of the follower copy. Only the follower copy that can keep up with the leader can be added to the ISR. This is configured using the replica.lag.time.max.ms parameter. Only members of the ISR can be elected as the leader.

2) Data consistency

The data consistency introduced here mainly means that consumers can read the same data whether the old Leader or the newly elected Leader. So how does Kafka work?

Assume that the partition has a copy of 3, where copy 0 is the Leader and copies 1 and 2 are followers, and are in the ISR list. Message4 has been written to copy 0, but the Consumer can only read Message2. Because all ISRS synchronize Message2, only messages above the High Water Mark can be read by Consumer, and the High Water Mark depends on the partition with the smallest offset in the ISR list, corresponding to copy 2 in the figure above. This is very similar to the barrel principle.

The reason for this is that messages that have not been replicated by enough replicas are considered “unsafe” and are likely to be lost if the Leader crashes and another replica becomes the new Leader. If we allow consumers to read these messages, we might break consistency. Imagine that a consumer reads and processes Message4 from the current Leader (replica 0). At this time, the Leader hangs up and elects replica 1 as the new Leader. At this time, another consumer reads the message from the new Leader and finds that the message does not exist. This leads to data inconsistency problems.

Of course, the introduction of the High Water Mark mechanism causes message replication between brokers to be slow for some reason, and the time it takes for messages to reach consumers to be longer (since we wait for the message to complete replication first). The delay time can be set by the replica.lag.time.max.ms parameter, which specifies the maximum delay that a replica can allow when copying messages.