I’m Little Xia Lufei. Learning shapes our lives. Technology changes the world.

The article directories

  • The introduction
  • 1 Producer partition policy
  • 2. Reliability of Producer
    • 2.1 ISR mechanism of Producer
    • 2.2 ACK mechanism of Producer
  • 3. Data consistency of Producer
  • 4 ExactlyOnce Producer
  • A little attention, won’t get lost

The introduction

In this section, we will focus on Kafka Producer. Kafka is how Kafka sends messages, and how to ensure the reliability of the message sending process.

1 Producer partition policy

When it comes to Kafka producers, the first thing that comes to mind is the partitioning strategy of Producer. There are two main reasons why Kafka producers partition:

  1. Improved scalability of message queues in clusters. A Topic can be composed of multiple partitions.
  2. Concurrency can be increased because reads and writes can be performed on a Partition basis.

How does the Producer specify partitions when sending messages? All data to be partitioned is encapsulated into a ProducerRecord object. Partition is then performed according to the following rules:

  • When specifying partition, the value is directly used as the partition value.
  • If the hash value of the key is not specified but there is a key, mod the hash value of the key with the number of partitions available under the topic to obtain the partition value.
  • If neither a partition nor a key is specified, a random integer is generated on the first call (the integer is incremented on each subsequent call), and the integer is mod with the number of partitions available under topic to obtain the partition value, as is often saidround-robinAlgorithm.

2. Reliability of Producer

The producer’s sole responsibility is to send messages, but if a message is sent, it can be lost. How does Kafka ensure that messages are not lost? In order to ensure the reliability of data, Kafka implements two implementations at the producer level. First, at the storage level, a copy is set for each partition, which is divided into leader and follower. When sending messages, only the leader sends the messages to the follower. Second, at the send level, Kafka requires that the message be delivered only after the leader replies with an ACK, otherwise it will retry.

2.1 ISR mechanism of Producer

As shown in the figure below, the producer sends the message to the leader on the server, who then synchronizes the message to all of its followers.



By default, Kafka returns the message to the producer ACK after all followers have synchronized the message. However, consider such a scenario: when the leader has several followers and one of them fails to complete synchronization with the leader due to failure every time, the leader has to wait forever. How can this problem be solved?

In fact, the leader solves this problem by using an IN-sync Replica Set (ISR) mechanism. ISR means a collection of followers that are synchronized with the leader. After a follower completes the synchronization message from the leader, the leader sends an ACK to the follower. If the follower does not respond to the ACK for a long time, the ISR will be removed. The time threshold is specified by replica.lag.time.max.ms (default: 10s). When the leader fails, a new leader is elected from the ISR.

In fact, in older versions of Kafka, the number of messages that cause followers to be kicked out of the ISR, in addition to the response time, differs from the leader. Kafka sends messages in batches. Suppose the batch size is 12, and the threshold for the number of differences is 10. Each time a producer sends a message to the leader, the follower is kicked out of the ISR. It is then re-selected into ISR, and this reciprocation takes a lot of performance out of Kafka and ZK.

2.2 ACK mechanism of Producer

Now that we know that producers send messages with an acknowledgement mechanism, when does acknowledgement happen in Kafka? Kafka uses acks to confirm values. There are three types of acks that correspond to different ACK mechanisms:

  1. Acks =0, the producer does not wait for the broker’s response. In this case, the latency is lowest, but data may be lost. Therefore, it is suitable for scenarios with high throughput and message loss.
  2. Acks =1, the producer sends a message and waits for the broker to respond, and waits for the leader to respond if the leader has successfully dropped. In this case, message loss may occur if a fault occurs before the leader completes synchronizing messages to the followers.
  3. If acks=-1, the producer sends a message and waits for the broker’s response. The producer does not respond until the leader and follower have all fallen successfully. This mechanism ensures that data is not lost. However, after all the followers are synchronized and before the leader sends an ACK response, the leader breaks down. At this time, the producer thinks that the sending fails and resends the data to the new leader. In this case, the data will be sent repeatedly.

3. Data consistency of Producer

How does Kafka ensure data consistency between its leader and follower? As shown in the figure below, two concepts are introduced here:

  • LEO: Log end offset, the last offset in each copy.
  • HW: High Watermark, the smallest LEO of all.

  1. When the follower fails, the follower is temporarily kicked out of the ISR. After the recovery, the follower reads the HW on the local disk, truncates messages higher than the HW in the log file, and synchronizes data from the leader. When the follower catches up with the leader, the follower rejoins the ISR.
  2. When the leader is faulty, a new leader is elected from the ISR. Then, messages whose log files in all replicas are higher than HW are truncated and data is resynchronized from the new leader.

Note that LEO and HW are mechanisms to ensure data consistency, not data loss or data duplication.

4 ExactlyOnce Producer

After talking about the above, I believe you have a certain understanding of the reliability of Kafka producers, data consistency. Here’s a further question: Can Kafka guarantee ExactlyOnce, and if so, how? In fact, in the transaction business, the precise one-time requirement is relatively high, we should never lose the transaction data, also should not initiate repeated transactions, this is ExactlyOnce semantics. When ack=0, the message may be lost. In this case, the At most once semantics correspond to the message. When ACK =-1, the message will not be lost but may be consumed repeatedly, corresponding At least once semantics. In order to guarantee ExactlyOnce, idempotency should be guaranteed on the basis of At least once semantics.

Idempotent: The results of one or more requests for the same operation are consistent without side effects caused by multiple clicks.

But how does Kafka keep messages idempotent? Kafka added a feature in 0.11 that assigns a unique PID to each producer and a Sequence Number to messages sent to the same broker. The broker will cache . Kafka will persist only one message that has the same primary key. ,partitionid,sequence>

However, PID changes with producer restart, and different partitions have different partitionids. Therefore, The idempotence of Kafka cannot guarantee ExactlyOnce across sessions and partitions.

A little attention, won’t get lost