Rebalance Kafka Consumer Rebalance Rebalance Kafka Consumer Rebalance

This article gives you a brief overview of Kafka Consumer Rebalance and the new version of its optimization strategy

Consumer Groups from previous versions of Kafka

Consumer Group

As shown in the figure above, the Consumer marks itself with the Consumer Group name, and each record published to the topic is passed to a Consumer instance in each subscriber Consumer Group. The Consumer instance can be in a separate process or on a separate machine.

If all Consumer instances belong to the same Consumer Group, these Consumer instances will consume Kafka in a balanced and reloaded manner.

If all Consumer instances have different Consumer groups, each record is broadcast to all Consumer processes.

Group Coordinator

A Group Coordinator is a service that each Broker starts when it starts. A Group Coordinator is used to store Meta information about a Group and record the Offset information of the corresponding Partition to a Kafka built-in Topic(__consumer_offsets). Kafka uses Zookeeper to store Offset information (consumers/{group}/offsets/{topic}/{Partition}). Since Zookeeper is not suitable for frequent write operations, the Offset of the corresponding Partition was recorded by built-in Topic after 0.9. As shown below:

This was true before Kafka 0.8.2

And then it went like this:

Each Group selects a Coordinator to complete the Offset information of each Partition in the Group. The selection rule is as follows:

  1. To calculateGroupCorresponding to the__consumer_offsetsOn thePartition
  2. Locate the Broker corresponding to the leader of the Partition based on the Partition. The Group Coordinator on the Broker is the Coordinator of the Group

Partition calculation rules:

partition-Id(__consumer_offsets) = Math.abs(groupId.hashCode() % groupMetadataTopicPartitionCount)
Copy the code

Including groupMetadataTopicPartitionCount offsets. The corresponding topic. Num. Partitions parameter values, the default value is 50 partition

Consumer Rebalance Protocol

Timing of rebalance

  1. The number of group members changed. Procedure For example, there are newconsumerThe instance joins or leaves the consumer group.
  2. Subscribe to theTopicThe number changes.
  3. To subscribe toTopicThe number of partitions in the.

A case in which the consumer process hangs

  1. sessionoverdue
  2. heartbeatoverdue

When Rebalance occurs, all Consumer instances in a Group are coordinated together. Kafka ensures the most equitable allocation possible. But the Rebalance process can have a serious impact on the Consumer Group. All Consumer instances under the Consumer Group will stop working during the Rebalance and wait for the Rebalance to complete.

The Consumer Rebalance protocol

How to perform after Rebalance occurs

1. New consumers join the Consumer Group

2. Select the leader from the Consumer Group

3. The leader allocates partitions

Issues

Known Issue #1: Stop-the-world Rebalance

Kafka
Rebalance
Consumer Group
Stop-the-world

Known Issue #2: Back-and-forth Rebalance

Rebalance

Compare current Rebalance with improved Rebalance

Progressive Rebalance protocol

  • Rebalance more times than before, but each time the Rebalance is cheaper
  • Fewer partitions are migrated than before
  • Consumer can continue to run during Rebalance

Refer to the article

  • Incremental Cooperative Rebalancing in Apache Kafka: Why Stop the World When You Can Change It?
  • KIP-429: Kafka Consumer Incremental Rebalance Protocol
  • Incremental Cooperative Rebalancing: Support and Policies

Pay attention to our