This is the third day of my participation in the August More text Challenge. For details, see:August is more challenging

The role of rebalance

Agree on which theme zones to consume within the consumer group. Rebalancing requires the Coordinator component on the Kafka Broker to reassign partitions to consumer groups with the help of a Coordinator.

Three conditions that trigger rebalancing

  1. The number of members in the group changes. Procedure
  2. The number of topics subscribed to by consumer groups has changed
  3. The number of partitions subscribed to the topic changed.

The process of rebalancing

A rebalance is a heartbeat thread on the consumer side that notifies other consumer instances that a rebalance has occurred. With rebalancing enabled, Borker coordinates the entire rebalancing mechanism by maintaining a consumer group state machine.

The five states of consumers

state meaning
Empty There are no members in the group, but consumers may have submitted displacement data, and these only have not expired
Dead The group does not have any members, but the metadata information of the group has been removed by the coordinator. The coordinator keeps the information of all groups registered with the coordinator
PreparingRebalance The consumer group is ready to turn rebalance on, at which point all members need to rejoin the consumer group
CompletingRebalance All members of the consumer group have joined, and each member is waiting for the allocation plan
Stable The stable state of the consumer group, which indicates that the rebalance has been completed and that the members of the group are able to consume data normally

The flow of five states

  1. A consumer group starts out in the Empty state. When the rebalancing process is on, it is in the PreparingRebalance state waiting for members to join, then changes to the CompletingRebalance state waiting for the allocation scheme, and finally flows to the Stable state to complete the rebalancing.

  2. When a new member joins or an existing member exits, the state of a consumer group jumps directly from Stable to the PreparingRebalance state, at which point all existing members must reapply to join the group. When all members exit the group, the consumer group status changes to Empty.

Detailed process

On the Consumer side, rebalancing is a two-step process: joining the group and waiting for the Leader Consumer to assign the solution. These two steps correspond to two specific types of requests: JoinGroup requests and SyncGroup requests.

  1. When a member of the group joins the group, it sends a JoinGroup request to the coordinator. In this request, each member reports the topics to which they subscribe, so that the coordinator can collect subscription information from all members. Once the JoinGroup requests from all the members have been collected, the coordinator selects one of these members to be the leader of the consumer group. The task of the leader consumer is to collect subscription information of all members, and then make specific partition consumption allocation plan according to this information.

  2. After the leader is selected, the coordinator encapsulates the subscription information of the consumer group into the response body of the JoinGroup request and sends it to the leader. After the leader makes a unified allocation scheme, the leader sends a SyncGroup request to the coordinator and sends the just made allocation scheme to the coordinator.

  3. Other members also send SyncGroup requests to the coordinator, but there is no actual content in the request body. The main purpose of this step is for the coordinator to receive the allocation and then uniformly distribute it to all members as a SyncGroup response, so that all members in the group know which partitions they should consume.

Broker side rebalancing process

Scenario 1: New members join

When the coordinator receives a new JoinGroup request, it notifying all existing members of the group through a heartbeat request response, forcing them to start a new round of rebalancing. The process is the same as the previous client rebalancing process.

Scenario 2: A member leaves the group

The thread or process on which the consumer instance resides actively notifies the coordinator that it is exiting by calling the close() method. This scenario involves a third type of request: a LeaveGroup request. After receiving the LeaveGroup request, the coordinator will still notify other members in the form of heartbeat response

Scenario 3: Group member crashes and leaves the group

The difference between this and active defection is that it is a passive defection, in which the broker is unaware of the defection and determines the defection after a heartbeat timeout.