Broker

The Broker is registered

Kafka relies heavily on ZooKeeper, and every time a broker starts, it registers itself with a temporary node of ZooKeeper.

The registered path of the Broker is chroot/brokers/ids/

, or /brokers/ids/

if chroot is not configured. Whether chroot is configured depends on whether chroot is set for the zookeeper.connect parameter in server.properties.

The information registered by the broker to ZooKeeper is saved in JSON format. This includes the following information:

{ "listener_security_protocol_map": { "PLAINTEXT": "PLAINTEXT" }, "endpoints": [] "PLAINTEXT: / / 192.168.110.110:9092", "jmx_port:" 9999, "the host" : "192.168.110.110", "timestamp" : "1575450860777", "port": 9092, "version": 4 }Copy the code
  • listener_security_protocol_map: specify thebrokerThe type of secure protocol used to communicate with the outside world,kafkasupportbrokerUse different security protocols withclientsAnd otherbrokercommunication
  • endpoints: specifybrokerThe service ofendpointList, eachendpointsSpecifies the transport security protocol type,brokerHost name and port information
  • Jmx_port: The JMX monitoring port of the broker. The jmX_Port environment variable needs to be set before starting the broker
  • host:brokerHost name orIPAddress.
  • timestamp:brokerStartup time.
  • port:brokerService port number.
  • version:brokerThe version of the registration information, note that this is notkafkaThe version of the

Broker lifecycle management

Kafka uses ZooKeeper temporary nodes to manage the broker life cycle.

A corresponding temporary node is created in ZooKeeper when the birth Broker is started, along with a listener.

Listeners on temporary nodes monitor the status of the nodes in real time. Once the broker changes, listeners synchronize the status information to the Kafka cluster

Death Once the broker crashes, its session with ZooKeeper becomes invalid, causing the temporary node to be deleted, the listener to be triggered, and the rest to be processed.

Controller Broker

From a client perspective, KafKa is a peer-to-peer distributed architecture, because each Broker contains the meta-information of the entire cluster and can provide complete services. From a server perspective, KafKa is a master-slave architecture, for reasons that come to the Controller Broker

What is a Controller Broker

A KafKa Broker has two roles, Controller and Follower. As the name suggests, the Controller is the main Follower. Both roles do most of the same things, but the Controller is also responsible for some management-related work

The responsibilities of Controller Broker

The Follower Broker does everything it wants to do, and the Controller Broker is also responsible for:

Topic management

  • Create and delete topics
  • Leader Replication election. For example, when the leader replica of a partition fails, the Controller Broker is responsible for electing a new leader replica for that partition
  • Partition redistribution. For example, when the kafka-topic.sh script is used to increase the number of partitions for a topic, the Controller is responsible for redistributing partitions
  • Synchronize metadata. For example, when the ISR collection of a partition changes, the Controller is responsible for notifying all brokers to update the metadata information
  • Preferred Leader election. The preferred leader election is a solution to change the leader to avoid overload of some brokers.
  • .

Managing Broker clusters

  • Launches a newly added Broker
  • Bring down the failed Broker
  • .

The Controller Broker election

After each Broker starts, it tries to create a temporary node /controller in Zookeeper. The first Broker to create a node is the controller. All other brokers are followers

The temporary node information of /controller is as follows

{
    "version": 1,
    "brokerid": 2,
    "timestamp": "1569267532839"
}
Copy the code
  • version: fixed to 1 (set a fixed value of 1 in code)
  • brokeridShould:brokerTemporary nodeid
  • timestamp: Registration time

When the Controller Broker and ZooKeeper lose connection, the temporary node is deleted, and other brokers listen for the change of the node. When the node is deleted, other brokers receive an event notification and re-initiate the Controller election

Split brain in master-slave architecture

If the master node to hang in the master-slave architecture, from the node will be mainly through a number of ways to upgrade, but the problem here is how to recognize host point is really hung up, or temporarily lost, if only temporarily lost in master node and slave nodes has escalated, back to the original master node, there will be two main node in the cluster, The result is that the other slave nodes in the cluster do not know who to listen to. This is called split brain!

The Controller fissure

First, each broker stores in memory the BrokerID of the current/Controller node, which is called activeControllerId.

If a new broker is elected Controller, the/Controller node will be refreshed. If the/Controller node changes, listening events will be triggered. Each broker (including the original Controller Broker) updates the activeControllerId stored in memory.

If the original Controller Broker can still listen for events, it updates the activeControllerId and checks if the BrokerID value is consistent with the new activeControllerId value. If not, it “abdicates” and closes the appropriate resource. For example, close the state machine and log out the corresponding listener.

Ideally, the old Controller Broker can still listen for events, so the transition is smooth. But in most cases a Controller occurs when the Controller Broker has been “unresponsive” for a long time, in which case the old Controller Broker can’t handle listening events. That’s fine if it’s actually down. There will be two controllers in the cluster, and that’s the split brain of the Controller

KafKa solves split brains

Kafka registers a temporary /controller node in ZooKeeper and a persistent/controller_EPOCH node, which holds an integer value for controller_EPOCH. It records the current generation of controllers, so it is called the era of the Controller “. The initial value is 1, and the value is incremented each time the Controller changes

Each request that interacts with the Controller will carry the controller_EPOCH field. If the controller_EPOCH value in the request is less than the controller_EPOCH value in memory, The request is considered to have been sent to an expired Controller, and the request is considered invalid. If the request’s Controller_EPOCH value is greater than the controller_EPOCH value in memory, a new controller has been elected.

Kafka uses controller_epoch to ensure uniqueness of controllers, and hence of related operations.