Kafka pit guide

As a Kafka novice, how do I get into a pit?

Messaging middleware usage scenarios

Why use message-oriented middleware?

In the development process of enterprise-level projects, common scenarios such as high concurrency, microservice interaction and asynchronous invocation are often encountered. These complex scenarios can be quickly realized by using the characteristics of message-oriented middleware, such as peak clipping, valley filling and decoupling, so as to achieve business decoupling, improve the maintainability of code and improve system stability.

2. What common scenarios use message-oriented middleware?

Service scenario decoupling: For example, after placing an order, an SMS message, email, or in-site email is sent to…… Asynchronous delivery after payment etc…… Data asynchronous synchronization……
Bulk log collection: In the microservices scenario, log data is huge. The well-known ELK uses Kafka to collect logs
High concurrency scenarios: For example, in a second kill scenario, a large number of requests are flooded. If all processes are processed by the same system, the system will basically break down. At this point, the upper system only takes part of the task and then dumps the request to the message queue for the subsequent system to continue processing, thus quickly releasing the pressure to continue processing the request.

Common messaging middleware

ActiveMQ

Java development, Apache production, old message middleware, support JMS specification, the old system is generally used.

Low throughput, supports only a small number of message backlogs, has only the active-standby cluster mode, and has low availability.

RocketMQ

Ali produced, Java development, experience ali baptism, stand the test.

RocketMQ is recommended for business systems. It supports message transactions, sequential messages, message backtracking, lower throughput than Kafka, distributed clustering, and higher availability.

RabbitMQ

Erlang development, I do not have much contact.

Kafka

Apacha produced, Scala developed, processing big data message de facto standard, high throughput, distributed cluster, high availability, message partition, consumer grouping.

Introduction to Kafka

Kafka is the mainstream distributed message engine and stream processing platform, often used as enterprise message bus, real-time data pipeline. Kafka, originally conceived as a high-throughput distributed messaging system, has evolved into a full-fledged distributed messaging engine and flow processing platform.

Architecture diagram

Kafka is designed to follow the producer-consumer model, where producers send messages to specific partitions of a topic within the broker, and consumers pull data from one or more partitions for consumption.

Kafka relies on Zookeeper to provide distributed coordination services, which is responsible for storing and managing metadata information in Kafka clusters, including broker information, topic information, topic partition and copy information, etc.

The architecture diagram is as follows:

Zookeeper plays a key role in Kafka:

For borker
- Status: ZooKeeper records the live status of all brokers. Brokers send heartbeat requests to ZooKeeper to report their status. Zookeeper maintains a list of running brokers that belong to a cluster.
- Controller election: A Kafka cluster has multiple brokers, one of which is elected as a controller. The controller manages the status of all partitions and replicas in the cluster. For example, if the leader of a partition fails, the controller elects a new leader. Zookeeper is responsible for selecting controllers from multiple brokers.
- Quota permissions: Kafka allows some clients to have different quotas for production and consumption. The quota configuration information is stored in ZooKeeper. Access control information for all topics is also maintained by ZooKeeper.
- Record the ISR: In-sync Replica (ISR) is a synchronization set of partitions. It is the part of the followers that synchronizes most actively. A message is considered “synchronized” only if it is received by all members of the ISR. Only replicas in an ISR collection are eligible to be elected leader. Zookeeper records ISR information and updates it in real time. If a member is found abnormal, it is removed immediately.
- Node and Topic registries: ZooKeeper keeps registries for all nodes and topics, making it easy to find out which topics each broker holds. Nodes and topics exist as temporary nodes in ZooKeeper and their information is lost as soon as the session with ZooKeeper is closed.
- Topic configuration: ZooKeeper keeps topic-specific configurations, such as the list of topics, number of partitions per topic, location of replicas, and so on.
For consumers:
- Offset: In older versions of Kafka, the consumer offset was stored in ZooKeeper by default. Due to the strong consistency of ZooKeeper, this method has high modification latency. In the new version, Kafka does the work itself, and kafka specifically creates an offset manager.
- Register: Like brokers, consumers need to register. The consumer is automatically registered by creating a temporary node that will be destroyed when the consumer is down.
- Partition registration: Each partition in Kafka can only be consumed by one consumer in the consumer group. Kafka must know the relationship between all partitions and consumers.

The term

Producer: producers that produce and send messages.
Broker: Kafka instance. Multiple brokers form a Kafka cluster. Usually one Kafka instance is deployed on one machine.
Consumer: Pull a message to make a consumption. A Topic can have several consumers (or instances of consumers) consuming, usually one machine.
Consumer Group: Several consumers form a Consumer Group. A message can only be consumed by one Consumer in the Consumer Group. Consumers in the same consumer group can consume data from different partitions of the same topic to improve Kafka throughput.
Topic: a logical storage unit for server messages. A Topic typically contains several partitions.
Partition: A Partition of a Topic that is distributed among brokers to balance publish and subscribe load. Several partitions can be consumed by several consumers at the same time to achieve high Consumer throughput. A zone has multiple replicas. This is Kafka’s design for reliability and availability.
Replication: Each partition has multiple copies that act as a backup. If the primary partition (Leader) fails, a Follower is selected to become the Leader. The default maximum number of replicas in Kafka is 10, and the number of replicas cannot be greater than the number of brokers. Followers and leaders are definitely on different machines, and the same machine can only store one copy (including itself) on a partition.
Message: A message is the data actually stored by the Kafka server. Each message consists of a key, a value, and a message timestamp, stored as a file in the broker.
Offset: the offset, the position of the message in the partition, maintained by Kafka itself. Consumer consumption also saves a copy of the offset to maintain the position of the message consumed.

Advantages & Characteristics

Kafka is a distributed high scale architecture with high throughput, data redundancy and large message backlogs.

High throughput and low latency: These are the salient features of Kafka. Kafka can achieve message throughput of millions of milliseconds with latency.
Persistent storage: Kafka’s messages are ultimately persisted on disk, providing sequential reads and writes for performance, and improving data reliability through Kafka’s replication mechanism.
Distributed scalability: Kafka’s data is distributed across different broker nodes, organized by topic and distributed by partition. Overall scalability is very good.
High fault tolerance: Kafka can still provide service when any broker node in the cluster goes down.

Sequential reads and writes, Page Cache, zero-copy technology, and partitioning are all features that make Kafka read and write very fast. These features also give Kafka high performance, high throughput, and low latency.

Production-consumption mode

Management console

Kafka-manager: Kafka’s cluster management tool. The management tool makes it easy to discover which topics are distributed in a cluster and how partitioned each topic is.

Build the Kafka environment

Setup process……

Springboot + Kafka example

Project Example……

Messaging middleware comparison

Reference: Fully compare the advantages and disadvantages of Kafka, RabbitMQ, RocketMQ and ActiveMQ

Advanced features

Advanced features of Kafka

Kafka message sending mechanism

The mechanism for sending messages on the production side of Kafka is very important. This is the basis of Kafka’s high throughput. The basic flow on the production side is shown below:

There are mainly the following aspects of design:

Asynchronous send

Kafka has introduced a new version of the Producer API since release 0.8.2, where Producer sends messages entirely asynchronously. The ProducerRecord constructed at the production end is serialized by keySerializer and valueSerializer, and then processed by Partition to determine whether the message falls into a specific topic Partition. Finally, the message is sent to the client in the Accumulator message buffer, which is sent to the broker by a thread called Sender.

The maximum size of an accumulator is controlled by the parameter buffer.memory. The default value is 32M. Therefore, the size of the buffer can be adjusted according to actual service conditions.
Batch send

Messages sent to the buffer are sent to the broker in batches. The batch size is controlled by the parameter batch.size (16KB by default). This means that messages are normally sent to the broker in batches when they reach 16KB. Therefore, decreasing the batch size helps reduce message latency while increasing the batch size helps improve throughput.

Do the generated messages have to reach a batch size before they are sent to the server? The answer is no. The Kafka production side provides another important parameter, Lingering.ms, which controls the maximum batch idle time. Any batch exceeding this idle time is sent to the broker side.
Message retry

In addition, Kafka supports a retry mechanism. If a message fails to be sent for some reason, such as network jitter, the Producer attempts to send the message again. This function is controlled by the retries parameter. The parameter meaning indicates the number of retries. The default value is 0, indicating no retries.

Kafka copy mechanism

The concept of Replicas (replicas) in Kafka is mentioned above. Replication is the basis for Kafka to achieve high reliability and high availability. Kafka has two types of replicas: leader and follower.

Kafka replicas function

Kafka default will only to partition set a copy, from the broker (the default parameters. The replication. The control factor, the default value is 1, usually, we can modify the default value, Or you can specify replication-factor when creating a topic on the command line. It is recommended to set 3 replicas for production. There are two main functions of duplicates:

Message redundancy storage, improve the reliability of Kafka data;
Improved availability of The Kafka service. Follower replicas can participate in the leader election if the leader replicas fail or the broker fails, and continue to provide read and write services.

About read-write separation

Kafka does not support read/write partitions. All read/write requests on the production and consumption ends are processed by the leader copy. The main job of the follower copy is to asynchronously pull messages from the leader copy and synchronize message data.

Kafka is designed to ensure read and write consistency. Copy synchronization is an asynchronous process. If the follower copy is not fully synchronized with the leader, the latest messages may not be read from the follower copy.

ISR replica collection

To maintain synchronization of zone Replicas, Kafka introduces the concept of in-Sync Replicas (ISR). The ISR is a replica list In a zone that is synchronizing with the leader Replicas and must contain the leader Replicas.

The ISR list is persisted in Zookeeper. Any copy in the ISR list is eligible to participate in the leader election.

The ISR list changes dynamically. Not all partition copies are in the ISR list. Which copies are included in the ISR list? The condition for the replica to be included in the ISR list is controlled by parameter replica.lag.time.max.ms. The parameter indicates the maximum interval between the replica synchronization and the leader. This means that messages in a follower replica are excluded from the ISR if they are more than 10 seconds later than the leader. Kafka is designed to reduce message loss. Only follower replicas that are synchronized with the leader replicas in real time are eligible to participate in the leader election.

Kafka controller

The Controller is the core component of Kafka. Its main function is to manage and coordinate the entire Kafka cluster with the help of Zookeeper. Any broker in the cluster can act as a controller, but only one broker can be a controller during a run.

Zookeeper is introduced here, because the generation of the controller depends on the ZNode model and Watcher mechanism of Zookeeper. The data model of Zookeeper is ZNode Tree, which is similar to the Unix operating system. ZNode is a data node in Zookeeper and the smallest data storage unit of Zookeeper. Each ZNode can store data. Child nodes can also be mounted, with the root node being /. The basic topology diagram is as follows:

Zookeeper has two types of ZNodes: persistent nodes and temporary nodes. A persistent node is a persistent node that exists after the client disconnects from Zookeeper and is deleted. The life cycle of the temporary node is bound to the session of the client. After the client disconnects from Zookeeper, the temporary node is automatically deleted.

The Watcher mechanism is an important feature of Zookeeper. It can bind monitoring events on ZNode nodes, such as monitoring node data changes, node deletion, and status changes of child nodes. Through this mechanism, distributed locking and cluster management functions can be implemented based on Zookeeper.

Controller election

When any broker in the cluster starts, an attempt is made to create a /controller node in Zookeeper. The first broker that successfully creates a /controller node is designated as a controller, and other brokers listen for changes to the node. When a running controller suddenly goes down or terminates unexpectedly, other brokers can quickly sense it and try again to create a /controller node. The successful broker becomes the new controller.
Controller Functions

The Kafka controller manages and coordinates Kafka clusters. The Kafka controller manages and coordinates Kafka clusters.
- Topic management: Creating, deleting, and adding topic partitions are all performed by the controller.
- Partition reassign: Kafka’s reassign script reassigns topic partitions, also implemented by the controller.
- Preferred Leader election: There is a concept called Preferred Replica, which refers to the first copy in the allocated copies. The Preferred leader election is a scheme in which Kafka selects a Preferred copy as the new leader when the leader load is unbalanced. This is also the responsibility of the controller.
- Cluster member management: The controller can monitor the addition of new brokers, active shutdown and passive downtime of brokers, and do other work. Zookeeper’s ZNode model and Watcher mechanism are used to monitor temporary node changes under Zookeeper /brokers/ IDS.
- Data service: The controller holds the most complete cluster metadata information, and all brokers regularly receive metadata update requests from the controller to update their cached data in memory.
The controller is the heart of Kafka, managing and coordinating the entire Kafka cluster, so the performance and stability of the controller itself becomes critical.

The community has done a lot of work in this area, especially in the 0.11 version of the controller was reconstructed, the biggest improvement of which changed the design of the internal multi-thread controller into a single thread plus event queue scheme, eliminating the multi-thread resource consumption and thread safety problems. Another improvement is that the previous synchronous operation of Zookeeper is changed to asynchronous operation, which eliminates the performance bottleneck of Zookeeper and greatly improves the stability of the controller.

Rebalance Kafka Rebalance

In the previous introduction of Consumer terms, the concept of Consumer Group was mentioned. A topic can be consumed by several consumers, and several consumers form a Consumer Group, that is, a message can only be consumed by one Consumer in the Consumer Group. We use the following diagram to represent Kafka’s consumption model.

Rebalance concept

On the Consumer side of Kafka, one of the most difficult issues to avoid is consumer rebalancing. Rebalance is the process of making all consumers in a Consumer group agree on how to consume all sections of a subscribed topic. During the Rebalance, all Consumer instances will stop consuming and wait for the Rebalance to complete. Because you have to stop spending until the rebalancing is complete, the Rebalance can seriously affect TPS at the consumer end and should be avoided.

Condition of Rebalance occurrence

There are three things that can sum up when this Rebalance happens:

The number of consumer members in the consumer group has changed
The number of consumption topics changes
The number of partitions for the consumption topic has changed

The latter two situations are usually planned, such as increasing the number of topic partitions to improve message throughput. These situations are usually unavoidable. Later we’ll focus on how to avoid making Rebalance due to changes in the number of consumers in a group.

Kafka coordinator

Before we talk about how to avoid this Rebalance problem, let’s take a look at Kafka’s Coordinator. Similar to Kafka controllers, coordinators are the core components of Kafka.

There are two main types of Kafka coordinators:

Group Coordinator
Consumer Coordinator

To better manage Consumer Group membership, shift management, and Rebalance, a Group Coordinator is introduced to the broker and a Consumer Coordinator is introduced to the Consumer. When each broker is started, a GroupCoordinator instance is created to handle metadata operations such as consumer group registration, consumer member logging, and offset. Each broker has its own Coordinator component. In addition, when each Consumer is instantiated, an instance of the ConsumerCoordinator is created, which is responsible for the communication between the various consumers in the Consumer group and the server group coordinator. The coordinator principle can be represented as follows:

The Consumer Coordinator on the client communicates with the Group Coordinator on the server through heartbeat.

conclusion

This paper summarizes the common ways of messaging-oriented middleware, introduces the architecture of Kaka, the core concept of Kafka, the main process of Kafka, message sending mechanism, partition copy, Kafka controller, consumer rebalancing and other core principles, and also introduces the construction of Kafka. And a simple use of Springboot + Kafka.

I believe that through this introduction, you should have a general understanding of Kafka, of course, due to space problems, there is no detailed introduction of ISR, advanced features and parameters, more features for you to practice.

Reference documentation

How to learn Kafka quickly and comprehensively? Blood ejection finishing 5000 words

Fully compare the advantages and disadvantages of Kafka, RabbitMQ, RocketMQ and ActiveMQ

Kafka Topic Architecture – Replication failover parallel processing

Kafka client consumer configuration parameter

Some questions about understanding Kafka

The Kafka cluster management tool Kafka-Manager is deployed and installed

Kafka ISR copy synchronization mechanism

Kafka offset management