Kafka is a mainstream message flow system, one of the concepts or more, the following through the graphical way to comb Kafka’s core concepts, in order to have a clear understanding of our minds.

basis

Kafka is a flow processing system that enables back-end services to easily communicate with each other. It is a common component of microservices architecture.

Producer consumer

The Producer service sends messages to Kafka, and the Consumer service listens to Kafka to receive the messages.

A service can be both producer and consumer.

Switchable viewer theme

Topic is the destination for producers to send messages to and the target for consumers to listen to.

A service can listen to and send multiple Topics.

In Kafka there is a concept of a consumer-group.

This is a set of services that act as a consumer.

If a consumer group receives a message, Kafka routes a message to one of the services in the group.

This helps to load balance the message and to extend the consumer.

Topic acts as a queue of messages.

First, a message is sent.

The message is then recorded and stored in the queue and is not allowed to be modified.

Next, the message is sent to the consumer of this Topic.

However, the message will not be deleted and will remain in the queue.

Continue sending messages.

As before, this message is sent to the consumer, is not allowed to be changed, and remains in the queue.

(How long can messages stay in queues? Kafka configuration can be modified.)

Partitions partition

In the description of a Topic, we treat a Topic as a queue. In fact, a Topic is composed of multiple queues, called partitions.

This makes it easy to expand topics.

When a producer sends a message, the message is routed to a Partition in the Topic.

The consumer listens on all partitions.

When a producer sends a message, the default is topic-oriented, and the Topic decides which Partition to place it on, using a polling strategy by default.

Topics can also be configured so that messages of the same type are in the same Partition.

For example, when processing user messages, you can have all messages for a certain user in one Partition.

For example, user 1 sends three messages: A, B, and C. By default, these three messages are in different partitions (P1, P2, and P3).

After configuration, you can ensure that all messages from user 1 are sent to the same partition (for example, P1).

What does this feature do?

This is to provide orderliness of the message.

Messages in different partitions are not guaranteed to be ordered. Messages in only one Partition are ordered.

architecture

Kafka is a cluster architecture, and ZooKeeper is an important component.

ZooKeeper manages all topics and partitions.

Topics and partitions are stored in physical nodes, which Are maintained by ZooKeeper.

For example, there are two topics, each with two partitions.

This is the logical form, but the actual storage in a Kafka cluster might look like this:

Partition #1 of Topic A has 3 copies distributed on each Node.

This increases Kafka’s reliability and system resiliency.

In the three partitions #1, ZooKeeper assigns a Leader to receive messages from the producer.

The other two partitions #1 act as followers, and the messages received by the Leader are copied to the followers.

Thus, each Partition contains the full amount of message data.

Even if a Node fails, there is no need to worry about message corruption.

The Partition distribution for Topic A and Topic B might look like this:

Thank you for reading, hope to help you 🙂

Translation from:

Timothystepro.medium.com/visualizing…