Abstract

In this article, I’ll start with why message queues are needed, give you two small examples, and talk to you about some of the current usage scenarios for message queues.

For example, the decoupling of message queues in a complex system, or the scenario of message queues with high concurrency if the traffic is more flat.

I’ll then introduce you to some of the important Kafka terms, such as theme, Broker, partition, and so on.

Note that Kafka is not just message middleware, it is also an excellent distributed flow processing platform, but in this article we will focus on the application of Kafka in message queues and how it works.

1. Application scenarios

Before we look at Kafka, let’s think about where messaging middleware is needed.

Then, I’ll take two small examples to briefly describe usage scenarios for message-oriented middleware.

Decoupling 1.1

For example, if we are in one system and need to call many other systems, we can of course call other systems one by one in this system. But the problem is, what if the other systems we need to call change, for example, new systems are added, or some systems no longer provide services? Modify the system call code and go online again?

This is particularly cumbersome and makes the system completely coupled to other systems.

So we can wrap our “service invocation” into a message and send it to the messaging middleware.

For other systems, you just need to go to the message middleware and pull the message you need, and then process it.

In addition, there is no need for users to wait as long as the system drops messages into the messaging middleware and returns them, which ensures faster response times.

This is what we call decoupling, where one system just sends the data to the middleware, and the other system just takes the data from the middleware and processes it.

1.2 Peak cutting and valley filling

As soon as I opened my mouth, I knew that old Gao was concurrent.

This noun we are not unfamiliar with, whenever you search “high concurrency”, “second kill” this kind of keywords, you will find such results.

So the key to cutting the peak and filling the valley is to make the flow more gentle.

Let’s take “seckill” as an example.

So for this mall, the upstream service of this “second kill” activity is to place an order. The upstream service needs to call many downstream services, such as inventory service to generate inventory, order service to generate order, payment service, and payment service may need to call the payment interface of the third party, etc. In simple terms, the processing speed of upstream services is much faster than that of downstream services.

At this point, if we do not make any restrictions on the upstream service, all requests go directly to the downstream service, then the whole system may fail.

So, we can use messaging middleware to do “peak load filling.”

The upstream service simply drops the necessary information, such as “the user bought the item at the time”, into the messaging middleware, and then the downstream service pulls the information from the messaging middleware at its own speed and consumes it. In this way, the whole system can be processed in order.

Here, I’m just a few examples that omits many details, the real business and not so simple, and there’s a lot of things to consider, for example, we put a message in the message middleware, when can be consumer, will have been hungry, need not to need to send information again this time?

For example, if I send the message again, is it possible to make repeated purchases?

Or is it possible that the messages we send are lost?

There are a lot of questions like this, and we’re going to look at these questions.

Concept 2.

Having said what message queues are for, I’d like to introduce you to a few common terms in Kafka.

One thing to note, though, is that Kafka is more than just a messaging engine, it’s an excellent distributed flow processing platform, even though I may be talking about message queues throughout this article and even this series of articles.

However, the authors are limited in ability and knowledge, so they can only study the message queue.

2.1 Producers and consumers

This is easy to understand, the producer is the object that sends the message.

The producer is responsible for sending the message or record that needs to be processed to the message queue, and the rest is his business.

The consumer is the object that processes the message.

The consumer is responsible for pulling pending messages from the message queue and processing them.

2.2 the theme

In the previous section, I only talked about “dropping a message to a message queue” and “pulling a message from a message queue”, so now we need to solve the following problem:

  • How do I determine where to send the message/where to get the message?

  • If there are different kinds of consumers, will the message get mixed up?

In fact, if there is only one kind of message, then there is no problem. But what if there are different kinds of messages, like I have an order message and I have a log message, is there a case where the order service gets the log message? Should I configure multiple KafKas?

So there’s a theme in Kafka.

Before I explain how this works, you can think of it this way: one topic, one queue. When sending a message, we select the appropriate topic and send the message to this topic.

The producer sends the message to the set topic, and the consumer is responsible for pulling the message from the specific topic and processing it.

As a result, our message queue can handle a wider variety of messages.

2.3 the Broker

We mentioned producers and consumers, which are called Kafka clients.

If there is a client, there must be a server, which is the Broker mentioned in this section.

The Broker is Kafka’s server. You can think of it as a queue where producers send messages to the Broker and consumers get messages from the Broker.

2.4 partition

As mentioned above, there are several topics in the Broker. Producers send information to one topic in the Broker, and consumers pull information from one topic in the Broker.

It is easy to see that there is a performance bottleneck in this message queue.

At this point, the IO speed of the machine on which the Broker is located may cause a performance bottleneck in the message queue.

Given that we have a very large number of topics on our Broker, producers may not be able to send messages to the Broker in time and consumers may not be able to pull messages from the Broker in time due to insufficient IO speed.

Hence the concept of partitioning, which means we can divide a topic into many parts. However, when a producer sends a message to a topic, the message is not sent to all partitions of the topic, but to one of the partitions of the topic.

In other words, partitioning is an extended concept, not a copy concept. Therefore, the partition here is also called data partition, data sharding.

In addition, these partitions can be deployed on different machines, which increases performance several times.

2.5 up

The concept of “high availability” is also particularly important in the Internet.

In general, usability is usually achieved through redundancy.

We mentioned the concept of partitioning, which allows you to divide a topic into multiple parts for extension.

But if a partition dies, isn’t a lot of information about that topic lost?

Hence the concept of Replica. To put it simply, Replica means to make several copies of each partition. in other words, Replica means a copy. Generally speaking, we understand that a copy has a leader and several followers, and generally the leader is responsible for writing and the followers are responsible for reading. Unlike MySQL, however, the followers in Kafka are only used for redundancy, and all reading and writing happens to the leader.

Write in the last

First of all, thank you for being here.

In the Getting Started Kafka series, my goal is to make complex concepts as easy to understand as possible, just like in the MySQL series. In addition, I will also go a little deeper into the principle, as much as possible in the case of the use, to know how he achieved, and why the design.

During this period, if I have what understanding is not enough, or the explanation is not correct, welcome to comment!

Thanks again for being here!

PS: If you have other questions, you can also find me in the public account, welcome to find me to play ~