This article is for the blogger to study notes, the content comes from the Internet, if there is infringement, please contact delete.

Personal note: www.dbses.cn/technotes

We can easily imagine two scenarios for how to handle requests.

  • Sequential processing of requests

The pseudocode might look something like this:

while (true) {
    Request request = accept(connection);
    handle(request);
}
Copy the code

Throughput is so poor that each request must wait for the previous one to complete before it can be processed. Suitable for systems where requests are sent very infrequently.

  • Processing requests asynchronously

The pseudocode might look something like this:

while (true) {
    Request = request = accept(connection);
    Thread thread = newThread(() -> handle(request);) ; thread.start(); }Copy the code

The advantage of this approach is that it is completely asynchronous and processing of each request does not block the next one.

The downside is that it can be very expensive, and in some cases can overwhelm the entire service. This approach also applies only to business scenarios where requests are sent infrequently.

How Kafka handles requests

Kafka uses the Reactor schema.

Simply put, the Reactor pattern is an implementation of event-driven architecture that is particularly suited for scenarios where multiple clients send requests to the server concurrently. The Reactor schema is shown below:

According to the figure above, the Reactor pattern is characterized primarily by distribution.

The Reactor has a request Dispatcher, also known as the Acceptor, that sends requests to multiple worker threads.

Acceptor threads are only used for request distribution and do not involve specific logical processing. They are very lightweight and therefore have high throughput performance. And these worker threads can be arbitrarily increased or decreased according to the actual business processing needs, so as to dynamically adjust the system load capacity.

Kafka’s request processing model is similar:

SocketServer component: It also has corresponding Acceptor threads and a network thread pool.

Acceptor threads: Polling inbound requests fairly to all network threads.

Network thread pool: Handles work assigned by Acceptor threads.

Kafka provides the Broker side parameter Num.net work.threads, which is used to adjust the number of threads in the network thread pool. The default value is 3, which means that each Broker starts with three network threads created to handle requests sent by the client.

Kafka network threads process requests

Requests from clients are dispatched by Acceptor threads on the Broker side to any network thread. Kafka performs another layer of asynchronous thread pool processing.

The main steps are as follows:

  1. When the network thread gets the request, it places it in a shared request queue.

  2. The Broker side has a pool of IO threads that pull requests from this queue to perform the actual processing.

    The Broker side parameter num.io. Threads controls the number of threads in the pool. The default value is 8, which means that each Broker automatically creates eight IO threads to process requests.

  3. Process requests. If it is a production request, the message is written to the underlying disk log. In the case of a FETCH request, the message is read from disk or the page cache.

  4. After the IO thread processes the request, it sends the generated Response to the Response queue in the network thread pool, and then the corresponding network thread is responsible for returning the Response to the client.

Why don’t network threads handle it directly? Why steps 2 and 3?

Difference between request queue and response queue

The request queue is shared by all network threads, while the response queue is exclusive to each network thread.

The reason for this design is that the Dispatcher is only for request distribution and not Response return, so each network thread can only send its own Response to the client, so the Response doesn’t have to be in a common place.

Purgatory components

There is also a component called Purgatory, which is the famous Purgatory component in Kafka. It is used to cache Delayed requests.

Delayed requests are those that cannot be processed immediately if conditions are not met. For example, if acks=all is set for the PRODUCE request, the request must wait for all copies of the ISR to receive the message before returning. Then, the I/O thread processing the request must wait for the write results of other brokers. When a request cannot be processed immediately, it exists temporarily in Purgatory. Later, once the completion condition is met, the IO thread continues processing the request and places the Response in the Response queue of the corresponding network thread.