Copyright Notice: This set of technical column is the author (Qin Kaixin) usually work summary and sublimation, through extracting cases from real business environment to summarize and share, and give business application tuning suggestions and cluster environment capacity planning and other content, please continue to pay attention to this set of blog. Looking forward to joining the most combative team in the IOT era. QQ email address: [email protected], if there is any academic exchange, please feel free to contact.

1 drops an availability

  • Availability reflects the ability of kafka clusters to cope with crashes, and tuning availability allows Kafka to recover more quickly from crashes.

2. Look at the problem from a perspective

  • A Controller is the broker, which is responsible for the replica allocation scheme and bringing partitions online (according to the replica allocation scheme).
  • Duplicate state machine
  • Partition state machine
  • Note that the larger the number of partitions in a Topic, once the leader copy of the partition’s broker crashes, the leader needs to be elected. Although the leader election is very short, about a few milliseconds, the Controller processes the request in a single thread. The Controller uses Zookeeper to detect broker status in real time. Once a broker fails, the controller can immediately sense and elect a new leader for the affected partition, but after a new allocation scheme is made, it wastes network I/O to publish metadata updates to the brokers.
  • It is recommended that the partition size not be too large, which may affect availability.

Look at the problem from a perspective

  • The first parameter is acks. If acks is set to all, the effect of the broker parameter min.insync.replicas affects the availability of the Producer. When the ISR is reduced to min.insync.replicas, producer stops sending messages to a specific partition, affecting availability.
  • Copyright Notice: This set of technical column is the author (Qin Kaixin) usually work summary and sublimation, through extracting cases from real business environment to summarize and share, and give business application tuning suggestions and cluster environment capacity planning and other content, please continue to pay attention to this set of blog. Looking forward to joining the most combative team in the IOT era. QQ email address: [email protected], if there is any academic exchange, please feel free to contact.

4 Broker perspective

  • The direct manifestation of high availability on the Broker side is the collapse of the Broker. After the collapse, the leader election can be made in two ways: 1. 2: by unclean. Leader. Election. Enable value to decide whether to choose from ISR. In the second case, there is a risk of data loss.
  • Another parameter is broker crash recovery tuning, namely num. Recovery. The threads. Per. Data. Dir. The scenario is that after the broker restarts and recovers from a crash, it scans and loads the underlying partitioned data to perform cleanup and synchronization with other brokers. This project is called log loading and log recovery. Single-threaded by default. Given that a Broker’s log.dirs is configured with 10 log directories, a single thread might be chugging through the scan load, which is too slow. In practice, you are advised to set the number of disks for logging to log.dirs.

5 Look at the problem from the Consumer’s perspective

  • The coordinator of a Consumer group can automatically detect a crash when one or more Consumer instances in the group “fail” and make rebalance. The collapsed consumer partition is then allocated to other surviving consumers.

  • The consumer parameter session.timeout.ms defines the interval at which failure is detected. To achieve high availability, you must set a low value, such as 5-10 seconds. Once this value is exceeded, a new round of rebalance will be opened.

  • Max.poll.interval. ms sets the time required by a consumer instance to process messages. If the time exceeds this value, the consumer will stop sending heartbeat and inform the coordinator to leave the group. The consumer is created with a property, max.poll.interinterval. Ms, which means the maximum delay for a Kafka consumer between each poll() call, the maximum amount of time the consumer can be idle before retrieving more records. If poll() is not called again before this timeout expires, the consumer is considered to have failed and the grouping is rebalanced so that the partition can be reassigned to another member

  • Hearbeat.interval. ms When coordinator decides to open a new round of Reblance, he will insert this decision into the consumer’s heartbeat request response in the form of a REBALANCE_IN_PROCESS exception. This allows you to quickly perceive the new partition allocation scheme.

  • Kafka tuning classic monologue, hahaha:

    Online for two months without a problem, why the recent sudden problems. I thought there must be some action in the business system, so I went to ask. Sure enough, the risk control system Kafka crashed the day before, and the data was repushed, which caused the data to be blocked. But I think even if blocked will slowly consumption ah, should not be wrong. The default value of max.poll.records is 2147483647 (0.10.0.1). If the number of polls in kafka is not consumed in the specified time, manual submission will fail. The data is rolled back into Kafka and repeated consumption occurs. And so the data keeps piling up. After consulting the company's Kafka guru, he said that my Kafka version was different from his cluster version and asked me to upgrade kafka. I upgraded to 0.10.2.1 and found that Max. Polll. Records defaults to 500 by looking at the official website. This version has a Max. Poll.interval. ms parameter, the default is 300s. This parameter roughly means that a Kafka consumer in a poll cannot be processed for more than this time. After updating the Kafka version and changing max.poll.records to 50, I went online once and was ready to take a look. It was already 9 o 'clock at night when I finished the line, so I clocked in and went home. I will see the result tomorrow. Get up early the next day full of joy ready to see the results, think will solve this problem, who had thought or accumulation. Oh, my God. I can't figure out what the problem is. So I printed out the execution time of the code before and after processing each business, added the code, and submitted it to the line. Then I looked at the result and found that most of the time was spent on database IO, and the execution time was very slow, mostly 2s. So I thought maybe when I just went online, the amount of data is relatively small, the query is relatively fast, and now the amount of data is large, it is relatively slow. The first thing that came to mind was to see if common query fields were indexed, see if they weren't, and then add indexes immediately. Add the cable lead observation, processing speed increased several times. Although a single business processing happiness, but the accumulation still exists, and later found that the business system about 1s push 3, 4 data, but I kafka is now a single thread consumption, the speed is probably so much. Add to that the previous accumulation, so consumption is still slow. So the business changed to multi-threaded consumption, using the thread pool, opened 10 threads, online observation. It was gone in a few minutes. Accomplished, at this moment, the in the mind is much more comfortable. It's not easy!Copy the code

6 Parameter List

The Broker end

  • Avoid Too many partitions
  • Set the unclean. Leader. Election. Enable = true (for availability, data loss does not consider)
  • Set min.insync.replicas=1 (reduce synchronization stress)
  • Set num. Recovery. Threads. Per. Data. The dir = broker log log. The dirs disk number

The Producer side

  • If acks=1 is set to all, follow min.insync.replicas=1

The consumer end

  • Set session.timeout.ms to a low value, such as 100000
  • Set the average processing time of max.poll.interval.ms messages.
  • Set the Max. Poll. Records and Max. Partition.. Fetch bytes decrease total message processing time, avoid frequent rebalance.
  • Copyright Notice: This set of technical column is the author (Qin Kaixin) usually work summary and sublimation, through extracting cases from real business environment to summarize and share, and give business application tuning suggestions and cluster environment capacity planning and other content, please continue to pay attention to this set of blog. Looking forward to joining the most combative team in the IOT era. QQ email address: [email protected], if there is any academic exchange, please feel free to contact.

7 summary

Num. Recovery. Threads. Per. Data. Dir and Max. Partition. The fetch. The bytes and Max. Poll. Records focus on it, is very interesting.

Qin Kaixin in Shenzhen 201812042237

Copyright Notice: This set of technical column is the author (Qin Kaixin) usually work summary and sublimation, through extracting cases from real business environment to summarize and share, and give business application tuning suggestions and cluster environment capacity planning and other content, please continue to pay attention to this set of blog. Looking forward to joining the most combative team in the IOT era. QQ email address: [email protected], if there is any academic exchange, please feel free to contact.