Kafka is a distributed publish/subscribe Message Queue, which is mainly used in real-time processing of big data.

Kafka’s designers and maintainers have designed an excellent, performance-oriented solution. Whether it’s a journaling architecture that offloads the workload to clients, brokers, or even batch processing, compression, zero-copy I/O, and streaming parallelism, Kafka beats almost all message-oriented middleware, commercial or open source.

Architecture diagram

Today to introduce a carefully organized by Zhu xiaoxi “Kafka interview summary”. The guy has written several books and tutorials on message queuing products, and this is a great collection of kafka questions that you can’t miss.

This set of interview questions is divided into three parts, including the basic, advanced, advanced three stages of the interview. I recommend you to study carefully, understand this problem, Kafka interview absolutely no problem.

If you feel that you have a certain understanding of Kafka, you can also use this question bank as a reference to evaluate whether you can cope with the first line of Internet companies kafka interview.

[Collect information here!!]

[Collect information here!!]

directory

  1. basis
  2. The advanced
  3. senior

basis

  1. What are the uses of Kafka? What are the usage scenarios?

  2. What do ISR and AR stand for in Kafka? What is scaling of ISR

3. What do HW, LEO, LSO and LW in Kafka stand for respectively?

4. How are messages sequential in Kafka?

5. What are the partitions, serializers, and interceptors in Kafka? What is the order in which they are processed?

\

6. What is the structure of the Kafka producer client?

7. How many threads are used to process the Kafka producer client? What are they?

8. What were the design flaws of Kafka’s old Version of Scala’s consumer client?

9. Is the statement “if the number of consumers in the consumer group exceeds the topic partition, then some consumers will not consume the data” true? If so, is there any way to hack?

10. What are the circumstances that lead to repeated consumption?

11. What are the scenarios that lead to missing information consumption?

12.KafkaConsumer is not thread-safe, so how to implement multithreaded consumption?

13. Describe the relationship between consumers and consumer groups

14. What logic is implemented behind kafka after you create (delete) a topic using kafka-topics.sh?

15. Can the number of topic partitions be increased? And if so, how? And if not, why?

16. Can the number of topic partitions be reduced? If so, how? And if not, why?

17. How to choose an appropriate number of partitions when creating a topic?

The advanced

  1. What internal topics do Kafka currently have, and what are their characteristics? What is the role of each

  2. What is a priority copy? What special function does it have?

  3. Where does Kafka have the concept of partition allocation? Describe the general process and principles

  4. This section describes the log directory structure of Kafka

  5. What index files are in Kafka?

  6. If I specify an offset, how does Kafka find the corresponding message?

  7. If I specify a timestamp, how does Kafka find the corresponding message?

  8. Talk about your understanding of Kafka’s Log Retention

9. Tell us what you mean by Kafka’s Log Compaction

10. Talk about your understanding of Kafka’s underlying storage

11. Talk about how Kafka works with latency

Talk about what the Kafka controller does

13. What were the design flaws of Kafka’s old Version of Scala’s consumer client?

14. What is the principle of consumption rebalancing? (Tip: Consumer coordinator and consumer group coordinator)

15. How is idempotent implemented in Kafka?

senior

1. How are transactions implemented in Kafka?

2. What does invalid copy mean? What are the measures to deal with it?

3. Evolution process of HW and LEO in multiple copies

4. What reliability improvements have Kafka made? (HW, LeaderEpoch)

5. Why does Kafka not support read/write separation?

6. How to implement delay queue in Kafka

7. How to implement dead letter queue and retry queue in Kafka?

8. How to audit messages in Kafka?

9. How do message traces work in Kafka?

  1. What is Lag? (Note the difference between READ_unCOMMITTED and READ_COMMITTED)

11. What metrics should Kafka focus on?

  1. What is Kafka designed for such high performance?

[Collect information here!!]

* Special statement: the information comes from the network, collation person Zhu Xiaosi, invade delete.