Preface:

Apache Kaika (Kafka for short) is a distributed messaging system originally developed by LinkedIn. It is now a sub-project of Apache and has become one of the most widely used messaging systems in the open source field. The Kafka community is also very active, and since version 0.9, Kafka’s tagline has changed from “a high throughput. “Distributed messaging system” changed to “a distributed flow platform”.

Due to space limitation, we don’t have all of them, so we won’t show all of them here, share a few key ones, get a complete ebook and follow my donor Hao (someone from Java Zhou) to get it

Directory:

  • Introduction to Kafka
  • producers
  • Consumer: High-level apis and low-level apis
  • The new consumer
  • The coordinator
  • Storage layer
  • The controller
  • Build data flow pipeline based on Kafka
  • Kafka stream processing
  • Advanced Features

Consumer: High-level apis and low-level apis

The controller

Compared to the previous chapters, the Kafka controller has more and more work to do. It handles the PartitionLeaderselector for a partition, manages the PartitionStateMachine for a partition, Manage replicas machine (REPLICas machine), manage various types of listeners, etc.

Build data flow pipeline based on Kafka

Kafka has two core concepts: producer and consumer. If you have multiple Kafka clusters, to synchronize data between different clusters, you can write an application in which the producer reads messages from the source cluster and the consumer writes messages to the target cluster. However, Kaka itself has a built-in tool for cluster synchronization: MirorMaker. Additionally, Uber’s open source uReplicator tool, which is an improved version of MirrorMaker, will be analyzed. In addition to synchronizing data between Katka clusters and other storage systems, Kafka also provides “Kafka Connect” to guide and export data.

Because the space limit does not have all, here will not do all the show, share a few more important, get the complete e-book, private letter I (information) can be free

Kafka stream processing

Prior to the 0.10 release, Kafka was only used as a message storage system, and developers had to rely on third-party streaming engines to stream data from Kafka clusters. After version 10, Kafka came with a built-in client library for the stream processing framework, allowing developers to build complex stream computing tasks directly from the core of Kaka.

The Kaka streaming framework provides developers with two apis: a low-level Processor API and a high-level streaming DSL. The former requires developers to implement their own flow processing logic. In the latter, the framework provides some generic stream handlers to meet the needs of stream computation. Let’s start with a simple low-level Processor example program and then step through the internal implementation of the Kafka stream processing framework with the related classes involved in the sample program.

Advanced Features

This chapter only briefly introduces some of the advanced features of the new version of Kafka, but does not delve into the details of the code implementation. Towards the end of this writing, Kafka’s version number is finally going from 0x to 1.00. The new version not only supports the transaction function, but also the connector, stream processing function is increasingly rich and stable, users can safely use these new features online. In addition to r’s ofluen blog, the official KIP documentation is also a good resource for readers interested in the latest technical improvements in the Katka community.

Reader benefits:

Due to space limitation, we don’t have all of them, so we won’t show all of them here, share a few key ones, get a complete ebook and follow my donor Hao (someone from Java Zhou) to get it