One, foreword

Kafka: Kafka: Kafka: Kafka: Kafka: Kafka: Kafka: Kafka: Kafka: Kafka Because this version is relatively new. Why don’t I use version 2.7? For example, 2.8, because ZooKeeper is removed, it is not stable, and production environment is not recommended to use, so 2.7 version of the source code construction and research.

Two, environmental preparation

  • The JDK: 1.8.0 comes with _241
  • Scala: 2.12.8
  • Gradle: 6.6
  • Zookeeper: 3.4.14

3. Environment construction

3.1 JDK environment Construction

This does not need me to say, do Java native have JDK environment.

3.2 Scala Environment Construction

Download link: www.scala-lang.org/download/2….

This is a Mac OS, so you can just watch your own system.

3.2.1 Configuring Scala Environment Variables

Enter the following command on the terminal to edit:

vim ~/.bash_profile

#The path here is for you to installSCALA_HOME = / Users/Riemann/Tools/scala - 2.12.8 export SCALA_HOME export PATH = $PATH: $SCALA_HOME/bin
#Put environment variables into effect, executed on the command line.
source  ~/.bash_profile
Copy the code

3.2.2 validation

Enter the following command:

scala -version
Copy the code

If the following message is displayed, the Scala environment is set up successfully.

3.3 Gradle environment construction

First came to Gradle website: services.gradle.org/distributio…

The diagram below:We select the distribution we want to install. Gradle-x.x-bin. zip is the installation distribution to download, gradle-x.x-src.zip is the source code, and gradle-x.x-all.zip is to download the entire file. I have gradle-6.6 locally.

The Gradle files do not need to be installed. You can unzip the files in the local directory, as shown below.

3.3.1 Configuring Gradle Environment Variables

Enter the following command on the terminal to edit:

vim ~/.bash_profile

#The path here is for you to installGRADLE_HOME = / Users/Riemann/Tools/gradle - 6.6 export GRADLE_HOME export PATH = $PATH: $GRADLE_HOME/bin
#Put environment variables into effect, executed on the command line.
source  ~/.bash_profile
Copy the code

3.3.2 rainfall distribution on 10-12 validation

Enter the following command:

gradle -v
Copy the code

If the following information is displayed, the Gradle environment is set up successfully.

3.4 Setting up a Zookeeper Environment

The Zookeeper environment has been set up in the Linux environment. Here I also give the setup steps, no matter what your system is, it is similar

3.4.1 track download

Wget HTTP: / / http://mirrors.hust.edu.cn/apache/zookeeper/zookeeper-3.4.14/zookeeper-3.4.14.tar.gzCopy the code

3.4.2 decompression

The tar - ZXVF zookeeper - 3.4.14. Tar. GzCopy the code

3.4.3 Go to the ZooKeeper-3.4.14 directory and create a data folder

 cdZookeeper - 3.4.14 mkdir dataCopy the code

3.4.4 Modifying a Configuration File

cd conf
mv zoo_sample.cfg zoo.cfg
Copy the code

3.4.5 Modifying the Data property in zoo. CFG

DataDir = / root/zookeeper - 3.4.14 / dataCopy the code

3.4.6 Starting the ZooKeeper service

Go to the bin directory and run the command to start the service

./zkServer.sh start
Copy the code

If the following information is displayed, the startup is successful

3.5 Kafka source code environment construction

Website to download the corresponding version of the source code package, url: kafka.apache.org/downloads

After downloading and decompressing, this source file also needs to import dependent JAR package, individual use IDEA to import the project, after importing need to use the previous configuration of Gradle as gradle home address.

3.5.1 Import Kafka source code to IDEA 3.5.2 to modify the build. Gradle

Next still can not guide jar package, need to change the image file download server to the domestic private server, otherwise it will be quite slow, directly lead to “time out” error.

Go to kafka source code package, modify the build.gradle file, on the original configuration, add ali private configuration.

buildscript {
    repositories {
        maven {
            url 'http://maven.aliyun.com/nexus/content/groups/public/'
        }
        maven {
            url 'http://maven.aliyun.com/nexus/content/repositories/jcenter'
        }
    }
}
 
allprojects {
    repositories {
        maven {
            url 'http://maven.aliyun.com/nexus/content/groups/public/'
        }
        maven {
            url 'http://maven.aliyun.com/nexus/content/repositories/jcenter'}}}Copy the code

3.5.3 Code construction

You can also run gradle commands on the IDEA GUI to build your idea. The idea GUI is easier to use, but gradle commands are also available to build your idea.

./gradlew clean build -x test
Copy the code

Copy the Jar file to the Gradle/Wrapper subdirectory in the Kafka path. Then run the gradlew build command again to build the project.

Link: pan.baidu.com/s/1W6EHysWY… Extraction code: HPJ5

Gradle other commands:

Build the JAR package and run it
./gradlew jar

# Build the project depending on whether you are an IDEA tool or Eclipse
./gradlew idea
./gradlew eclipse

# Build the source package
./gradlew srcJar

Build the Javadoc document
./gradlew aggregatedJavadoc

# Clean and build
./gradlew clean
Copy the code

4. Code structure

4.1 Code installation package structure

  • Bin directory: Kafka tool scripts. Kafka-server-start and kafka-console-producer scripts are stored in this directory.

  • Checkstyle directory: Code specification, automated detection.

    What Checkstyle is, the discussion about formatting has never been interrupted, what is right, what is wrong, until now there is no complete conclusion. But over time, a set of norms evolved. There is no absolute right or wrong, the key is the definition of the specification. The most famous is Google Style Guide, and Checkstyle is an automated plugin developed in this style to help determine whether code formats meet specifications.

    The files in this directory define the specification of the project code format. We can see the checkstyle configuration and automatic code formatting configuration in build.gradle:

    Checkstyle configuration:

Scala automated code formatting configuration:

  • Clients directory: Stores Kafka client code, such as producer and consumer code.

  • The config directory: stores Kafka configuration files. The most important configuration file is server.properties.

  • Connect directory: Holds the source code of the CONNECT component. The Kafka Connect component is used for real-time data transfer between Kafka and external systems.

  • Core directory: Holds broker-side code All the Kafka server-side code is stored in this directory.

  • Docs directory: Kafka design documents and component-related structure diagrams.

  • Examples directory: Kafka sample directory.

  • The generator directory: The build.gradle file defines processMessages:

  • Gradle directory: Gradle scripts, dependency package definitions and other related files.

  • Jmh-benchmarks Directory: Kafka code microbenchmarks related classes.

    JMH, or Java Microbenchmark Harness, is a suite of tools specifically for code Microbenchmark testing. What is Micro Benchmark? In simple terms, this is benchmark testing at the method level, with an accuracy of microseconds. When you locate a hot method and want to further optimize its performance, you can use JMH to quantify the results of the optimization.

    Typical application scenarios of JMH are as follows:

    • You want to know exactly how long a method takes to execute and the correlation between the execution time and the input.
    • Compare the throughput of different implementations of interfaces under given conditions to find the optimal implementation.
  • Kafka-logs directory: directory generated by configuring log.dirs in the server.properties file.

  • Log4j – appender directory:

    A log4j appender that produces log messages to Kafka

    There is only one KafkaLog4jAppender class in this directory.

  • Raft directory: Raft consistency protocol.

  • Streams directory:

    Kafka Streams is a client library for building applications and microservices, where the input and output data are stored in Kafka clusters.

    Provide a Kafka based streaming processing class library, directly provide specific classes to the developers to call, the whole application running mode is mainly controlled by the developers, easy to use and debugging.

    Kafka Streams is a library for building a stream handler, especially one whose input is a Kafka topic and whose output is another Kafka Topic (either by calling an external service, updating a database, or whatever). It allows you to do it in a distributed and fault-tolerant way.

  • Tests directory: This directory describes how to perform Kafka system integration and performance tests.

  • Tools directory: tool class module.

  • Vagrant directory: Describes how to run Kafka in the Vagrant virtual environment, with script files and documentation.

    Vagrant is a Ruby-based tool for creating and deploying virtualized development environments. It uses Oracle’s open source VirtualBox virtualization system and Chef to create automated virtual environments.

4.2 Project Structure

The core directory is the core package of Kafka, including cluster management, partition management, storage management, copy management, consumer group management, network communication, consumption management and other core classes.

  • Admin package: the function of executing administrative commands;
  • API package: encapsulates request and response DTO objects;
  • Cluster package: cluster objects. For example, the Replica class represents a Partition copy and the Partition class represents a Partition.
  • Common package: a generic JAR package;
  • A kafkaController (KC) cluster has only one leader KC. This kc is responsible for partition management, copy management, and synchronization of cluster information in the cluster.
  • Coordinator package: Holds the GroupCoordinator code for the consumer side and the TransactionCoordinator code for the transaction. The analysis of the Coordinator package, especially the Consumer-side GroupCoordinator code, is key to the design principle of the broker-side coordinator component.
  • Log package: Kafka logs, log snippets, index files, etc. This package encapsulates the implementation mechanism of log Compaction. It is an important source code package.
  • The Network package encapsulates the Code for the Kafka server network layer. In particular, socketServer. scala is a code class that implements the Kafka Reactor model.
  • Consumer package: This package will be discarded later and replaced with the consumer-related classes in the Clients package.
  • Server package: As the name suggests, this is the main server code for Kafka. It contains many classes, and many key Kafka components are stored here, such as the state machine, Purgatory delay mechanism, and so on.
  • Tools package: tool class.

Five, environmental verification

Let’s verify that the Kafka source environment is set up successfully.

5.1 First, create a resources directory under core/ SRC /main and copy the log4j.properties configuration file from conf to resources.

As shown below: 5.2 Modifying the server.properties file in the conf directory

The dirs = / Users/Riemann/Code/framework - source - Code - analysis/kafka - 2.7.0 - SRC/kafka - logsCopy the code

The rest of the configuration in the server.properties file remains unchanged for now.

5.3 Configuring the kafka.Kafka entry class in IDEA

The specific configuration is shown in the figure below:

5.4 Starting Kafka Broker

If the startup is successful, the console output is normal and you can see the following output:

5.5 The following exceptions may occur

5.5.1 exception 1

log4j:WARN No appenders could be found for logger (kafka.utils.Log4jControllerRegistration$).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Copy the code

Add slf4J-log4j12-1.7.30. jar and log4j-1.2.17.jar to the project structure. You can also add the corresponding configuration to build.

Method 1:Method 2:

compile group: 'log4j', name: 'log4j', version: '1.2.17'
compile group: 'org.slf4j', name: 'slf4j-api', version: '1.7.30'
compile group: 'org.slf4j', name: 'slf4j-log4j12', version: '1.7.30'
Copy the code

Add core modules to build.gradle file:

5.5.2 exception 2

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Copy the code

5.6 Sending and Consuming Messages

We use Kafka scripting tools to verify the above Kafka source code environment

First, go to the ${KAFKA_HOME}/bin directory and create a topic named topic_test using the kafka-topics.

The execution effect is as follows:

We then start a command line consumer with kafka-console-consumer.sh to consume topic_test as follows: kafka-console-consumer.sh

./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic topic_test
Copy the code

Sh > kafka-console-producer.sh > kafka-console-producer.sh > kafka-console-producer.sh > kafka-console-producer.sh > kafka-console-producer.sh > kafka-console-producer.sh > kafka-console-producer.sh > kafka-console-producer.sh

When we type a message and press Enter, the message is sent to the topic_test topic.

After we type message and press Enter, we can receive the message in the consumer, as shown in the picture below:Kafka Broker end source code will be analyzed in the future