Lightweight log collection tool with superior performance, Microsoft, Amazon are using!

SpringBoot e-Commerce project mall (35K + STAR) address: github.com/macrozheng/…

Abstract

ELK log collection system we all know, but there is a log collection system EFK, there must be a lot of friends do not know! The F here refers to Fluentd, which has a log collection function similar to that of Logstash, but with less than a tenth of the memory footprint, superior performance and very lightweight. This article will introduce the use of Fluentd in detail, mainly used to collect SpringBoot application logs, hope to help you!

Fluentd profile

Fluentd is an open source log collection function designed to provide a unified log collection layer for users. You can use Fluentd with Elasticsearch and Kibana to create an EFK log collection system. What is a unified log collection layer? Take a look at the picture below!

The installation

In the “you actually return to the server on the log, build a log collection system is not sweet yao!” Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd: Fluentd

Download Fluentd Docker image;

Docker/fluentd: pull fluent v1.10Copy the code

Default configurationfluent.confFile copy to/mydata/fluentd/Under the directory, the configuration information is as follows:

<source>
  @type  forward
  @id    input1
  @label @mainstream
  port  24224
</source>

<filter **>
  @type stdout
</filter>

<label @mainstream>
  <match docker.**>
    @type file
    @id   output_docker1
    path         /fluentd/log/docker.*.log
    symlink_path /fluentd/log/docker.log
    append       true
    time_slice_format %Y%m%d
    time_slice_wait   1m
    time_format       %Y%m%dT%H%M%S%z
  </match>
  <match **>
    @type file
    @id   output1
    path         /fluentd/log/data.*.log
    symlink_path /fluentd/log/data.log
    append       true
    time_slice_format %Y%m%d
    time_slice_wait   10m
    time_format       %Y%m%dT%H%M%S%z
  </match>
</label>
Copy the code

To run the Fluentd service, open it24221 ~ 24224Four ports are used to receive different types of logs.

docker run -p 24221:24221 -p 24222:24222 -p 24223:24223 -p 24224:24224 --name efk-fluentd \
-v /mydata/fluentd/log:/fluentd/log \
-v /mydata/fluentd/fluent.conf:/fluentd/etc/fluent.conf \
-dFluent/fluentd: v1.10Copy the code

The initial startup may fail. Change the directory permission and restart the system.

chmod 777 /mydata/fluentd/log/
Copy the code

userootThe user enters Fluentd container.

docker exec -it --user root efk-fluentd /bin/sh
Copy the code

Install Elasticsearch for Fluentd.

fluent-gem install fluent-plugin-elasticsearch
Copy the code

If you still want to use itdocker-composeTo install EFK once, you can use the following script,Pay attention touseuser:rootStart does not need to modify directory permissions!

version: '3'
services:
  elasticsearch:
    image: Elasticsearch: 6.4.0
    container_name: efk-elasticsearch
    user: root
    environment:
      - "cluster.name=elasticsearch" Set the cluster name to elasticSearch
      - "discovery.type=single-node" Start in single-node mode
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m" # set the JVM memory size to be used
      - TZ=Asia/Shanghai
    volumes:
      - /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins The plugin file is mounted
      - /mydata/elasticsearch/data:/usr/share/elasticsearch/data Mount data files
    ports:
      - 9200: 9200
      - 9300: 9300
  kibana:
    image: Kibana: 6.4.0
    container_name: efk-kibana
    links:
      - elasticsearch:es You can use the es domain name to access elasticSearch
    depends_on:
      - elasticsearch # Kibana will start after ElasticSearch has started
    environment:
      - "elasticsearch.hosts=http://es:9200" Set the address to access ElasticSearch
      - TZ=Asia/Shanghai
    ports:
      - 5601: 5601
  fluentd:
    image: Fluent/fluentd: v1.10
    container_name: efk-fluentd
    user: root
    environment:
      - TZ=Asia/Shanghai
    volumes:
      - /mydata/fluentd/log:/fluentd/log
      - /mydata/fluentd/fluent.conf:/fluentd/etc/fluent.conf
    depends_on:
      - elasticsearch # Kibana will start after ElasticSearch has started
    links:
      - elasticsearch:es You can use the es domain name to access elasticSearch
    ports:
      - 24221: 24221
      - 24222: 24222
      - 24223: 24223
      - 24224: 24224
Copy the code

Use the new configuration filefluent.confReplace the original configuration file and restart the Fluentd service. The new configuration file is shown below.

Fluentd configuration details

Next we will introduce how to configure the Fluentd configuration file, first release the complete configuration, and then we will elaborate on some of the configuration points inside.

Complete configuration

<source>
  @type  tcp
  @id    debug-input
  port  24221
  tag debug
  <parse>
	@type json
  </parse>
</source>

<source>
  @type  tcp
  @id    error-input
  port  24222
  tag error
  <parse>
	@type json
  </parse>
</source>

<source>
  @type  tcp
  @id    business-input
  port  24223
  tag business
  <parse>
	@type json
  </parse>
</source>

<source>
  @type  tcp
  @id    record-input
  port  24224
  tag record
  <parse>
	@type json
  </parse>
</source>

<filter record>
  @type parser
  key_name message
  reserve_data true
  remove_key_name_field true
  <parse>
    @type json
  </parse>
</filter>

<match fluent.**>
  @type stdout
  output_type json
</match>

<match **>
  @typeElasticsearch host 192.168.3.101 Port 9200 Type_name Docker logSTash_formattrue
  logstash_prefix docker-${tag}-logs
  logstash_dateformat %Y-%m-%d
  flush_interval 5s
  include_tag_key true
</match>
Copy the code

Configuration Essentials

`<source>`

The log collection source can be TCP, UDP, tail (file), forward (TCP + UDP), or HTTP.

Here we collect logs from TCP requests on port 24221 and set the tag to DEBUG.

<source>
  @type  tcp
  @id    debug-input
  port  24221
  tag debug
  <parse>
	@type json
  </parse>
</source>
Copy the code

`<parse>`

Define how raw data is parsed to convert logs to JSON.

For example, converting debug logs to JSON can be configured as follows.

<source>
  @type  tcp
  @id    debug-input
  port  24221
  tag debug
  <parse>
	@type json
  </parse>
</source>
Copy the code

`<filter>`

The collected logs can be processed in a number of ways, such as printing them to the console or parsing them.

Configuration to print all logs to the console:

<filter **>
  @type stdout
</filter>
Copy the code

For logs with a tag as a record source, we convert the Message attribute to JSON format; otherwise, the message attribute would be a string.

<filter record>
  @type parser
  key_name message
  reserve_data true
  remove_key_name_field true
  <parse>
    @type json
  </parse>
</filter>
Copy the code

`<match>`

– Defines where to export collected logs to stdout (console), File, ElasticSearch, Mongo, etc.

Logstash_format, logSTASH_prefix, and logSTash_dateformat are used to control log index name generation. Currently, the index format of debug logs is docker-debug-logs-2020-06-03. Flush_interval controls the interval at which logs are output to ElasticSearch.

<match **>
  @typeElasticsearch host 192.168.3.101 Port 9200 Type_name Docker logSTash_formattrue
  logstash_prefix docker-${tag}-logs
  logstash_dateformat %Y-%m-%d
  flush_interval 5s
  include_tag_key true
</match>
Copy the code

Replacing configuration Files

Replace the original/mydata/fluentd/fluent. Conf configuration file, and then restart the service, our fluentd service can start collect the log.

docekr restart efk-fluentd
Copy the code

Used with SpringBoot

Fluentd collects logs in the same way as Logstash. It collects logs through the TCP port. Therefore, we only need to change the original Logstash log collection address port in the Logback configuration file to Fluentd.

Modify thelogback-spring.xmlConfiguration file;

<! --DEBUG log output to LogStash-->
<appender name="LOG_STASH_DEBUG" class="net.logstash.logback.appender.LogstashTcpSocketAppender">
    <destination>${LOG_STASH_HOST}:24221</destination>
</appender>

<! ERROR log output to LogStash-->
<appender name="LOG_STASH_ERROR" class="net.logstash.logback.appender.LogstashTcpSocketAppender">
    <destination>${LOG_STASH_HOST}:24222</destination>
</appender>

<! Business log output to LogStash-->
<appender name="LOG_STASH_BUSINESS" class="net.logstash.logback.appender.LogstashTcpSocketAppender">
    <destination>${LOG_STASH_HOST}:24223</destination>
</appender>

<! Interface access log output to LogStash-->
<appender name="LOG_STASH_RECORD" class="net.logstash.logback.appender.LogstashTcpSocketAppender">
    <destination>${LOG_STASH_HOST}:24224</destination>
</appender>
Copy the code

If your Fluentd is not deployed on the same Logstash server, you will need to change itapplication-dev.ymlThe configuration of thelogstash.hostProperties.

logstash:
  host: localhost
Copy the code

Start and run our SpringBoot application.

View logs in Kibana

At this point, our EFK log collection system is complete, and we only need to use it in Kibana.

inManagement->Kibana->Index PatternsCan createIndex Patterns, Kibana service visit address:http://192.168.3.101:5601

After creating the ELK system, you can see that the log collection function is exactly the same as the ELK system we built before.

Logstash vs Fluentd

Let’s compare the aspects of the two log collection tools.

Contrast aspects	Logstash	Fluentd
Memory footprint	About 1G boot	Start about 60M
CPU	higher	The lower
Support plugins	rich	rich
General log parsing	Support for GROk (regular expression based) parsing	Regular expression parsing is supported
Specific log types	Supports mainstream formats such as JSON	Supports mainstream formats such as JSON
Data filtering	support	support
Data buffer sending	Plug-in support	Plug-in support
Runtime environment	JRuby implementation, depending on the JVM environment	CRuby, C implementation, rely on Ruby environment
Thread support	Multithreading support	Multithreading is limited by GIL

The resources

Official documentation: docs.fluentd.org/

Project source code address

Github.com/macrozheng/…

The public,

Mall project full set of learning tutorials serialized, attention to the public number the first time access.

Lightweight log collection tool with superior performance, Microsoft, Amazon are using!

Abstract

Fluentd profile

The installation

Fluentd configuration details

Complete configuration

Configuration Essentials

<source>

<parse>

<filter>

<match>