Recently, some people in the group mentioned that their log storage path is from the client to Kafka, and then from Kafka to ElasticSearch. Switching ES to Loki now requires tools that support both Kafka and Loki plugins. He checked out the Fluentd, Logstash, and Vector tools available today that are reliable enough to meet his needs.

  • Fluentd

CNCF Graduated cloud native log collection client. Combined with Kubernetes is more sophisticated, plugins and a large factory endorsement. The downside is that ruby limits the resource consumption itself when logging is heavy (FluentBit is recommended).

  • Logstash

The old log collection and aggregation tools in the ELK stack are widely used and have abundant plug-ins. However, the disadvantages lie in the high overall resource consumption and low efficiency of concurrent log processing on a single machine.

  • Vector

A lightweight logging client that is new to open source with high product integration and low resource consumption (click here to see – official performance report). The downside is that current products don’t seem to have widespread best practices.

Below is a comparison of the performance tests performed by Vector for each of these products:

Test Vector FluentBit FluentD Logstash
TCP to Blackhole 86mib/s 64.4 the mib/s 27.7 the mib/s 40.6 the mib/s
File to TCP 76.7 the mib/s 35mib/s 26.1 the mib/s 3.1 the mib/s
Regex Parsing 13.2 the mib/s 20.5 the mib/s 2.6 the mib/s 4.6 the mib/s
TCP to HTTP 26.7 the mib/s 19.6 the mib/s <1mib/s 2.7 the mib/s
TCP to TCP 69.9 the mib/s 67.1 the mib/s 3.9 the mib/s 10mib/s

When we need to save logs in Kafka to Loki, what are the methods we can implement? Let’s start with a simple 😄

Vector

Vector already has the kafka and Loki methods integrated internally, so we just need to download the Vector and its configuration to use it directly.

Installation of the vector

Curl, proto '= HTTPS' - tlsv1.2 - sSf https://sh.vector.dev | shCopy the code

Or you can just use the Docker image

Docker pull timberio/vector: 0.10.0 - alpineCopy the code

The vector configuration

[sources.] in bootstrap_servers = "< address > kafka" group_id = "< id > consumption group" switchable viewer = [" ^ (prefix1 | prefix2) - + ", "topic - 1", "Topic-2 "] \\ topic-name, Support regular type = "kafka" [sinks. Out] endpoint = "http://< Loki address >" inputs = ["in"] \\source.in type = "Loki" labels. Key = "Value" \\ custom key labels. Key = "{{event_field}}" \\event dynamic valueCopy the code

More information about vector-loki can be found at: vector-dev /docs/refere…

Fluentd

Input – fluent-plugin-kafka

The fluent plugin-Kafka plugin is fluent’s official plugin for processing Kafka, which can be used for both input and output stages. It is installed as follows:

gem install fluent-plugin-kafka
Copy the code

When it is used in the input phase, FluentD acts as a Kafka consumer, taking messages from a given topic and processing them. It is configured as follows:

<source>
  @type kafka

  brokers <kafka Borker address >
  topics 
      
  format < log processing format, default is json, support text | json | LTSV | msgpack >
  message_key < log key, used only for text logs >
  add_prefix < Add fluentd tag prefix >
  add_suffix < Add fluentd tag suffix >
</source>
Copy the code

If you want to specify an offset for consuming messages from different topics, you need the following configuration:

<source>
  @type kafka

  brokers  <kafka Borker address >
  format   < log processing format, default is json, support text | json | LTSV | msgpack >
  <topic>
    topic     < Single topic name >
    partition < > partition,
    offset    < Starting from offset >
  </topic>
  <topic>
    topic     < Single topic name >
    partition < > partition,
    offset    < Starting from offset >
  </topic>
</source>
Copy the code

If you’re familiar with FluentD, you might know that FluentD handles pipelines with tag names. By default, kafka will tag you with the topic name. If you want to do some global filter you can add tag prefix/ suffix (add_prefix/add_suffix) to implement global matching.

Output – fluent-plugin-grafana-loki

Fluent-plugin-grafana-loki is a plugin contributed by Grafana Lab that sends logs from Fluentd to Loki. Haku was introduced earlier in Loki and Fluentd, but I won’t go into much detail here.

The configuration is copied directly from the previous article. The main difference is the tag matching, as shown below:

<match $kafka.topic>  \\ Here is a topic for Kafka
  @type loki
  @id loki.output
  url "http://loki:3100"
  <label>
    key1          If your log is in JSON format, then you can extract it as needed
    key2          The \\ field as your Loki labels
  </label>
  <buffer label>
    @type file
    path /var/log/fluentd-buffers/loki.buffer
    flush_mode interval
    flush_thread_count 4
    flush_interval 3s
    retry_type exponential_backoff
    retry_wait 2s
    retry_max_interval 60s
    retry_timeout 12h
    chunk_limit_size 8M
    total_limit_size 5G
    queued_chunks_limit_size 64
    overflow_action drop_oldest_chunk
  </buffer>
</match>
Copy the code

Logstash

Input – logstash-input-kafka

Logstash-input-kafka is the official Kafka consumer plugin provided by Elastic. The configuration of the input phase is relatively simple.

Input {kafka {bootstrap_servers => "<kafka Borker address >" topics => "<topic name >" codec => "< log type, default plain>" tags =>" >"}}Copy the code

More parameters reference www.elastic.co/guide/en/lo…

Output – logstash-output-loki

Logstash-output-loki is also a logStash plugin contributed by Grafana Lab to handle output to Loki. During the installation, run the following command:

logstash-plugin install logstash-output-loki
Copy the code

Logstash outputs few parameters to Loki. The main ones are as follows:

Output {Loki {url => "http://loki address >/ Loki/API /v1/push" batch_size => 112640 # Log packet size for a single push retries => 5 min_delay => 3 Max_delay => 500 message_field => "< the key of the log message line is passed by logstas, default is message>"}}Copy the code

conclusion

None of the above three tools filter or parse, but simply act as a pipeline to dump logs from Kafka to Loki. The actual environment can be complex and requires further analysis of the logs. From His experience, however, vector configuration for logs from Kafka to Loki is relatively straightforward. Fluentd and Logstash overall difference depends on your own ease of use.


Follow the public account “Cloud native xiao Bai”, reply [Enter group] and enter Loki learning group