Fluent Bit profile

Fluent Bit, short for TD-Agent, is an open source log processing and forwarding tool that can collect data from different data sources, such as indicator data and logs, and process the data more enriched by using filters. Td-agent can use the Output plug-in to forward data to multiple target systems. Td-agent is also the best log collection tool for container environments such as Kubernetes.

Td-agent is developed using C language, which is inherently more efficient and saves resources. Td-agent is also a sub-project of Fluentd of CNCF project.

Collect WAF SYSLOG from changting Mine Pool

The syslog sending function of the lightning pool can be used to transfer WAF logs to other systems. The collectors for receiving syslogs are as follows:

  • rsyslog
  • filebeat
  • logstash
  • fluentd

Each of the above collectors has its own advantages and disadvantages. This paper only focuses on Fluentbit.

If you are impatient, you will want to see the configuration directly and copy the past. Of course, it is not impossible. As long as it is the same data source 1, it can be used directly.

td-agent-bit.conf

Td-agent-bit. conf is the main FluentBit configuration file. Here we use @include to introduce our custom input configuration file:

[SERVICE]
    flush        5
    daemon       Off
    log_level    info
    parsers_file parsers.conf
    plugins_file plugins.conf
    http_server  Off
    http_listen  0.0.0.0
    http_port    2020
    storage.metrics on
@INCLUDE input-*.conf
Copy the code

input-syslog-waf.conf

Depending on personal habits, it is common to configure each data source as a separate file, either to simplify the main configuration file or to separate each data source for easy debugging.

(INPUT) Name tail Tag waf Path/data/syslog/host / 10.2.34.11 / * DB/var/log/td - agent/waf - the DB Parser waf - syslog Mem_Buf_Limit 24m Refresh_Interval 1 [FILTER] Name nest Match * Operation lift Nested_under message Remove_prefix Message [OUTPUT] Name es Match waf* Host 10.2.53.2 Port 9200 HTTP_User elastic HTTP_Passwd mypassword Logstash_Format on  Logstash_Prefix waf-threat Logstash_DateFormat %Y.%m.%d Trace_Output on Trace_Error on Retry_Limit 3 Tag_key tag [OUTPUT] Name stdout Match *Copy the code

parsers.conf

Parsers. conf is the parser configuration file that comes with the software package, and I put the configuration section for parsing WAF in it as follows:

[PARSER] Name waf-syslog Format regex Regex (? <message>\{.*}) Time_Key timestamp Time_Format %Y-%m-%dT%H:%M:%S %z # Command | Decoder | Field | Optional Action # =============|==================|================= Decode_Field_As escaped_utf8 message do_next Decode_Field_As json messageCopy the code

Raw data sample

The original data is the original data sent by syslog from the mine pool and is received by Rsyslog without filtering:

As the data involves sensitive information, the domain names and IP addresses in the original data are replaced by ***.

2021-09-21T23:59:24+08:00 0684c3476940 /mario/mario[1] {" action ":" allow ", "attack_type" : "none", "body" : ""," cookie ":" ", "country" : "CN", "decode_path" : ""," dest_ip ":" 10.2.120.250 ", "d est_port":80,"event_id":"23756fd838ee47c8bf54d31766915a64","host":"***","location":"","method":"GET","module":"","node": "Chaitin - safeline", "content" : ""," protocol ":" HTTP ", "province", "Beijing", "" reason" : "whitelist", "referer" : ""," req_header_raw ", "GE HTTP / 1.1 T/p/login/index \ \ r \ \ nX - Forwarded - Proto: HTTP \ \ r \ \ nHost: * * * \ \ r \ \ nX - Forwarded - For: * * * \ \ r \ \ nX - Forwarded - For: * * * \ \ r \ \ nX - Real - IP: * * * \ \ r \ \ nUser - Agent: Go - HTTP client / 1.1 \ \ r \ \ nAccept - Charset: utf-8 \ \ r \ \ nAccept - Encoding: gzip\\r\\n\\r\\n","resp_body":"","resp_header_raw":"","resp_reason_phrase":"","resp_status_code":"","risk_level":"none", "rule_id":"/1@time-1631893715","selector_id":"","session":"","src_ip":"***","src_port":35928,"timestamp":1632239964,"tim 23:59:24 estamp_human ":" 2021-09-21 ", "urlpath" : / p/login/index ", "" user_agent" : "Go - HTTP client / 1.1}"Copy the code

Example of filtered data

The filtered data is output to ES, and the data format of ES is used here:

The raw data selected above is not the same as the filtered data here, and the actual effect is actually the same.

{" _index ":" waf -- -- kyoui -- 2021.09.22 ", "_type" : "_doc", "_id" : "KOxODnwBGLD5dr227ftA", "_version" : 1, "_score" : Null, "_source" : {" @ timestamp ":" the T16 2021-09-22: agreement of. 511 z ", "action" : "allow", "attack_type" : "none", "body" : ""," cookie ":" ", "country" : "CN", "decode_path" : ""," dest_ip ":" 10.2.120.250 ", "dest_port:" 81, "event_id" : "a8d64637368f4c65b9c24a62e7df7949", "host": "***", "location": "", "method": "GET", "module": "", "node": "Chaitin-safeline ", "payload": "", "protocol":" HTTP ", "province": "Beijing ", "reason": "whitelist", "referer": ""," req_header_raw ":" GET/HTTP / 1.0 \ r \ nHost: * * * \ r \ nX - Forwarded - For: * * * \ r \ nX - Forwarded - Proto: HTTPS \ r \ nX - Real - IP: * * * \ r \ nConnection: close \ r \ nUser - Agent: Go - HTTP client / 1.1 \ r \ nAccept - Charset: utf-8 \ r \ nAccept - Encoding: gzip\r\n\r\n", "resp_body": "", "resp_header_raw": "", "resp_reason_phrase": "", "resp_status_code": "", "risk_level": "none", "rule_id": "/1@time-1631893715", "selector_id": "", "session": "", "src_ip": "***", "src_port": 33754, "timestamp": 1632327623, "timestamp_human": "2021-09-23 00:20:23", "urlpath": "/", "user_agent": "Go-http-client/1.1"}, "fields": {"@timestamp": [" 2021-09-22T16:20:23.511z "]}, "sort": [1632327623511]}Copy the code

ES screenshot

Fluent Bit Key concepts

  • Supplementary Details

Events and Records

filter

Tag

The time stamp

matching

Structured information


  1. For raw data, see raw data example ↩