Original: Taste of Little Sister (wechat official ID: XjjDog), welcome to share, please reserve the source.

Bao Jianfeng from honed out, plum blossom incense from the bitter cold.

Bai Juyi, a poet, went down to Jiangnan in March. When he saw the peach blossoms along the road, his heart surged with emotion and he wrote this poem that will last forever. It shows the poet’s yearning for beautiful things and his feeling of bitterness behind them. Forget it… I can’t make this up anymore.

The first half of the sentence is ok, it is causal. But in the second half of the poem, it doesn’t make any sense, it’s a typical cause-by-effect approach. Even if you change it to “shit stinks out of my ass,” it’s much better than that.

This is the difference between theory and practice. It can’t support the feeling of reality only by supposition.

In this article, we will introduce a common, run-of-the-road ELKB solution, along with a nice set of configuration files to reduce rework.

ELKkB

Not long ago, ELKB was elK. The Beats line has only been developed in recent years to replace collection components such as Flume. But to make the process smoother and more scalable, a component called Kafka is often added. So the whole thing looks something like this.

Just a quick comment on a few components.

1) filebeat. It is used to collect logs. After testing, it is easy to use and consumes less resources than Flume. However, the resource usage is not so intelligent, and some parameters need to be adjusted. Filebeat consumes both memory and CPU resources, so be careful.

2) kafka. Popular message queue with storage + buffering in log collection. Kafka has too many topics and can have serious performance problems, so you need to categorize the information you collect. To take this one step further, separate kafka clusters directly. Kafka has low CPU requirements, and large memory and high-speed disks significantly increase its performance.

3) the logstash. It is mainly used for data filtering and shaping. This component is very greedy and takes up a lot of resources, so don’t put it in the same place as the application process. However, it is a stateless compute node that can be expanded as needed.

4) elasticsearch. Can store very large volume of log data. Note That a single index should not be too large, and can be indexed by day or month according to the magnitude, and can be easily deleted.

5) kibana. Xjjdog and ES are very well integrated presentation components, for which xJjDog has a special article. Your wild flowers, My Kibana

The more components you choose, the more elegant the process becomes. The addition of Kafka, in particular, makes the whole chain’s head and butt perfectly replaceable, which is magical. One way to advance is: ELK->ELKB->ELKkB.

Practice journey

Log format

In order to concatenate our components, we need to prepare some little data. Of these, nginx logging is the most common and has become the default load balancer for HTTP services.

First, you need to tidy up its log format, and I have a handy configuration here.

log_format  main 
'$time_iso8601|$hostname|$remote_addr|$upstream_addr|$request_time| '
'$upstream_response_time|$upstream_connect_time|$status|$upstream_status| '
'$bytes_sent|$remote_user|$uri|$query_string|$http_user_agent|$http_referer|$scheme| '
'$request_method|$http_x_forwarded_for' ;

access_log logs/access.log main;
Copy the code

Eventually, the generated log might look something like this, with relatively complete content. Logging in this format is much more convenient, whether it is handled by programs or scripts.

: the 2019-11-28 T11 26:24 + 08:00 | nginx100. Server. Ops. Pro. Dc | 101.116.237.77 | 10.32.135.2:41015 | | | | 0.000 0.060 0.062 200 | 200 | 13701 | | /api/exec| v = 10 & token = H8DH9Snx9877SDER5627 | | - | - | HTTP POST | 112.40.255.152Copy the code

The collector

Next, you need to configure the FileBeat component. As mentioned above, since this thing is deployed on a business machine, its resources need to be tightly controlled. The complete configuration file is available in the attachment.

For example, CPU resource limits.

max_procs: 1
Copy the code

Memory resource limits.

queue.spool:
  file:
    path: "${path.data}/spool.dat"
    size: 512MiB
    page_size: 32KiB
  write:
    buffer_size: 10MiB
    flush.timeout: 5s
    flush.events: 1024
Copy the code

In addition, you can add some additional fields.

fields:
  env: pro
Copy the code

Next you need to configure Kafka. Since logs are usually of a large magnitude and have no obvious significance, the number of copies exceeds 2, which is useless and increases the recovery time.

The filter

The Logstash configuration is probably the most confusing and this is where we’ll focus our attention. Because the above nginx log will be parsed into a JSON string that elasticSearch can recognize.

Through the Input section, you can access some data sources. In this case, our data source becomes Kafka. If you have multiple Kakfa, or multiple data sources, these can be defined here.

You can then define some data cleaning actions in the Filter section. It has very bad syntax and very crappy apis, especially for date handling. If the code is not formatted, the nested layers can be confusing. Ruby syntax is said to be used.

Note that event is a built-in variable that represents the current row of data, including some underlying properties. You can get some values using the GET method.

For example, get the main body information, which is the specific line information.

body = event.get('message')
Copy the code

It is then parsed into the corresponding key/value value. The separator | is our nginx log separator. Is there an impulse to flourish?

reqhash = Hash[@mymapper.zip(message.split('|'))]

query_string = reqhash['query_string']

reqhash.delete('query_string')
Copy the code

Date processing is also a heartbreaking journey.

time_local = event.get('time_local')
datetime = DateTime.strptime(time_local,'%Y-%m-%dT%H:%M:%S%z')
source_timestamp = datetime.to_time.to_i * 1000
source_date = datetime.strftime('%Y-%m-%d')      event.set('source_timestamp',source_timestamp)
event.set('source_date',source_date)
Copy the code

If you want to parse Query Param, you can do this too, but it’s still a bit tricky.

query_string = reqhash['query_string']
query_param = CGI.parse(query_string)
query_param.each { |key,value| query_param[key]=value.join() }
reqhash['query_param'] = query_param
buffer_map = LogStash::Event.new(reqhash)
event.append(buffer_map)
Copy the code

Where do all these weird functions come from? Logstash won’t tell you, I looked it up from Ruby. Maybe L doesn’t care to talk to us.

https://ruby-doc.org/core-2.5.1/
Copy the code

If your log format is oddly defined or deeply nested, be careful. Destined to analysis is to make some effort.

But logstash has an output called STdout that allows you to debug the process in real time, and you need to see the result visually. This is the test of one-time programming success.

End

So, I shared these configuration files for you to implement. Be able to refer to. Warehouse address, also can click on the original view:

https://github.com/xjjdog/elkb.cnf.pro
Copy the code

If you’re not familiar with the theoretical stuff, there are also some articles that have been written. The whole idea, the research. Out of all the surveillance components, there’s always one for you.

In fact, the implementation of ELKB is difficult not in the scheme, but in the integration. First, the application of this kind of bypass can not affect the normal operation of the original service, that is, can not become a distraction; Second, generally playing this, the number of machines is particularly large, how to deploy, update, is a need to study.

But the filtering part, whether it’s flume or logstash, is a big problem.

Xjjdog is a public account that doesn’t allow programmers to get sidetracked. Focus on infrastructure and Linux. Ten years architecture, ten billion daily flow, and you discuss the world of high concurrency, give you a different taste. My personal wechat xjjdog0, welcome to add friends, further communication.