What makes a log analysis platform?


With the growth of the service volume, the service server will generate hundreds of millions of logs every day, and the single log file is several GB. At this time, we found that the cat grep AWK analysis tool of Linux was becoming more and more inadequate. Besides the server logs, there were also program error logs distributed on different servers, which was cumbersome to search.


Pain points to be addressed:

1. A large number of different types of logs become the burden of operation and maintenance personnel, which is inconvenient to manage;


2. A single log file is huge and cannot be analyzed by common text tools, making retrieval difficult;


3. Logs are distributed on multiple servers. Once a service failure occurs, you need to view logs on one server.


In order to solve the above problems:


Next, we will build the log analysis platform step by step. The architecture diagram is as follows:

Architecture interpretation: (The entire architecture is divided into 5 layers from left to right)


The first layer, data acquisition layer

On the left is the business server cluster, where FileBeat is installed to collect logs and send the collected logs to the two Logstash services.


The second layer, data processing layer, data cache layer

The Logstash service formats the received logs and forwards them to the local Kafka Broker + ZooKeeper cluster.


Third layer, data forwarding layer

This single Logstash node pulls data from the Kafka Broker cluster in real time and forwards it to the ES DataNode.


The fourth layer, persistent data storage

The ES DataNode writes the received data to disks and builds an index library.


The fifth layer, data retrieval, data display

ES Master + Kibana mainly coordinates ES cluster, handles data retrieval request and data display.


In order to save valuable server resources, the author combined some separable services in the same host. Everyone is free to split and extend the architecture according to their actual business environment.


Open work!


Operating system environment: CentOS Release 6.5


Role allocation for each server:

IP role Belonging to the cluster
10.10.1.2 Business server + FileBeat Business server cluster
10.10.1.30 Logstash+Kafka+ZooKeeper


Kafka Broker cluster

10.10.1.31 Logstash+Kafka+ZooKeeper
10.10.1.32 Kafka+ZooKeeper
10.10.1.50 Logstash Data forwarding
10.10.1.60 ES DataNode



Elasticsearch cluster

10.10.1.90 ES DataNode
10.10.1.244 ES Master+Kibana

Software package Version:


jdk-8u101-linux-x64.rpm

Logstash – 2.3.2. Tar. Gz

Filebeat – 1.2.3 – x86_64. RPM

Kafka_2. 11-0.10.0.1. TGZ

Zookeeper – 3.4.9. Tar. Gz

Elasticsearch – 2.3.4. RPM

Kibana – 4.5.3 – Linux – x64. Tar. Gz


Install and deploy the Elasticsearch cluster


Configure ES Master node 10.10.1.244


Install jdk1.8, elasticSearch -2.3.4


Oracle’s JDK download address: http://www.oracle.com/technetwork/java/javase/downloads/index.html

Elasticsearch: https://www.elastic.co/

Setup command # yum install JDK - 8 u101 - Linux - x64. RPM elasticsearch - 2.3.4. # RPM - y ES will be installed by default in/usr/share/elasticsearchCopy the code


2, system tuning, JVM tuning

# configuration system maximum number of open file descriptors vim/etc/sysctl. Conf fs. The file - Max = 65535 # largest open file descriptors vim configuration process/etc/security/limits the conf # End of file * Soft nofiles 65535 * hard nofiles 65535 # configure JVM memory vim/etc/sysconfig/elasticsearch ES_HEAP_SIZE = 4 g # the available memory is 8 g of the machineCopy the code


3. Write ES Master node configuration file

# /etc/elasticsearch/elasticsearch.yml # ---------------------------------- Cluster ----------------------------------- # Use a descriptive name for your cluster: cluster.name: bigdata # ------------------------------------ Node ------------------------------------ node.name: server1 node.master: true node.data: false # ----------------------------------- Index ------------------------------------ index.number_of_shards: 5 index.number_of_replicas: 0 index.refresh_interval: 120s # ----------------------------------- Paths ------------------------------------ path.data: /home/elk/data path.logs: /var/log/elasticsearch/elasticsearch.log # ----------------------------------- Memory ----------------------------------- bootstrap.mlockall: true indices.fielddata.cache.size: 50 MB # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - Network And HTTP -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - Network. Host: 0.0.0.0. HTTP port: 9200 # ------------------------------------ Translog ---------------------------------- index.translog.flush_threshold_ops: 50000 # --------------------------------- Discovery ------------------------------------ discovery.zen.minimum_master_nodes: 1 discovery.zen.ping.timeout: 200s discovery.zen.fd.ping_timeout: 200s discovery.zen.fd.ping.interval: 30s discovery.zen.fd.ping.retries: 6 discovery.zen.ping.unicast.hosts: [" 10.10.1.60:9300 ", "10.10.1.90:9300", "10.10.1.244:9300",] discovery. Zen. Ping. Multicast. Enabled: false # --------------------------------- merge ------------------------------------------ indices.store.throttle.max_bytes_per_sec: 100mbCopy the code


Note: If you do not need to create a path specified by path.data and path.logs, you need to grant permission to elasticSearch. (The same goes for ES DataNode.)


4, install head, KOPf, BigDesk open source plug-ins

There are two installation methods:

1. Use the plugin command of ES

# head
/usr/share/elasticsearch/bin/plugin install mobz/elasticsearch-head
# kopf
/usr/share/elasticsearch/bin/plugin install lmenezes/elasticsearch-kopf
# bigdesk
/usr/share/elasticsearch/bin/plugin install hlstudio/bigdeskCopy the code

2. Download the source code package of the plug-in by yourself


We through the plugin command to install the plug-in, it is installed on this path: / usr/share/elasticsearch/plugins


The mobz/ Elasticsearch-head plugin is actually a github address.

Github’s official website is github.com/mobz/elasti… You can copy it to a browser and open it to find the source repository for the plug-in.


Now that you know, if you want to find your own plugin, go to Github and search for a bunch of them. Select one of the following paths and use ES to install it.


If the installation fails, manually download the source package for the plug-in. After the package is decompressed, the entire directory is directly mv to the plug-in installation path of ES.

Is here: / usr/share/elasticsearch/plugins /


So how do you access the installed plug-ins?

http://ES_server_ip:port/_plugin/plugin_name

Example:

http://127.0.0.1:9200/_plugin/head/

http://127.0.0.1:9200/_plugin/kopf/


At this point, the ES Master has been configured.


Configure ES DataNode 10.10.1.60


The installation and system tuning methods are the same as above, plug-ins do not need to be installed, but the configuration file is different.


Writing configuration files

# ---------------------------------- Cluster ----------------------------------- # Use a descriptive name for your cluster: cluster.name: bigdata # ------------------------------------ Node ------------------------------------ node.name: server2 node.master: false node.data: true # ----------------------------------- Index ------------------------------------ index.number_of_shards: 5 index.number_of_replicas: 0 index.refresh_interval: 120s # ----------------------------------- Paths ------------------------------------ path.data: /home/elk/data,/disk2/elk/data2 path.logs: /var/log/elasticsearch/elasticsearch.log # ----------------------------------- Memory ----------------------------------- bootstrap.mlockall: true indices.fielddata.cache.size: 50 MB # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - Network And HTTP -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - Network. Host: 0.0.0.0. HTTP port: 9200 # ------------------------------------ Translog ---------------------------------- index.translog.flush_threshold_ops: 50000 # --------------------------------- Discovery ------------------------------------ discovery.zen.minimum_master_nodes: 1 discovery.zen.ping.timeout: 200s discovery.zen.fd.ping_timeout: 200s discovery.zen.fd.ping.interval: 30s discovery.zen.fd.ping.retries: 6 discovery.zen.ping.unicast.hosts: [" 10.10.1.244:9300, "] discovery. Zen. Ping. Multicast. Enabled: false # --------------------------------- merge ------------------------------------------ indices.store.throttle.max_bytes_per_sec: 100mbCopy the code


10.10.1.60 is also ready.


Configure another ES DataNode 10.10.1.90


Writing configuration files

# ---------------------------------- Cluster ----------------------------------- # Use a descriptive name for your cluster: cluster.name: bigdata # ------------------------------------ Node ------------------------------------ node.name: server3 node.master: false node.data: true # ----------------------------------- Index ------------------------------------ index.number_of_shards: 5 index.number_of_replicas: 0 index.refresh_interval: 120s # ----------------------------------- Paths ------------------------------------ path.data: /home/elk/single path.logs: /var/log/elasticsearch/elasticsearch.log # ----------------------------------- Memory ----------------------------------- bootstrap.mlockall: true indices.fielddata.cache.size: 50 MB # -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - Network And HTTP -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - Network. Host: 0.0.0.0. HTTP port: 9200 # ------------------------------------ Translog ---------------------------------- index.translog.flush_threshold_ops: 50000 # --------------------------------- Discovery ------------------------------------ discovery.zen.minimum_master_nodes: 1 discovery.zen.ping.timeout: 200s discovery.zen.fd.ping_timeout: 200s discovery.zen.fd.ping.interval: 30s discovery.zen.fd.ping.retries: 6 discovery.zen.ping.unicast.hosts: [" 10.10.1.244:9300, "] discovery. Zen. Ping. Multicast. Enabled: false # --------------------------------- merge ------------------------------------------ indices.store.throttle.max_bytes_per_sec: 100mbCopy the code


5. Now the three ES nodes are ready to start services respectively

# 10.10.1.244 / etc/init. D/elasticsearch start # 10.10.1.60 / etc/init. # d/elasticsearch start 10.10.1.90 /etc/init.d/elasticsearch startCopy the code


6. Access the HEAD plug-in to view the cluster status



The Elasticsearch cluster is ready to complete



2. Configure the ZooKeeper cluster at layer 2 in the architecture diagram


Configure the 10.10.1.30 node


1. Install and configure ZooKeeper

The zookeeper’s official website: http://zookeeper.apache.org/


RPM -ivh jdK-8u101-linux-x64. RPM # Tar xf zookeeper-3.4.9.tar.gz # tar xf zookeeper-3.4.9.tar.gzCopy the code


Writing configuration files

# conf/zoo.cfg # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, / TMP here is just # example sakes. dataDir=/u01/zookeeper/zookeeper-3.4.9/data # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients 12=10.10.1.31:2888:3888 server.13=10.10.1.32:2888:3888 # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir # autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature # autopurge.purgeInterval=1Copy the code


Synchronize the configuration file to the other two nodes

Note: In the ZooKeeper cluster, the configuration files of each node are the same. So you just sync it over and you don’t have to do anything.

Not familiar with they are friends, you can refer to here: http://tchuairen.blog.51cto.com/3848118/1859494

SCP zoo. CFG 10.10.1.31: / usr/local/zookeeper 3.4.9 / conf/SCP zoo. The CFG 10.10.1.32: / usr/local/zookeeper - 3.4.9 / conf /Copy the code


Create myid file

# 10.10.1.30 echo 11 >/usr/local/zookeeper-3.4.9/data/myid # 10.10.1.31 echo 12 >/usr/local/zookeeper-3.4.9/data/myid # 10.10.1.30 echo 12 >/usr/local/zookeeper-3.4.9/data/myid # 13 > 10.10.1.32 echo/usr/local/zookeeper - 3.4.9 / data/myidCopy the code


3. Start the service & View the node status

# 10.10.1.30 bin/ zkserver. sh start bin/ zkserver. sh status ZooKeeper JMX enabled by default Using config: / usr/local/they/they are - 3.4.9 / bin /.. /conf/zoo.cfg Mode: Leader # 10.10.1.31 bin/ zkserver. sh start bin/ zkserver. sh status ZooKeeper JMX enabled by default Using config: / usr/local/they/they are - 3.4.9 / bin /.. /conf/zoo.cfg Mode: Follower # 10.10.1.32 bin/ zkserver. sh start bin/ zkserver. sh status ZooKeeper JMX enabled by default Using config: / usr/local/they/they are - 3.4.9 / bin /.. /conf/zoo.cfg Mode: followerCopy the code


The ZooKeeper cluster configuration is complete


Configure the Kafka Broker cluster at the second level of the architecture diagram


Kafka website: http://kafka.apache.org/

Not familiar with Kafka’s friends can refer to: http://tchuairen.blog.51cto.com/3848118/1855090


Configure the 10.10.1.30 node

1. Install and configure Kafka

Tar xf kafka_2.11-0.10.0.1. TGZCopy the code


Writing configuration files

############################# Server Basics ############################# broker.id=1 ############################# Socket Server Settings ############################# num.network.threads=3 # The number of threads doing disk I/O num.io.threads=8 # The send buffer (SO_SNDBUF) used by the socket server socket.send.buffer.bytes=102400 # The receive buffer (SO_RCVBUF) used by the socket server socket.receive.buffer.bytes=102400 # The maximum size of a request that the  socket server will accept (protection against OOM) socket.request.max.bytes=104857600 ############################# Log Basics # # # # # # # # # # # # # # # # # # # # # # # # # # # # # the dirs = / usr/local/kafka/kafka_2. 11-0.10.0.1 / data num. Partitions = 6 num.recovery.threads.per.data.dir=1 ############################# Log Flush Policy ############################# # The number of messages to accept before forcing a flush of data to disk #log.flush.interval.messages=10000 # The maximum amount of time a message can sit in a log before we force a flush #log.flush.interval.ms=1000 ############################# Log Retention Policy ############################# log.retention.hours=60 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000 ############################# Zookeeper # # # # # # # # # # # # # # # # # # # # # # # # # # # # # to zookeeper. Connect = 10.10.1.30:2181,10.10. 1.31:2181,10.10. 1.32:2181 zookeeper.connection.timeout.ms=6000Copy the code


Note: The configuration files for the other two nodes are basically the same, with only one parameter to modify broker.id. It is used to uniquely identify nodes, so it must not be the same, otherwise nodes will conflict.


Synchronize the configuration file to the other two nodes

SCP server properties 10.10.1.31: / usr/local/kafka/kafka_2. 11-0.10.0.1 / config/SCP server properties 10.10.1.32: / usr/local/kafka/kafka_2. 11 - # 0.10.0.1 / config/modify broker. Id # 10.10.1.31 broker. Id = 2 # 10.10.1.32 broker. Id = 3Copy the code


2. Configure IP address resolution for the host name

Vim /etc/hosts 10.10.1.30 server1 10.10.1.31 server2 10.10.1.32 server3Copy the code


3. Start the service

Bin /kafka-server-start.sh config/server.properties # The other two nodes use the same boot modeCopy the code


The Kafka+ZooKeeper cluster is configured


Configure the Logstash service at the second level of the architecture diagram


Configure the 10.10.1.30 node


1. Install and configure logstash

# decompress tar xf logstash-2.3.2.tar.gzCopy the code


Configure GeoLiteCity to display the cities accessed by THE IP address on the map

Website address: http://dev.maxmind.com/geoip/legacy/geolite/

Download address: http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz


Unpack the

gunzip GeoLiteCity.dat.gz



Writing configuration files

input { beats { port => 5044 codec => "json" } } filter { if [type] == "nginxacclog" { geoip { source => "clientip" # And log in to access the address of the key to the corresponding target = > "geoip database" = > "/ usr/local/logstash/GeoLiteCity dat" add_field = > [ "[geoip][coordinates]","%{[geoip][longitude]}" ] add_field => [ "[geoip][coordinates]","%{[geoip][latitude]}" ] } mutate  { convert => [ "[geoip][coordinates]","float" ] } } } output { kafka { workers => 2 bootstrap_servers => "10.10.1.30:9092,10.10. 1.31:9092,10.10. 1.32:9092" topic_id = > "peiyinlog}}"Copy the code


2. Start the service

/usr/local/logstash/bin/logstash agent -f logstash_in_kafka.conf &Copy the code


10.10.1.31 The configuration of the node is the same as the preceding configuration. (abbreviated)

The Logstash configuration at the second layer, the data processing layer, is complete



5. Configure data collection layer, business server +Filebeat


1. Customize Nginx log format

log_format json '{"@timestamp":"$time_iso8601",' '"slbip":"$remote_addr",' '"clientip":"$http_x_forwarded_for",' '"serverip":"$server_addr",' '"size":$body_bytes_sent,' '"responsetime":$request_time,' '"domain":"$host",' '"method":"$request_method",' '"requesturi":"$request_uri",' '"url":"$uri",' '"appversion":"$HTTP_APP_VERSION",' '"referer":"$http_referer",' '"agent":"$http_user_agent",' '"status":"$status",' '"devicecode":"$HTTP_HA"}'; # in the virtual host configuration call access_log alidata/log/nginx/access/access. Log json;Copy the code


2. Install Filebeat

Filebeat is also a product of Elasticsearch. You can download Filebeat from Elasticsearch.

Yum install filebeat-1.2.3-x86_64. RPM -yCopy the code


3. Write the Filebeat profile

################### Filebeat Configuration Example ######################### ############################# Filebeat ###################################### filebeat: prospectors: - paths: - /var/log/messages input_type: log document_type: messages - paths: - /alidata/log/nginx/access/access.log input_type: log document_type: nginxacclog - paths: - /alidata/www/logs/laravel.log input_type: log document_type: larlog - paths: - /alidata/www/logs/500_error.log input_type: log document_type: peiyinlar_500error - paths: - /alidata/www/logs/deposit.log input_type: log document_type: lar_deposit - paths: - /alidata/www/logs/call_error.log input_type: log document_type: call_error - paths: - /alidata/log/php/php-fpm.log.slow input_type: log document_type: phpslowlog multiline: pattern: '^[[:space:]]' negate: true match: after registry_file: /var/lib/filebeat/registry ############################# Output ########################################## output: logstash: hosts: [10.26.95.215:5044 ""] # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Shipper # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Shipper: name: "host_6" ############################# Logging ######################################### logging: files: rotateeverybytes: 10485760 # = 10MBCopy the code


4. Start the service

/etc/init.d/filebeat startCopy the code


At the data collection layer, the Filebeat configuration is complete.


The log data on the business server is now being written to the cache.



Configure the third layer in the architecture diagram, the data forwarding layer


The Logstash setup is already covered.


Write the Logstash configuration file

# kafka_to_es. Conf input {kafka {zk_connect = > "10.10.1.30:2181,10.10. 1.31:2181,10.10. 1.32:2181" group_id = > "logstash" Topic_id => "peiyinlog" reset_beginning => false Consumer_threads => 50 decorate_events => true}} # Remove some unnecessary filter fields  { if [type] == "nginxacclog" { mutate { remove_field => ["slbip","kafka","domain","serverip","url","@version","offset","input_type","count","source","fields","beat.hostname","h ost","tags"] } } } output { if [type] == "nginxacclog" { # stdout {codec => rubydebug } elasticsearch { hosts => ["10.10.1.90:9200","10.10.1.60:9200"] index => "logstash-nginxacclog-%{+ YYYY.mm. Dd}" manage_template => true flush_size => 50000 idle_flush_time => 10 workers => 2 } } if [type] == "messages" { elasticsearch { hosts => ["10.10.1.90:9200","10.10.1.60:9200"] index => "logstash-messages-%{+ YYYy.mm. Dd}" manage_template => true flush_size => 50000 idle_flush_time => 30 workers => 1 } } if [type] == "larlog" { elasticsearch { hosts => ["10.10.1.90:9200","10.10.1.60:9200"] index => "logstash-larlog-%{+ YYYy.mm. Dd}" manage_template => true flush_size => 2000 IDLE_flush_time => 10}} if [type] == "deposit" {elasticSearch {hosts => ["10.10.1.90:9200","10.10.1.60:9200"] index => "logstash-deposit-%{+YYYY.MM.dd}" manage_template => true flush_size => 2000 idle_flush_time => 10 } } if [type] == "phpslowlog" {elasticSearch {hosts => ["10.10.1.90:9200","10.10.1.60:9200"] index => "logstash-phpslowlog-%{+YYYY.MM.dd}" manage_template => true flush_size => 2000 idle_flush_time => 10 } } }Copy the code


Start the service

/usr/local/logstash/bin/logstash agent -f kafka_to_es.conf &Copy the code


The data forwarding layer has been configured


At this point, the data has been gradually removed from Kafka and transferred to ES DataNode.


We log on to any Kafka host to see how the data is cached and consumed



Modify ES index template configuration


Why do I do this step? Because when logStash writes data to ES, it automatically selects an index template. We can take a look at that


This template is actually pretty good, but there’s one parameter that I’ve marked. “Refresh_interval “:”5s” This parameter controls the refresh frequency of the index. The faster the index refreshes, the data you search for is real time. This is 5 seconds. Generally, we don’t need such high real time for logging scenarios. This parameter can be appropriately reduced to improve the write speed of ES index library.


Upload a custom template

Curl -xput http://10.10.1.244:9200/_template/logstash2 -d '{"order":1, "template":"logstash-*", "settings":{ "index":{ "refresh_interval":"120s" } }, "mappings":{ "_default_":{ "_all":{ "enabled":false } } } }'Copy the code


Because of this custom template, I have defined order as a higher priority than the Logstash template, and the template has the same matching rules, so the configuration of this custom template overrides the original Logstash template.

I’m just going to describe it briefly. To understand this in more detail, check out my ES tuning article.


8. Configure Kibana data display layer


10.10.1.244 node

Kibana is a member of the ELK kit, also owned by ElasticSearch, available for download on elasticSearch.


The installation

Tar xf kibana-4.5.3-linux-x64.tar.gz # tar xf kibana-4.5.3-linux-x64.tar.gz #Copy the code


Modifying a Configuration File

# vim Kibana -4.5.3-linux-x64/config/ Kibana. Yml # Kibana is served by a back end server. This controls which port to use.  server.port: 5601 # The host to bind the server to. server.host: "0.0.0.0" # The Elasticsearch instance to use for all your queries. ElasticsearchCopy the code


Start the service

Open your browser and visit http://10.10.1.244:5601/


Customise the Index pattern of Elasticsearch


By default, Kibana thinks you want access to Elasticsearch data imported via Logstash, so you can use the default Logstash -* as your index pattern. The wildcard (*) matches any number of characters in the index name.


Select an index field with a timestamp (field type date) that can be used for time-based processing. Kibana will read the index

Map, and then list all the fields that contain the timestamp. If your index doesn’t have time-based data.

Turn off the Index contains time-based Events parameter.


If a new index is generated periodically and the index name is time-stamped, select the Use Event Times to create index names option

Then select Index Pattern Interval. This can improve search performance, Kibana will search the index within the time range you specify. This is especially useful if you use Logstash to export data to Elasticsearch.


Since our index is named by date, it is segmented by day. The index pattern as follows

The data show

After the work!