1. Introduction

ELK has been a widely used log retrieval platform for several years and is one of the mainstream solutions in the industry. However, because the ELK cluster involves many different services, it is difficult for beginners to get started, so I wrote this tutorial, hoping to complete a breakthrough from 0 to 1 for beginners through this tutorial, and build a set of usable environment with the simplest configuration and minimal deployment scheme. At the same time, in this process, the novice can gradually establish a preliminary understanding of ELK technology stack, and find a direction in the fog.

2. Introduction to the experimental environment

2.1 architecture diagram

2.2 Cluster Version Information

  • Operating system Version: Ubuntu 20.04.2 LTS (GNU/Linux 5.4.0-80-generic x86_64)
  • Elasticsearch version: 7.13.4
  • Logstash version: 7.14
  • Kibana version: 7.14
  • Filebeat version: 7.14

2.3 Solution Overview

  1. Access logs generated by the Nginx server are sent to the Logstash via Filebeat.
  2. Actions on the collected logs are defined in the Logstash configuration file and archived to ES.
  3. The user accesses Kibana through a browser and reads data from ES.
  4. All servers communicate with each other and reside on the same subnet.

2.4 Experimental Objectives

  1. Nginx, Filebeat, Logstash, ES, Kibana all work well.
  2. Each service can fulfill its task according to the most basic requirements.
  3. Logs can be accessed and retrieved normally in Kibana.

3. Setting up the environment

3.1 Nginx deployment

This test environment uses the test environment of this Hexo website. The setup of Nginx service is relatively simple and will not be described in detail.

3.2 ElasticSearch Service Construction

Based on the dependencies between the various services, we first install ES that does not need to depend on other services. This is a common practice when building a cluster, starting with the database.

In this article, we installed ES using Ubuntu’s own APT package management tool.

3.2.1 installation Elasticsearch

The official document: www.elastic.co/guide/en/el…

# Method 1, use APT to install online

Install the PGP Key for Elasticsearch
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

sudo apt-get install apt-transport-https

Add elasticSearch repository info to apt configuration file
echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-7.x.list

Install ElasticSearch. In this article, version 7.13.4 is installed
sudo apt-get update && sudo apt-get install elasticsearch


# method 2, install using deb packageWget wget HTTP: / / https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.13.4-amd64.deb https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.13.4-amd64.deb.sha512Verify file integrityShasum -a 512 -c elasticSearch -7.13.4-amd64.deb.sha512# installationSudo DPKG -i elasticsearch - 7.13.4 - amd64. DebCopy the code
2.2.2 configure Elasticsearch

Use after completion of the installation package to install, the default configuration file stored in the/etc/elasticsearch, in this case, according to the most simplified deployment, we only adjust the/etc/elasticsearch/elasticsearch yml, sample configuration is as follows:

cluster.name: rc-application
node.name: node-rc-1
path.data: /data/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 192.168. 0212.
http.port: 9200
cluster.initial_master_nodes: ["node-rc-1"]
Copy the code
3.2.3 start Elasticsearch

In this example Ubuntu distribution, the Systemd startup project is automatically created after installation as follows:

Set boot up
sudo /bin/systemctl daemon-reload
sudo /bin/systemctl enable elasticsearch.service

# start/stop/restart operates like any other system service
sudo systemctl start elasticsearch.service
sudo systemctl stop elasticsearch.service
sudo systemctl restart elasticsearch.service
Copy the code
2.2.4 Verifying Elasticsearch Is normal
  • Viewing Service Status

    root@ES:~# systemctl status elasticsearch.service Low elasticsearch. Service - elasticsearch the Loaded: the Loaded (/ lib/systemd/system/elasticsearch. Service; disabled; vendor preset: enabled) Active: active (running) since Tue 2021-08-03 08:07:57 UTC; 25min ago Docs: https://www.elastic.co Main PID: 10100 (java) Tasks: 61 (limit: 9448) Memory: 4.2g CGroup: / system. Slice/elasticsearch service ├ ─ 10100 / usr/share/elasticsearch/JDK/bin/Java - Xshare: auto - Des.net workaddress. Cache. TTL = 60 - Des. Ne > └ ─ 10303 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller Aug 03 08:07:25 ES systemd[1]: Starting Elasticsearch... Aug 03 08:07:57 ES systemd[1]: Started Elasticsearch.Copy the code
  • Check port usage. ES uses port 9200 by default

    root@ES:~# netstat -tunpl | grep 9200Tcp6 00 192.168.0.212:9200 :::* LISTEN 10100/ JavaCopy the code
  • To query the ES service status, run the curl command to query the root directory of the ES service

    root@ES:~# curl http://192.168.0.212:9200
    {
      "name" : "node-rc-1"."cluster_name" : "rc-application"."cluster_uuid" : "y_q0vv9vQ3K7ukSdl0fO7g"."version" : {
        "number" : "7.13.4"."build_flavor" : "default"."build_type" : "deb"."build_hash" : "c5f60e894ca0c61cdbae4f5a686d9f08bcefc942"."build_date" : "The 2021-07-14 T18:33:36. 673943207 z"."build_snapshot" : false."lucene_version" : "8.8.2"."minimum_wire_compatibility_version" : "6.8.0"."minimum_index_compatibility_version" : "6.0.0 - beta1"
      },
      "tagline" : "You Know, for Search"
    }
    Copy the code

3.3 installation Kibana

Like Elasticsearch, we continue to install using the official package.

3.3.1 Installation Process

The official document: www.elastic.co/guide/en/ki…

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
sudo apt-get install apt-transport-https
echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
sudo apt-get update && sudo apt-get install kibana
Copy the code
3.3.2 Kibana Configuration File/etc/kibana/kibana.yml

The configuration example is as follows:

server.port: 5601
server.host: "192.168.0.213"
server.name: "my-kibana"
elasticsearch.hosts: "http://192.168.0.212:9200"	Address of elasticSearch
logging.dest: /var/log/kibana/kibana.log
logging.verbose: false
Copy the code
3.3.3 start kibana
root@Kibana:/etc/kibana# systemctl start kibana.service 
root@Kibana:/etc/kibana# systemctl status kibana.service Low kibana. Service - kibana the Loaded: the Loaded (/ etc/systemd/system/kibana. Service; disabled; vendor preset: enabled) Active: active (running) since Tue 2021-08-03 08:49:27 UTC; 3s ago Docs: https://www.elastic.co Main PID: 9978 (node) Tasks: 18 (limit: 9448) Memory: 147.8 M CGroup: / system. Slice/kibana service ├ ─ 9978 / usr/share/kibana/bin /.. /node/bin/node /usr/share/kibana/bin/.. / SRC/cli/dist -- logging. Dest = > └ ─ 9996 / usr/share/kibana/node/bin/node - preserve - symlinks - the main - preserve - symlinks /usr/share/kiba> Aug 03 08:49:27 Kibana systemd[1]: Started Kibana.Copy the code
We verify kibana

Use the browser, enter kibana address http://192.168.0.213:5601, if the page can be normal open, the service is normal.

3.4 Logstash Service Establishment

In the debugging of ELK cluster, I think Logstash is the most difficult one. Let’s do it step by step.

3.4.1 track installation Logstash

Use APT installation according to official documentation guidelines

The official document: www.elastic.co/guide/en/lo…

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
sudo apt-get install apt-transport-https
echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
sudo apt-get update && sudo apt-get install logstash
Copy the code
3.4.2 configuration Logstash

The official document: www.elastic.co/guide/en/lo…

3.4.2.1 Configuring the Logstash Service/etc/logstash/logstash.yml

Here is the configuration example:

path.data: /data/logstash	Verify that the path exists in advance and check permissions to allow the logstash user to read and write
pipeline.workers: 2
pipeline.batch.size: 125
path.config: /etc/logstash/conf.d/*.conf	This is the default configuration for logStash, which is explicitly configured here
path.settings: /etc/logstash				This is the default configuration for logStash, which is explicitly configured here
config.test_and_exit: false
http.host: 192.168. 0211.
http.port: 9600- 9700.
log.level: debug
path.logs: /var/log/logstash
Copy the code
3.4.2.2 Configuring the Logstash Pipeline/etc/logstash/conf.d/test.conf

Let’s start with a simple configuration to test if Logstash can push content to ES properly.

input { stdin { } }

output {
  elasticsearch { hosts => ["192.168.0.212:9200"] }
  stdout { codec => rubydebug }
}
Copy the code
3.4.3 Start and test Logstash
  1. Start the logstash command line

    /usr/share/logstash/bin/logstash  -f /etc/logstash/conf.d/test.conf 
    Copy the code
  2. With stdin input, look at logstash. If you enter “This is a test”, the following output is displayed:

    {
        "@timestamp" => 2021-08-03T10:33:10.584Z,
          "@version"= >"1"."message"= >"this is a test"."host"= >"Logstash"
    }
    Copy the code
  3. Enter ES to view the content. You can find that ES has stored the test information in the database

    Check es index
    curl -s "192.168.0.212:9200 / _cat/indices? v"health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open Logstash -2021.08.03-000001 YIqdbC81Tq2zIulxfFUI4w 1 13 0 13.4 KB 13.4 KB Green open. apm-custom-link NnAWZ3THRWiHqJf2_B830Q 1 0 0 0 208b 208b green open .apm-agent-configuration YOjjGNX9SaacWHnJshD6Sw 1 0 0 0 208b 208b Green open. kibana_7.13.4_001 - Event-log - 7.13.4-00000_uadzi -GSBmRc9inJd1eJg 1 01 0 5.6 KB 5.6 KB Green open. kibanA_7.13.4_001 Lxjp8ppnrrsdeo8-1ape7a 10 18 0 2.1 MB 2.1 MB green open. kibana_task_manager_7.13.4_001 yotlqk02SH6P6xISCZZw-g 10 10 2062 270.3 KB of 270.3 KBFind a new index logstash-2021.08.03-000001
    curl -XGET '192.168.0.212:9200 / logstash - 2021.08.03-000001 / _doc / _search /? pretty'
    {
      "took" : 5,
      "timed_out" : false."_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped": 0."failed": 0}."hits" : {
        "total" : {
          "value": 3."relation" : "eq"
        },
        "max_score": 1.0."hits": [{"_index" : "The logstash - 2021.08.03-000001"."_type" : "_doc"."_id" : "5_yTC3sB-S8uwXpkCUFc"."_score": 1.0."_source" : {
              "@timestamp" : "The 2021-08-03 T10: spoke. 584 z"."@version" : "1"."message" : "this is a test"."host" : "Logstash"}}}]}}Copy the code
  4. After confirming that the Logstash process is working properly, we can exit the Logstash process by CTRL + C. Later, we will intertune with Filebeat.

3.5 Installing Filebeat for the Nginx Server

ELK(ElasticSearch-Logstash – Kibana) has been deployed through the above steps, but in common technical practice, because Logstash relies on the Java environment to run, it is often not installed directly on the business machine. In mainstream solutions, FileBeat is installed on the business machine for log transmission.

3.5.1 track of installation Filebeat

The official document: www.elastic.co/guide/en/be…

The curl - L - O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.13.4-amd64.deb sudo DPKG -i Filebeat 7.13.4 - amd64. DebAfter the installation is complete, the systemCTL project will be automatically created
root@hexo:/tmp# systemctl status filebeat● Filebeat. Service - Filebeat sendslog files to Logstash or directly to Elasticsearch.
     Loaded: loaded (/lib/systemd/system/filebeat.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
       Docs: https://www.elastic.co/beats/filebeat
Copy the code
3.5.2 configuration Filebeat

To collect nginx access.log as an example, we use the following configuration example:

# /etc/filebeat/filebeat.yml

Configure fileBeat's own run log output
logging.level: info
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat
  keepfiles: 7
  
# fileBeat log collection and output configuration
filebeat.inputs:
- type: log
  enabled: true
  paths:
   - /var/log/nginx/hexo_access.log

setup.template.settings:
  index.number_of_shards: 1

output.logstash:
  # Enter the address and port of the logstash. Later, we will synchronize the logstash
  hosts: ["192.168.0.211:5400"]
Copy the code
3.5.3 Filebeat start

You can start fileBeat by using systemctl start filebeat.service. However, the connection between FileBeat and logstash has not been completed yet, so fileBeat will report an error.

2021-08-03T22:12:45.795+0800 ERROR [publisher_pipeline_output] Pipeline/pipeline_output :154 Failed to connect to Backoff (async(TCP ://192.168.0.211:5400)): dial TCP 192.168.0.211:5400: connect: connection refusedCopy the code

4. The alignment

After the above steps, we have completed the construction of ELK + Filebeat + Nginx typical business cluster. Next, we need to continue with the specific joint investigation to bring ELK into play and implement log retrieval and statistics functions.

4.1 Combination of FileBeat and Logstash

As you can see from the architecture diagram, Filebeat sends the log to the Logstash, which then sends the log to the ES. Let’s focus on the Logstash.

4.1.1 Modifying the Logstash Configuration
  1. Used for testing the configuration of/etc/in front of the first logstash/conf. D/test. Delete the conf.

  2. Create a new test configuration/etc/logstash/conf. D/nginx – es. Conf

    input {
            beats {
                    host => "0.0.0.0"}} output {elasticSearch {hosts => ["192.168.0.212:9200"] 
                    index => "rc_index_pattern-%{+YYYY.MM.dd}"}}Copy the code
  3. Run the systemctl start logstash. Service command to start the logstash service.

4.1.2 Verification Symptom
  1. Use a browser to access the Nginx service and have access_log produce content.

  2. Filebeat log/var/log/filebeat filebeat generated content is as follows:

    The T16 2021-08-06:56:02. 801 + 0800 INFO (the registrar) the registrar/registrar. Go: 109 States the Loaded from the registrar: 1 2021-08-06T16:56:02.801+0800 INFO [crawler] beater/crawler. Go :71 Loading Inputs: 1 2021-08-06T16:56:02.802+0800 INFO log/input.go:157 Configured Paths: [/var/log/nginx/hexo_access.log] 2021-08-06t16:56:02.802 +0800 INFO [crawler] beater/crawler.go:141 Starting input (ID: 1621-08-06T16:56:02.802 +0800 INFO [crawler] beater/crawler. Go :108 Loading and starting Inputs Completed. Enabled Inputs: 1 2021-08-06T16:56:02.802+0800 INFO log/ Harvester. Go :302 Harvester started for file: Log 2021-08-06t16:56:03.803 +0800 INFO [publisher_pipeline_output] Pipeline /output.go:143 Connecting to Backoff (ASYNc (TCP ://192.168.0.211:5400)) 2021-08-06T16:56:03.803+0800 INFO [Publisher] pipeline/retry.go:219 retryer: Send unwait Signal to Consumer 2021-08-06T16:56:03.803+0800 INFO [Publisher] Pipeline/Retry.go :223 done 2021-08-06T16:56:03.803+0800 INFO [publisher_pipeline_output] Pipeline/pipeline_output :151 Connection to Backoff (async (5400) TCP: / / 192.168.0.211:) establishedCopy the code

    The /var/log/nginx/hexo_access.log log has been successfully pushed to the logstash.

  3. The index RC_INDEx_PATTERn-2021.08.06 has been automatically created and Nginx log content has been created

    The curl - XGET '192.168.0.212:9200 / rc_index_pattern - 2021.08.06 / _doc _search /? Pretty 'rondo@ES:~$curl -XGET '192.168.0.212:9200/rc_index_pattern-2021.08.06/_doc/_search/? pretty' { "took" : 728, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : , "hits" : {0} "total" : {" value ": 76, the" base ":" eq "}, "max_score" : 1.0, "hits" : [{" _index ": "Rc_index_pattern - 2021.08.06 _type", "" :" _doc ", "_id" : "0 _yngnsb - S8uwXpkwkJF", "_score" : 1.0, "_source" : {" host ": {" name ":" hexo} ", "@ timestamp" : "the 2021-08-06 T08:50:04. 192 z", "input" : {" type ":" log "}, "tags" : [ "beats_input_codec_plain_applied" ], "message" : "202.105.107.186 - - [06/Aug/ 2020:16:49:57 +0800] \"GET/CSS /main. CSS HTTP/1.1\" 200 56117 \"http://192.168.0.125:80/\" \"Mozilla/5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36\", "@version" : "1", "log" : { "offset" : 197, "file" : { "path" : "/var/log/nginx/hexo_access.log" } }, "agent" : { "id" : "D2f43da1-5024-4000-9251-0bcc8fc10697 "," Type ":" fileBeat ", "version" : "7.13.4", "hostname" : "hexo", "ephemeral_id" : "77089197-d22d-497f-a064-76fbb2e369b3", "name" : "hexo" }, "ecs" : { "version" : "1.8.0"}}},......Copy the code

4.2 Kibana set

  1. Click the ≡ icon in the upper left corner of the page and click Stack Management at the bottom of the pop-up menu

  2. From the Index Patterns menu, click Create Index Pattern

  3. The page will display the existing index, input the desired pattern in the input box, match the index you want, click Next step

  4. Set the timestamp used by index and create index pattern.

  5. After the index Pattern is created, click the menu bar in the upper left corner to enter Discover

  6. You can view the index by selecting the appropriate time range in the upper left corner of the page.

  7. Click the > button to the left of each message to view the message details, which will include details of the Nginx log.

5. To summarize

After the previous series of operations, we got an ELK cluster, and simulated the actual production scenario, showing the process from the business log (NGINX) generation to the repository ES, forming a closed loop of a log platform. But this is just an embryonic form, and we will continue to optimize this cluster in the future, which will involve the configuration optimization of various services and horizontal expansion and other scenarios. We will wait and see.

Day1 — Build a minimal ELK cluster and collect Nginx logs