An overview of the


As a part of the system infrastructure, log processing plays an important role. ELK is commonly used as the log processing platform. ELK is short for ElasticSearch, Logstash, and Kibana. These three are the core suites. However, with the iteration of functionality and the replacement of components, ELKF also emerged, where the F stands for FileBeat. So when we say ELK, we generally refer to the various components associated with it

Why log collection?

When our system is not very complex, for example, if it is a single-machine architecture system, take a Spring Boot project as an example, we can use logback to cut logs and record the logs of the entire service operation process in a file for subsequent troubleshooting. But there are a lot of problems with that,

1. For example, the logs are on the server, and only the developers need to see the logs. Generally, the logs are managed by the server, so some friends will develop the log download function for the convenience of the developers. However, logs are generated in real time, and viewing logs by downloading files is not real-time

2. In a distributed system, multiple services may participate in the process of completing a service, and log files are scattered on each server. Therefore, logs need to be unified to facilitate troubleshooting

ELK provides such a solution. The whole idea is to collect logs through Logstash and send them to ElasticSearch, which is an efficient search engine. Then we use the official Kibana UI component. Get data from ElasticSearch

Next, we set up a set of ELK environment, all using Docker deployment, then collect the logs of K8S container into ES, process the logs through Logstash, and check the logs on the Kibana page. Let me summarize some of the core points.

Resource planning (erp)

ES cluster > 3 LogStash 1 (can share one with Kibana)

Since JVM parameters are fixed at container startup when deployed with Docker (although this can be adjusted by unconventional means, it is generally not recommended), we need to plan resources well in advance

It is recommended that a system with 100 million daily logs be configured with three nodes, each with 4C and 8 gb disks and 100 gb disks, because the data volume is not large

One very important configuration is that the es JVM maximum heap should not exceed half of the memory (official advice is not to set it to 64GB, even if the physical machine memory is 128GB).

The recommendation is that ES be deployed independently, with no other services preempting resources on this machine. Logstash is also deployed independently and can be deployed with Kibana if the server resources are sufficient

Install the Docker service in advance

ElasticSearch deployment

ElasticSearch can be easily expanded by using an official image. The data processing capability of the system can be improved by using the cluster capability. We deploy three nodes

Each version of the image address, reference ELK official docker mirror address: https://www.docker.elastic.co/

Server Configuration

Docker Pull ElasticSearch :7.5.0 Pull Image

Mkdir -p /data/es/data mkdir -p /data/es/config Error M.max_map_count [65530] Likely too low Max_map_count =262144 # sysctl -pCopy the code

Yaml configuration files and services start

The configuration file is created for each node: CD /data/es/config vim es.yml

Modify the node name node.name and gateway address network.publish_host for each node. Other configurations are the same for each node

Run the start command to adjust the heap size and file mapping as required

docker run -e ES_JAVA_OPTS="-Xms8g -Xmx8g" -d -p 9200:9200 -p 9300:9300 -v /data/es/config/es.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v /data/es/data:/usr/share/elasticsearch/data - the name es elasticsearch: 7.5.0 - restart = always

After the cluster starts properly, you can view the following statistics:

The curl - XGET 'http://10.90.x.x:9200/_cluster/stats? pretty'

# cluster name

cluster.name: es-cluster

# node name

node.name: node-a

Are you eligible to run for the primary node

node.master: true

# Whether to store data

node.data: true

# Maximum number of nodes in a cluster

node.max_local_storage_nodes: 10

# gateway address

network.host: 0.0. 0. 0

network.publish_host: 10.90.x.x

# port

http.port: 9200

# Communication port between internal nodes

transport.tcp.port: 9300

http.cors.enabled: true

http.cors.allow-origin: "*"

The new configuration after #es7.x writes the device address of the candidate master node, which can be selected as the master node after the service is enabled

discovery.seed_hosts: [10.90. 7.0.x.x: "9300", "10.90. 7.0.x.x: 9300", "10.90. 7.0.x.x: 9300"]

This configuration is added after #es7.x. This configuration is required when initializing a new cluster to elect the master, so that you can decide who the master is after the cluster starts

cluster.initial_master_nodes: ["node-a"]

# Data storage path

path.data: /usr/share/elasticsearch/data

# Log storage path

path.logs: /usr/share/elasticsearch/data/logs



# Optimize configuration



# request cache, default unlimited

indices.fielddata.cache.size: 30%

# 40%

# indices.breaker.fielddata.limit: 40%



# query cache, which defaults to 10% and is limited to 10%, belongs to the query cache

# indices.queries.cache.size: 10%



# common Space, default 70%

# indices.breaker.total.limit: 70%



# request AGG data, default 60%

# indices.breaker.request.limit: 1%



# write thread count setting, can be adjusted if error related exception is reported, default comment

thread_pool.write.queue_size: 1000



# Enable security verification, comment it out the first time you install

xpack.security.enabled: true

xpack.security.transport.ssl.enabled: true

xpack.license.self_generated.type: basic

xpack.security.transport.ssl.verification_mode: certificate

xpack.security.transport.ssl.keystore.path: /usr/share/elasticsearch/data/elastic-certificates.p12

xpack.security.transport.ssl.truststore.path: /usr/share/elasticsearch/data/elastic-certificates.p12



Enable monitoring data configuration so that cluster monitoring data can be viewed in Kibana

xpack.monitoring.collection.enabled: true

This parameter has no effect when configured with standard edition

xpack.monitoring.history.duration: 1d

Copy the code

Other configuration

# The number of shards allowed on each node. Suppose 3 shards are required on each node, and 2 shards will cause an error

index.routing.allocation.total_shards_per_node: 2



The default values for the shard and copy configurations are as follows:

index.number_of_shards: 5

index.number_of_replicas: 1

Copy the code

Once the service is deployed, security authentication can be added for security purposes, so that ES is not streaking

Configuring Security Authentication

To enter the container and execute the command, you need to enter the password twice

1. bin/elasticsearch-certutil ca

It’s going to tell you to put in a file name so let’s say elastice-stack-ca. p12 and it’s going to tell you to put in a password, and that’s done

2. bin/elasticsearch-certutil cert –ca elastic-stack-ca.p12

Take the file you just created and run the command

Type password authentication (elastic-stack-ca.p12), a file name, elastic-certification. p12, and a password

Now I have two files

elastic-certificates.p12

elastic-stack-ca.p12

Copy the code

3. Enter the password entered in 2 to run two more commands

bin/elasticsearch-keystore add xpack.security.transport.ssl.keystore.secure_password bin/elasticsearch-keystore add xpack.security.transport.ssl.truststore.secure_password

The password will be saved to the config directory in the elasticSearch. keystore file, which if deleted will prompt you to create it when executing the above two commands

4. Copy elasticsearch.keystore from the elastics-certificate. p12 file

The elasticSearch. keystore file is copied to each container, and the elastics-certificate. p12 is mapped to the host directory

SCP – r elastic – certificates. P12 [email protected]: / data/es/data

docker cp /data/es/data/elasticsearch.keystore es-cluster:/usr/share/elasticsearch/config/ docker cp /data/es/data/elasticsearch.keystore es:/usr/share/elasticsearch/config/

5. Modify the configuration file to grant permissions to 777

xpack.security.enabled: true xpack.security.transport.ssl.enabled: true xpack.license.self_generated.type: basic xpack.security.transport.ssl.verification_mode: certificate xpack.security.transport.ssl.keystore.path: /usr/share/elasticsearch/data/elastic-certificates.p12 xpack.security.transport.ssl.truststore.path: /usr/share/elasticsearch/data/elastic-certificates.p12

At this point, elastice-certificates. p12 is now available for every es and is given 777. The reference to elasticSearch. keystore in the configuration file is also copied to every ES but not mapped

Next, restart the ES

6. Initialize the password

After the es restart is complete, the password is required to access again

Enter the container of the master node to initialize the password

Bin /elasticsearch-setup-passwords Interactive you need to initialize the passwords for all 6 accounts and use these accounts for login in the future. Of course you can set the passwords to the same for convenience

Then restart each ES (just initialize the password on the master node).

7. Configure the password for Kibana

Due to the added security authentication, Kibana needs to be configured to connect to ES

Username: “kibana” elasticsearch.password: “password created for kibana account”

Modify the configuration file and restart the system

You’re going to have to log in with an Elastic account, and you’re going to report 403 with your Kibana account

Nodule 8.

Certificates can be reused so that you don’t have to create certificates every time you install a new cluster. There are three files in total

elastic-certificates.p12

elastic-stack-ca.p12

elasticsearch.keystore

Copy the code

P12 only exists in the container where the certificate is created. You need to enter a total of two passwords. In principle, you should use two passwords only for certificate creation

The ES plug-in is installed offline

Some plug-ins, such as Chinese word segmentation plug-ins, will be installed after ES deployment is complete

Plug-in offline installation, mainly to use the file resource path, and later version is not necessarily using plugin command, in the bin command can try their own command

Chinese word segmentation plug-in address: https://github.com/medcl/elasticsearch-analysis-ik/

Run the./bin/ elasticSearch -plugin install command in the container https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.5.0/elasticsearch-analysis-ik-7.5.0.zip

However, you may need to install the file offline, cp the file to the container, and specify the file by using the command file:///

bin/plugin install file:///usr/local/elasticsearch/elasticsearch-head-master.zip

After the installation is successful, restart the container

How to view ES installed plug-ins

There is some doubt about this, maybe the plugins of the higher version will not be shown, only their own installed (guess), ES comes with some plugins

http://esip address / _cat/plugins

http://xxx:9200/_cat/plugins

After all three devices are installed with Chinese word segmentation, the page feedback is as follows

Node-1 analysis-IK 7.5.0 Node-2 Analysis-IK 7.5.0 Node-3 analysis-IK 7.5.0Copy the code