Apache Kylin starter series directory

  • Introduction to Apache Kylin 1 – Basic Concepts
  • Getting Started with Apache Kylin 2 – Principles and Architecture
  • Apache Kylin Getting started 3 – Details of installation and configuration parameters
  • Apache Kylin Starter 4 – Building the Model
  • Apache Kylin Starter 5 – Build Cube
  • Apache Kylin Starter 6 – Optimizing Cube
  • Construct Kylin query time monitoring page based on ELKB

A list,

ELKB refers to Elasticsearch, Logstash, Kibana, and Beats. Kylin logs are collected by FileBeat, distributed to Logstash for filtering, and finally written into ES. Kinaba can be used to quickly build a series of charts. Through aggregation analysis of Kylin query logs, a monitoring page of its query indicators can be built from multiple dimensions.

Second, role allocation

role IP port
Elasticsearch 192.168.3.214 9200
Logstash 192.168.3.213 5044
Kibana 192.168.3.214 5601
Beats 192.168.3.213 5044

Install and configure ElasticSearch

Create Elasticsearch 6.3 search cluster on CentOS 7.4

1. Kibana installation and configuration

Kibana installation is relatively simple, mainly divided into the following steps:

  1. Download the tar installation file (version 6.4.2 is used here).
  2. Unzip to the specified directory:Tar -zxvf kibana-6.4.2-linux-x86_64.tar.gz -c /opt/;
  3. Modifying a Configuration File/ opt/kibana - 6.4.2 - Linux - x86_64 / config/kibana yml;
  4. Kibana start:/ opt/kibana - 6.4.2 - Linux - x86_64 / bin/kibana.

2. Kibana configuration

Kibana does not need to do too much configuration, just need to set the node information and ES connection information.

server.port: 5601
server.host: "192.168.3.214"
server.name: "kibana-edps"
elasticsearch.url: "http://192.168.3.214:9200"
Copy the code

4. FileBeat installation and configuration

1. Install FileBeat

CentOS 7 can be installed using RPM packages.

The curl - L - O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.4.2-x86_64.rpm sudo RPM - vi Filebeat 6.4.2 - x86_64. RPMCopy the code

For other system installation methods, see the official documents.

2. Configure FileBeat

After the success of the installation, use RPM way FileBeat configuration file path for: / etc/FileBeat/FileBeat yml, open the modified configuration file:

Configure Filebeat input
filebeat.inputs:
- type: log
  Enable log collection
  enabled: true
  Set the log pathPaths: - / opt/apache - kylin - 2.4.0 - bin - cdh57 / logs/kylin log# set the rows to exclude (re matches will be discarded)
  #exclude_lines: ['^DBG']
  # Set the row to contain (regex match)
  include_lines: ['Query Id: ']
  # Set files to exclude (re match)
  #exclude_files: ['.gz$']

  # Additional static fields
  #fields:
  # level: debug
  # review: 1

  Set the split regex for the log
  multiline.pattern: '\d{4}-\d{2}-\d{2}\s*\d{2}:\d{2}:\d{2},\d{3}\s*\w+\s*\['
  multiline.negate: true
  multiline.match: after

#==================== Elasticsearch template setting ==========================
Disable automatic template loading
setup.template.enabled: false
#setup.template.name: "log"
#setup.template.pattern: "log-*"
#setup.dashboards.index: "log-*"
#setup.template.settings:
# index.number_of_shards: 3
# index.number_of_replicas: 0
# index.codec: best_compression
# _source.enabled: false

#============================== Kibana =====================================
setup.kibana:
  # Kibana address
  host: "192.168.3.214:5601"

#-------------------------- Elasticsearch output ------------------------------
# use ES as output
#output.elasticsearch:
  # hosts: [192.168.3.214:9200 ""]
  #index: "log-kylin-cdh3"  
  
  # Optional protocol and basic auth credentials.
  #protocol: "https"
  #username: "elastic"
  #password: "changeme"

#----------------------------- Logstash output --------------------------------
# Use LogStash as output
output.logstash:
  hosts: ["192.168.3.213:5044"]

#============================== Xpack Monitoring ===============================
# Set monitoring information
xpack.monitoring:
  enabled: true
  elasticsearch:
    hosts: ["http://192.168.3.214:9200"]
    username: beats_system
    password: beatspassword
Copy the code

3. Some notes on the configuration file

  1. Filebeat modules mostly need the Ingest node of ES for secondary processing of data;
  2. The multiline section is particularly important for logging. It is important to set the split regular expression between logs.
  3. Whether using ES or Logstash as output, Filebeat supports multiple addresses, which helps with load balancing.

5. Logstash Installation and configuration

The Logstash setup is simple:

  1. Download the Logstash installation file, 6.4.2 Download address;
  2. Unzip to the specified folder:Tar -zxvf logstash-6.4.2.tar.gz -c /opt/.

1. Configure the Logstash

Above, Kylin’s log is sent to the Logstash using FileBeat. Here, we need to use the Logstash to filter the log and write it to ES.

First, create the configuration file vi /opt/logstash-6.4.2/config/kylin_log.conf for kylin log processing:

input {
  beats {
    port => 5044
  }
}

filter {
  grok {
    match => {"message"= >"(? 
      
       [^,]+),[\s\S]+? Query Id:\s*(? 
       
        \S+)\s*SQL:\s*(? 
        
         [\s\S]+?) \nUser:\s*(? 
         
          [\s\S]+?) \nSuccess:\s*(? 
          
           [\s\S]+?) \nDuration:\s*(? 
           
            [\s\S]+?) \nProject:\s*(? 
            
             [\s\S]+?) \n[\s\S]+? \nStorage cache used:\s*(? 
             
              [\s\S]+?) \n[\s\S]+"
             
            
           
          
         
        
       
      }
    remove_field => [ "message"."tags"."@timestamp"."@version"."prospector"."beat"."input"."source"."offset"."host"]  
  }
  date{
    match=>["query_dtm"."YYYY-MM-dd HH:mm:ss"."ISO8601"]
    target=>"sql_dtm"
  }
}

output {
  elasticsearch { 
    hosts => ["192.168.3.214:9200"]
    index => "log-kylin-cdh3"
    document_id => "%{query_id}"
  }
  stdout {}
}
Copy the code

2. Some notes on the configuration file

  1. Gork’s regular expression is used to match Kylin’s query SQL, and the query ID, time, cache hit, SQL statement, USER and other information are obtained.
  2. Remove useless fields passed by FileBeat:"message", "tags", "@timestamp", "@version", "prospector", "beat", "input", "source", "offset", "host";
  3. Since the LogStash time is UTC by default, a new UTC time field is added using the Date plugin (sql_dtm), while preserving the original time (query_dtm);
  4. ES output and console output are used (for ease of monitoring);
  5. For whitespace in the log, the console output uses it\nShow that the regular expression must not be written as\\n;
  6. Gork regular expressions do not need to be escaped (very important).

After all configurations are complete, start the Logstash command: /opt/ Logstash -6.4.2/bin/ Logstash -f /opt/ Logstash -6.4.2/config/kylin_log.conf.

Vi. Kibana monitoring page

1, create index match

First login Kibana: http://192.168.3.214:5601, after landing successful indexed matching: Management – > Kibana – > Index Patterns – > Create Index Pattern.

The date field must be set to UTC time; otherwise, the date may not correspond to the time in later query.

2. Query log details

After the index rule is established, you can click Discover on the left to view log details.

3. Create visual components

If the log is normal, everything is fine so far, so start setting up the Visualize component by clicking the Visualize Component TAB on the left menu to go to the Visualize Component page (which has a bunch of preset components by default, so they’re all deleted).

It is not necessary to explain the construction steps of visual components. You only need to know the aggregation function of ES, so you can use it better. The aggregation function that you know is recommended:

  1. Count, Min, Max, Avg (often used);
  2. Date Histogram and Data Range (fixed Date interval and time Range);
  3. Histogram, Range (fixed numeric interval and Range);
  4. Terms (very suitable for word clouds, pie charts, etc.)

4. Create a dashboard

Click Dashboard on the left menu bar to open the Dashboard management interface, click “Create New Dashboard” button to Create a new Dashboard, click “ADD” button on the upper right to ADD the newly created components, then drag and drop to adjust the size and layout, and finally save.

7. Read more

  • Elasticsearch aggregation is fully understood
  • FileBeat Multiline
  • Logstash timestamp conversion
  • Time processing (Date)
  • Logstash best practice

Any Code, Code Any!

Scan code to pay attention to “AnyCode”, programming road, together forward.