“This is the 25th day of my participation in the Gwen Challenge in November. See details: The Last Gwen Challenge in 2021”

What is black box surveillance?

In the previous article, we mainly introduced how to conduct white-box monitoring under Prometheus, which monitors the resource usage and other operating data of the host. These are the infrastructures that support businesses and services. The white box allows you to understand the actual operating status of the inside of the white box, and the monitoring indicators allow you to anticipate possible problems and optimize potential uncertainties. From the perspective of complete global monitoring logic, in addition to applying a large number of white box monitoring, we should also add appropriate black box monitoring. Black box monitoring is to test the external visibility of services as users. Common black box monitoring includes HTT P probe and TCP probe, which are used to test the accessibility and access efficiency of sites or services.

Introduction to Blackbox_Exporter

Blackbox_exporter is an official blackbox monitoring solution provided by Prometheus, which operates a network through HTTP (S), DNS, TCP, and ICMP.

Github address: github.com/prometheus/…

Currently supported application scenarios:

  • The ICMP test
    • Main engine probe mechanism
  • TCP test
    • Business component port status monitoring
    • Application layer protocol definition and monitoring
  • HTTP test
    • Define Request Header information
    • Check Http status/Http Respones Header/Http Body
  • The POST test
    • Interface connectivity
  • SSL certificate expiration time
  • Custom tests (extensions)

Three, installation,

1. Binary package

Blackbox_exporter can be downloaded from github.com/prometheus/…

Using Linux as an example, download the compiled binary package, decompress it, and use:

# cd /data/blackbox_exporter/
# ./blackbox_exporter --versionBlackbox_exporter, version 0.16.0 (branch: HEAD, revision: 991 f89846ae10db22a3933356a7d196642fcb9a9) build user: root@64f600555645 build date: 20191111-16:27:24 Go version: go1.13.4 nohup./ Blackbox_exporters &Copy the code

2, docker

docker run -id --name blackbox-exporter -p 9115:9115  prom/blackbox-exporter
Copy the code

Four, the principle of use

Official explanation: github.com/prometheus/…

Running Blackbox_EXPORTER requires the user to provide configuration information about the probe, which can be custom HTTP headers, TLS configurations for the probe, or the authentication behavior of the probe itself. Each probe configuration is called a module in Blackbox_EXPORTER and is provided to Blackbox_exporter as a YAML configuration file. Each module contains the following configuration items, including probe type (ProBER), authentication access timeout period (Timeout), and specific configuration items of the current probe:

  # Probe type: HTTP HTTPS TCP DNS ICMP
  prober: <prober_string>

  # timeout
  [ timeout: <duration> ]  The default unit is seconds

  The detailed configuration of the probe can only be configured at most one of them
  [ http: <http_probe> ]
  [ tcp: <tcp_probe> ]
  [ dns: <dns_probe> ]
  [ icmp: <icmp_probe> ]
Copy the code

<http_probe> Configurable parameters:

  The status code accepted by this probe. The default value is 2xx.
  [ valid_status_codes: <int>, ... | default = 2xx ]

  The HTTP version accepted by this probe.
  [ valid_http_versions: <string>, ... ]

  # HTTP method the probe will use.
  [ method: <string> | default = "GET" ]

  # HTTP header set for probe.
  headers:
    [ <string>: <string> ... ]

  # whether the probe will follow any redirection.
  [ no_follow_redirects: <boolean> | default = false ]

  If SSL is present, the probe fails.
  [ fail_if_ssl: <boolean> | default = false ]

  If SSL does not exist, the probe fails.
  [ fail_if_not_ssl: <boolean> | default = false ]

  If the response body matches the regular expression, the probe fails.
  fail_if_body_matches_regexp:
    [ - <regex>, ... ]

  If the response body does not match the regular expression, the probe fails.
  fail_if_body_not_matches_regexp:
    [ - <regex>, ... ]

  If the response header matches the regular expression, the probe fails. For headers with multiple values, if * matches at least one *, it fails.
  fail_if_header_matches:
    [ - <http_header_match_spec>, ... ]

  If the response header does not match the regular expression, the probe fails. For headers with multiple values, if * None * does not match, it fails.
  fail_if_header_not_matches:
    [ - <http_header_match_spec>, ... ]

  Configure TLS protocol for HTTP probe.
  tls_config:
    [ <tls_config> ]

  HTTP basic authentication credentials for the target.
  basic_auth:
    [ username: <string> ]
    [ password: <secret> ]
    [ password_file: <filename> ]

  The host token of the target.
  [ bearer_token: <secret> ]

  The host token file for the target.
  [ bearer_token_file: <filename> ]

  The HTTP proxy server used to connect to the target.
  [ proxy_url: <string> ]

  # IP protocol for HTTP probes (IP4, IP6)
  [ preferred_ip_protocol: <string> | default = "ip6" ]
  [ ip_protocol_fallback: <boolean> | default = true ]

  The body of the HTTP request used in the probe.
  body: [ <string> ]
Copy the code

Five, several application scenarios

1. ICMP test (host probe)

You can check whether the server is alive by ping(icmp) and configure the use of icmp module in the blackbox. Yml configuration file:

modules:
  icmp:
    prober: icmp
Copy the code

The Prometheus configuration file is as follows:

  - job_name: 'blackbox-ping'
    metrics_path: /probe
    params:
      modelus: [icmp]
    static_configs:
    - targets:
      - 172.16106.208.  IP address of the monitored end
      - 172.16106.80.
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 172.16106.84.: 9115  # BlackBox - My exporter is on the machine and port
Copy the code

The collection task named Blackbox-ping is configured, and the probe (Module) and target (target) to be used are specified by params. The problem then is that if we have N target sites and all require M probes, then Prometheus would have N * M acquisition tasks, which is obviously unacceptable from a configuration management perspective. In previous articles we introduced Prometheus’ Relabeling capabilities, and here we can simplify these configurations in the form of Relabling. Here, a collection task is defined for each probe service (such as ICMP), and the collection target of the task is directly defined as the site we want to explore. The collection task is dynamically set by relabel_configs before sample data collection.

  • Step 1, according totargetInstance address, write__param_targetTag.__param_<name>Form tag, which will be added to the request target address when collecting tasks<name>Parameter, equivalent toparamsThe Settings;
  • Step 2, get__param_targetTo the value of and overwriteinstanceIn the label;
  • Step 3, overwritetargetThe instance__address__Label value isblackbox_exporterThe access address of the instance.

The above three Relabel steps greatly simplify the configuration of Prometheus tasks.

2. TCP test (monitor host port survival status)

Configure the use of TCP Module in the blackbox. Yml configuration file:

modules:
  tcp_connect:
    prober: tcp
Copy the code

The Prometheus configuration file is as follows:

  - job_name: 'blackbox-tcp'
    metrics_path: /probe
    params:
      modelus: [tcp_connect]
    static_configs:
    - targets:
      - 172.16106.208.: 6443
      - 172.16106.80.: 6443
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 172.16106.84.: 9115
Copy the code

3. HTTP detection (monitoring website status)

HTTP probes are one of the most commonly used probes for black box monitoring. HTTP probes can be used to establish effective monitoring of websites or HTTP services, including their own availability and user experience related issues such as response time. In addition to timely alarm when abnormal service occurs, it can also help operation and maintenance students to analyze and optimize the website experience.

Configure the use of HTTP Module in the Blackbox. Yml configuration file:

modules:
  http_2xx:
    prober: http
    http:
      method: GET
  http_post_2xx:
    prober: http
    http:
      method: POST
Copy the code

The Prometheus configuration file is as follows:

  - job_name: 'blackbox-http'
    metrics_path: /probe
    params:
      modelue: [http_2xx]
    static_configs:
    - targets:
      - http://monitor.mall.demo.com/login
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: 172.16106.84.: 9115  # BlackBox - My exporter is on the machine and port
Copy the code

The probe type is specified through the PROber configuration item. The configuration item HTTP is used to customize the probe detection mode. If no configuration is added to the configuration item HTTP, it indicates that the default configuration of the HTTP probe is used. The probe uses HTTP GET to detect the target service and verifies whether the returned status code is 2xx. Otherwise fail.

The collected data is as follows:

DNS resolution time (unit: sProbe_dns_lookup_time_seconds 0.000199105Detect the time from start to finish, in s, to request the page response timeProbe_duration_seconds 0.010889113# HELP probe_failed_due_to_regex Indicates if probe failed due to regex
# TYPE probe_failed_due_to_regex gauge
probe_failed_due_to_regex 0
# Length of HTTP content response
probe_http_content_length -1
# Count the time of each phase by phase
probe_http_duration_seconds{phase="connect"0.001083728}# Connection time
probe_http_duration_seconds{phase="processing"0.008365885}# request processing time
probe_http_duration_seconds{phase="resolve"0.000199105}# Response time
probe_http_duration_seconds{phase="tls"} 0                  Time to verify the certificate
probe_http_duration_seconds{phase="transfer"0.000446424}# Transfer time
# Number of redirects
probe_http_redirects 0
# SSL indicates whether SSL is used for final redirection
probe_http_ssl 0
# Return status code
probe_http_status_code 200
Uncompressed response body length
probe_http_uncompressed_body_length 1766
# HTTP protocol versionProbe_http_version 1.1# HELP probe_ip_addr_hash Specifies the hash of IP address. It's useful to detect if the IP address changes.3.24030434 e+09 probe_ip_addr_hash# IP protocol version number used
probe_ip_protocol 4
# Check whether the probe is successful
probe_success 1
Copy the code

4. Customize HTTP requests

HTTP services are often presented in different forms, some of which may be simple web pages, others may be REST-based API services. Different types of HTTP probes require administrators to customize the behavior of HTTP probes, including HTTP request methods, HTTP headers, and request parameters. Some services with security authentication enabled also need to be able to set up auTH support for HTTP probes. You also need to be able to customize certificates for HTTPS services. As shown below, the request method is defined by method. For some services that require request parameters, headers can be used to define the request headers, and body can be used to define the request content:

http_post_2xx:
    prober: http
    timeout: 5s
    http:
      method: POST
      headers:
        Content-Type: application/json
      body: '{}'
Copy the code

If HTTP services have security authentication enabled, BlackBox_EXPORTER has built-in support for Basic_auth, which can be directly set to the authentication information:

http_basic_auth_example:
    prober: http
    timeout: 5s
    http:
      method: POST
      headers:
        Host: "login.example.com"
      basic_auth:
        username: "username"
        password: "mysecret"
Copy the code

For services that use Bear Tokens, you can also specify the Token string directly via the Bearer_Token configuration item, or the Token file via bearer_token_file. For HTTPS enabled services that require customized certificates, you can run the tls_config command to specify certificate information:

 http_custom_ca_example:
    prober: http
    http:
      method: GET
      tls_config:
        ca_file: "/certs/my_cert.crt"
Copy the code

5. Customize probe behavior

By default, the HTTP probe verifies only the HTTP return StatusCode. If the StatusCode is 2XX (200 <= StatusCode < 300), the probe succeeds, and the probe_success value returned by the probe is 1. If the user needs to specify HTTP return status codes or has special requirements for HTTP versions, as shown below, the valid_HTTP_versions and VALID_status_codes can be defined:

  http_2xx_example:
    prober: http
    timeout: 5s
    http:
      valid_http_versions: ["HTTP / 1.1"."HTTP/2"]
      valid_status_codes: []
Copy the code

By default, the sample data returned by Blockbox also contains the metric probe_HTTP_SSL, which indicates whether SSL is being used by the current probe:

# HELP probe_http_ssl Indicates if SSL was used for the final redirect
# TYPE probe_http_ssl gauge
probe_http_ssl 0
Copy the code

If the user has a mandatory standard on whether SSL is enabled for the HTTP service. You can use fail_if_ssl or fail_if_not_ssl. If fail_if_SSL is true, the probe fails if SSL is enabled on the site and succeeds otherwise. Fail_if_not_ssl is the opposite.

  http_2xx_example:
    prober: http
    timeout: 5s
    http:
      valid_status_codes: []
      method: GET
      no_follow_redirects: false
      fail_if_ssl: false
      fail_if_not_ssl: false
Copy the code

In addition to being based on the HTTP status code, HTTP protocol version, and whether SSL is enabled as the criteria to control the successful probe behavior, the response content of THE HTTP service can also be matched. Using fail_if_matches_regEXP and fail_if_not_matches_regexp Users can define a set of regular expressions to verify whether the content returned from HTTP matches or does not match the content of the regular expression.

  http_2xx_example:
    prober: http
    timeout: 5s
    http:
      method: GET
      fail_if_matches_regexp:
        - "Could not connect to database"
      fail_if_not_matches_regexp:
        - "Download the latest version here"
Copy the code

As a final note, HTTP probes will use ipv6 by default. In most cases, preferred_ip_protocol=ip4 can be used to force probing over ipv4. In the monitoring sample of Bloackbox responses, the indicator probe_IP_protocol also indicates the current protocol usage:

# HELP probe_ip_protocol Specifies whether probe ip protocol is IP4 or IP6
# TYPE probe_ip_protocol gauge
probe_ip_protocol 6
Copy the code

Check the configuration file

Check whether the configuration file is correctly written

cd /data/prometheus
./promtool check config prometheus.yml
Copy the code

Integrate Grafana

Import blackbox_EXPORTER 9965 template in grafana: grafana.com/grafana/das…

Note: This template requires the installation of the pie chart plug-in, download address: grafana.com/grafana/plu…

Install the plug-in and restart Grafana to take effect.

grafana-cli plugins install grafana-piechart-panel
service grafana-server restart
Copy the code

Check the following data:

Eight, summary

The biggest difference between black-box monitoring and white-box monitoring is that black-box monitoring is fault-oriented. When a fault occurs, black-box monitoring can quickly discover a fault, while white-box monitoring focuses on proactively discovering or predicting potential problems. A good monitoring goal is to be able to detect potential problems from a white box perspective and quickly detect problems that have occurred from a black box perspective.

Examples:

  • Github.com/zuozewei/bl…

References:

  • [1] : cloud.tencent.com/developer/a…
  • [2] : www.cnblogs.com/xiao9873341…