Prometheus monitoring Kubernetes cluster of use first released in: blog.ihypo.net/15649063510…

When you think about building on Kubernetes’ capabilities, you open a Pandora’s box, and you don’t know what’s in that box, just like you don’t know what’s going on in the Kubernetes cluster and the applications on that cluster that you depend on.

Regardless of the architecture chosen and the underlying runtime, observability is always a high priority. There is a saying: if you don’t know how to operate, don’t try to deploy. This is also a very unpretentious, start-to-finish way of thinking.

Then again, if we embrace Kubernetes, what does “observability” look like? For microservices architecture, I think there are several areas that can be passed:

  1. Observability of cluster and application state
  2. Cluster and application logs
  3. Observability of inter-application traffic, invocation relationships, and request state

In a nutshell: monitor, log, trace, and Prometheus is a mature solution for monitoring in Kubernetes.

Prometheus

Prometheus is an open source monitoring and alarm system based on a timing database from SoundCloud. Prometheus periodically captures the status of monitored components through HTTP. Therefore, monitored components need only HTTP interfaces. Components that do not provide HTTP interfaces by default, such as Linux and MySQL, can be accessed by Prometheus. Prometheus supports exporting information and provides the metrics interface on behalf of Prometheus.

The SoundCloud blog has a brief explanation of the Prometheus architecture and how it works. In the article, Prometheus satisfies four characteristics:

  1. A multi-dimensional data model
  2. Operational Simplicity (Easy deployment and maintenance)
  3. Scalable Data Collection and Decentralized Architecture
  4. A powerful Query Language

The first and fourth items are also features of sequential databases, but Prometheus does not have any additional storage built in by default for ease of deployment, choosing to implement its own implementation. For the fourth feature, Prometheus implements the PromQL query language, which implements powerful query rules.

As versions have evolved, Prometheus’s features have gone beyond that.

As shown in the architecture diagram for Prometheus, there are four main components:

  1. Prometheus Server
  2. PushGateway
  3. AlertManager
  4. WebUI

Only Prometheus Server is the most important component for data collection. Prometheus collects data from monitored objects by means of pull, but pushes the monitored objects to Prometheus by means of Push. PushGateway can be introduced, where monitored objects actively Push state to PushGateway, which Prometheus Server periodically collects.

Both the AlertManager and WebUI are optional. The AlertManager can set alarms based on collected data, while the WebUI displays monitoring data in real time.

Prometheus Operator

Prometheus can be deployed in a variety of ways, and thanks to its simple working principle, it is only necessary to deploy Prometheus Server into an environment that can access the monitored object.

For K8s, however, CoreOS open-source management and deployment of Prometheus (github.com/coreos/prom…) through Operator (CRD) due to the relatively closed network environment in the cluster and the changeable IP address of Pod. .

As mentioned in the previous article on CRD (how to extend the Kubernetes cluster using CRD), the capabilities provided by CRD depend on the CRD Controller, of which Prometheus Operator is a type, It is responsible for monitoring the changes of custom resources and completing the follow-up management work.

Install the Operator

Prometheus Operator installation is as simple as kubectl apply bundle.yaml in the root directory of the Git repository:

git clone https://github.com/coreos/prometheus-operator.git
kubectl apply -f prometheus-operator/bundle.yaml
Copy the code

The basic concept

Prometheus Operator will host the deployment and management of Prometheus, and based on the CRD in K8s, Prometheus Operator introduces several new CR (custom resources) :

  1. Prometheus: Describes the Prometheus Server cluster to be deployed
  2. Described the Prometheus ServiceMonitor/PodMonitor: list of Target Server
  3. Alertmanager: Describes an Alertmanager cluster
  4. PrometheusRule: alarm rule of PrometheusRule

The design philosophy for Prometheus Operator can be found in the documentation: github.com/coreos/prom… .

The working principle of

Prometheus Operator listens for changes to the above custom resource (CR) and performs subsequent management logic, as shown below:

By creating a resource of type Prometheus (where Prometheus refers to a custom resource defined by Prometheus Operator), Prometheus selects the associated ServiceMonitor by label Selector, which in turn selects the Service to be monitored by defining the Label selector for the Service. Obtain the LIST of Pod IP addresses to be monitored through the Endpoints corresponding to the Service.

Monitoring Application Demo

We provide a guide to how to use Promethee-Operator to monitor your application using the promethee-operator guide. For more details, see github.com/coreos/prom… .

Deploy monitored objects

Deploying an application with 3 copies through Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: example-app
  template:
    metadata:
      labels:
        app: example-app
    spec:
      containers:
      - name: example-app
        image: fabxc/instrumented_app
        ports:
        - name: web
          containerPort: 8080
Copy the code

Then create a Service that provides stable access:

kind: Service
apiVersion: v1
metadata:
  name: example-app
  labels:
    app: example-app
spec:
  selector:
    app: example-app
  ports:
  - name: web
    port: 8080
Copy the code

Note that the Service defines app=example-app label, which is the basis for selecting ServiceMonitor.

The deployment of monitor

According to the Label defined in the Service, we can define the ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app
  labels:
    team: frontend
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: web
Copy the code

ServiceMonitor defines the label for team=frontend, which is why Prometheus chose ServiceMonitor. Thus Prometheus could be created:

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
spec:
  serviceAccountName: prometheus
  serviceMonitorSelector:
    matchLabels:
      team: frontend
  resources:
    requests:
      memory: 400Mi
  enableAdminAPI: false
Copy the code

At this point, an instance of Prometheus is started:

# kubectl get po
NAME                                   READY   STATUS    RESTARTS   AGE
example-app-66db748757-bfqx4           1/1     Running   0          103m
example-app-66db748757-jqsh5           1/1     Running   0          103m
example-app-66db748757-jtbpc           1/1     Running   0          103m
prometheus-operator-7447bf4dcb-lzbf4   1/1     Running   0          18h
prometheus-prometheus-0                3/3     Running   0          100m
Copy the code

Prometheus itself provides webuis, so we can create SVCS that are exposed to external cluster access (preferably not in a public network environment) :

apiVersion: v1
kind: Service
metadata:
  name: prometheus
spec:
  type: NodePort
  ports:
  - name: web
    nodePort: 30900
    port: 9090
    protocol: TCP
    targetPort: web
  selector:
    prometheus: prometheus
Copy the code

You can view the monitoring information about Demo applications in the cluster:

The cluster monitoring

As you can see from this custom Demo, Prometheus uses SVC to initiate HTTP access to data, while cluster monitoring simply gives Prometheus the ability to access the monitoring interface of the Kubernetes component. Prometheus also supports Node Management as a DaemonSet deployment to collect cluster Node information directly.

The collection form of monitoring data of Kubernetes components depends on the deployment mode of the cluster. In binary mode, install Prometheus on Node and collect data. For container deployment, you can create SVC for Kubernetes components. Subsequent operations are consistent with the monitoring method of cluster applications. You can refer to the related documents: coreos.com/operators/p… .