K8s cluster information monitoring based on Prometheus

1, deployment,

  • Get official source code
git clone https://github.com/coreos/kube-prometheus.git
Copy the code
  • Deploy CRD and monitor
# CRD
cd kube-prometheus/manifests/setup
kubectl apply -f .

cd kube-prometheus/manifests
kubectl apply -f .
Copy the code

Check the CRD kubectl get CRD | grep coreos

alertmanagers.monitoring.coreos.com     2021-12-15T06:56:28Z
podmonitors.monitoring.coreos.com       2021-12-15T06:56:28Z
prometheuses.monitoring.coreos.com      2021-12-15T06:56:28Z
prometheusrules.monitoring.coreos.com   2021-12-15T06:56:28Z
servicemonitors.monitoring.coreos.com   2021-12-15T06:56:28Z
Copy the code

Check pod kubectl get pod-n Monitoring

alertmanager-main-0                   2/2     Running   0          42d
alertmanager-main-1                   2/2     Running   0          42d
alertmanager-main-2                   2/2     Running   0          42d
kube-state-metrics-78b46c84d8-klllv   3/3     Running   0          42d
prometheus-adapter-5cd5798d96-kj6r5   1/1     Running   0          42d
prometheus-k8s-0                      3/3     Running   1          42d
prometheus-operator-99dccdc56-lf6vm   1/1     Running   0          42d
Copy the code

Check service kubectl get SVC-N Monitoring

Alertmanager-main ClusterIP 11.1.126.171 <none> 9093/TCP 42D alertmanager-operated ClusterIP none <none> 9093/TCP,9094/TCP,9094/UDP 42d kube-state-metrics ClusterIP None <none> 8443/TCP,9443/TCP 42d prometheus-adapter ClusterIP 11.1.178.197 < None > 443/TCP 42d Prometheus -k8s NodePort 11.1.253.224 < None > 9090:8098/TCP 42D prometheus-operated ClusterIP None <none> 9090/TCP 42d prometheus-operator ClusterIP None <none> 8080/TCP 42dCopy the code
  • Change Prometheus to NodePort:
prometheus-service.yaml
​
apiVersion: v1
kind: Service
metadata:
  labels:
    prometheus: k8s
  name: prometheus-k8s
  namespace: monitoring
spec:
  type: NodePort
  ports:
  - name: web
    port: 9090
    nodePort: 8098
    targetPort: 9090
  selector:
    app: prometheus
    prometheus: k8s
  sessionAffinity: ClientIP
​
Copy the code

2. Prometheus persistent storage

Prometheus, in the form of pod once pod hang up will cause the container stored in a data loss, historical data can’t query, so you need to persistent node configuration ChubaoFS CSI official link: ChubaoFS. Readthedocs. IO/zh_CN/lates…

Deploy the Chubao CSI plugin

kubectl apply -f deploy/csi-rbac.yaml
kubectl apply -f deploy/csi-controller-deployment.yaml
kubectl apply -f deploy/csi-node-daemonset.yaml
Copy the code

Create StorageClass

MasterAddr needs to be changed to its own configuration

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: chubaofs-sc-pro
provisioner: csi.chubaofs.com
reclaimPolicy: Delete
parameters:
  masterAddr: "****"
  owner: "****"
  consulAddr: "*****"
  logLevel: "info"
Copy the code

The node to add the label

chubaofs-csi-controller=enabled
chubaofs-csi-node=enabled
Copy the code

Configuration Prometheus

Yaml Storage: #---- Set StorageClass as the fast volumeClaimTemplate: spec: storageClassName: Chubaofs-sc-test #-- specifies fast resources: requests: storage: 200GiCopy the code

Deployment:

CD/home/work/zhanglei/kube - Prometheus - 0.3.0 / manifests/setup && kubectl apply - f. CD /home/work/zhanglei/ kube-prometh-0.3.0 / MANIFESTSCopy the code

3. Monitor nodes outside the cluster using Promethy-operator

It is used to monitor the indicator data of some other plug-ins in the way of serviceMonitor. This paper is mainly used to monitor the GPU memory allocation and usage of each container in a shared GPU plug-in.

  • First add the definition SVC to expose the monitoring metrics through port 10255
apiVersion: v1 kind: Service metadata: labels: k8s-app: kubelet name: test-exporter #namespace: monitoring namespace: kube-system spec: clusterIP: None ports: - name: port port: 10255 targetPort: port --- apiVersion: v1 kind: Endpoints metadata: name: test-exporter #namespace: monitoring namespace: kube-system labels: k8s-app: kubelet subsets: - addresses: -ip: 100.200.200.200 -ip: 100.200.200.201 ports: -name: port Port: 10255 protocol: TCPCopy the code
  • Then add the eps you want to monitor in the serviceMonitor
. endpoints: - port: port interval: 30s scheme: http path: /metrics/cadvisor ...Copy the code

2. Build the Grafana monitoring panel

Grafana can be deployed directly using Prometheus-Oerator or docker Run. This article is directly deployed with Docker Run.

1, deployment,

docker run -d -p 3000:3000 grafana/grafana
Copy the code

[Note] In this way of deployment, remember to back up the container image after the completion of the panel construction, one side of the container hangs, resulting in data loss.

2.2 Docker Deployment Modified the configuration file

  • Anonymous logins

    When grafana is embedded for use in other systems, you need to modify the configuration file if you want to hide login operations

    [auth.anonymous] # enable anonymous access Allow anonymous access Enabled = true # Specify organization name that should be used for unauthenticated users # Org_name = Org_role = Viewer Main Org. # Specify role for unauthenticated usersCopy the code

    If docker is used, mount grafana to modify the configuration file

    Normal start, Mount data plate docker run - d - name grafana -p 3000:3000 - v/data/grafana: / var/lib/grafana grafana/grafana # # replicate docker cp configuration file Grafan: / etc/grafana/grafana ini/data/grafana - data/etc / # # modified configuration files, such as add a domain name, such as modifying ports to 80, such as... ## kill grafana docker rm grafana docker run --user root -d --name grafana -p 318:3000-v /data/grafana-data/etc:/etc/grafana/Copy the code
  • Cross domain access

    Grafana embedding in other systems typically requires cross-domain access, where configuration files also need to be modified

    Ini configuration file to modify allow_embedding = true KIOSk = TVCopy the code

3. Panel construction

Reference: kalacloud.com/blog/grafan…

4 Grafana Permission Management

  • Add users and user groups

    Reference: blog.csdn.net/qq_34355232…

  • Add permission management for Dashboard

    First of all, Dashboard permissions are inherited from the Folder permissions, so you will see a small lock behind some existing permissions, indicating that permissions cannot be changed. If you change permissions, you can only change the permissions of the corresponding Folder. In other words, permissions for the so-called Grafana Dashboard are Folder permissions;

To Add Permissions to a single dashboard, go to “Dashboard Settings “–> “Permissions”–> “Add Permission” to Add Permissions for a user or group.