Operation and development story public account author: Hua Zi

One, foreword

Why do you want to write metrics in Golang? This is mainly from one of our customers. Ovs is used in K8S network and bond is made, namely bond0 and BONd1. There are two network cards under each bond. After the production, I was asked to check whether the network adapter was normal every day, because there was a network adapter DOWN before. I, on the other hand, am lazy and don’t want to check manually. Thinking that Prometheus would eventually show up to Grafana, I would check to see if there were any abnormal network cards on grafana. Secondly, I happen to be studying GO recently, and I want to practice. At the same time also asked the R & D students, said that it is very simple, ask me to try, meet difficulties are willing to help me. So, I’m going to give it a try.

Second, the environment

component version note
k8s v1.14
ovs v2.9.5
go 1.14.1

Three, goals,

The goal was for Prometheus to pull the status indicator of my OVS bond, so here I needed to write a GO program to get the OVS bond information of my host and expose it metrics for Prometheus to pull and display on Grafana. The following is an example:

Now get the current bond information
[root@test~]$ ovs-appctl  bond/show |grep '^slave' |grep -v grep  |awk '{print $2""$3}'
a1-b1:enabled
a2-b2:enabled
a3-b3:enabled
a4-b4:disabled
5 indicates that the command to obtain bond information failed to be executed. 0-4 indicates that there are several nics in disabled state
curl http://$IP:$PORT/metrics 
ovs_bond_status{component="ovs"} 5
ovs_bond_status{component="ovs"."a1b1"="enabled"."a2b2"="disabled"."a3b3"="enabled",a4b4="disabled“} 2


Copy the code

Four, idea,

  1. Since metrics are being captured through Prometheus, bond information must be exposed in metrics format. Metrics format can be found on Prometheus website.

  2. There are two nics under each bond. The status of each NIC can only be enabled or Disabled. Therefore, digits 0 to 4 are used to indicate how many nics are disabled. Bond information can be obtained through commands. Therefore, the bond information can be obtained through commands.

  3. The output from the command is processed and put into metrics. Note: The label of metrics cannot have [-].

  4. The correct bond information returned by shell command is received by map. Key is the nic name, and value is the nic status

  5. See client_golang/ Prometheus

Five, the practice

Run the shell command to obtain bond information

Now get the current bond information
[root@test~]$ ovs-appctl  bond/show |grep '^slave' |grep -v grep  |awk '{print $2""$3}'
a1-b1:enabled
a2-b2:enabled
a3-b3:enabled
a4-b4:disabled
Copy the code

The output of the shell is processed

Execute shell commands, process the output, and record related logs// return map 
// If the command is executed incorrectly, or if the command is executed successfully, but null is returned
func getBondStatus(a) (m map[string]string) {
	result, err := exec.Command("bash"."-c"."ovs-appctl bond/show | grep '^slave' | grep -v grep | awk '{print $2\"\"$3}'").Output()
	iferr ! =nil {
		log.Error("result: ".string(result))
		log.Error("command failed: ", err.Error())
		m = make(map[string]string)
		m["msg"] = "failure"
		return m
	} else if len(result) == 0 {
		log.Error("command exec failed, result is null")
		m = make(map[string]string)
		m["msg"] = "return null"
		return m
	}
// To process the result, first remove the space on both sides
	ret := strings.TrimSpace(string(result))
    // Cut by newline
	tt := strings.Split(ret, "\n")
	//tt := []string{"a1-b1:enabled","a2-b2:disabled"}
    // If the key contains [-], it needs to be removed
	var nMap = make(map[string]string)
	for i := 0; i < len(tt); i++ {
		// if key contains "-"
		if strings.Contains(tt[i], "-") = =true {
			nKey := strings.Split(strings.Split(tt[i], ":") [0]."-")
			nMap[strings.Join(nKey, "")] = (strings.Split(tt[i], ":"))1]}else {
			nMap[(strings.Split(tt[i], ":"))0]] = (strings.Split(tt[i], ":"))1]}}return nMap
}
Copy the code

Define metrics

// define a struct
type ovsCollector struct {
    // More than one can be defined
	ovsMetric *prometheus.Desc
}

func (collector *ovsCollector) Describe(ch chan<- *prometheus.Desc) {
	ch <- collector.ovsMetric
}

/ / name card
var vLable = []string{}
// Nic status
var vValue = []string{}
// Fix the label to indicate ovS
var constLabel = prometheus.Labels{"component": "ovs"}

// define metric
func newOvsCollector(a) *ovsCollector {
	var rm = make(map[string]string)
	rm = getBondStatus()
	if _, ok := rm["msg"]; ok {
		log.Error("command execute failed:", rm["msg"])}else {
        // Get only the name of the network adapter
		for k, _ := range rm {
			// get the net
			vLable = append(vLable, k)
		}
	}
    // metric
	return &ovsCollector{
		ovsMetric: prometheus.NewDesc("ovs_bond_status"."Show ovs bond status", vLable,
			constLabel),
	}
}
Copy the code

Index corresponding value

// If this command is executed correctly, inject the corresponding nic, nic status and number of abnormal nics into metrics
func (collector *ovsCollector) Collect(ch chan<- prometheus.Metric) {
	var metricValue float64
	var rm = make(map[string]string)
	rm = getBondStatus()
	if _, ok := rm["msg"]; ok {
		log.Error("command exec failed")
		metricValue = 5
		ch <- prometheus.MustNewConstMetric(collector.ovsMetric, prometheus.CounterValue, metricValue)
	} else {
		vValue = vValue[0:0]
        / / takes the value only
		for _, v := range rm {
			// get the net
			vValue = append(vValue, v)
            // Count for the disabled
			if v == "disabled"{ metricValue++ } } ch <- prometheus.MustNewConstMetric(collector.ovsMetric, prometheus.CounterValue, metricValue, vValue...) }}Copy the code

Program entrance

func main(a) {
	ovs := newOvsCollector()
	prometheus.MustRegister(ovs)

	http.Handle("/metrics", promhttp.Handler())

	log.Info("begin to server on port 8080")
	// listen on port 8080
	log.Fatal(http.ListenAndServe(": 8080".nil))}Copy the code

The complete code

package main

import (
	"github.com/prometheus/client_golang/prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
	log "github.com/sirupsen/logrus"
	"net/http"
	"os/exec"
	"strings"
)

// define a struct from prometheus's struct named Desc
type ovsCollector struct {
	ovsMetric *prometheus.Desc
}

func (collector *ovsCollector) Describe(ch chan<- *prometheus.Desc) {
	ch <- collector.ovsMetric
}

var vLable = []string{}
var vValue = []string{}
var constLabel = prometheus.Labels{"component": "ovs"}

// get the value of the metric from a function who would execute a command and return a float64 value
func (collector *ovsCollector) Collect(ch chan<- prometheus.Metric) {
	var metricValue float64
	var rm = make(map[string]string)
	rm = getBondStatus()
	if _, ok := rm["msg"]; ok {
		log.Error("command exec failed")
		metricValue = 5
		ch <- prometheus.MustNewConstMetric(collector.ovsMetric, prometheus.CounterValue, metricValue)
	} else {
		vValue = vValue[0:0]
		for _, v := range rm {
			// get the net
			vValue = append(vValue, v)
			if v == "disabled"{ metricValue++ } } ch <- prometheus.MustNewConstMetric(collector.ovsMetric, prometheus.CounterValue, metricValue, vValue...) }}// Define metric's name, help
func newOvsCollector(a) *ovsCollector {
	var rm = make(map[string]string)
	rm = getBondStatus()
	if _, ok := rm["msg"]; ok {
		log.Error("command execute failed:", rm["msg"])}else {
		for k, _ := range rm {
			// get the net
			vLable = append(vLable, k)
		}
	}
	return &ovsCollector{
		ovsMetric: prometheus.NewDesc("ovs_bond_status"."Show ovs bond status", vLable,
			constLabel),
	}
}

func getBondStatus(a) (m map[string]string) {
	result, err := exec.Command("bash"."-c"."ovs-appctl bond/show | grep '^slave' | grep -v grep | awk '{print $2\"\"$3}'").Output()
	iferr ! =nil {
		log.Error("result: ".string(result))
		log.Error("command failed: ", err.Error())
		m = make(map[string]string)
		m["msg"] = "failure"
		return m
	} else if len(result) == 0 {
		log.Error("command exec failed, result is null")
		m = make(map[string]string)
		m["msg"] = "return null"
		return m
	}
	ret := strings.TrimSpace(string(result))
	tt := strings.Split(ret, "\n")
	var nMap = make(map[string]string)
	for i := 0; i < len(tt); i++ {
		// if key contains "-"
		if strings.Contains(tt[i], "-") = =true {
			nKey := strings.Split(strings.Split(tt[i], ":") [0]."-")
			nMap[strings.Join(nKey, "")] = (strings.Split(tt[i], ":"))1]}else {
			nMap[(strings.Split(tt[i], ":"))0]] = (strings.Split(tt[i], ":"))1]}}return nMap
}

func main(a) {
	ovs := newOvsCollector()
	prometheus.MustRegister(ovs)

	http.Handle("/metrics", promhttp.Handler())

	log.Info("begin to server on port 8080")
	// listen on port 8080
	log.Fatal(http.ListenAndServe(": 8080".nil))}Copy the code

Six, deployment,

Because you will eventually deploy to a K8S environment, build the image first, as shown in the Dockerfile below

FROM golang:1.14.1 AS builder
WORKDIR /go/src
COPY ./ .
RUN go build -o ovs_check main.go

# runtime
FROM centos:7.7
COPY --from=builder /go/src/ovs_check /xiyangxixia/ovs_check
ENTRYPOINT ["/xiyangxixia/ovs_check"]

Copy the code

The YAML I use here looks like this:

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: ovs-agent
  namespace: kube-system
spec:
  minReadySeconds: 5
  selector:
    matchLabels:
      name: ovs-agent
  template:
    metadata:
      annotations:
      Tell promethue to grab the path
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
      labels:
        name: ovs-agent
    spec:
      containers:
      - name: ovs-agent
        image: ovs_bond:v1
        imagePullPolicy: IfNotPresent
        resources:
            limits:
              cpu: 100m
              memory: 200Mi
            requests:
              cpu: 100m
              memory: 200Mi
        securityContext:
          privileged: true
          procMount: Default
        volumeMounts:
        - mountPath: /lib/modules
          name: lib-modules
          readOnly: true
        - mountPath: /var/run/openvswitch
          name: ovs-run
        - mountPath: /usr/bin/ovs-appctl
          name: ovs-bin
          subPath: ovs-appctl
      serviceAccountName: xiyangxixia
      hostPID: true
      hostIPC: true
      volumes:
      - hostPath:
          path: /lib/modules
          type: ""
        name: lib-modules
      - hostPath:
          path: /var/run/openvswitch
          type: ""
        name: ovs-run
      - hostPath:
          path: /usr/bin/
          type: ""
        name: ovs-bin
  updateStrategy:
    type: RollingUpdate
Copy the code

Seven, test,

[root@test~] $kubectl kube get Po - n - system - wide o | grep ovs ovs agent - 1/1 h8zc6 Running 0 2 d14h 10.211.55.41 master - 1 < none > <none> [root@test~] $curl 10.211.55.41:8080 / metrics | grep ovs_bond# HELP ovs_bond_status Show ovs bond status
# TYPE ovs_bond_status counter
ovs_bond_status{component="ovs",a1b1="enabled",a2b2="enabled",a3b3="enabled",a4b4="enabled"} 0
Copy the code

Eight, summary

Above is all this article, forgive me not fine art can only rough introduction. Thanks to the friends who have been paying attention to the public account!