Since the release of Loki2.0, LogQL V2 has gradually provided Loki with log analysis capabilities with rich query functions. In some cases, when the developer does not provide Metrics, we can also use LogQL to build log-based Metrics, mainly using aggregate queries.

Common operation

Those familiar with PromQL should know that common aggregate queries include sum, rate, count, and so on. There are also two common types of aggregation operations in Loki

The first type calculates the log entry as a whole

The following operation functions are supported:

  • Rate (log-range) : calculates the number of log entries per second
  • Count_over_time (log-range) : Counts the number of entries per log stream in a given range
  • Bytes_rate (log-range) : Calculates the number of bytes per second for each stream
  • Bytes_over_time (log-range) : calculates the number of bytes used by each log stream in a given range

For example, count the number of timeout times greater than 10s in a minute in MYSQL logs

sum by (host) (rate({job="mysql"} |= "error" ! = "timeout" | json | duration > 10s [1m]))Copy the code

The second type, sample range, can extract the value of the tag as a sample

It is important to note that in order to correctly select the tag sample, we must end our log query with a unpack expression and an optional tag filter expression to discard errors. For example, we often in | __error__ = “” to filter error parsing the log.

The features supported in the expanded scope include:

  • Rate (unwrapped-range) : calculates the rate per second of all values within a specified time interval
  • Sum_over_time (unwrapped-range) : sum of all values within a specified interval
  • Avg_over_time (unwrapped-range) : the average value of all points within a specified interval
  • Max_over_time (unwrapped-range) : specifies the maximum value of all points in the interval
  • Min_over_time (unwrapped-range) : specifies the minimum value of all points in the interval
  • Stdvar_over_time (unwrapped-range) : the total standard variance of the values within the interval
  • Stddev_over_time (unwrapped-range) : total standard deviation of values within a specified interval
  • Quantile_over_time (scalar,unwrapped-range) : φ quantile of the values within the specified interval (0≤φ≤1)

For example: get request time TP99 line in ingress

Quantile_over_time (0.99, {cluster="ops-tools1",container="ingress-nginx"} | json | __error__ = "" | unwrap request_time [1m])) by (path)Copy the code

Quantile_over_time, which you may be familiar with, is not an estimate, as it was in Prometheus. Instead, you sort all the values in the range and calculate the 99th percentile.

About the group

Loki’s grouping differs from Prometheus in that it allows us to use grouping without interval vectors, such as the aggregation functions avg_over_time, max_over_time, min_over_time, stdvar_over_time, Stddev_over_time and quantile_over_time can be grouped, which is useful for aggregating data for specific dimensions.

For example, if we wanted to get the average latency of the ingress response by cluster, we could use:

avg_over_time({container="ingress-nginx",service="hosted-grafana"} | json | unwrap response_latency_seconds | __error__=""[1m]) by (cluster)
Copy the code

For other operations, we can also sum by (..) This is the same as using PromeQL.

For example, we want to group the request rates for the different status codes of the ingress:

The sum by (response_status) (rate ({container = "ingress - nginx", service = "hosted - grafana"} | json | __error__ = "" [m] 1))Copy the code

As you can see, LogQL is quite powerful by extracting tags for grouping, parsing and computing log data to generate new metrics. When building a parser with logfMT and JSON formats to do metric queries, we should always remember to use groups, because if unchecked, we can include a large number of labels in the results of the query, which can easily meet the limits_config limit on labels.

conclusion

Loki’s range vector operations are very useful for calculating log volumes. Using LogQL parsers and sample expressions, we can quickly extract a new set of metrics from the log, and we can see how the system is performing without even changing the code.

Pay attention to the public account “cloud native Xiaobai”, get more exciting content