Inspired by PromQL, Loki has its own LogQL query. Officially, it is like a distributed grep log aggregation viewer. Like PromeQL, LogQL uses labels and operators for filtering. It has two main parts:
- Log Stream selector
- Filter expression
We can use these two parts to combine the functionality we want in Loki, and usually we can use them to do the following
- View the log content based on the log flow selector
- Measures are calculated in the log flow based on filtering rules
log stream selector
This part of the log flow selector is the same as the syntax of PromQL. It uses the collected log label to determine which log flow you want to query. Generally, label matching operations support the following:
- =: Matches exactly
- ! = : don’t match
- =~: matches the regular expression
- ! ~: The regular expression does not match
For example
{name=~"mysql.+", env="prod"} {name! ~"mysql.+", env="prod"} {name! ~`mysql-\d+`,env="prod"}Copy the code
The above statements can find all matching log contents
filter expression
When viewing full-text logs, tools such as grep are often used to find the log content we care about. That’s what LogQL expressions do. Currently, the following filter expressions are supported:
- | = : log line contains a string
- ! = : A string that is not contained in the log line
- | ~ : log line matching of regular expressions
- ! ~ : The log line does not match the regular expression
For example
{job="mysql"} |= "error" {name="kafka"} |~ "tsdb-ops.*io:2003" {name="cassandra"} |~ `error=\w+` {instance=~"kafka-[23]",name="kafka"} ! = "kafka.server:type=ReplicaManager"Copy the code
If we want to make more than one match, we can also append the rule as we pipe it in Linux:
{job="mysql"} |= "error" ! = "timeout"Copy the code
Log measurement
LogQL also supports measuring log flows in a functional way, usually to calculate the error rate of messages or to sort the Top N of application log output over a period of time.
Interval vector
LogQL also supports limited interval vector measurement statements, which can be used in the same way as PromQL. There are four common functions:
- Rate: Calculates log entries per second
- Count_over_time: Counts entries for each log flow within a specified range
- Bytes_rate: calculates the number of bytes per second in the log flow
- Bytes_over_time: The number of bytes used for each log flow within a specified range
Here’s an example:
# compute nginx QPS rate ({filename = "/ var/log/nginx/access. Log"} [m] 5)) # computing kernel oom in the last five minutes count_over_time({filename="/var/log/message"} |~ "oom_kill_process" [5m]))Copy the code
Aggregation function
LogQL also supports aggregation operations, which can be used to aggregate elements within a single vector to produce a new vector with fewer elements. The currently supported aggregation functions are as follows:
- The sum, sum
- Min: indicates the minimum value
- Max: Maximum value
- Avg: indicates the average value
- Stddev: standard deviation
- Stdvar: Standard variance
- Counting the count:
- Bottomk: the smallest K elements
- Topk: maximum k elements
Aggregate functions are usually described by the following expression:
<aggr-op>([parameter,] <vector expression>) [without|by (<label list>)]
Copy the code
When we need to group tags, we can use without or by, for example
# calculation nginx QPS, groups with pod_name sum (rate ({filename = "/ var/log/nginx/access. Log"} [m] 5)) by (pod_name)Copy the code
Only when bottomk and topk are used can we input parameters to the function, such as
# calculation nginx QPS the top five largest, groups with pod_name topk (5, the sum (rate ({filename = "/ var/log/nginx/access. Log"} [m] 5)) by (pod_name))Copy the code
Mathematical calculations
Someone else is going to ask, doesn’t Loki keep a log? It’s all text. How do you calculate it? Obviously, the math in LogQL is still oriented towards interval vector operations. LogQL supports the following binary operators:
- + : addition
- – : subtraction
- * : multiplication
- / : division
- % : modulus
- ^ : exponentiation
For example, if we want to find the error rate in a business log, we can calculate it as follows:
# calculation error rate within the log sum (rate ({app = "foo", level = "error"} [m] 1))/sum (rate ({app = "foo"} [m] 1))Copy the code
Set operations
Set operations are only valid within the range of interval vectors, currently supported
- And, and
- Or: or
- Unless: eliminate
Small white has not found LogQL set operation cases, temporarily skip
Comparison operations
Comparison operations supported by LogQL are the same as those supported by PromQL.
- = = : equal to zero
- ! = : No
- >.
- >=: Greater than or equal to
- < : less than
- <=: less than or equal to
Usually, we use interval vector to make a threshold comparison after calculation, which is very useful for alarm, such as:
Count_over_time ({app="foo", level="error"}[5m]) > 10 count_over_time({app="foo", level="error"}[5m]Copy the code
Of course, we can also express it by Boolean calculation, such as:
Count_over_time ({app="foo", level="error"}[5m]) > bool 10Copy the code
In the later part of this section, there will be more scenarios to use it with Loki ruler. It is recommended to use it with The Correct Posture of Loki Alarm
Operational priority
The priority of LogQL operations also maintains the normal order of mathematical operations, namely, the following rules:
- ^
- *, /, %
- The +, –
- = =,! <=, <, >=, >
- And, unless
- or
Experience a better cloud native log query system? Please check out our open source project Dagger
Follow the public account “Cloud native xiao Bai” on wechat, reply [Enter group] and enter Loki learning group