Kubernetes container log collection principle

An overview of the

About container Logs

Docker logs are divided into two categories, one is Docker engine logs; Another type is container logs. Engine logs are typically assigned to system logs, which are placed in different locations for different operating systems. This article focuses on container logs. Container logs can be understood as logs generated by applications running inside a container. By default, Docker logs display log information about the current running container, including STOUT and STDERR. Log in json – the format of the file is stored in the/var/lib/docker/containers/containers < id > / < container id > – json. The log, but this way does not fit into the production environment.

By default, container logs do not limit the size of log files. The container keeps writing logs, which causes disk overflow and affects system applications. Docker logdriver supports the rotate of log files.
The standard output of the Docker Daemon collection container. When the number of logs is too large, the Docker Daemon becomes a bottleneck in log collection and the log collection speed is limited.
When the number of log files is too large, docker logs -f will directly block the Docker Daemon, resulting in docker ps and other commands do not respond.

Docker provides logging Drivers configuration. Users can Configure different log-drivers according to their own requirements. Please refer to the official website of Configure Logging Drivers [1]. However, the log collection of the above configuration is also collected by the Docker Daemon, and the log collection speed is still a bottleneck.

Log-driver Log collection speed Syslog 14.9 MB/s Json-file 37.9 MB/sCopy the code

Can you find a tool that redirects log content to a file and automatically rotate without using the Docker Daemon to collect logs? The answer is yes with S6[2] base mirroring.

S6-log redirects the standard output of CMD to /… /default/current instead of sending it to the Docker Daemon, thus avoiding the performance bottleneck of the Docker Daemon collecting logs. In this paper, S6 base image is used to construct application image to form a unified log collection scheme.

About Kubernetes logs

The Kubernetes log collection solution is divided into three levels:

Application (Pod) level

Pod-level logging, which defaults to standard output and flag input, is actually consistent with Docker containers. Using kubectl logs pod – name – n namespace view, specific reference: kubernetes. IO/docs/refere…

Node level

Node level logs are managed by configuring the log-driver of the container. Logs that exceed the upper limit are rotated automatically.

Cluster level

There are three types of log collection at the cluster level.

Log collection is performed at the Node level in Node agent mode. Typically deployed on each Node using DaemonSet. The advantage of this method is that it consumes less resources because it only needs to be deployed on nodes and does not invade applications. The disadvantage is that only for in-container applications logs must be standard output.

Use sidecar Container as a container logging agent, that is, a logging container that follows the application container in Pod, in two forms:

One is to directly collect and export the logs of the application container to standard output (called Streaming Sidecar Container). However, it is important to note that there are actually two identical log files on the host: one is written by the application itself; The other is the JSON file corresponding to Stdout and stderr of Sidecar. This is a huge waste of disk, so it is impossible to change the application container unless absolutely necessary.

The other is to have a log collecting agent (such as Logstash or Fluebtd) in each Pod, which is equivalent to putting the logging agent in Scheme 1 into the Pod. However, this solution consumes a lot of resources (CPU, memory) and the log is not output to standard output. Kubectl logs will not see the log content.

The application container pushes logs directly to the storage back end, which is a simpler way to send log content directly from within the application to the log collection service back end.

Log architecture

According to the introduction of Kubernetes log collection scheme above, in order to design a unified log collection system, node agent can be adopted to collect the logs of containers on each node. The overall architecture of logs is shown in the figure:

The explanation is as follows:

All application container is based on the S6 basal mirror, container application log will be redirected to the host machine under a directory of files such as/data/logs/namespace/appname/podname/log/XXXX. Log
Log-agent contains tools such as Filebeat and Logrotate. Filebeat is an agent that collects log files
The collected logs are sent to Kafka using Filebeat
Kafka is talking about the ES log storage/Kibana retrieval layer for log sending
Logstash serves as an intermediate tool for creating index and consuming Kafka messages in ES

The whole process is well understood, but the following needs to be addressed:

How do I dynamically update the Filebeat configuration for a new application deployed by a user
How can I ensure that every log file is properly rotated
If you need more functionality, you need to redevelop Filebeat to support more custom configurations

To put into practice

To solve the above problems, it is necessary to develop a log-Agent application to run on every node of Kubernetes cluster in the form of DaemonSet, and the application contains Filebeat, Logrotate and functional components to be developed.

First question, how to dynamically update Filebeat configuration, you can use github.com/fsnotify/fs…

Second question, using github.com/robfig/cron…

/var/log/xxxx/xxxxx.log {
  su www-data www-data
  missingok
  notifempty
  size 1G
  copytruncate
} 
Copy the code

The third question, about the secondary development filebeat, can refer to blog: www.jianshu.com/p/fe3ac68f4…

conclusion

This article provides a simple idea of how Kubernetes log collection can be tailored to your company’s needs.

Write in the last

Welcome to pay attention to my public number [calm as code], massive Java related articles, learning materials will be updated in it, sorting out the data will be placed in it.

If you think it’s written well, click a “like” and add a follow! Point attention, do not get lost, continue to update!!

Kubernetes container log collection principle

An overview of the

Log architecture

To put into practice

conclusion

Write in the last

Related Posts

Programmers need to know these points about ThreadLocal

Two merchant sets deployed on one server Modified description

Docker environment, set up redis Sentinel cluster