This article was adapted from Rancher Labs

Kubernetes helps manage the life cycle of the hundreds of containers deployed in pods. It is highly distributed and its parts are dynamic. An implemented Kubernetes environment typically involves several systems with clusters and nodes that host hundreds of containers that are constantly being started and destroyed based on the workload.

Proactive monitoring and debugging of errors is important when dealing with large amounts of containerized applications and workloads in Kubernetes. At the container, node, or cluster level, these errors can be seen in the container. Kubernetes’ logging mechanism is an important component for managing and monitoring services and infrastructure. In Kubernetes, logging lets you track errors and even tune the performance of the container hosting your application.

Configure stDOUT (standard output) and STderr (standard error) data streams

Image credit: kubernetes.io

The first step is to understand how logs are generated. With Kubernetes, logs are sent to two data streams, STdout and stderr. These data streams are written to JSON files, and the process is handled internally by Kubernetes. You can configure which logs to send to which data stream. A best practice recommendation is to send all application logs to STdout and all error logs to STderr.

Decide whether to use the Sidecar model

Kubernetes recommends using the Sidecar container to collect logs. In this approach, each application container will have a neighboring “Streaming container” that will transfer all log streams to STdout and Stderr. The Sidecar model helps avoid exposing logging at the node level, and it lets you control container-level logging.

However, the problem with this model is that it can be used for small volumes of logging, which can lead to a large resource footprint if faced with large-scale logging. Therefore, you need to run a separate log container for each running application container. In the Kubernetes documentation, the Sidecar model is described as having “almost no overhead.” It’s up to you to decide if you want to try this model and look at the type of resources it consumes before choosing it.

The alternative is to use a logging agent that collects logs at the node level. This reduces overhead and ensures that logs are handled safely. Fluentd has become the best choice for large-scale aggregation of Kubernetes logs. It acts as a bridge between Kubernetes and any number of endpoints you want to use for Kubernetes logging. You can also opt for a Kubernetes management platform like Rancher, which already integrates Fluentd in the App Store without having to install a configuration from scratch.

After determining that Fluentd can better summarize and route the log data, the next step is to determine how to store and analyze the log data.

Choose log analysis tool: EFK or dedicated logging

Traditionally, for local server-centric systems, application logs are stored in log files on the system. These files can be seen at defined locations or moved to a central server. For Kubernetes, however, all logs are sent to a JSON file called /var/log on disk. This type of log aggregation is not secure because a Pod in a node can be temporary or transient. When a Pod is deleted, the log file is lost. This can be difficult if you need to try to troubleshoot partial log data loss.

Kubernetes officially recommends two options: send all logs to Elasticsearch, or use a third-party logging tool of your choice. Again, there is a potential option. Going the Elasticsearch route means you need to buy a full stack, the EFK stack, including Elasticsearch, Fluentd, and Kibana. Each tool has its own role. As mentioned above, Fluentd can aggregate and route logs. Elasticsearch is a powerful platform for analyzing raw log data and providing readable output. Kibana is an open source data visualization tool that creates beautiful custom dashboards from your log data. This is a completely open source stack that is a powerful solution for logging using Kubernetes.

Still, there are some things to keep in mind. In addition to being built and maintained by an organization called Elastic, Elasticsearch also has a large open source community of developers contributing to it. Although it has been proven to be fast and powerful in handling large-scale data queries, problems can arise when operating on a large scale. If you are using self-managed Elasticsearch, you need someone who knows how to build large-scale platforms.

The alternative is to use a cloud-based log analysis tool to store and analyze Kubernetes logs. Tools such as Sumo Logic and Splunk are good examples. Some of these tools utilize Fluentd to route logs to their platform, while others may have their own custom logging agent, which is located at the node level in Kubernetes. These tools are simple to set up and can be used to build a log viewing dashboard from scratch in the least amount of time.

Use RBAC to control access to logs

The authentication mechanism in Kubernetes uses role-based access control (RBAC) to verify a user’s access and system permissions. According to whether the user has the privilege (authorization. K8s. IO/decision) and granted to the users of reason (authorization. K8s. IO/reason), to audit log generated during operation. Audit logs are disabled by default. It is recommended to enable it to track authentication issues and can be set up using Kubectl.

Keep the log format consistent

Kubernetes logs are generated by different parts of the Kubernetes architecture. These aggregated logs should be formatted consistently so that log aggregation tools such as Fluentd or FluentBit can handle them more easily. Keep this in mind when configuring STdout and Stderr or allocating labels and metadata using Fluentd, for example. This structured logging is provided to Elasticsearch to reduce latency during log analysis.

Set resource limits on the log collection daemon

Because of the volume of logs generated, it is difficult to manage logs at the cluster level. DaemonSet is used in Kubernetes in a similar way to Linux. It runs in the background to perform specific tasks. Fluentd and FileBeat are two daemons supported by Kubernetes for log collection. We had to set resource limits for each daemon to optimize log file collection based on available system resources.

conclusion

Kubernetes consists of multiple layers and components, so good monitoring and tracking of it allows us to take our time in the face of failures. Kubernetes encourages logging using seamlessly integrated external “Kubernetes native” tools to make logging easier for administrators. The practices mentioned in this article are important for having a robust logging architecture that works under all circumstances. They consume computing resources in an optimized manner and maintain the security and high performance of the Kubernetes environment.