Many times we deployed Loki as a monolithic application, which allowed us to quickly deploy it in development and test environments. But in the end, we still can not escape the law of true fragrance, this time we are thinking about the soul of operation and maintenance three questions, this thing how to deploy to the production environment? Is high availability stable? How about distributed? Today xiao Bai starts a prelude, in Loki distributed deployment above to bring you to think.

Loki main components

Before deploying in a distributed way, it’s worth taking a quick look at some of Loki’s core components.

  • Distributor

Distributor is the front end of writing logs to Loki, validates the correctness of the logs when it receives them, and forwards the logs to Ingester for storage.

  • Ingester

Ingester is responsible for writing the received log data to back-end stores such as DynamoDB, S3, Cassandra, etc.), and it also sends the log information to the Querier component.

  • Querier

Querier is responsible for extracting logs from Ingester and the backend store, processing them in the LogQL query language and returning them to the client

  • Query Frontend

Query Frontend provides a Query API that splits a large Query request into multiple Querier queries in parallel and returns a summary. It is an optional deployment component that is typically deployed to prevent large queries from causing out-of-memory problems in a single query.

These components of Loki mainly constitute two data paths in the system as shown below. The red one is the data writing path and the green one is the data query path.

Those of you who know Cortex might see this diagram and find it familiar. Yes, according to Loki’s developers, Loki’s distributed architecture is derived from Cortex code. After all, they are all products of the same company, with a mature architecture and no need to build new wheels 😃

Loki distributed architecture

The application of consistent hashes is a key part of Loki’s distributed deployment. Like the Cortex, Loki’s Distributor component hashes the tenant ID or Label set of logs and decides to send the processed logs to an Ingester on the hash ring. Ingester maintains a life cycle manager that registers its states to hash rings using Consul’s token, including PENDING, JOINING, ACTIVE, LEAVING, or UNHEALTHY

Ingester in the JOINING state may process log writes and ingester in the LEAVING state may still process log query requests

To ensure final consistency in queries, before Loki responds to client data, Distributor must wait for more than half of Ingester’s services to respond successfully before returning.

Now let’s take a look at what Loki needs to be distributed:

  • Log block files require persistent object file storage,Recommend Ceph S3
  • Log indexing requires a NoSQL database that can scale horizontally,Recommend Cassandra
  • Consul’s cluster is required to hold the status of Loki’s major components
  • A unified gateway is required for load balancing Loki log writing and query

The overall architecture of Loki should look something like this:

By the way, if you read Haku’s previous article “Smart Caching to Speed Up Loki log queries”, this is what Loki’s architecture looks like if we want to introduce caching into Loki’s distributed deployment architecture

How’s it going? Doesn’t it feel a little complicated? But this picture is exactly how it feels like Cortex. Calculate, big guy is responsible for design, we small white can use good good 😂

Loki component configuration

Loki module deployment is controlled by adjusting the –target parameter in the startup parameter, which is not specified when we start Loki by default, so it will run as full ALL_in_one.

  • distributor_configThe global configuration of distributor is defined, which mainly maintains the consistent hash ring access address and ingesters heartbeat information stored in Consul or ETCD.
Ring: kvstore: # Storage mode, supporting Consul, etcd, or memroy store: <string> # Access path [prefix: < string > | default = "collectors/"] # kv storage configuration [consul: < consul_config >] [host: <string> | default = "localhost:8500"] [acl_token: <string>] [http_client_timeout: <duration> | default = 20s] [consistent_reads: <boolean> | default = true] [etcd: <etcd_config>] [endpoints: <list of string> | default = []] [dial_timeout: <duration> | default = 10s] [max_retries: The < int > | default = 10] [memberlist: < memberlist_config >] [... skip... # and ingesters heartbeat timeout [heartbeat_timeout: < duration > | default = 1 m]Copy the code

Loki uses gossip protocol to implement hash consistency in memory. However, the configuration of member is complicated, so we recommend that you use Consul or ETCD as the storage back end of hash ring

  • ingester_configThis section defines the configuration of the life cycle manager and log store. Except for the adjustment of the life cycle manager, most of the default configurations can be used.
Lifecycler: ring: kvstore: # reference istributor_config configuration can be [store: < string > | default = "consul"] [the prefix: <string> | default = "collectors/"] [consul: <consul_config>] [etcd: <etcd_config>] [memberlist: <memberlist_config>] [heartbeat_timeout: <duration> | default = 1m] [replication_factor: < int > | default = 3] # registered in hash number of token ring, can be understood as a virtual node [num_tokens: < int > | default = 128] [heartbeat_period: < duration > | default = 5 s] # read from the service network card IP address interface_names: - [< string >. | default = [" eth0, "" en0"]]Copy the code
  • query_frontend_configThis section defines the configuration of the query front end
Frontend: # compressed HTTP response packet [compress_responses: < Boolean > | default = false] # define downstream querier service address [downstream_url: <string> | default = ""]Copy the code

Query_frontend Split query requests are determined by queryrange.split_queriers_by_interval. If set to 1h, Query_frontend splits the daily query into 24 hourly queries, distributes them to Querier, and then aggregates the returned data into logs.

This is very useful in a production environment and Haku strongly recommends enabling it.

Query_frontend supports Prometheus for cache applications. Loki1.6 does not support log caching.

  • gatewayIs an external custom nginx load balancer, mainly Loki log write and query requests do you have and load balancer, its core configuration can refer to the following:
server { listen 80; location = / { proxy_pass http://querier:3100/ready; } location = /api/prom/push { proxy_pass http://distributor:3100$request_uri; } location = /api/prom/tail { proxy_pass http://querier:3100$request_uri; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } location ~ /api/prom/.* { proxy_pass http://querier-frontend:3100$request_uri; } location = /loki/api/v1/push { proxy_pass http://distributor:3100$request_uri; } location = /loki/api/v1/tail { proxy_pass http://querier:3100$request_uri; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; } location ~ /loki/api/.* { proxy_pass http://querier-frontend:3100$request_uri; }}Copy the code

The distributed summary of Loki is here, “HEARD, your Loki or singly?(Part 2)” I will update the way to deploy Loki clusters using Helm.

About cloud native xiao Bai

The purpose of cloud native xiaobai is to show the cloud native application from a practical point of view, standing in the perspective of xiaobai to view and use cloud native, and to solve a practical problem in each article starting point to lead you into the cloud native.


Follow the public account “Yunprotoxiao Bai” on wechat and reply to “enter the group” to enter the Loki learning group