Sentry Monitoring - Snuba Data Mid platform Architecture introduction (Kafka+Clickhouse)

A series of

1 minute Quick use of The latest version of Docker Sentry-CLI – create version
Quick use of Docker start Sentry-CLI – 30 seconds start Source Maps
Sentry For React
Sentry For Vue
Sentry-CLI usage details
Sentry Web Performance Monitoring – Web Vitals
Sentry Web performance monitoring – Metrics
Sentry Web Performance Monitoring – Trends
Sentry Web Front-end Monitoring – Best Practices (Official Tutorial)
Sentry Backend Monitoring – Best Practices (Official Tutorial)
Sentry Monitor – Discover Big data query analysis engine
Sentry Monitoring – Dashboards Large screen for visualizing data
Sentry Monitor – Environments Distinguishes event data from different deployment Environments
Sentry monitoring – Security Policy Reports Security policies
Sentry monitoring – Search Indicates the actual Search
Sentry monitoring – Alerts Indicates an alarm
Sentry Monitor – Distributed Tracing
Sentry Monitoring – Distributed Tracing 101 for Full-stack Developers

Snuba is a service that provides a rich data model on top of Clickhouse and fast uptake of consumers (directly from Kafka) and query optimizers.

Snuba was originally developed to replace the combination of Postgres and Redis to search for and provide aggregated data about Sentry errors. Since then, it has evolved into its current form to support most Sentry functionality related to time series on multiple data sets.

function

forClickhouseDistributed data storage provides a database access layer.
Provides a graphical logical data model that clients can passSnQLLanguage queries that the language provides similar toSQLThe function.
Multiple separate data sets are supported in a single installation.
Provides a rule-based query optimizer.
Provide a migration system that willDDLThe changes are applied in both single-node and distributed environmentsClickhouse.
Directly from theKafkaTaking the data
Support point-in-time query and streaming query.

Some use cases in Sentry:

eventsThe data setsIssue PageAnd other functions to provide support. The search function here is provided bySnubaAnd all aggregations (aggregationThe) function provides support.
discoverData set for all performance monitoring (Performance Monitoring) relevant functions are supported.
sessionsThe data set is published (Releases) features provide support. Specifically, the dataset ingests a large number of data points and stores pre-aggregated data to allow quick queries of large amounts of data.
outcomesThe data set is the statistical page (Stats page) provide support.

Start using Snuba

This is a quick guide to starting Snuba in a Sentry development environment.

A necessary condition for

Snuba assumes the following:

aClickhouseThe server endpoint is locatedCLICKHOUSE_HOST(the defaultlocalhost).
inREDIS_HOST(the defaultlocalhost)redisInstance. In the port6379On.

A quick way to get these services running is to set up Sentry and use:

sentry devservices up --exclude=snuba
Copy the code

Note that Snuba assumes that everything is running in UTC time. Otherwise, you may encounter time zone mismatches.

Sentry + Snuba

Add/Change the following lines in ~/.sentry/sentry.conf.py:

SENTRY_SEARCH = 'sentry.search.snuba.EventsDatasetSnubaSearchBackend'
SENTRY_TSDB = 'sentry.tsdb.redissnuba.RedisSnubaTSDB'
SENTRY_EVENTSTREAM = 'sentry.eventstream.snuba.SnubaEventStream'
Copy the code

Run:

sentry devservices up
Copy the code

Access the original ClickHouse Client (similar to PSQL) :

docker exec -it sentry_clickhouse clickhouse-client
Copy the code

Sentry_local: select count() from sentry_local;

Set up the

Settings can be found in settings.py

CLUSTERS: Provides a list of clusters and the host names that should run on each cluster (hostname), port (port) and storage set (storage sets). Each cluster is also set up locally and distributed (Local vs distributed).
REDIS_HOST:redisRunning here.

Snuba Architecture Overview

Snuba, a time series-oriented data storage service supported by Clickhouse, is a columnar storage distributed database that is well suited to the query type of the Snuba service.

clickhouse.tech/

Data is stored entirely in Clickhouse tables and materialized views, which are ingested through input streams (currently only Kafka Topics) and can be queried through point-in-time queries or subscriptions.

storage

Clickhouse was chosen as the backup storage because it provides a good balance between the real-time performance, distributed and replication nature, flexibility in terms of the storage engine, and consistency assurance that Snuba needs.

Snuba data is stored in the Clickhouse table and in the Clickhouse Materialized Views. Use multiple Clickhouse storage engines based on the target of the table.

Clickhouse. Tech/docs/en/eng…

Snuba data is organized in multiple data sets that represent separate partitions of the data model. See the Snuba Data Model section for more details.

intake

Snuba does not provide an API endpoint for inserting rows (unless run in debug mode). Data is loaded from multiple input streams, processed by a series of consumers and written to Clickhouse tables.

A consumer consumes one or more topics and writes to one or more tables. So far, no more than one consumer has written to the table. This allows for some of the consistency guarantees discussed below.

Data ingestion is most effective in batch processing (for Kafka but especially for Clickhouse). Our consumer supports batching and ensures that a batch of events fetched from Kafka is delivered to Clickhouse at least once. By properly selecting the Clickhouse table engine for row deduplication, we can achieve exactly once semantics if we accept final consistency.

The query

The simplest query system is point in time. The query is expressed in SnQL language (SnQL query language) and sent as an HTTP POST call. The query engine processes the query (the process described in Snuba query processing) and turns it into a ClickHouse query.

Streaming queries (done through a subscription engine) allow clients to receive query results as a push. In this case, the HTTP endpoint allows clients to register stream queries. The subscriber then consumes the topic used to populate the associated Clickhouse table for updates, running the query periodically through the query engine and generating results on the subscriber Kafka Topic.

Data consistency

Different consistency models coexist in Snuba to provide different guarantees.

By default, Snuba is ultimately consistent. Monotonic reads are not guaranteed by default when running a query because Clickhouse is a multi-leader, the query can hit any copy and there is no guarantee that the copy is up to date. Furthermore, by default, there is no guarantee that Clickhouse will reach a consistent state on its own.

Strong consistency can be achieved on specific queries by forcing Clickhouse to reach a FINAL keyword before executing the query and by forcing the query to hit a specific copy written by the consumer. This essentially uses Clickhouse as if it were a single leader system that allows Sequential consistency.

Snuba in Sentry deployment

This section explains Snuba’s role in Sentry deployments that showcase primary data flows. This is of no use to you if you deploy Snuba separately.

Errors and Transactions data flows

The main section at the top of the diagram illustrates the ingestion process for Events and Transactions entities. These two entities service most of the issue/errors related functionality in Sentry and throughout the Performance product.

Only one Kafka topic (Events) is shared between Errors and Transactions to provide information for this pipeline. This topic contains error messages and Transaction messages.

Errors Consumers use events Topic to write messages in the Clickhouse Errors table. Once committed, it also generates records about snuba-commit-log topic.

Error alerts are generated by Errors Subscription Consumer. This is a synchronized consumer, which consumes both the master Events topic and snuba-commit-log topic, so it can be synchronized with the master consumer.

Synchronized Consumer then generates alerts by querying Clickhouse and results on the Result Topic.

Transactions exist in an identical but separate pipeline.

The Errors pipeline has an additional step: write replacements Topic. Sentry causes Errors mutations(merge/cancel merge/reprocess/etc) on events topic. The Errors Consumer then forwards them to replacements Topic, which is executed by Replacement Consumer.

Events topics must be semantically partitioned by Sentry Project ID to allow events in the project to be processed sequentially. For now, this is required by Alerts and replacements.

Sessions and Outcomes

Sessions and Outcomes work in a very similar and simpler way. In particular, Sessions enhances Release Health, while Outcomes primarily provides data to the Sentry statistics page.

Both pipes have their own Kafka Topic, Kafka Consumer, which writes its own tables in Clickhouse.

Change the data capture pipeline

The pipeline is still under construction. It uses CDC Topic and populates two separate tables in Clickhouse.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Sentry Monitoring – Snuba Data Mid platform Architecture introduction (Kafka+Clickhouse)

A series of

function

Some use cases in Sentry:

Start using Snuba

A necessary condition for

Sentry + Snuba

Set up the

Snuba Architecture Overview

storage

intake

The query

Data consistency

Snuba in Sentry deployment

Errors and Transactions data flows

Sessions and Outcomes

Change the data capture pipeline

Sentry Monitoring – Snuba Data Mid platform Architecture introduction (Kafka+Clickhouse)

A series of

function

Some use cases in Sentry:

Start using Snuba

A necessary condition for

Sentry + Snuba

Set up the

Snuba Architecture Overview

storage

intake

The query

Data consistency

Snuba in Sentry deployment

Errors and Transactions data flows

Sessions and Outcomes

Change the data capture pipeline

Related Posts

Build your first command-line tool with Rust

Git add and git commit undo commands

Redis schchat (1) : Building a knowledge graph