An overview of the

Microservices provide a powerful architecture, but they come with their own challenges, especially in debugging and observing distributed transactions across complex networks — simply because there are no memory calls or stack traces to enable this. This is where distributed tracing comes in. Distributed tracing provides a solution for describing and analyzing cross-process transactions. Some of the use cases for distributed tracing described in Google’s Dapper article include exception detection, diagnosing steady state problems, distributed analysis, resource attributes, and workload modeling of micro-services.

Distributed tracking

  1. Tracing: Description of transactions as they move through a distributed system.
  2. Span: A named timed operation that represents part of the workflow. The span accepts the key:value tag and a fine-grained, timestamp, and structured log attached to a particular span instance.
  3. Crossing the following: Tracing information that accompanies a distributed transaction, including the time it takes to deliver a service to a service over a network or message bus. The span context contains the trace identifier, span identifier, and any other data that the trace system needs to propagate to the downstream service.

Four big plate

From the point of view of an application-layer distributed tracking system, a modern software system would look like the following diagram:

Components in modern software systems can be classified into three categories:

  • Application and business logic: Your code.
  • Widely shared libraries: Other people’s code.
  • Widely shared services: Other people’s infrastructure. These three components have different requirements and drive the design of a distributed tracing system whose job is to monitor applications. The final design resulted in four important parts:
  • Trace the instrumentation API: Decorate the content of the application code.
  • Wired protocol: What is sent along with application data in an RPC request.
  • Data protocol: Content sent asynchronously (out of band) to the analysis system.
  • Analysis system: A database and interactive UI for processing trace data.

How does Opentracing adapt?

The OpenTracing API provides a standard, vendor-independent framework for testing. This means that if a developer wants to try a different distributed tracing system, instead of repeating the entire detection process for a new distributed tracing system, the developer can simply change the configuration of the tracker.

What is distributed tracing?

Distributed tracing, also known as distributed request tracing, is a method for analyzing and monitoring applications, particularly those built using microservice architectures. Distributed tracing helps pinpoint where the failure occurred and what caused the poor performance.

Who uses distributed tracing?

  • The IT and DevOps teams can use distributed tracing to monitor applications. Distributed tracing is particularly well suited for debugging and monitoring modern distributed software architectures, such as microservices.
  • Developers can use distributed tracing to help debug and optimize their code.

What is Opentracing?

It might be easier to start with what OpenTracing isn’t.

  • OpenTracing is not a download or program. Distributed tracing requires software developers to add instrumentation to the code of the application or to the framework used in the application.
  • Opentracing is not a standard. Cloud Native Computing Foundation(CNCF) is not an official standards body. The OpenTracing API project aims to create more standardized APIs and tools for distributed tracing.

OpenTracing consists of API specifications, frameworks and libraries that implement the specifications, and project documentation. OpenTracing allows developers to use APIs to add detection to their application code without locking them into any particular product or vendor. For more information about where OpenTracing is already implemented, see the list of languages and the list of tracing programs that support the OpenTracing specification.

Concepts and Terms

All the language-specific OpenTracing APIs share some core concepts and terminology. These concepts for project is center and so important, that they have their own repository (github.com/opentracing/specification) and semver scheme.

  1. The OpenTracing Semantic Specification is a version description of the current pan-language OpenTracing standard
  2. Both the traditional cross-tag and log key files that the semantic convention specification describes the common semantic scenario are versioned, and the GitHub repository is tagged according to the rules described in the versioning policy.

Spans

A “span” is the primary building block of distributed tracing, representing a single unit of work done in a distributed system. Each component of a distributed system contributes a Span — a named, timed operation that represents part of the workflow. Crosses can (and usually do) contain “references” to other spans, which allows multiple spans to be assembled into a complete trace — a visualization of the life cycle of requests as they move through a distributed system. Each Span encapsulates the following states according to the OpenTracing specification:

  • An operation name
  • Start timestamp and end timestamp
  • A set of keys: value range markers
  • A set of keys: value span log
  • A SpanContext

Tags

Tags are key-value pairs that enable user-defined annotations across domains for querying, filtering, and understanding trace data. Span tags should be applied to the entire Span, and there is a list of general Span tags used in common scenarios in semantic_contaction.md. Tag keys such as db identify the database host with db.instance, http.status_code for the HTTP response code, or error, which can be set to True if the operation represented by Span fails.

Logs

Logs are key-value pairs that are useful for capturing a specific range of log messages and other debugging or information output from the application itself. Logs can be useful for recording specific moments or events within a span (as opposed to the markup that should be applied to the entire span).

SpanContext

SpanContext carries data across process boundaries. Specifically, it has two main components:

  • Implement-dependent state used to refer to different spans in the trace, namely the spanID and traceID definitions of the implementation tracer
  • Any Baggage Items

    • Key-value pairs across process boundaries.
    • There is some data available for the entire tracing process.

An example of a span:

t=0 operation name: Db_query t = x + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | · · · · · · · · · · Span · · · · · · · · · · | +-----------------------------------------------------+ Tags: - db.instance:"customers" - db.statement:"SELECT * FROM mytable WHERE foo='bar'" - Peer. Address: "mysql: / / 127.0.0.1:3306 / customers" Logs: -message :"Can't connect to MySQL server on '127.0.0.1'(10061)" - trace_id:"abc123" - span_id:"xyz789" - Baggage Items: - special_id:"vsid1738"

Scopes and Threading

introduce

In any given thread, there is an “active” span that is primarily responsible for the work done by the surrounding application code, called ActiveSpan. The OpenTracing API allows only one span in a thread to be active at any point in time. This is managed through scopes, which formalize the activation and deactivation of SPAN. Any other span involving the same thread will meet any of the following conditions:

  • Started
  • Not finished
  • Not “active” For example, there can be more than one span on the same thread if the span is:
  • Wait for I/O
  • Blocked subspan
  • It deviates from the critical path

Note that if a scope exists when a developer creates a new Span, it will act as its parent Span, unless the programmer calls IgnoreActiveSpan () when buildSpan() or explicitly specifies the parent context.

Access the currently active APAN

Because it is inconvenient to manually pass a scope of activity from one function to another, OpenTracing requires that each tracing program include a SCOPeManager. The ScopeManager API grants access to the scope of activity through scope. This means that developers can access the Span of any activity through Scope.

Moving a span between threads uses the ScopeManager API, allowing developers to transfer a span between different threads. A Span’s life cycle may begin in one thread and end in another. The ScopeManager API allows a Span to be transferred to another thread or a callback. Passing scope to another thread or calling back is not supported. Refer to the language-specific documentation for more details.

Tracers

introduce

OpenTracing provides an open, vendor-neutral, standard API to describe distributed transactions, specifically causality, semantics, and timing. It provides a general distributed context propagation framework consisting of the following API primitives:

  • Passing the metadata context in the process
  • The metadata context is encoded and decoded to transfer the metadata over the network for interprocess communication
  • Causal Tracing: Parent-Child, Forking, and Joining OpenTracing abstracts the differences between many Tracker implementations. This means that the means of detection will remain the same regardless of which tracer system the developer uses. In order to detect an application using the OpenTracing specification, a compatible OpenTracing tracer must be deployed. A list of all supported trackers can be found here.

Tracer Interface

The tracer interface creates Spans and learns how to inject (serialize) and extract (deserialize) metadata across process boundaries. It has the following functions:

  • Start a new Span
  • Inject the SpanContext into the carrier
  • Extracting the SpanContext from the carrier will discuss these issues in more detail below. To do this, check out the specific language guide.

Set the Tracer

A tracker is the actual implementation of taking a record span and publishing it somewhere. How the application handles the actual tracer is up to the developer: either use it directly throughout the application, or store it in GlobalTracer for easy use with the instrumentation framework. Different tracer implementations receive parameters differently in how and what they receive during initialization, for example:

  • The name of the component that the application traces.
  • Trace the endpoints.
  • Track credentials.
  • Sampling strategy. Once you have the tracer instance, you can use it to manually create Spans or pass it to existing checks for the framework and libraries. To avoid forcing the user to keep the Tracer, the io. openTrace. util tool includes a globalTracer class that implements the io. openTrace. Tracer interface and, for the sake of the name, acts as a global instance that can be used anywhere. It works by forwarding all operations to another underlying tracer that will register at some point in the future, and by default, the underlying tracer is a no-nop implementation.

Start a new trace

A new trace is started whenever a new Span is created without reference to the parent Span. When creating a new Span, you need to specify an “operation name,” which is a free-form string that you can use to help you identify the code associated with the Span. The next scope of our new trace may be a subscope, and can be thought of as a representation of the subroutines that are executed in the main scope. Therefore, there is a child-of relationship between the child span and the parent span. Another type of relationship, shown below, is used in special cases where the new span is independent of the parent, such as in an asynchronous process.

Access the active span

You can use the tracer to enable access to ActiveSpan. In some languages, ActiveSpan can also be accessed through ScopeManager. See the language-specific guide for more implementation details.

Use Inject/Extract to propagate the trace

In order to trace across process boundaries in a distributed system, services need to be able to continue tracing injected by the client sending each request. OpenTracing achieves this by providing injection and extraction methods that encode the context of a span as a carrier. The inject method allows the SpanContext to be passed to the carrier. For example, trace information is passed to a client request so that the server sending the information can continue tracing. The EXTRACT method does the opposite. It extracts the spanContext from the carrier. For example, if the client has a request, the developer must use IO. Opentracing. Tracer. Extraction SpanContext extract method.

Implement the system of OpenTracing

Tracking system Supported languages
CNCF Jaeger Java.Go.Python.Node.js.C++.C#
Datadog Go
inspectIT java
Instana Crystal.Go.Java.Node.js.Python.Ruby
LightStep Go.Python.JavaScript.Objective-C.Java.PHP.Ruby.C++
stagemonitor java