background

While bringing many benefits, microservice architecture also increases the complexity of the system. Traditional single applications are divided into distributed microservices according to different dimensions, and different microservices may even be written in different languages. In addition, the deployment of services tends to be distributed, with perhaps thousands of servers across multiple different urban data centers. The figure below is a typical microservice architecture, and the number of nodes in the figure is relatively small. In Alipay, an overall offline payment transaction payment link involves hundreds of nodes.

Photo credit: www.splunk.com/en_us/data-…

Microservitization introduces the following typical problems:

  1. Fault location is difficult. A single request often involves multiple services, and troubleshooting even requires multiple teams

  2. It is difficult to comb the whole call link and analyze the call relationship of nodes

  3. Performance analysis difficult, performance short board node

In fact, the above problems are all observability problems of applications:

  1. log

  2. Trace

  3. metrics

This article will focus on the Trace aspect, fully referred to as Distributed Tracing. In 2010, Google published Dapper’s paper sharing their solution, one of the earliest distributed link tracking systems in the industry. Since then, various Internet companies have followed Dapper’s idea to launch their own link tracking system, including Twitter’s Zipkin, Alibaba’s Eagle Eye, PinPoint, Apache’s HTrace and Uber’s Jaeger. And, of course, there’s our protagonist: SOFATracer. There are many implementations of distributed link, which has given rise to a specification for distributed link tracing: OpenTracing, which was merged with OpenCensus in 2019 to become OpenTelemetry.

OpenTracing

A quick explanation of OpenTracing before diving into SOFATracer, because SOFATTracer is built on the OpenTracing specification (based on the 0.22.0 OpenTracing, the new version of the specification HAS a different API). A Trace consists of the Span generated by the service invocation and the references between them. A Span is a Span of time. Each service invocation creates a new Span, divided into the calling Span and the called Span.

  1. TraceId and SpanId

  2. The operation name

  3. Time consuming

  4. Service invocation result

There are usually more than one service invocation in a Trace link, so there will be more than one Span. The relationship between the spans is declared by a reference from the caller to the service provider. OpenTracing specifies two reference types:

  1. “ChildOf” synchronizes the service invocation. The client needs the result returned by the server to proceed with subsequent processing.

  2. FollowsFrom, an asynchronous service call, where the client does not wait for the result from the server.

A Trace is a directed acyclic graph. The topology of a call can be shown as follows:

The SpanContext in the diagram is the data that will be shared in a request, so it is called a Span context. The data that a service node puts in the context is visible to all subsequent nodes and can therefore be used for information transmission.

SOFATracer

TraceId generated

TraceId collects all service nodes in a request. The generation rules need to avoid conflicts between different Traceids, and the overhead should not be too high. After all, the generation of Trace links is an extra overhead in addition to the business logic. The TraceId generation rule in SOFATracer is as follows: The server IP address + the time when the ID was generated + the increment sequence + the current process ID, for example:

0ad1348f1403169275002100356696
Copy the code

The first 8 bits 0AD1348F is the IP of the machine that generates TraceId. It is a hexadecimal number, and each two digits represents a segment of the IP. We convert this number into 10 digits to get the common IP address representation 10.209.52.143. You can also follow this rule to find the request through the first server. The last 13 bits 1403169275002 are the time when TraceId is generated. The next four digits, 1003, are an increasing sequence, rising from 1000 to 9000, and then returning to 1000 after 9000. The last five bits, 56696, are the current process ID. To prevent TraceId conflict between multiple processes in a single machine, the current process ID is added to the end of TraceId.

The pseudocode is as follows:

TraceIdStr.append(ip).append(System.currentTimeMillis())
append(getNextId()).append(getPID());
Copy the code

SpanId generated

SpanId records the service invocation topology, in SOFATracer:

  1. The dot represents the call depth

  2. Numbers represent the order of invocation

  3. SpanId Is created by the client

The generation rules for TraceId and SpanId in SOFATracer refer to ali’s Hawk-eye component

Combining the calling Span and called Span, combined with TraceId and SpanId, we can build a complete service invocation topology:

Trace buried point

But how do we generate and retrieve Trace data? This is where the Instrumentation Framework comes in, which is responsible for:

  1. Generation, transmission, and reporting of Trace data

  2. Parsing and injection of Trace context

In addition, the Trace collector should be automatic, low intrusion and low overhead. The structure of a typical Trace collector is as follows. It is buried before the business logic:

  1. Server Received (SR), which creates a new parent Span or extracts it from the context

  2. Call business code

  3. The business code initiates the remote service call again

  4. Client Send (CS) creates a subspan that passes TraceId, SpanId, and transparently transmitted data

  5. Client Received (CR) : the current subspan is terminated and the Span is recorded/reported

  6. Server Send (SS) End Parent Span, record/report Span

Steps 3-5 May not exist or may be repeated multiple times.

There are a variety of implementations of buried logic, and the current mainstream are as follows:

  1. (Dubbo, SOFARPC, Spring MVC)

  2. AOP Aspect (DataSource, Redis, MongoDB)

a.Proxy

b.ByteCode generating

  1. Hook mechanism (Spring Message, RocketMQ)

Java language, SkyWalking and PinPoint use JavaAgent method to achieve automatic, non-invasive buried point. Typical SOFATracer implementations of Spring MVC Trace buried as follows:

SOFATracer’s Span is 100% created, but only log/report supports sampling. Log /report is relatively high in overhead, making it more likely to become a performance bottleneck under heavy traffic/load. In other Trace systems, Span is sampled, but in order to have 100% Trace in case of call error, they use a backward sampling strategy.

SOFATracer prints Trace information to a log file by default

  1. Client-digest: Calls Span

  2. Server-digest: Span is called

  3. Client-stat: calls the aggregation of Span data within one minute

  4. Server-stat: aggregation of the data that Span was invoked within one minute

The default log format is JSON, but you can customize it.

APM

A typical Trace system, in addition to Trace collection and reporting, has collectors, Storage, and display (API & UI) : Application Performance Management, referred to as APM, is shown in the figure below:

Image: pinpoint – apm. Making. IO/pinpoint/ov…

General requirements for Trace data reporting include real-time and consistent reporting. SOFATracer supports Zipkin reporting by default. Before storage involves streaming computation, the combination of calling Span and called Span, usually using Alibaba JStorm or Apache Flink; After the processing is complete, Trace data is stored in Apache HBase. Since Trace data is useful only for a short period of time, the system automatically deletes expired data. The expiration time is usually 7 to 10 days. In the final part of the presentation, query and analysis in HBase requires the following support:

  1. Graphical representation of directed acyclic graphs

  2. Query based on TraceId

  3. Query by caller

  4. Query by called

  5. Query by IP address

Image: pinpoint – apm. Making. IO/pinpoint/im…

In ant Group, we did not use Span to report, but Span to print logs and collect them on demand. The structure is as follows:

(Relic and Antique are not real system names.)

On the host, the DaemonSet Agent collects Trace logs, digest logs are used for troubleshooting, and stat logs are used for service monitoring. After log data is collected, it is processed by Relic system: single-machine log data is cleaned and aggregated; After further integration of the Antique system, the service data of Trace is aggregated by Spark for application and service latitude. Finally, we store the processed Trace data in CeresDB for Web Console query and analysis. The system can also be configured with monitoring and alarms to give early warning of application abnormalities. At present, the above monitoring and alarm can be achieved in real time, with a delay of about 1 minute.

The development of full-link tracking has been continuously improved and its functions are constantly enriched. At present, the Application Performance Management involved not only contains the full capability of full-link tracking, but also includes:

  1. Storage & analysis, rich terminal features

  2. Full link pressure measurement

  3. Performance analysis

  4. Monitor & alarm: CPU, memory, JVM information, etc

In ant Group, we have a special pressure measurement platform. When the platform initiates the pressure measurement flow, it has its own man-made TraceId, SpanId and transparently transmitted data (pressure measurement marks) to realize separate printing of logs. Welcome to SOFATracer as a full Link tracking tool, SOFATracer’s Quick Start Guide Link:

Looking forward to

SOFATracer’s future development plan is as follows. Welcome to contribute! Project Github link.

A link to the

SOFATracer quick start: www.sofastack.tech/projects/so…

SOFATracer Github project: github.com/sofastack/s…

OpenTracing: OpenTracing. IO /

OpenTelemetry: OpenTelemetry. IO /

Recommended Reading of the Week

  • KCL: Declarative cloud native Configuration Policy language

  • Ant Group ten thousand scale K8S cluster etCD high availability construction road

  • We built a distributed registry

  • Still struggling with multi-cluster management? OCM come!

For more articles, please scan the code to follow the “financial level distributed Architecture” public account