background


For distributed online services, a request needs to go through multiple modules in multiple systems, possibly up to hundreds of machines to complete a single request. In this scenario, single manpower cannot grasp the performance cost of each stage of the entire request, and cannot quickly locate the performance bottleneck in the system. When a failure occurs, it is often necessary to look at a large number of logs across multiple teams to confirm the problem.


Take a chestnut

As a senior engineer who has been working in the field for many years, the system design he may face is like this, as shown below.


Pictures from the Internet


With the help of good system design and coding standards, xiao Liang spent two hours to locate the general problem request processing by browsing a large number of log files based on his understanding of multiple systems (on the premise that the log output also needs to be standardized). The complexity of the system grows exponentially as the number of users increases, and liang spends most of his time on tasks like team communication. Ryo’s happiness index also drops exponentially as does the system’s complexity.

If only there were something that could record the system through which each request passed, and capture the time spent on each node, processing classes and so on, the world would be a wonderful place.

By chance, Xiao Liang learned about one of the magical functions of UAVStack called call chain, which easily solved his problem without any intrusion into the business code. Let’s start a magical journey to explore the UAVStack.

UAVStack call chain technology stack support

Technology stack

Detailed technology stack name

Support state

note

jdk

6


Java native Queue is not currently supported

7


8


9


Java custom method


Application framework

Dubbo


The entire amount of support

CXF


AXIS2


XFIRE


SUN JAXWS


Jersey


SpringMVC


SpringRESTService


The Servlet (2.5/3. X)


Struts 2.x


Wink


Apache HttpClient (Synchronous/Asynchronous)


Log4j


LogBack


Application server

Tomcat (6 +)


The entire amount of support

SpringBoot (also classified as application server)


Jetty (7 +)


The data source

MySQL


The entire amount of support

Oracle and other JDBC data sources


Directing (MongoClient)


Redis (JEDIS, LETTUCE, ARedis)


Aredis does not support this yet

Message middleware

RabbitMQ (consumption/production)


The entire amount of support

RocketMQ (Consumer/Production)


To support

Kafaka (consumption/Production)


To support

Database connection pool

DBCP/2


The entire amount of support

c3p0


Druid


Proxool


Hikari


MyBatis CP


Tomcat DBCP/2



Results show

Lightweight call chain display details:

Fetch view after recall chain is open:

More use of skills and instructions please refer to the website: https://uavorg.github.io/documents/uavdoc_useroperation/91.html (the invocation chain part of the user guide).

The specific implementation

The implementation of UAVStack call chain includes model design, server information collection (light/heavy), method-level information collection (light/heavy), client information collection (light/heavy), call chain protocol design (light/heavy), call chain context transmission, call information recording and transmission, call data statistical processing, etc. Due to space limitation, this issue only shares the model design and implementation of the call chain model sequence diagram.

Model design

The following call chain model is abstracted based on previous experience and specific business scenario requirements:

Call chain metadata:

1) SpanEndpointType: call type (Root(” E “),Service(” S “), Client(” C “), Method(” M “));

Root refers to the first node in the invocation chain, i.e. the beginning of an invocation chain, which can be a service request, an HttpClient call, etc.

Service refers to the external Service provided by the system that is not the first node in the current invocation chain, for example, user login Service.

Client refers to the current call chain is not the first node and is a way of the current system and external communication, such as HttpClient, Mongoclient, etc.;

The value of Method is not the first node in the current call chain and is a function in the system, such as log counting function.

2) traceId: unique identifier of the call chain;

3) spanId: call order of the current node in a call chain (unique in combination with SpanEndpointType); SpanId adopts a layered design, shaped like 1.2.1, which can represent both the order of calls and the hierarchy of the call chain.

4) parentId: the parent calling node of the current node in a calling chain.

Call chain drawing rules:

1) The initial call (fatherless call) of the caller (service, Web) is recorded as the start node E, and the unique call chain ID, traceID, is generated;

2) System application component calls (such as HttpClient, method calls, etc.), spanId end by 1 (if it is the first one, the spanId end by.1);

3) Inter-system invocation (for example, service A calls service B). The span information of service A and service B differs only from that of SpanEndpointType (corresponding to two ends of span respectively).

Take a chestnut

Background: User Xiao Ming wants to get some knowledge through the network, through the network he entered the system O. Service A and B are deployed in service O. Service A uses HttpClient to communicate with SERVICE B. Service B will first interact with Redis and then interact with MyQL.

To complete this request the UAV abstracts out the following call chain model:

1) Xiaoming (the caller in the figure below) accesses service A through the portal. At this time, the calling chain generates A unique traceId and sets the SpanEndpointType of the current node to N (meaning the first node) and spanId to 1 (the first node in the current calling layer). ParentId = E (no parent node);

2) Service A initiates an HTTP request to service B through HttpClient, and invokes the chain metadata as follows: traceId (the parent node ID is used); 1.1 (spanId ends with.1 because it’s the first call); 1 (spanId of the parent parentId); C (call type recorded as C client call);

3) Service B receives A call from service A via HttpClient. At this time, the metadata of the call chain is as follows: traceId (same as the id of the initial call); 1.1 (spanId follows the spanId passed); 1 (parentId follows the passed parentId); S (call type recorded as S server processing request);

4) Service B queries Redis first, at this time, the call chain metadata is as follows: traceId (the id of the initial call is used); 1.1.1 (spanId ends with.1 because it’s the first call); 1.1 (spanId of parentId parent node); C (call type recorded as C client call);

5) B service initiates a query against mysql again, and at this time, the call chain metadata is as follows: traceId (the id of the initial call is used); 1.1.2 (spanId ends with a 1 because it is not the first call); 1.1 (spanId of parentId parent node); C (call type recorded as C client call);

6) At the end of processing, the call chain will record the recorded information.


Call the chain sequence diagram


UAVServer: Middleware enhancement framework, which provides the ability to hijack in different life cycles of middleware, i.e., middleware hijack technology, such as tomcat WebContainer startup time, etc.

JEEServiceRunGlobalFilterHandler: using middleware hijack technology extends the global filter, to intercept all passes through the middleware (such as tomcat) requests;

ServiceSpanInvokeChainHandler: in the call chain focus processing as a Service type node handler;

ClientSpanInvokeChainHandler: in the call chain focus processing as Client type node handler;

XXAdapter: refers to all adapters in the call chain, providing the ability to operate data in the handler (Service, Client, Method handler, Method type omitted in the figure) at the time before and after the action.

To complete the generation of the call chain without any “intrusion” into the user code, the process is roughly divided into the following processes:

1) in JEEServiceRunGlobalFilterHandler doRepuest packaging parse request;

2) Before in xxAdapter ADAPTS data;

3) xxHandler processes the request data within the corresponding scope (Service, Client and Method);

4) After in xxAdapter collates or records the data;

5) in JEEServiceRunGlobalFilterHandler doResponse returned after processing the request.

conclusion

The main purpose of this article is to give readers an overall understanding of UAVStack call chain, a preliminary understanding of the general life cycle of a call chain drawing, the specific implementation will be detailed in the future share.


Source: Creditease Institute of Technology