preface

Recently a friend of mine kept complaining to me that the company’s system logs were so bad that there was little useful information and a lot of useless information. Even that useful information was a pain to trace through all those node logs.

I asked him, the company does not collect logs, log collection is better than a node by node log view. He said the company had access to a charging third-party log product that did the collection. But only to facilitate the unified view search, but the system’s own log is missing some key elements. It is messy and difficult to locate when viewing call logs between many microservices.

The problems are summarized as follows:

  • Not all logs have critical information, such as order numbers and SKU information. Some logs do, and some do not. As a result, some logs print the process step log but do not know which object is being processed.
  • There is no uniform specification for logging, resulting in a seemingly chaotic appearance
  • The call chain tracing of the same request between micro-services is more painful, and it can only be searched according to the time stamp. When the concurrency is small, it can still be found through the time stamp. When the concurrency is large, the cost of problem detection is too large.

I have recommended to him some distributed tracking frameworks, Skywalking, Pinpoint. He looked at it and said it was perfect, but it was a bit expensive to build and implement. There are also storage costs involved, as the company has purchased third-party logging services. It’s a little tricky to hook up. I’m afraid they don’t approve of this.

Recently, I saw such an open source framework in the open source community, which is known as a lightweight log tracking magic tool and can be accessed in 10 minutes, so I recommended it to my friends. A few days later, he told me that this product was very appropriate to his current pain point. Now he has started production, and the preliminary results show that he is very satisfied. Can reduce the time cost for their log location.

Project features

Invited, so I’m going to take a look at this framework. Just to give you an intuition.

This framework is called TLog, and the project is hosted on Gitee

Gitee.com/dromara/TLo…

The home page looks like this.

I use, to put it bluntly, TLog automatically prefixes each log line, which is called a label. Labels are classified into system-level labels and service labels. Service labels can be customized by developers. I drew a picture to make it easier to understand:

TLog eventually renders each row of the log, as shown above. Among them, the system log can present five information. The upstream information can let you know who called your system, and the link TraceId is the unique ID of the global link across micro services. If you search an ID, you can get the same request log in all systems. This one still smells good!

As for SpanId, the official website explains that it represents the position of the call in the whole call link tree. For specific demonstration, please refer to the figure on the official website for a fairly clear explanation:

My personal understanding of SpanId is that it gives you a sense of the hierarchy of the system in a call chain, which, if collected, generates a tree of call links from SpanId. I wish TLog could implement this tree display, but it is not supported yet.

“TLog positioning is the easiest way to solve log tracking problems. It does not collect logs and does not require additional storage space. It simply automatically tags your business logs and automatically generates Traceids throughout your microservice link. It also provides information about upstream and downstream nodes. Suitable for small to medium businesses and corporate projects that want to quickly solve log tracking problems. “

This is a tautology on the official website. In fact, when I tested TLog, the log provided was the log itself. In multi-node microservices, the log was scattered. It is only logical to concatenate the logs to some degree. At the same time, however, the TLog documentation also suggests using other schemes for log collection. For example, ELK, Ali Cloud log, other charging log products and so on. TLog only modifies the log and does not generate new logs. Therefore, there is no problem in connecting with other log collection solutions, and there is no need to change anything for companies that have already connected with log collection solutions.

Supported logging frameworks

Every company uses a variety of logging frameworks. TLog claims to support three major logging frameworks: Log4j, log4j2, and Logback

In the actual test, TLog can print labels in all three frameworks. However, in the process of access, there are three official access methods

In testing, the JavaAgent approach was indeed unstable for some projects, and some complex projects were ineffective. For the log adaptation that claims to be the most stable, we tested the company’s project and it worked.

Access mode, follow the document step by step can be.

Supported RPC framework

Since log tracking is done across microservices, support for common RPC is also required on the implementation side. It is good that TLog supports three common RPCS, Dubbo, SpringCloud, and Dubbox.

In actual testing, TLog also supports The Gateway for SpringCloud.

In the process of access, no matter which RPC framework, TLog can also be automatically adapted in the Springboot environment, introducing one can be automatically assembled. No additional configuration is required. That’s convenient.

In a native Spring environment (non-SpringBoot), however, additional configuration of XML is required, but it is relatively simple and well documented.

Business tag

In addition to system-given tags, another feature I’ve found is that TLog allows developers to customize tags. Access is also simple, for example:

@TLogAspect({"str","user.name","user.userDetail.company","user.dddd"})
    public void test(String str, User user){
        log.info("This is a custom expression tag.");
        log.info("This is business log 1.");
        log.info("This is business log 2.");
        log.info("This is business log 3.");
        log.info("This is business log 4.");
        log.info("This is business log 5.");
    }
Copy the code

Add a label to a method, and all logs below that method, including the next N levels, are automatically tagged with the label you defined

This feature can add many tags to the layout and search of the log. This is awesome!

Even custom tags support logical processing of information that can be handled by custom classes

@TLogAspect(convert = CustomAspectLogConvert.class)
public void demo(Person person){
  log.info("Custom Convert Example");
}
Copy the code

You can try this out. One tag does all the custom tags.

Support for other scenarios

Parameters & Time printing support:

After the introduction of TLog, it is found that TLog also comes with the parameter printing and time consumption of each method, regardless of the framework, which is disabled by default, only need to configure it:

tlog.enableInvokeTimePrint=true
Copy the code

The resulting effect is as follows:

Asynchronous thread & thread pool support

If your project has asynchronous threads, consistency for tag passing is automatically supported without any problems.

However, TLog has no native support for thread pool scenarios. But a helper class is provided that requires a small amount of intrusive code. This needs to be improved.

MQ support

Similarly, TLog allows for the passing of MQ scenario tags. This also requires a small amount of intrusive code. If you change nothing, it will not be supported in MQ scenarios.

performance

For performance, I conducted a simple test on TLog, only in the log4J2 environment. The test condition was to print several w logs synchronously. The time consuming in the native environment was compared with the time consuming after TLog framework was added

The test code is very simple:

StopWatch stopWatch = new StopWatch();
stopWatch.start();
for (int i = 0; i < 100; i++) {
  log.info("test log {}", i+1);
}
stopWatch.stop();
log.info("Time consuming: {}",stopWatch.getTotalTimeSeconds());
Copy the code

Do not add TLog

Prints 5w logs (seconds) Prints 10w logs (seconds)
The first 6.496249974 15.595447718
The 2nd 6.185712521 14.295489162
The third time 6.116123718 13.559289437
4 times 6.205771261 12.782565374
The fifth time 6.727208117 12.884360048
6 times 5.908489157 14.604699842
The seventh 6.153151066 13.700609245
eighth 6.603960836 13.048889457
ninth 6.139718196 12.584335736
10th 6.365920588 12.85222910
On average, 6.29 13.60

Join the TLog

Prints 5w logs (seconds) Prints 10w logs (seconds)
The first 5.997981416 12.878389572
The 2nd 6.154590093 14.268328625
The third time 6.228010581 12.385200456
4 times 6.452949788 15.542794904
The fifth time 6.156225995 12.350440713
6 times 6.121611887 12.543883453
The seventh 6.18131273 12.192140225
eighth 6.238254682 12.159684042
ninth 6.403632588 12.308115127
10th 5.954781153 12.321667925
On average, 6.19 12.89

And the test results were a little bit surprising, the average of the 10 tests with TLog was actually faster than the one without TLog. But I guess the problem is that the test environment and sample size are too small, not that adding is faster than not adding. Let’s just say, if I do 100 tests, 1,000 tests. Two of them should be about the same.

This performance test also reflects that TLog does not impose additional overhead on the system. That’s nice too.

conclusion

The open source framework TLog is generally a logging framework, but with distributed tracking, log tagging, and other small features, it is a very characteristic Java open source project, very small in structure, and very good performance. From a practical point of view, it solves the pain point of quickly locating logs in small and medium-sized companies. The downside is that you can’t do more efficient log mining without collecting logs, but that’s why TLog boasts 10-minute access. From the objective analysis, this has advantages and disadvantages. Hopefully TLog will be able to improve this part of the functionality in the future.