Project introduction

The watchman routing gateway project receives front-end HTTP requests and forwards them to back-end service systems, functional security verification, traffic limiting, and forwarding.

Technology used: Spring Boot + Nreflix Zuul, slF4J + Log4j deployment method was used in the beginning of the log into JAR package, internal use is Tomcat container, set the number of threads 600.

work
As a principle

First, let’s introduce the working principle of Zuul. In the figure, zuul defines four types of filter

1. Pre executes the request before it arrives at Origin. In this step, you can perform authentication, select a forwarding address, and record logs

2.Routing Establish HTTP requests using httpClient or Netflix ribbon

3.Post returns the request. You can perform statistical collection in this step, set the HTTP heder for response, and send the request back to the client

4.Error This command is executed when an Error occurs in the preceding steps

Review images


The problem

Review images

Performance pressure test, TPS only more than 400, open concurrent 10 threads and 100 threads the same results

Review images

Analysis of the

Review images

Watchman has no business logic, no database connections, no frequent IO operations, only security checking classes and stream limiting classes

Review images



Using jprofiler analysis, first look at the CALL Tree under CPU views, method call link stack,

Review images


In terms of the percentage of call time, the highest percentage is WrappingRunable with 10 HTTP with a high percentage below, these 10 HTTP are the interfaces of this pressure test. Each interface has a similar occupancy ratio. Expand the first 9%

Review images

Why does a route take so long to preroute

An incoming request is processed by three types of filters: prefilter, route, and preroute. There are only two Prefilters in the Pre package of this development, one is AppUrsTokenPreFilter, which is used to check app request uncheck, and the other prefilter is used to control the flow. I carefully checked the code and found no problem. There are five Prefilter classes in the Spring source package. Look at the code no problem, no problem found in another direction


Cpuload guess

Review images

Review images

From the view of CPU load, it is always very high, which can be inferred that CPU is related to calculation, and then there is dynamic MD5 calculation in AppUrsTokenPreFilter filter, which may be related to this class. After verification, the class is excluded from the container, and the pressure test is continued, and the final result is consistent with the previous one, but the CPU does not drop, which is speculated wrong


Pile of GC analysis

Review images

Review images

All clear. Keep changing direction


Thread dump analysis

Review images

By observing thread monitoring, it can be found that only 5 threads are in runnable state when 50 threads are concurrent, and the rest are BLOCKED and waiting. When the number of threads is added to 200, runnable still only runs in single digits, and 194 threads are BLOCKED, as shown in the figure. Here you can guess that the code has a lot of lock contention

Review images

Blocked is searched globally for line 12 blocked using jStack to dump thread logs

Review images

Loadclass… loadclass… loadclass… The problem has been found

Review images

Now look at where FormBodyWrapperFilter is called on line 137. Look at the code. Spring has a built-in prefilter function that converts parameters to form requests. Using the MappingJackson2HttpMessageConverter, among them in the constructor

Review images

Call again Jackson2ObjectMapperBuilder, continue to look at the code

Review images

Continue to look at the code Jackson2ObjectMapperBuilder this class

Review images

Continue to look at the code, or Jackson2ObjectMapperBuilder in this class

Review images

We’re almost at the source, continue, this time to the Classutils class

Review images

The forname method is too long and I just truncated the key part

Review images

Loadclass uses a synchronous lock

Problem solving method to disable the spring bring the prefilter class zuulzuul. FormBodyWrapperFilter. Pre. Disable = true, change after the redeployment of pressure measurement, the moment



It’s not over yet

After adding the comments to the two Prefilter codes written by myself, the TPS is only half (the red line is kill).

Review images

Continue to view thread dump. Yes you read that right log4J has become a bottleneck

Review images

See the log4j source

Review images

As can be seen from this side, loggingEvents are synchronized when forwarding to each appender. Even if you use asycAppender, there is no effect. The solution is to use Logback and pressure test, and the average 4k+ is acceptable

Review images



Review images