preface

I’ve shared a few things about memory models and object creation before, but most people don’t really know what they mean.

Until one day you run into some weird online question like:

  • A thread executing a task that does not return applies feign death.
  • The interface responds slowly or the request times out.
  • The CPU is running under high load.

This type of problem is not as obvious as a null pointer or an array out of bounds, so you need to combine the memory model, object creation, threading, etc.

This time, we will talk about how to troubleshoot and solve problems with the help of a previous production problem.

Production phenomenon

First, consider the context of the question:

This is actually a timed task, at a fixed time will open N threads to obtain data from Redis for calculation.

The business logic is very simple, but the application generally involves multi-threading and the simplest things have to be treated with care.

Sure enough, it went wrong this time.

Symptom: A task that should only take a few minutes is executed for several hours without quitting. I went through all the logs and found nothing unusual.

Then began to locate the problem of the road.

Location problem

Since there is no way to detect exceptions directly from the log, you can only see what the application is doing.

The most common tool is the one that comes with the JDK.

This time I used JStack to see how the thread is executing, which essentially dumps the current thread stack.

Of course, I need to know the PID of my application before dump, so I can use JPS -v to list all Java processes.

, of course, if you know the key directly use ps aux | grep Java is also possible.

Dump = pid=1523 jstack 1523 > 1523.log dump = pid=1523

If your application is simple, if it’s not complicated, if you have fewer threads and things like that you can actually open it up and look at it.

However, log files exported from complex applications are large. You are advised to use professional analysis tools.

I have a few logs here and I can just open them up.

Because I know the name of the open thread in my application, I can find the relevant stack in the log by thread name:

Therefore, it is usually recommended that thread names be given with meaning, which is necessary when troubleshooting problems.

In fact, several other threads are similar to the stack here, obviously doing Redis connections.

So I logged on to Redis to check the current connection count and found it was already very high.

The Redis response is slow.

Jps-v then lists all the Java processes running at the moment. Sure enough, several applications are querying Redis, and all are concurrent connections, so the problem is found naturally.

The solution

Therefore, the main cause of the problem is: a large number of applications query Redis concurrently, resulting in the performance of Redis degradation.

Now that we have identified the problem, how can we solve it?

  • Reduce the application query of Redis at the same time, and reduce the pressure of Redis in separate time periods.
  • Copy Redis to several clusters and query each application separately. However, this will involve operation and maintenance operations such as data synchronization, or synchronization by programs will increase the complexity.

At present we choose the first plan, the effect is very obvious.

Local simulation

Now, let’s look at the memory problem.

Take this class as an example:

Github.com/crossoverJi…

public class HeapOOM {

    public static void main(String[] args) {
        List<String> list = new ArrayList<>(10);while (true){
            list.add("1"); }}}Copy the code

The startup parameters are as follows:

-Xms20m
-Xmx20m
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/Users/xx/Documents
Copy the code

Fixed the maximum memory size of the heap at 20M to quickly highlight memory problems, and automatically dumped memory to /Users/xx/Documents(generated in the current directory if the path is not available) when the JVM is OOM.

After execution, an exception occurred:

The corresponding memory dump file is also generated.

Memory analysis

At this time, corresponding tools are needed for analysis, and the most commonly used one is naturally MAT.

I tried an online tool that worked well (too big for that) :

heaphero.io/index.jsp

After uploading the memory file you just generated:

Because of memory overflow, we mainly observe large objects:

This object is likely to run out of memory.

It’s pretty obvious when you look at the stack:

Writing data to the ArrayList over and over again leads to frequent expansions, such as array copying, that eventually run out of memory by 20M.

For more advice

As MENTIONED above, be careful when using multiple threads.

Here are some daily tips:

  • Try not to do a lot of time-consuming network operations in the thread, such as querying the database (if possible, get the data ready from the DB in the first place).
  • Minimize multithreaded contention for locks. Data can be segmented and read separately by individual threads.
  • More use ofCAS + spinTo update data and reduce the use of locks.
  • Add to the application-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmpParameter to at least get the memory log in case of an overflow.
  • Thread pool monitoring. Data such as thread pool size, queue size, and maximum number of threads can be estimated in advance.
  • JVM monitoring, where you can see heap memory growth trends, GC curves, etc., can also be prepared in advance.

conclusion

Online problem location requires comprehensive skills, so some basic skills are needed. Such as threads, memory models, Linux, etc.

Of course, these problems have not been carried out in practice are all on paper; If you encounter an online problem for the first time, don’t panic. Instead, be thankful that solving it will help you learn another skill.

extra

Recently in the summary of some Java related knowledge points, interested friends can maintain together.

Address: github.com/crossoverJi…

Welcome to pay attention to the public account to communicate: