“I don’t have to worry about garbage collection.” – Is that true?

I’ve heard some of my developer friends say. Garbage collection is automatic. So, I don’t need to worry about that.” The first part is true, namely that “garbage collection is automated” on all modern platforms –Java,.NET, Golang, Python…… But the second part, “I don’t have to worry about it.” – Probably not true. It is debatable and questionable. Here is my case that shows the importance of garbage collection.

1. Unpleasant customer experience

When the garbage collector runs, it pauses the entire application to mark objects in use and sweep up objects with no active references. During this pause, all running customer transactions are paused (that is, frozen). Depending on the GC algorithm you configure and the type of memory setup, pause times can range from milliseconds to minutes. Frequent pauses in your application can cause your customers to stutter, shake, or stop. This will leave your client with an unpleasant experience.

2. Wasting millions of dollars

Here’s a white paper we published that explains literally how businesses waste millions of dollars through garbage collection. Basically, in a nutshell, modern applications are creating thousands/millions of objects. These objects must be continuously investigated to determine whether they have active references or are ready for garbage collection. Once objects are garbage collected, memory becomes fragmented. Fragmented memory must be compressed. All of these activities consume * huge computing cycles *. Those cycles translate into millions of dollars in payouts. If the performance of garbage collection can be optimized, it can result in cost savings of millions of dollars.

3. Low-risk, high-impact performance improvements

By optimizing garbage collection performance, you not only improve the pause time of garbage collection, but also improve the response time of the entire application. We recently helped one of the world’s largest car companies tune garbage collection performance. By simply changing the garbage collection Settings **, without refactoring a single line of code **, we significantly improved their overall application response time. The following table summarizes the overall response time improvement resulting from each of the changes we made to our garbage collection setup.

Average response time (seconds) Transactions over 25 seconds (%)
baseline 1.88 0.7
GC sets iteration #2 1.36 0.12
GC sets iteration #3 1.7 0.11
GC sets iteration #4 1.48 0.08
GC sets iteration #5 2.045 0.14
GC sets iteration #6 1.087 0.24
GC set iteration #7 1.03 0.14
GC sets iteration #8 0.95 0.31

When we started GC tuning, the overall response time for the car application was 1.88 seconds. When we optimized garbage collection performance with different Settings, we were able to improve the overall response time to 0.95 seconds in iteration 8. This translates to a 49.46% improvement in response time. Similarly, the percentage of transactions that took more than 25 seconds dropped from 0.7% to 0.31%, which is a 55% increase. This is a significant improvement without changing a single line of code.

All other forms of response time improvement require infrastructure change or architecture change, or code level change. All of these are expensive changes. Even if you start making these expensive changes, there is no guarantee that your application’s response time will improve.

4. Predictive monitoring

Garbage collection logs expose important predictive micro indicators. These metrics can be used to predict the availability and performance characteristics of an application. One micrometric exposed in garbage collection is “GC throughput “(see this article for more information on other micrometrics). What is GC throughput? If your application’s GC throughput is 98%, this means that your application spends 98% of its time processing customer activities and the remaining 2% on GC activities. When an application has a memory problem, GC throughput starts to drop after a few minutes. Troubleshooting tools like yCrash can monitor “GC throughput “to predict and forecast memory problems before they occur in a production environment.

5. Capacity planning

When planning capacity for your application, you need to understand the memory, CPU, network, and storage requirements of your application. One of the best ways to study memory requirements is to analyze garbage collection behavior. When you analyze garbage collection behavior, you will be able to determine the average object creation rate (e.g. 150 MB/ SEC), average object recovery rate. Using these micro metrics, you can do effective capacity planning for your application.

conclusion

Friends, I have done my best to demonstrate the importance of garbage collection analysis in this article. I wish you and your team the benefit of insightful garbage collection metrics.