Insurance Company Improved Throughput with Java GC Tuning

Page Contents

A major insurance company improved its overall application throughput by tuning their Garbage Collection (GC) behavior. In this post, let’s study the bottleneck they faced, the GC tuning methodology they employed, and the settings that enhanced their application throughput.

Identifying the GC Performance Bottleneck

The Insurance application would handle the incoming transactions properly, however starting from 10am till 4pm it will be intermittently unresponsive to the requests. The SRE team suspected this unresponsiveness was happening due to Garbage Collection pauses. Thus, they captured the GC log from the application and uploaded it to GCeasy, a GC log analysis tool. The tool analyzed the GC log and instantly generated this GC analysis report.

Back-to-Back Full GCs

The GCeasy log analysis report indicated that the application was suffering from Consecutive Full GCs (i.e., GC events were running back to back).

***Fig: Back-to-Back Full GC running consecutively reported by GCeasy***

When a GC event occurs, it stops the application, preventing it from processing customer transactions. In this case, multiple GC events were running one after another. Each Full GC event took about 2 seconds to complete. The GC Duration graph below shows the time taken for each GC event. As a result, the application intermittently failed to respond to customer transactions whenever these GC events were running.

***Fig: Duration of GC events reported by GCeasy***

Why Were Full GCs Consecutively Running?

Application creates objects to process new requests. When more requests are processed, more objects are created. When more objects are created, the garbage collector runs more frequently. When the garbage collector runs more frequently, then the application is getting paused more frequently.

Solutions to address Consecutive Full GC

There are 3 potential solutions to address the consecutive Full GC problem:

1. Profile and Optimize Code to Reduce Memory Consumption

You can use memory profilers like HeapHero to profile your application’s memory and identify the memory bottlenecks to fix them. Although profiling and refactoring can reduce memory consumption, it’s a tedious and time-consuming process. It will yield good rewards in the long term but it will not help to put out the immediate burning fire.

2. Add More JVMs

Adding more JVMs can distribute and lower the load on each JVM, easing memory pressure and reducing GC frequency. However, since this application was doing session clustering, there was a limitation on the number of JVMs that could be added. They were at the peak limit on the number of JVMs that can be added, so this option was ruled out.

3. Increase Memory Size

Increasing the heap size provides the application with more memory space to store objects. This means the JVM can accommodate a larger number of objects before it needs to perform garbage collection. Consequently, with more available memory, GC events need to run less frequently. This was the easiest choice for them, given that the other two options weren’t feasible. However, be advised that there are two side effects to this approach:

a. Increasing more memory size translates to higher computing costs.

b. While the frequency of GC events may decrease, the time each GC event takes can increase with a larger memory size. Careful tuning is necessary to reduce individual GC pause times

New GC Settings

The application was configured with the following GC settings:

-Xmx8g -Xms8g -XX:+UseParallelGC

If you notice, the application is configured with an 8GB heap size and the Parallel GC algorithm. Since the resolution was to increase the heap size, they increased the heap size from 8GB to 12GB, following new settings were passed:

-Xmx12g -Xms12g -XX:+UseParallelGC -XX:-UseAdaptiveSizePolicy

If you notice the new ‘-XX:-UseAdaptiveSizePolicy’ argument has been added. This argument prevents the JVM from automatically adjusting the sizes of the heap regions to optimize performance. Team opted for ‘-XX:-UseAdaptiveSizePolicy’, because fixed heap sizes tend to have more predictable GC behavior and tuning control.

Following the GC tuning best practices, they implemented this new setting on one of their production JVMs. GC behavior is heavily influenced by real-world traffic patterns. Since it’s challenging to replicate in production traffic in testing environments, it’s recommended to apply JVM setting changes directly in the one of the production JVM to accurately observe and measure the impact. This approach ensures that the tuning adjustments are effective under actual operating conditions.

Remarkable Performance Gains

The increase in heap size provided significant relief to the application. By increasing the heap size, the application was able to handle more objects before triggering garbage collection, which reduced the frequency of GC events. Here is the GC log analysis report of the JVM which was running with the new GC settings.

GC throughput is one of the primary KPIs in GC tunin g studies. It refers to the percentage of time the JVM spends executing application code versus the time spent on garbage collection. Higher GC throughput means the JVM spends more time processing application requests and less time performing garbage collection. Thus, most applications prefer to have high GC throughput. In the original settings, this application’s GC throughput was 96.14%, whereas the revised settings improved the GC throughput to 99.115%.

This improvement in GC throughput performance led to following remarkable gains in the application:

a. Overall application throughput improvement of 23%. By reducing the time spent on garbage collection, the application could process more transactions in a given period, thus boosting its efficiency.

b. Overall application’s response time improved by 15%, meaning customers experienced faster and more reliable interactions with the application.

c. Besides that, CPU consumption also dropped by a phenomenal 50%.

Conclusion

In summary, increasing the heap size not only resolved the immediate GC bottleneck, but also improved the overall performance of the application.

GC easy – Universal Java GC Log Analyser

How an Insurance Company Improved Throughput with Java GC Tuning

Identifying the GC Performance Bottleneck

Back-to-Back Full GCs

Why Were Full GCs Consecutively Running?

Solutions to address Consecutive Full GC

1. Profile and Optimize Code to Reduce Memory Consumption

2. Add More JVMs

3. Increase Memory Size

New GC Settings

Remarkable Performance Gains

Conclusion

You may also like

Ram Lakshmanan

9 thoughts on “How an Insurance Company Improved Throughput with Java GC Tuning”

Add yours

9 Pingback

Share your thoughts!Cancel reply

How To

RECENTLY PUBLISHED

Understanding .NET GC Statistics

Automating .NET GC Analysis Using APIs

Important .NET Runtime Configuration Settings

About

Popular Topics

Troubleshooting Tools

GC easy – Universal Java GC Log Analyser

How an Insurance Company Improved Throughput with Java GC Tuning

Identifying the GC Performance Bottleneck

Back-to-Back Full GCs

Why Were Full GCs Consecutively Running?

Solutions to address Consecutive Full GC

1. Profile and Optimize Code to Reduce Memory Consumption

2. Add More JVMs

3. Increase Memory Size

New GC Settings

Remarkable Performance Gains

Conclusion

You may also like

Ram Lakshmanan

9 thoughts on “How an Insurance Company Improved Throughput with Java GC Tuning”

Add yours

9 Pingback

Share your thoughts!Cancel reply

How To

RECENTLY PUBLISHED

Understanding .NET GC Statistics

Automating .NET GC Analysis Using APIs

Important .NET Runtime Configuration Settings

About

Popular Topics

Troubleshooting Tools

Discover more from GC easy - Universal Java GC Log Analyser