When you are tuning the application’s memory & Garbage Collection settings, you should take well-informed decisions based on the key performance indicators. But there are overwhelming amount of metrics reported; which one to choose and which one to leave? This article intends to explain the right KPIs and right tools to source them.
What are the right KPIs?
Throughput is the amount of productive work done by your application in a given time period. This brings the question what is productive work? what is non-productive work?
Productive Work: This is basically the amount of time your application spends in processing your customer’s transactions.
Non-Productive Work: This is basically the amount of time your application spend in house-keeping work, primarily Garbage collection.
Let’s say your application runs for 60 minutes. In this 60 minutes let’s say 2 minutes is spent on GC activities.
It means application has spent 3.33% on GC activities (i.e. 2 / 60 * 100)
It means application throughput is 96.67% (i.e. 100 – 3.33).
Now the question is: What is the acceptable throughput %? It depends on the application and business demands. Typically one should target for more than 95% throughput.
This is the amount of time taken by one single Garbage collection event to run. This indicator should be studied from 3 fronts.
- Average GC time: What is the average amount of time spent on GC?
- Maximum GC time: What is the maximum amount of time spent on a single GC event? Your application may have service level agreements such as “no transaction can run beyond 10 seconds”. In such cases, your maximum GC pause time can’t be running for 10 seconds. Because during GC pauses, entire JVM freezes – no customer transactions will be processed. So it’s important to understand the maximum GC pause time.
- GC Time Distribution: You should also understand how many GC events are completing with in what time range (i.e. within 0 – 1 second, 200 GC events are completed, between 1 – 2 second 10 GC events are completed …)
Footprint is basically the amount CPU consumed. Based on your GC algorithm, based on your memory settings, CPU consumption will vary. Some GC algorithms will consume more CPU (like Parallel, CMS), whereas other algorithms such as Serial will consume less CPU.
According to memory tuning Gurus, you can pick only 2 of them at a time.
- If you want good throughput and latency, then footprint will degrade.
- If you want good throughput and footprint, then latency will degrade.
- If you want good latency and footprint, then throughput will degrade.
Throughput and Latency can be obtained from analyzing Garbage collection Logs. Upload your application’s Garbage Collection log file in http://gceasy.io/ tool. This tool can parse Garbage Collection logs and generates Throughput and Latency indicators for you. Below is the screen shot from the http://gceasy.io/ tool showing the throughput and latency:
Fig: KPI section from GCEasy.io report
Footprint (i.e. CPU consumption) can be obtained from the monitoring tools – Nagios, NewRelic, AppDynamics,…
January 29, 2021 at 5:50 pm
In the graph what is the time zone reported, is it reading this from our logs or from server timezone where ur application is hosted. I see our logs are EST but that is not reflected correctly in the interactive graph for GC
sorry if this is a dumb question..
February 8, 2021 at 10:05 pm
Hello Vikram! Indeed this is a good question. In GCeasy, we print the timezone that is present in the *GC log* file.
April 13, 2017 at 8:41 am
This is Mariya,
I just read about the “Key Performance Indicators”, Good explanation. Now I got clear picture about this.
And also it would be better if you could explain remaining parameters to understand.
Heap After GC
Heap Before GC
For the above
(what is the acceptable range, what should be min and max difference between Heap After GC & Heap beforeGC )
February 28, 2017 at 9:37 am
how to know run time of application
March 24, 2017 at 7:17 pm
Hello Dream.Xie! At the very top left corner of the GCeasy report, run time of the application is reported