When you are studying application’s Garbage Collection performance, you need to base your study on the ‘GC Pause Duration’ rather than ‘GC Duration’. So, it brings few questions:

  • Why should I base my study on the ‘GC Pause Duration’?
  • What is the difference between ‘GC Pause Duration’ & ‘GC Duration’?
  • From where can I get ‘GC Duration’ and ‘GC Pause Duration’ time period?

Let’s answer these questions in this article.

In Garbage Collection, there are two types of events:

  1. An event that fully pause the application. These events are also called “Stop the world events”. No customer transactions will be processed during this event. Against most common belief, “Young GC” falls under this category. When “Young GC” runs, the entire application is paused for the duration till it completes.
  1. An event that partially pause the application. For example, if you are using G1 GC algorithm, old generation GC event has 5 phases. In those 5 phases, only 3 of them pause the application threads completely. Other 2 phases run concurrently with application threads without pausing the application threads. Refer to the below table:
G1 GC Phase Type
Initial Mark Stop the world Phase
Root Region Scanning Concurrent Phase
Concurrent Marking Concurrent Phase
Cleanup Stop the world Phase
Remark Stop the world Phase

 

Similarly, CMS old generation GC has 6 phases. In those 6 phases, only 2 of them pause the application threads completely. Other 4 phases run concurrently with application threads. Refer to the below table:

CMS GC Phase Type
Initial Mark Stop the world Phase
Concurrent Mark Concurrent Phase
Concurrent Preclean Concurrent Phase
Concurrent Abortable Preclean Concurrent Phase
Final Remark Stop the world Phase
Concurrent Sweep Concurrent Phase
Concurrent Reset Concurrent Phase

 

Equipped with this knowledge, let’s study the difference between GC Duration, GC Pause Duration, and Concurrent GC Duration.

Let’s say, in an old generation G1 GC event, each phase took 0.5 seconds.

GC Duration

Total Duration of this event is 2.5 seconds. Because there are 5 phases and each phase took 0.5 seconds. So, 5 x 0.5 seconds = 2.5 seconds

GC Pause Duration

Pause Duration of this event is 1.5 seconds. Because there are 3 Stop the world phases – Initial Mark, Cleanup, Remark. Each phase took 0.5 seconds. So, 3 x 0.5 seconds = 1.5 seconds.

GC Concurrent Duration

Concurrent Duration of this event is 1 second. Because there are 2 concurrent phases – Root Region Scanning, Concurrent Marking. So, 2 x 0.5 seconds = 1 second.

Thus, when you are tuning your application, you need to focus primarily on the ‘GC Pause Duration’ and not on the ‘GC Duration’ or GC ‘Concurrent Duration’. When you are calculating ‘Throughput’ or ‘Latency’ in Garbage Collection analysis, it should be based on ‘GC Pause Duration’. Unfortunately, a lot of Garbage collection log analysis tools doesn’t give this clarity, they only publish ‘Total Duration’. If you base your analysis on the ‘Total Duration’ metric, it can lead to erroneous results.

However, tools like GCeasy – Universal Garbage Collection Log analysis tool, gives you separate graphs & metrics showing for ‘Total Duration, ‘Pause Duration’ and ‘Concurrent Duration’. Let’s review the graphs/metrics generated by this tool.

total-gc-duration

Fig 1: GC Duration Time Graph

You can notice in the ‘Fig 1: GC Duration Time Graph’ showing the total duration of every GC event that ran in the application. You can notice that there are a lot of full GC events whose duration spans for several seconds (4 – 18 seconds). ‘Fig 2: GC Pause Duration Time Graph’ shows the pause duration time of the same GC events. You can notice that the GC Pause Duration of the Full GC events (i.e. red triangle) to be significantly lower than the ‘GC Duration’.

pause-gc-duration

Fig 2: GC Pause Duration Time Graph

Below table shows the comparative summary of the Pause Duration time and Concurrent Duration time.

summary

Fig 3: Comparative summary

I hope this article clarified the difference between ‘GC Duration’ & ‘GC Pause Duration’ and why one should focus on the ‘GC Pause Duration’ for their Garbage collection study.