When you are studying application’s Garbage Collection performance, you need to base your study on the ‘GC Pause Duration’ rather than ‘GC Duration’. So, it brings few questions:
- Why should I base my study on the ‘GC Pause Duration’?
- What is the difference between ‘GC Pause Duration’ & ‘GC Duration’?
- From where can I get ‘GC Duration’ and ‘GC Pause Duration’ time period?
Let’s answer these questions in this article.
In Garbage Collection, there are two types of events:
- An event that fully pause the application. These events are also called “Stop the world events”. No customer transactions will be processed during this event. Against most common belief, “Young GC” falls under this category. When “Young GC” runs, the entire application is paused for the duration till it completes.
- An event that partially pause the application. For example, if you are using G1 GC algorithm, old generation GC event has 5 phases. In those 5 phases, only 3 of them pause the application threads completely. Other 2 phases run concurrently with application threads without pausing the application threads. Refer to the below table:
|G1 GC Phase||Type|
|Initial Mark||Stop the world Phase|
|Root Region Scanning||Concurrent Phase|
|Concurrent Marking||Concurrent Phase|
|Cleanup||Stop the world Phase|
|Remark||Stop the world Phase|
Similarly, CMS old generation GC has 6 phases. In those 6 phases, only 2 of them pause the application threads completely. Other 4 phases run concurrently with application threads. Refer to the below table:
|CMS GC Phase||Type|
|Initial Mark||Stop the world Phase|
|Concurrent Mark||Concurrent Phase|
|Concurrent Preclean||Concurrent Phase|
|Concurrent Abortable Preclean||Concurrent Phase|
|Final Remark||Stop the world Phase|
|Concurrent Sweep||Concurrent Phase|
|Concurrent Reset||Concurrent Phase|
Equipped with this knowledge, let’s study the difference between GC Duration, GC Pause Duration, and Concurrent GC Duration.
Let’s say, in an old generation G1 GC event, each phase took 0.5 seconds.
Total Duration of this event is 2.5 seconds. Because there are 5 phases and each phase took 0.5 seconds. So, 5 x 0.5 seconds = 2.5 seconds
GC Pause Duration
Pause Duration of this event is 1.5 seconds. Because there are 3 Stop the world phases – Initial Mark, Cleanup, Remark. Each phase took 0.5 seconds. So, 3 x 0.5 seconds = 1.5 seconds.
GC Concurrent Duration
Concurrent Duration of this event is 1 second. Because there are 2 concurrent phases – Root Region Scanning, Concurrent Marking. So, 2 x 0.5 seconds = 1 second.
Thus, when you are tuning your application, you need to focus primarily on the ‘GC Pause Duration’ and not on the ‘GC Duration’ or GC ‘Concurrent Duration’. When you are calculating ‘Throughput’ or ‘Latency’ in Garbage Collection analysis, it should be based on ‘GC Pause Duration’. Unfortunately, a lot of Garbage collection log analysis tools doesn’t give this clarity, they only publish ‘Total Duration’. If you base your analysis on the ‘Total Duration’ metric, it can lead to erroneous results.
However, tools like GCeasy – Universal Garbage Collection Log analysis tool, gives you separate graphs & metrics showing for ‘Total Duration, ‘Pause Duration’ and ‘Concurrent Duration’. Let’s review the graphs/metrics generated by this tool.
Fig 1: GC Duration Time Graph
You can notice in the ‘Fig 1: GC Duration Time Graph’ showing the total duration of every GC event that ran in the application. You can notice that there are a lot of full GC events whose duration spans for several seconds (4 – 18 seconds). ‘Fig 2: GC Pause Duration Time Graph’ shows the pause duration time of the same GC events. You can notice that the GC Pause Duration of the Full GC events (i.e. red triangle) to be significantly lower than the ‘GC Duration’.
Fig 2: GC Pause Duration Time Graph
Below table shows the comparative summary of the Pause Duration time and Concurrent Duration time.
Fig 3: Comparative summary
I hope this article clarified the difference between ‘GC Duration’ & ‘GC Pause Duration’ and why one should focus on the ‘GC Pause Duration’ for their Garbage collection study.