GCeasy’s JSON APIs are used for application monitoring, analyzing code quality during CI/CD pipeline and several other purposes. API response contains rich set of information (i.e. lot of elements). In this article, we intend to highlight to few key elements in the API response. If values of these elements exceed or drops below the certain threshold, then you might consider raising alerts/warnings.
1. PROBLEM
GCeasy API has the intelligence to detect problems in the GC logs automatically. Whenever GCeasy detects problem, it will send back ‘isProblem’ element with value ‘true’. Detail about the detected problem will reported in ‘problem’ array element.
‘problem’ is THE PRIMARY ELEMENT in the response. If you encounter ‘problem’ element in the response, you are highly encouraged to generate alert/warning.
How to read ‘problem’ element?
‘isProblem’ and ‘problem’ element will be found in the root level of the JSON API response. Example:
{ "isProblem": true, "problem": [ "Our analysis tells that Full GCs are consecutively running in your application. It might cause intermittent OutOfMemoryErrors or degradation in response time or high CPU consumption or even make application unresponsive.", "Our analysis tells that your application is suffering from long running GC events. 4 GC events took more than 10 seconds. Long running GCs are unfavourable for application's performance.", "342 times application threads were stopped for more than 10 seconds." ], : : }
JSON expression to read ‘isProblem’ and ‘problem’ elements from the JSON response is:
$.isProblem $.problem
2. VISUAL REPORT
You might want to integrate the graphical visualization of the Garbage collection log analysis to Jenkins report or any other dashboards. API response contains ‘graphURL’ element which has a hyperlink to the visual report. This is the same visual Report that you will get when you upload the garbage collection logs in to the web portal. Visual report contains graphs, detailed metrics, help texts, … As old saying goes: “one picture is worth 1000 words” :-), we highly encourage you to integrate visualization of the data in your reports.
How to read ‘graphURL’ element?
‘graphURL’ element will be found in the root level of the JSON API response. Example:
{ "graphURL": "http://gceasy.io/my-gc-report.jsp?p=YXJjaGl2ZWQvMjAxOC8wNy8xMC8tLWFwaS1lMDk0YTM0ZS1jM2ViLTRjOWEtODI1NC1mMGRkMTA3MjQ1Y2NmMTI5ODQxOC04YWFhLTQyYzYtODA5OS04NWUyOGFkYjc2MDQudHh0LS0=&channel=API" : : }
JSON expression to read ‘graphURL’ element from the JSON response is:
$.graphURL
3. GC THROUGHPUT
Garbage Collection throughout is the amount of time your application spends in processing customer transactions vs amount of time your application spends in doing garbage collection.
Let’s say your application has been running for 60 minutes. In this 60 minutes, 2 minutes is spent on GC activities.
It means application has spent 3.33% on GC activities (i.e. 2 / 60 * 100)
It means Garbage Collection throughput is 96.67% (i.e. 100 – 3.33).
When there is a degradation in the GC throughput, it’s an indication of some sort of memory problem.
How to read GC throughput?
GC throughput is reported in the ‘throughputPercentage’ element. Example:
{ : : "gcKPI": { "throughputPercentage": 99.952, "averagePauseTime": 2.2834644, "maxPauseTime": 30 }, : : }
JSON expression to read ‘throughputPercentage’ element from the JSON response is:
$.gcKPI.throughputPercentage
What is the acceptable GC Throughput?
It depends on the application and business demands. There could be some low profile applications where very low GC Throughout is acceptable, on the other there could be Tier1 applications in your organization, where you want to target high GC throughput. We will recommend if throughput drops below 99% then raise warning. If it drop below 95% then fail the build.
4. AVERAGE GC PAUSE TIME
When Garbage Collection event runs, entire application pauses. Because Garbage Collection has to mark every object in the application, see whether those objects are referenced, if no one is referencing then will be evicted from memory. Then fragmented memory is compacted. To do all these operations, application will be paused. Thus when Garbage collection runs, customer will experience pauses/delays. Thus one should always target to attain low average GC pause time.
How to read average GC pause time?
Average GC pause time is reported in the ‘averagePauseTime’ element. Example:
{ : : "gcKPI": { "throughputPercentage": 99.952, "averagePauseTime": 2.2834644, "maxPauseTime": 30 }, : : }
JSON expression to read ‘averagePauseTime’ element from the JSON response is:
$.gcKPI.averagePauseTime
What is the acceptable average GC pause time?
It depends on the application. Say if you running a batch application then GC pause time may not matter, because no customer is waiting in front of his device actively for response. On the other hand, if you are building rockets in NASA then even few milliseconds of pauses matters. One good way to come up with acceptable threshold value is to look at current average GC pause time experienced by your application. Based on it, you can come up with value. If you are going to put me in a gun point to ask for the acceptable value, I will say 2 seconds 🙂
5. MAX GC PAUSE TIME
Some Garbage collection events might take a few milliseconds, where as some garbage collection events might also take several seconds to minutes. You should measure maximum garbage collection pause time, to understand the worst possible impact to the customer.
How to read maximum GC pause time?
Maximum GC pause time is reported in the ‘maxPauseTime’ element. Example:
{ : : "gcKPI": { "throughputPercentage": 99.952, "averagePauseTime": 2.2834644, "maxPauseTime": 30 }, : : }
JSON expression to read ‘maxPauseTime’ element from the JSON response is:
$.gcKPI.maxPauseTime
What is the acceptable Maximum GC pause time?
Please refer to the description given in the ‘What is the acceptable average GC pause time?’ section of this article.
6. OBJECT CREATION RATE
Object creation rate is the average amount of objects created by your application. May be in your previous code commit, application was creating 100mb/sec. Starting from recent code commit, application started to create 150mb/sec. This additional object creation rate can trigger lot more GC activity, CPU spikes, potential OutOfMemoryError, memory leaks when application is running for longer period.
How to read object creation rate?
Object creation rate is reported in the ‘avgAllocationRate’ element in the JSON response. Example:
{ : : "gcStatistics": { "avgAllocationRate": "30.83 mb/sec", : : }, : : }
JSON expression to read ‘avgAllocationRate’ element from the JSON response is:
$.gcStatistics.avgAllocationRate
What is the acceptable Object creation rate?
It depends on the application. One way to come up with acceptable object creation rate is to study your application’s current object creation rate. Say suppose your application’s current object creation rate is 100 mb/sec. Then you might want to raise WARNINGs if object creation rate goes beyond 105 mb/sec (i.e. 5%). You might want to fail the build if object creation rate goes beyond 110mb/sec (i.e. 10%).
7. PEAK HEAP SIZE
Peak heap size is the maximum amount of memory consumed by your application. If peak heap size goes beyond a limit you must investigate it. May be there is a potential memory leak in the application, newly introduced code (or 3rd libraries/frameworks) is consuming lot of memory, maybe there is legitimate use of it, if it is the case you will have change your JVM arguments to allocate more memory.
How to read peak heap size?
Peak heap size is reported in the ‘peakSize’ element in the JSON response. Example:
{ "jvmHeapSize": { "total": { "allocatedSize": "3.93 gb", "peakSize": "154 mb" } : : }, : : }
JSON expression to read ‘peakSize‘ element from the JSON response is:
$.jvmHeapSize.total.peakSize
What is the acceptable peak heap size?
It depends on the application. One way to come up with acceptable peak heap size is to study your application’s current peak heap size. Say suppose your application’s current peak heap size is 1 gb. Then you might want to raise WARNINGs if object creation rate goes beyond 1.05 gb (i.e. 5%). You might want to fail the build if object creation rate goes beyond 1.10gb (i.e. 10%).
7 Pingback