In today’s software development landscape, Continuous Integration/Continuous Deployment (CI/CD) pipelines have become essential for maintaining high-quality releases. As part of the CI/CD process, organizations run smoke tests, regression tests, performance tests, static code analysis, and security scans. Despite these rigorous quality checks, performance issues like OutOfMemoryError, CPU spikes, application unresponsiveness, and response time degradations are still surfacing in production environments.

These issues arise because traditional CI/CD pipelines monitor only high-level metrics like static code quality, test coverage, CPU utilization, memory consumption. While these metrics are valuable, they miss critical performance issues rooted in memory management inefficiencies. Incorporating Garbage Collection (GC) performance type of micro-metrics into your CI/CD pipeline fills this gap.

Analyzing GC metrics as part of CI/CD offers tangible benefits across several areas:

  1. Forecast Memory Problems – GC behavior can reveal memory leaks, help diagnose OutOfMemoryError early, and even forecast memory issues before they impact production.
  2. Improve Application Performance – Tuning GC reduces response times, increases throughput, and prevents long pauses, ensuring a smoother user experience.
  3. Lower Computing Costs – GC tuning reduces CPU usage, optimizes memory allocation, and can help cut software licensing costs.

You can learn more benefits on GC monitoring & tuning from here.

Thus, monitoring GC performance micro-metrics in your CI/CD pipeline enables you to catch potential performance bottlenecks early, bringing a true shift-left approach to performance management. By proactively addressing memory issues and optimizing GC behavior during development, you can prevent costly production issues and ensure smoother releases. In this post, we’ll explore how to integrate GC metrics into your CI/CD.

Steps to Integrate GC Metrics into CI/CD Pipeline

Below are the steps to integrate Garbage Collection performance metrics into you CI/CD pipeline:

1. Garbage Collection Logging: The first step is to enable GC logging in your application by adding the following JVM arguments to your application:

For Java 8 or below

-XX:+PrintGCDetails -Xloggc:<gc-log-file-path>

Example: -XX:+PrintGCDetails -Xloggc:/opt/tmp/myapp-gc.log

For Java 9 or above

-Xlog:gc*:file=<gc-log-file-path>

Example: -Xlog:gc*:file=/opt/tmp/myapp-gc.log

2. Run a Performance Test in CI/CD: Run a performance test as part of your CI/CD pipeline, executing all key transactions of your application at least for 15 minutes. If possible, extend this test duration to capture more GC events, which provides a richer dataset for analysis.

3. Submit GC Logs to GCeasy’s REST API: After running the performance test, submit the GC log file to GCeasy’s REST API for analysis. The API processes the GC log file and returns a detailed JSON response with insightful GC metrics. It also employs machine learning algorithms to detect memory anomalies, providing early alerts on potential issues.

Refer to this documentation, for more information on using GCeasy’s REST API and understanding the response payload. 

In summary, GCeasy’s REST API can be easily called with a simple HTTP POST request, as shown below:

curl -X POST --data-binary @{GC_LOG_FILE_PATH} https://api.gceasy.io/analyzeGC?apiKey={YOUR_API_KEY} --header "Content-Type:text"

4. Extract Key GC Metrics: GCeasy’s REST API returns a JSON response containing several GC metrics and performance insights. You need to extract Key GC performance metrics from the JSON response and populate them in your CI/CD report. Key GC metrics to be reported in the JSON response is shown in the following section, Essential GC Micro-Metrics to Monitor.

5. Triggering Build Failures on GC Issues: GCeasy’s REST API uses machine learning algorithms to automatically detect memory or GC-related issues. If any problems are identified, they are flagged in the ‘problem’ element of the JSON response. You can use this element to trigger build failures whenever GC anomalies are detected. 

Essential GC Micro-Metrics to Monitor

To effectively monitor Garbage Collection (GC) performance in your CI/CD pipeline, focus on the following key performance indicators (KPIs) that provide insights into memory usage, application responsiveness, and resource efficiency. Below is a table detailing each KPI, its significance, and where to find it in the JSON response.

Metric NameDescription
ProblemThe ‘problem’ element flags any issues detected by GCeasy’s ML algorithms. The ‘problem’ element is an array that contains descriptions of detected GC-related issues, such as frequent full GCs or prolonged pauses. This information can help trigger alerts or fail builds if problems are detected. 

Retrieve this metric from the JSON response using the path: $.problem
Avg GC Pause TimeThis metric represents the average time taken for each GC event to complete. High average pause times can delay application response times, as all active transactions are paused during GC. Tracking this metric helps ensure application responsiveness by keeping GC pauses low. 

Retrieve this metric from the JSON response using the path: $.gcKPI.averagePauseTime
Max GC Pause TimeMax GC Pause Time provides the longest pause recorded during a GC event. Prolonged pauses can cause significant application slowdowns and indicate potential performance bottlenecks. Monitoring this metric helps identify memory inefficiencies or configuration issues impacting performance. 

Retrieve this metric from the JSON response using the path: $.gcKPI.maxPauseTime
GC ThroughputGC Throughput reflects the percentage of time spent on application processing versus GC activities. A high throughput percentage means more time spent on processing customer transactions rather than GC activities, indicating efficient memory handling. 

Retrieve this metric from the JSON response using the path: $.gcKPI.throughputPercentage
Object Creation RateThis metric reports the average rate of object creation, typically in MB/sec. High object creation rates can lead to frequent GC events, potentially causing CPU spikes and memory issues. Tracking this metric helps identify sudden increases in memory allocation and possible inefficiencies. 

Retrieve this metric from the JSON response using the path: $.gcStatistics.avgAllocationRate
CPU ConsumptionThis metric reports the total CPU time consumed by Garbage Collection activities, displayed in a format like 3 min 52 sec 180 ms. Since GC is a CPU-intensive process, monitoring this metric is crucial, as changes to GC algorithms or settings can significantly impact CPU usage. By optimizing GC behavior, you can reduce CPU load, improving resource efficiency and potentially lowering costs.

Retrieve this metric from the JSON response using the path: $.gcKPI.cpuTime

To learn more details on the above mentioned GC KPIs, you can visit this post Key Metrics of Java Garbage Collection.

Setting Application Specific GC Thresholds

You can set custom thresholds to control build outcomes based on GC performance. For example, if the GC pause time exceeds a specified duration or if GC throughput falls below a certain percentage, you may want the build to fail. Different applications often require unique thresholds: for example, a 30-second pause might be acceptable for batch applications, while in trading applications, even a 200-millisecond pause could be too long. Custom thresholds enable you to fine-tune performance monitoring based on each application’s specific requirements.

To configure these custom thresholds, refer to the documentation here. When thresholds are breached, issues will be flagged in the problem element of the JSON response, providing early alerts on potential performance impacts.

Embedding GC Graphs in CI/CD Reports

Embedding GC performance graphs in your CI/CD reports provides a visual representation of trends and anomalies, making it easier for your team to identify potential issues. GCeasy’s REST API can generate these graphical images of GC behavior when you include the ‘graphs=true’ parameter in your API request, as shown below:

curl -X POST --data-binary @{GC_LOG_FILE_PATH} https://api.gceasy.io/analyzeGC?apiKey={YOUR_API_KEY}&graphs=true

This setting returns graph image URLs for various GC behavior graphs in the JSON response, which you can embed directly into your CI/CD report. Below is an excerpt from the JSON response containing these graph image URLs:

"graphs": { 
    "heapAfterGCGraph": "https://graphs.gceasy.io/archived/2019/04/14/--heap_usage_after_gc.png", 
    "heapBeforeGCGraph": "https://graphs.gceasy.io/archived/2019/04/14/--heap_usage_before_gc.png", 
    "gcDurationGraph": "https://graphs.gceasy.io/archived/2019/04/14/--gc_duration.png", 
    "gcPauseDurationGCGraph": "https://graphs.gceasy.io/archived/2019/04/14/--gc_pause_duration.png" 
}

Each graph URL provides valuable insights into different aspects of GC performance. Here’s a breakdown of each graph, its purpose, and where to locate it in the JSON response:

Graph ImageDescription
Heap After GC GraphShows the heap memory usage immediately after each GC event, helping to track memory release patterns.

Retrieve this URL from the JSON response using the path: $.graphs.heapAfterGCGraph
Heap Before GC GraphDisplays heap memory usage just before each GC event, useful for identifying peak memory usage before collection.

Retrieve this URL from the JSON response using the path: $.graphs.heapBeforeGCGraph
GC Duration GraphVisualizes the total duration of each GC event, including both pausing (stop-the-world) and non-pausing phases. This graph helps to analyze the complete GC cycle duration, offering insights into potential areas of optimization beyond just pause times.

Retrieve this URL from the JSON response using the path: $.graphs.gcDurationGraph
GC Pause Duration GraphReports only the duration of the GC event’s pause (stop-the-world) phases, excluding any non-pausing activity. This graph is useful for identifying and minimizing application disruptions caused by GC pauses.

Retrieve this URL from the JSON response using the path: $.graphs.gcPauseDurationGCGraph

Linking to Detailed Report

To enhance analysis, add a link to GCeasy’s full GC analysis report in your CI/CD output, next to the GC metrics and graph images. This URL link is available in the ‘webReport‘ element ($.webReport) of the JSON response. This link will take the user to the GCeasy dashboard, where they can access a complete report with visualizations, metrics, and insights that provide an in-depth view of GC behavior.

Embedding this link in your CI/CD report enables team members to explore GC performance data in greater detail, supporting faster diagnosis and resolution of memory or GC-related issues in the event of a build failure. 

Calendar Dashboard Reporting with Metadata

Whenever you make a GCeasy API call, it will be automatically reported in the Calendar Dashboard. To further enrich your Calendar Dashboard experience and gain more contextual insights, you can include metadata parameters in your API requests. These parameters allow you to specify the application name, host name, and even custom tags, providing a comprehensive view of your GC log analysis in the context of your applications and infrastructure.

This metadata reporting can be achieved by passing following parameters in the REST API endpoint:

appName (string, optional): The name of the application from which the GC log was collected.

host (string, optional): The host where the specified application is running.

tags (string, optional): Tags provide additional information or categorization for the analysis.

Syntax:

https://api.gceasy.io/analyzeGC?apiKey=<API-KEY>&appName=<APPLICATION_NAME>&host=<HOST_NAME>&tags=<TAGS>

Example:

https://api.gceasy.io/analyzeGC?apiKey=fb8101b4-c47a-4a6d-bd09-aa1406976b1c&appName=MyApp&host=ProductionServer1&tags=PerformanceProblems,Release-23_09_01

GC Report File Name

When analyzing GC logs across hundreds of JVMs, it’s helpful to name each report. This can be achieved by passing the ‘fileName’ parameter in the GCeasy API endpoint, allowing you to easily determine which report belongs to which host.

For example, you can tag a report as follows:

https://api.gceasy.io/analyzeGC?apiKey={YOUR_API_KEY}&fileName={YOUR_TAG}

In the JSON response, the webReport element provides a URL. Opening this URL in a browser displays the visual GC analysis report, with the File name set in the API call displayed at the top left corner of the report.

GCeasy report with the tag passed in API
Fig: GCeasy report with the tag passed in API

This feature helps streamline tracking and organization, especially when managing numerous GC logs across multiple hosts or environments.

Conclusion

Incorporating Garbage Collection micro-metrics into your CI/CD pipeline is a valuable step toward maintaining application performance and reliability. These metrics offer a deeper insight into memory management, helping you catch issues early and avoid costly production downtimes.