The Future of GC: ZGC vs. Shenandoah – Which Low-Pause Collector is Right for Your App?

Annya Arun

4 months ago

Page Contents

In old enterprise systems, the performance metrics are mainly based on throughput, or the amount of work that could be done in a given time period. Garbage collectors such as Parallel, CMS, and even G1 were designed with this in mind.

However, when it comes to modern systems, latency is often more important than throughput as it is the brief period of time where the application is not available while the garbage collector performs critical work. In interactive and real-time systems, even a small pause can have business implications.

For example, let’s consider a foreign exchange market. Even a 100-millisecond pause can mean that the user misses the best exchange rate. In robotics or real-time automation, a one-second pause can mean that the task failed.

The problem gets worse as the size of the heaps increases. Especially in large systems, the heap sizes can run into tens or even hundreds of gigabytes of memory. In traditional garbage collectors, the pause times can increase as the heap size increases, making it difficult to predict the latency.

To overcome this problem, two new low-pause garbage collectors were developed with large heaps and latency-sensitive workloads in mind: Z Garbage Collector (ZGC) and Shenandoah GC.

Which one is the right one for your application? Still confused. No worries. This article evaluates ZGC vs Shenandoah architecture, trade-offs, and ideal applications to help you in making an informed decision.

How Modern Concurrent Garbage Collectors Work

Modern garbage collectors are designed to not block the application for a long time. They work on one idea: garbage collection should not take too long no matter how big the heap is. Here is how they do it:

1. Concurrent Instead of Stop-the-World: Most of the work, like marking and moving things around, happens at the same time as the application is running. The only pauses are short ones. This keeps things predictable, even when a lot of objects are being allocated.

2. Region-Based Heap Design: The heap is split into parts that can be cleaned up independently of other regions. This makes it easier to move things around bit by bit. It means the whole heap does not need to be cleaned up at once.
3. Snapshot Safety (SATB): While marking things concurrently the collector takes a kind of snapshot of what’s alive. It keeps track of changes to references. This makes sure everything is correct even if the application is changing memory.

4. Concurrent Relocation & Compaction: Things are moved around while the application is still running. Tricks like pointers ensure that references stay good. This happens without pauses.
5. Heap-Size–Independent Pause Times: Because most of the work is done concurrently,pause times don’t depend on how big the heap is. This is the innovation behind modern garbage collectors that aim for low latency.

We have briefly talked about the alternatives in one of our blogs, which describes modern Java Garbage Collection.

Note: If you’d like a refresher on the fundamentals before diving into low-pause collectors, our must-read guide on Java Garbage Collection basics explains how the JVM manages memory and why pause behavior matters.

ZGC Architecture Explained

Fig: ZGC GC Cycle Timeline

The ZGC was introduced in JDK 11. It was originally not always reliable, but it got much better in later releases. Recent JDK versions included further important improvements. The main goal of the ZGC is to deliver ultra-low pause times even when the heap is very large. Let us look at the design principles of the ZGC:

Heap Scalability: The ZGC is designed to handle large heaps and it does this very well. Unlike earlier collectors, the pause time does not get longer when the heap gets bigger.
Sub-10ms Pause Targets: The ZGC keeps pause times short, usually under 10ms, by doing most of the hard work at the same time as other tasks.
Colored Pointers: The ZGC uses something called pointers, which means it stores information about objects right in the object references. This helps the collector keep track of things without having to stop everything.
Load Barriers: The ZGC also uses something called load barriers to intercept when objects are accessed and to update references when objects move. This means it does not need to stop everything for a time to move objects around.
Fully Concurrent Compaction: The ZGC can do several things at the same time, like marking, relocating and updating references. This means there is no pause for these tasks like there is with other collectors.

Though there are some strengths to ZGC architecture, there are limitations too. Let’s quickly check what they are:

Strengths	Limitations
Extremely predictable pause times, even under high allocation pressure	Initially Linux-only (now expanded to broader OS support in newer JDKs)
Excellent scalability for large heaps (32GB, 64GB, multi-terabyte)	Slight CPU overhead due to continuous load barrier execution
Minimal tuning requirements compared to older collectors	Smaller operational ecosystem compared to long-established collectors like G1
Handles large allocations without “humongous object” techniques	–

The ZGC is especially good in systems where latency is critical and consistency and predictability are more important than getting the most work done. The ZGC architecture is very good at delivering ultra-low pause times even on very large heaps, which makes it a good choice for these types of systems.

For a deeper dive into ZGC internals and optimization, see our detailed ZGC tuning guide.

Shenandoah Architecture Explained

Fig: Shenandoah GC Cycle (Concurrent Evacuation Model)

The Shenandoah GC was first made by Red Hat. It was created to make pausing really fast by doing two things at the time: evacuation and compaction. This happens while the application threads are still running. The main idea of Shenandoah GC is simple: keep the application running even when there’s not much memory left. Now let’s look at how Shenandoah GC works:

Concurrent Evacuation: Shenandoah GC moves objects around without pausing application threads. This is different from earlier collectors. It helps reduce pauses when memory is being cleaned up. This makes things run smoothly and consistently.
Brooks Forwarding Pointers: Each object gets a pointer. When an object is moved all references to it get updated automatically. This means the application doesn’t have to stop.
Region-Based Heap: Memory is split into parts called regions. These can be cleaned up one by one. This way not the whole memory has to be cleaned at once.
Heap-Size Independent Pauses: The time it takes to pause doesn’t change much even if the heap size increases. This is good for systems that need to be fast.
Responsiveness First: Shenandoah GC prioritizes keeping the application running over making it run fast, as much as possible.

Here’s a snapshot of Shenandoah GC’s best and worst points, and also, where under certain workloads it might fall short.

Strengths	Limitations
Mature and well-supported in the Red Hat ecosystem	Slightly higher memory overhead due to forwarding pointers
Performs well on mid-to-large heaps (8GB–64GB range)	Can exhibit higher CPU usage under heavy allocation pressure
Strong choice for interactive and latency-sensitive systems	Requires more careful tuning compared to ZGC in some workloads

In an ecosystem already using the Red Hat-supported OpenJDK distributions, responsiveness and predictable latency are especially important, so the Shenandoah GC shines in these environments. The Shenandoah GC is ideal for these applications as it ensures continuous application operation. It is also good for systems that need to be fast and have low latency.

For Shenandoah-specific insights, see our Shenandoah GC log analysis guide.

ZGC vs Shenandoah: Side-by-Side Comparison

When evaluating ZGC vs Shenandoah Java GC, the difference is not about whether both are low-pause collectors. They are, but the real distinction lies in how they balance latency, scalability, CPU usage, and operational complexity.

Criteria	ZGC	Shenandoah
Pause time consistency	Extremely consistent, typically sub-10ms and largely heap-size independent	Low pauses (often 1–10ms), slightly more variance under allocation pressure
Heap scalability	Designed for very large heaps (32GB to multi-terabyte)	Strong for mid-to-large heaps (8GB–64GB), scales well but optimized slightly differently
CPU overhead	Moderate due to continuous load barriers	Can be higher during heavy concurrent evacuation cycles
Throughput impact	Slight throughput trade-off for latency predictability (≈85–95%)	Generally competitive throughput (≈90–95%)
Memory overhead	Low additional structural overhead	Slightly higher due to Brooks forwarding pointers
Tuning complexity	Minimal tuning required in most cases	More sensitive to pacing and allocation patterns
Ecosystem maturity	Rapidly maturing; strong adoption in modern JDKs	Mature in Red Hat ecosystems; widely supported in OpenJDK builds
Cloud suitability	Excellent for large containerized deployments with high memory	Very suitable for responsive microservices and container workloads
Best-fit workload	Large heaps, strict latency SLOs, predictable behavior required	Interactive systems, mid-to-large heaps, responsiveness prioritized

In short, the ZGC vs Shenandoah Java GC decision is not about which is better universally. It’s about which collector aligns more precisely with your latency objectives, heap profile, and operational constraints. ZGC emphasizes predictability and massive heap scalability. Shenandoah emphasizes responsiveness and strong concurrent evacuation behavior.

Choosing the right collector depends heavily on workload patterns, heap size, and latency goals, as discussed in our GC choosing and tuning guide.

Possible Real-World Use Case Scenarios

Let’s explore this through a hypothetical cloud-native scenario to see how workload characteristics can lead to different GC decisions.

A Cloud-Native Story: Same Environment, Different Outcome

Imagine you are handling a group of microservices that reside on Kubernetes. The memory is limited, throttling is apparent in heavy loads, traffic is random, and horizontal scaling hides the performance issue underneath. One afternoon, all hell breaks loose, when a P99 latency just suddenly shoots up.

There are alarms set off for a mission-critical service. Autoscaling has already begun spinning up new pods, but the latency only keeps going higher and higher. You finally start digging into GC logs.

The first problem service consists of a 48GB heap. Allocation is high and constant. GC pauses work well, but concurrent cycles are consuming a lot of CPU time, pushing the container up near the throttling threshold. ZGC is preferred in this case. In scenarios with huge heaps and strict latency constraints, ZGC’s steady pause times remain a nice fit, assuming the need for sufficient CPU resources is for the process. The issue isn’t variability in pause timings, it’s resource allocation.

In a different service in that same service environment, the 16GB heap is having an intense allocation at peak times. Heavy allocation may be observed in the GC logs, pacing and degenerated cycles are visible once allocation briefly exceeds the evacuation rates. For this Shenandoah GC is preferable. Its concurrent evacuation model supports allocation bursts sufficiently for mid-sized heaps, and is responsive in dynamic load situations.

Did you notice the difference? Same constraints, same infrastructure. But different workload characteristics and therefore a different GC decision. The choice between ZGC and Shenandoah can only be made after studying heap size, allocation behavior, CPU Spike, and latency under real production traffic.

If you’re interested in how these collectors perform in large-scale production environments, here are a few detailed real-world case studies worth exploring:

For ZGC:
-> Lower Java Tail Latencies with ZGC
-> Netflix: Generational ZGC in Production

For Shenandoah:
-> Shenandoah in Production – Clojure Goes Fast
-> Shenandoah Deep Dive Talk

Tuning Considerations

While ZGC requires minimal manual tuning, a few configuration and monitoring best practices ensure it consistently delivers ultra-low pause performance.

For ZGC

The ZGC is made to work well from the start but doing a few simple checks will help make sure it always meets the latency goals you have for the ZGC. Here are some things to check for the ZGC:

Heap Sizing: Make sure you have extra space in the heap so that the ZGC can do its job without competing with other things that are happening at the same time.
Allocation Rate Monitoring: Keep an eye on how much memory’s being used by the ZGC because if it is using too much it can cause problems with the ZGC.
Container Alignment: Make sure the settings for the JVM heap match the amount of memory available in the container so the ZGC does not get slowed down accidentally. Remember when sizing the container that it must have enough memory not only for the heap, but also for Java native memory, the operating system and any services.
Minimal Flag Usage: Do not try to adjust many settings for the ZGC because it usually works best when it is left alone and everything is simple.
GC Log Validation: Regularly check the logs for the ZGC to make sure it is working smoothly and consistently, especially when it is dealing with a lot of traffic.

The thing to remember about the ZGC is that it is not about adjusting a lot of settings but about planning carefully and keeping a close eye on how the ZGC is working so you can make sure the ZGC is meeting your latency goals. If you’re running ZGC, refer to our step-by-step ZGC log analysis guide.

For Shenandoah

Shenandoah works well on its own. It can be slow if you do not manage pacing and allocation pressure properly. The Shenandoah garbage collector uses pacing to slow down application threads when the heap is in danger of running out of space. This allows the garbage collector to catch up. You should keep an eye on this so that your program does not stop working because it is waiting for memory. Here are things you should think about when you use Shenandoah.

GC Pacing: The Shenandoah garbage collector has a system that slows down other threads when free memory is low .
Evacuation Thresholds: The Shenandoah garbage collector needs to be set up so that it can clean up memory without having to do much work at the same time.
Allocation Failure Scenarios: You should watch out for times when the Shenandoah garbage collector cannot keep up with the demand for memory.
Region Sizing Impact: How the Shenandoah garbage collector breaks up memory into pieces can affect how well it works.

To make sure Shenandoah is working well you should look at how long it’s pausing, how long it takes to do its job and how often it has to slow down. The main goal of tuning Shenandoah is to make sure it can clean up memory without having to stop everything.

The Shenandoah garbage collector needs to be able to do its job in the background and it should not have to stop everything because it is running out of memory. If you’re running Shenandoah, refer to our step-by-step Shenandoah log analysis guide.

For a broader view of GC tuning strategies across collectors, refer to our comprehensive GC tuning guide.

Observability & GC Log Analysis

Step one is to pick a low-pause collector. The second step is to validate it with GC logs.

When you’re operating Z Garbage Collector or Shenandoah GC, look at what’s impacting your latency. Here are a few things you’ll want to look at:

Pause distribution: Try checking P95 and P99 not the average pause times.
Allocation rate: Ensure allocation is not moving faster than marking.
Concurrent cycle duration: Keep an eye on marking and evacuation times to ensure they aren’t increasing.
Fallback events: Identify allocation stalls or degenerated cycles.
CPU usage: Verify that GC threads aren’t being throttled.

Low pause times don’t necessarily mean much if they aren’t stable under traffic. GC logs show whether you’re consistently meeting your latency goals.

Manually parsing GC logs is time-consuming and error-prone, especially in production-scale systems. Automated GC log analysis tools can visualize pause distributions, detect allocation pressure trends, and highlight fallback events quickly. Whether you’re validating ZGC or Shenandoah behavior, tooling transforms GC tuning from reactive troubleshooting into proactive performance engineering. My personal favorite for automatically analysing these GC logs would be GCeasy.

Common Misconceptions About Low-Pause Collectors

Low-pause collectors are really powerful. A lot of people do not understand them.

“Low pause means high throughput loss.”

That is not true. Low latency does not necessarily mean you will have poor throughput. The Z Garbage Collector and the Shenandoah GC are able to balance things out so they can work quickly and efficiently all the time. They can keep working at 85 to 95 percent throughput, still keep the pauses short.

“ZGC is only for huge heaps.”

This is not true either. Although the ZGC was made to work with huge amounts of memory, it works really well, even with moderate heap sizes. It is good for systems that need to work constantly and cannot afford to wait, no matter how big or small they are.

“Shenandoah is experimental.”

Not true. Although it was experimental originally, in the latest JVM versions Shenandoah is ready to use. A lot of people are using it, especially in OpenJDK distributions for businesses. It has been proven to be stable in real world situations.

“Concurrent GC means zero pauses.”

Wrong again. With concurrent collectors, there are still times when everything has to stop for a brief moment. The goal is to make these pauses as short and predictable as possible, not to get rid of them completely.

Understanding these nuances prevents incorrect tuning decisions and unrealistic expectations. Some of these misconceptions originate from outdated assumptions about garbage collection behavior.

Final Decision Framework: Which One Should You Choose?

The choice between ZGC and Shenandoah Java GC is not about picking the superior one. It’s about matching GC behavior with your needs.

Choose Z Garbage Collector if you want a garbage collector that has short and predictable pauses, can handle large heaps and you don’t want to spend much time tuning. It works well in systems with a lot of memory, where consistency is key.

Choose Shenandoah GC if you want your system to stay responsive when it’s allocating memory a lot and you want balanced performance, especially with medium to large heaps.

The best collector is the one that meets your latency goals consistently when it’s really being heavily used. Look at GC logs to find out how memory is being used and also pause times to make your decision.

Short pauses are good. Consistent performance when it’s busy is what matters.