Understanding Generational GC: Young, Old & Promotion

Page Contents

In this post, let’s discuss why JVM’s heap memory is partitioned into two regions: Young Gen and Old Gen. Is there any benefits because of this partition, when does an object go to Young Gen, when does it get promoted into Old Gen? Let’s see.

How Java’s Automatic Garbage Collection Works?

Fig: Object References in Memory

To understand why JVM’s heap memory is partitioned into generations, we first need to understand how Java’s automatic Garbage Collection works. Let’s assume there is an Account Balance object in the memory. Say this Account Balance object be referenced by the Account object, and this Account object is referenced by a Customer object. And there is no reference to Customer object.

Now let’s see what all are the steps Garbage Collector needs to do to reclaim the Account Balance from the memory.

Step 1: Garbage Collector will have to find all the active references to the Account Balance object. It will discover the Account object referencing it

Step 2: Garbage Collector will have to find all the objects that are referencing this Account object. It will discover Customer object referencing it

Step 3: Garbage Collector will have to find all the active references to the Customer object. It will not find any references

Step 4: Now Garbage Collector will reclaim Account Balance, Account & Customer object from memory

Even for this hypothetical scenario, Garbage Collector has to do 3 levels of scans (i.e. steps 1 – 3) to reclaim the Account Balance object from memory. However, modern applications are quite complex with millions of objects and billions of references amongst them. Thus, walking through all the objects in memory and trying to identify their references is a tedious time consuming and computationally intensive job. Hence, automatic garbage collection adds considerable performance overhead to your application.

Short Lived & Long Lived Objects: 80 – 20 Rule

It has been statistically found that 80% of the objects created by our applications are short-lived. For example, when a new customer request enters our application, we create several objects like: Servlet, ServeltFilter, HttpSession, HttpRequest, HttpResponse, DAO, DB Connection, ResultSet …. Once the request is serviced, several of the newly created objects are no longer needed and they should be evicted from the memory (i.e … they are short-lived). On the other hand, there are certain long-lived objects among them. Example: HttpSession, DB Connection, Cache,… they span beyond a single request and they persist in memory for a longer duration. This typically accounts only for 20% of the objects.

Depending on the application and traffic volume, Garbage Collection will have to run thousands of times in a day. If we already know that certain objects are long-lived, why should the garbage collector scan them during every collection cycle? Doing so will unnecessarily increase the GC pause times and CPU consumption.

Why JVM Heap is Partitioned into Young & Old Gen?

To minimize the garbage collection overheads, JVM Heap memory has been partitioned into 2 regions (i.e. generations):

a. Young Gen (aka Young Generation)

b. Old Gen (aka Old Generation)

JVM stores all the newly created objects in the Young Gen and the objects which live for a longer duration are promoted to the Old Gen. Beside these 2 regions, there is also Native Memory region which stores the metadata information (such as threads, memory for GC, class definitions, File Descriptors…) that are required to execute our application code. If time permits, learn more from this video about JVM Memory Regions.

As per the default JVM settings, Young Gen occupies 1/3rd of your heap memory size (i.e. -Xmx) and the remaining 2/3rd is occupied by the Old Gen.

Thus, Garbage Collector runs frequently on a much smaller Young Gen, to clean up all the short-lived objects and runs occasionally on the old Gen to clean up the long lived objects. Because of this brilliant strategy, Garbage Collector has to scan only 1/3rd of the memory to reclaim 80% of objects. Otherwise, it has to scan the entire memory to reclaim the same short-lived objects, which will add significant overhead to the JVM.

When does an object get promoted from Young Gen to Old Gen?

An object will be promoted from Young Gen to Old Gen under the following circumstances:

a. Object has aged :-): When objects survive a certain number of GC cycles in the Young Gen then it gets promoted to the Old gen. There is a JVM argument ‘-XX:MaxTenuringThreshold’ in which we can specify number of GC cycles after which object should be promoted to Old Gen. Example: if we specify ‘-XX:MaxTenuringThreshold=15’, then if any object that survives 15 GC cycles in Young Gen will be promoted to the Old Gen.

b. Memory Pressure in Young Gen: When there is a shortage of Memory in the Young Gen, then also objects will be promoted from the Young Gen to Old Gen.

c. Direct Allocation of Large Objects: If an object is too large to fit within the survivor spaces of Young Gen, then JVM might bypass those areas entirely and directly promote the object to the Old Gen.

d. Ergonomics: JVM employs a set of adaptive strategies to optimize memory management. JVM will continuously monitor the allocation patterns and dynamically adjust the sizes of Eden and Survivor spaces. When these areas become constrained, the garbage collector may opt for early promotion of objects to reduce the overhead of copying them between spaces.

Do all GC algorithms have Young Gen & Old Gen?

There are 7 GC algorithms in OpenJDK 22:

1. Serial GC

2. Parallel GC

3. CMS GC

4. G1 GC

5. Shenandoah GC

6. ZGC

7. Epsilon GC (can be ignored because it’s a no-op garbage collector)

Here Shenandoah GC and ZGC are the only GC algorithms which are single Generation i.e. they don’t have the two Generations (i.e. Young and Old). All other GC algorithms are Generational i.e. they do have Young Gen and Old Gen. However, starting from JDK 21, ZGC has also introduced the support for Generations for optimal performance.

Conclusion

I hope this post tries to clarify why Java Heap Memory has two generations and what are the performance benefits our applications inherits from this brilliant architecture. If you would like to learn more about GC Tuning, please check out my online ‘JVM Performance Engineering & Troubleshooting Master Class’.

GC easy – Universal Java GC Log Analyser