Time taken by every single GC event is reported in the GC log. In every GC event, there are ‘user’, ‘sys’, and ‘real’ time. What does these time mean? What is the difference between them?
- ‘real’ time is the total elapsed time of the GC event. This is basically the time that you see on the clock.
- ‘user’ time is the CPU time spent in user-mode code (outside the kernel)
- ‘Sys’ time is the amount of CPU time spent in the kernel. This means executing CPU time spent in system calls within the kernel, as opposed to library code, which is still running in user-space.
To learn more about these times, please refer to this article.
In normal (all most all) GC events, user time will be greater than sys time. It’s because, in a GC event, most of the time is spent within the JVM code and only very less time is spent in the kernel. But in certain circumstances, you might see sys time to be greater than user time.
[Times: user=0.04 sys=0.35, real=0.42 secs]
Here you can notice the sys time to be 0.35 seconds which is considerably higher than user time 0.04 seconds.
If you observe multiple occurrences of this scenario in your GC log then it might be indicative of one of the following problems:
1. Operating System problem
Operating System exceptions such as page faults, misaligned memory references, and floating-point exceptions consume large amount of system time. Make sure your OS is running with proper patches, upgrades, and sufficient CPU/memory/disk space.
2. VM related Problem
If your application is running in a virtualized environment, may be because of nature of the emulation sys time can be higher than user time. Make sure virtualized environment is NOT OVERLOADED with too many environments and also ensure that there are adequate resources available in the Virtual Machine for your application to run
3. Memory constraint
In Linux operating system, JVM and OS will request large pages (like in the magnitued of 2mb size/page). If Operatin system can not find contiguous free space to allocate 2mb page, then operating system will stop all stop all the process that is running and starts moving around the data to find contiguous free space. If OS does this then ‘sys’ will be typically higher than real time.
If this turns out to be case either allocate sufficient memory on the machine or reduce number of processes running on the machine.
4. Disk I/O Pressure
JVM creates and writes statistics about safepoints and garbage collection events in the /tmp/hsperfdata_(username) file. This file gets updated when GC events run. Occasionally linux kernel threads blocks the GC threads from updating this file when there is heavy disk I/O. Thus very heavy disk I/O activity can increase GC pause time. When this problem happen either real time will be much greater than user + Sys time or sys time will be higher than user time.
If this turns out to be case, try to reduce disk I/O activity. You can measure disk I/O activity using lsof.