The last part of this series is about garbage collection logging and associated flags. The GC log is a highly important tool for revealing potential improvements to the heap and GC configuration or the object allocation pattern of the application. For each GC happening, the GC log provides exact data about its results and duration.
The flag -XX:+PrintGC (or the alias -verbose:gc) activates the “simple” GC logging mode, which prints a line for every young generation GC and every full GC. Here is an example:
[GC 246656K->243120K(376320K), 0,0929090 secs]
[Full GC 243120K->241951K(629760K), 1,5589690 secs]
A line begins (in red) with the GC type, either “GC” or “Full GC”. Then follows (in blue) the occupied heap memory before and after the GC, respectively (separated by an arrow), and the current capacity of the heap (in parentheses). The line concludes with the duration of the GC (real time in seconds).
Thus, in the first line, 246656K->243120K(376320K) means that the GC reduced the occupied heap memory from 246656K to 243120K. The heap capacity at the time of GC was 376320K, and the GC took 0.0929090 seconds.
The simple GC logging format is independent of the GC algorithm used and thus does not provide any more details. In the above example, we cannot even tell from the log if the GC moved any objects from the young to the old generation. For that reason, detailed GC logging is more useful than the simple one.
If we use -XX:+PrintGCDetails instead of -XX:+PrintGC, we activate the “detailed” GC logging mode which differs depending on the GC algorithm used. We start by taking a look at the output produced by a young generation GC using the Throughput Collector. For better readability, I split the output in several lines and indented some of them. In the actual log, this is just a single line and less readable for humans.
[PSYoungGen: 142816K->10752K(142848K)] 246648K->243136K(375296K),
[Times: user=0,55 sys=0,10, real=0,09 secs]
We can recognize a couple of elements from the simple GC log: We have a young generation GC (red) which reduced the occupied heap memory from 246648K to 243136K (blue) and took 0.0935090 seconds. In addition to that, we obtain information about the young generation itself: the collector used (orange) as well as its capacity and occupancy (green). In our example, the “PSYoungGen” collector was able to reduce the occupied young generation heap memory from 142816K to 10752K.
Since we know the young generation capacity, we can easily tell that the GC was triggered because otherwise the young generation would not have been able to accommodate another object allocation: 142816K of the available 142848K were already used. Furthermore, we can conclude that most of the objects removed from the young generation are still alive and must have been moved to the old generation: Comparing the green and blue output shows that even though the young generation was almost completely emptied, the total heap occupancy remained roughly the same.
The “Times” section of the detailed log contains information about the CPU time used by the GC, separated into user space (“user”) and kernel space (“sys”) of the operating system. Also, it shows the real time (“real”) that passed while the GC was running (which, however, with 0.09 is just a rounded value of the 0.0935090 seconds also shown in the log). If, like in our example, the CPU time is considerably higher than the real time passed, we can conclude that the GC was run using multiple threads. In that case, the CPU time logged is the sum of the CPU times of all GC threads. And indeed, I can reveal that the collector used 8 threads in our example.
Now consider the output of a full GC.
[ParOldGen: 232384K->232244K(485888K)] 243136K->241951K(628736K)
[Times: user=10,96 sys=0,06, real=1,53 secs]
In addition to details about the young generation, the log also provides us with details about the old and permanent generations. For all three generations, we can see the collector used, the occupancy before and after GC, and the capacity at the time of GC. Note that each number shown for the total heap (blue) is equal to the sum of the respective numbers of the young and old generations. In our example, 241951K of the total heap are occupied, 9707K of which are in the young generation and 232244K of which belong to the old generation. The full GC took 1.53 seconds, and the CPU time of 10.96 seconds in user space shows that the GC used multiple threads (like above, 8 threads).
The detailed output for the different generations enables us to reason about the GC cause. If, for any generation, the log states that its occupancy before GC was almost equal to its current capacity, it is likely that this generation triggered the GC. However, in the above example, this does not hold for any of the three generations, so what caused GC in this case? With the Throughput Collector, this can actually happen if GC ergonomics (see part 6 of this series) decides that a GC should be run already before one of the generations gets exhausted.
A full GC may also happen when it is explicitly requested, either by the application or via one of the external JVM interfaces. Such a “system GC” can be identified easily in the GC log because in that case the line starts with “Full GC (System)” instead of “Full GC”.
For the Serial Collector, the detailed GC log is very similar to that of the Throughput Collector. The only real difference is that the various sections have different names because other GC algorithms are being used (for example, the old generation section is called “Tenured” instead of “ParOldGen”). It is good that the exact names of the collectors are used because it enables us to conclude just from the log some of the garbage collection settings used by the JVM.
For the CMS Collector, the detailed log for young generation GCs is very similar to that of the Throughput Collector as well, but the same cannot be said for old generation GCs. With the CMS Collector, old generation GCs are run concurrently to the application using different phases. As such, the output itself is different from the output for full GCs. Additionally, the lines for the different phases are usually separated in the log by lines for young generation GCs that happen while the concurrent collection is running. Yet, being familiar with all the elements of GC logging that we have already seen for the other collectors, it is not difficult to understand the logs for the different phases. Only when interpreting durations we should be particularly careful and keep in mind that most of the phases run concurrently to the application. Thus, as opposed to stop-the-world collections, long durations for individual phases (or for a complete GC cycle) do not necessarily indicate a problem.
Ad we know from part 7 of this series, full GCs can still happen when the CMS Collector does not complete a CMS cycle in time. If that happens, the GC log additionally contains a hint as to what caused the full GC, e.g., the well-known “concurrent mode failure”.
In order to keep this article reasonably short, I will refrain from giving a detailed description of the CMS Collector GC log. Also, one of the actual authors of the collector has already published a great explanation here, which I highly recommend for reading.
-XX:+PrintGCTimeStamps and -XX:+PrintGCDateStamps
It is possible to add time and date information to the (simple or detailed) GC log. With -XX:+PrintGCTimeStamps a timestamp reflecting the real time passed in seconds since JVM start is added to every line. An example:
0,185: [GC 66048K->53077K(251392K), 0,0977580 secs]
0,323: [GC 119125K->114661K(317440K), 0,1448850 secs]
0,603: [GC 246757K->243133K(375296K), 0,2860800 secs]
And if we specify -XX:+PrintGCDateStamps each line starts with the absolute date and time when it was written:
2014-01-03T12:08:38.102-0100: [GC 66048K->53077K(251392K), 0,0959470 secs]
2014-01-03T12:08:38.239-0100: [GC 119125K->114661K(317440K), 0,1421720 secs]
2014-01-03T12:08:38.513-0100: [GC 246757K->243133K(375296K), 0,2761000 secs]
It is possible to combine the two flags if both outputs are desired. I would recommend to always specify both flags because the information is highly useful in order to correlate GC log data with data from other sources.
By default the GC log is written to stdout. With -Xloggc:<file> we may instead specify an output file. Note that this flag implicitly sets -XX:+PrintGC and -XX:+PrintGCTimeStamps as well. Still, I would recommend to set these flags explicitly if desired, in order to safeguard yourself against unexpected changes in new JVM versions.
A frequently discussed question is whether GC logging should be activated for production system JVMs. The overhead of GC logging is usually rather small, so I have a clear tendency towards “yes”. However, it is good to know that we do not have to decide in favor of (or against) GC logging when starting the JVM.
The HotSpot JVM has a special (but very small) category of flags called “manageable”. For manageable flags, it is possible to change their values at run time. All the flags that we have discussed here and that start with “PrintGC” belong to the “manageable” category. Thus, we can activate or deactivate GC logging for a running JVM whenever and as often as we want. In order to set manageable flags we can, for example, use the jinfo tool shipped with the JDK or use a JMX client and call the setVMOption operation of the HotSpotDiagnostic MXBean.
The entire series of the JVM Flags (Part 1 thru Part 8) is the creation of the maestro Patrick Peschlow. Special thanks to him for compiling these facts in such a fluent manner.