RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results

Hohensee, Paul hohensee at amazon.com
Thu Jan 25 21:04:26 UTC 2018

Hi Erik & co, thanks for looking at this.

Would you be ok with pushing this fix (it really is a bug!) and then me doing a followup RFE? That way, I can backport the fix to 8u and eventually remove the patch I’ve already pushed to our OpenJDK8 internal release.

Some background. We used to measure heap occupancy using MemoryPoolMXBean.Usage and alarm on it when it got ‘too high’. The problem is that it’s an instantaneous measure and therefore includes an unknown amount of garbage, so you can’t determine where to set an alarm, because you don’t know how much of it will be collected in the near future. We want to detect steadily increasing memory use such as you’d see with a leak, so we’re switching to MemoryPoolMXBean.CollectionUsage, which is the usage immediately after the last GC that affected the memory pool. We have a synthetic metric that’s defined for all collectors as (sum of CollectionUsage.used for non-eden pools) / (sum of CollectionUsage.max-if-defined for all non-eden pools), and alarm on that. ‘max-if-defined’ means zero if JMX returns -1, which is the ‘undefined’ value.

The JMX API spec doesn’t specify what the memory pool or garbage collector names are, but the current names are de-facto part of the API, so if we change the existing ones, imo a CSR should be filed. To avoid that, how about we keep the existing memory pool names the same and, in the spirit of compatibility, make the new ones similar to the existing ones. I.e.,

“G1 Eden Space”
“G1 Survivor Space”
“G1 Old Gen”
“G1 Humongous Space”
“G1 Archive Space”

Alternatively, we can make the existing names aliases for the new ones, though “Regions” seems a bit G1-specific to me and doesn’t convey the relationship to the equivalent spaces in the other collectors. Especially if there’s an archive space for the other collectors (see below).

There’s no specification requirement that memory pools be disjoint. What do you think of defining the archive space as a subset of the old gen? That way existing code can stay the same (it typically just iterates over a collector’s pools as above), and new code can decide exactly what it wants to report. Do/should the non-G1 collectors have an archive space too? If so, we should just call it “Archive Space” and not make it G1 specific. I’d guess for non-G1 collectors it’s just the initial live prefix of the old gen and therefore ignored during a collection. In this scheme, the initial value of ‘used’ for all collectors’ oldgens would be the size of the archive space. All the archive space’s MemoryUsage attributes would have the same value.

Not attaching the archive space to any collector seems correct, because a collector’s memory pools are defined to be the ones that it affects.

Currently, Usage.max and CollectionUsage.max for all G1 memory pools other than the oldgen is -1, for “undefined”. For the oldgen it’s –Xmx. This makes it easy to generate aggregate metrics for all the collectors by just summing used values and dividing by the sum of max values as described above. It would be nice to keep this characteristic. Otherwise we’d have to write special-case code for G1, and change existing code to check for which JDK we’re running on.

I’m uncertain whether your definition of the memory pool usage fields is for MemoryPool.Usage or MemoryPool.CollectionUsage. Seems like the former, which is fine and matches the current definition, except for max.

All the existing collector names have an implicit “ Collector” suffix, e.g., “ConcurrentMarkSweep” means “ConcurrentMarkSweep Collector”. So, I’d use

“G1 Young”
“G1 Mixed”
“G1 Full”
“G1 Concurrent Cycle”

keep the existing “G1 Young Collection” as an alias for both “G1 Young” and “G1 Mixed”, and keep the existing “G1 Old Collection” as an alias for “G1 Full”. If you accept this and the memory pool name suggestions, then strictly speaking you don’t need a CSR.

The definition of the “G1 Concurrent Cycle” elapsed time makes sense to me. The young collection would still be reported separately, right?


On 1/25/18, 5:27 AM, "Erik Helin" <erik.helin at oracle.com> wrote:

    Hi Paul,
    thanks for your interest in this area and for your patch! The 
    GarbageCollectorMXBean and MemoryPoolMXBean support for G1 is in need of 
    some updates, so thanks for working on this.
    Looking at your patch, I'm not sure that this is the direction we want 
    to go in. I discussed this a bit with Thomas and Stefan J, and our 
    current line of thinking is the following:
    - Memory pools (MemoryMXBean):
       - "G1 Eden Regions"
       - "G1 Survivor Regions"
       - "G1 Old Regions"
       - "G1 Humongous Regions"
       - "G1 Archive Regions" (if CDS and/or AppCDS is used)
    `init` for these pools would be 0, `used` would be total size of the 
    "live" objects in the used regions of that type, `committed` the total 
    size of the used regions of that that type and `max` would be 
    MaxHeapSize. Note that "live" here means live from the GCs point of 
    view, i.e. an object might be dead in an old region but the GC will 
    consider that object live until a concurrent cycle has marked through 
    the heap and deemed it dead.
    - Collectors (GarbageCollectorMXBean):
       - "G1 Young Collector" with the pools
         - "G1 Eden Regions"
         - "G1 Survivor Regions"
         - "G1 Humongous Regions" (due to early reclamation)
       - "G1 Mixed Collector" with the pools
         - "G1 Eden Regions"
         - "G1 Survivor Regions"
         - "G1 Old Regions"
         - "G1 Humongous Regions" (due to early reclamation)
       - "G1 Full Collector" with the pools
         - "G1 Eden Regions"
         - "G1 Survivor Regions"
         - "G1 Old Regions"
         - "G1 Humongous Regions" (can collect empty humongous regions)
       - "G1 Concurrent Cycle" with the pools
         - "G1 Old Regions" (can collect empty old regions)
         - "G1 Humongous Regions" (can collect empty humongous regions)
    Note that with this design, the 
    GarbageCollectorMXBean::getCollectionTime() method for "G1 Concurrent 
    Cycle" would be the wall clock time from start of the first initial mark 
    to end of the last cleanup (also including the time of any eventual 
    young collection during the concurrent cycle). So 
    GarbageCollectorMXBean::getCollectionTime() would be a mix of concurrent 
    and STW time for the GarbageCollectorMXBean with name "G1 Concurrent Cycle".
    Also note that the MemoryPoolMXBean with name "G1 Archive Regions" will 
    not be attached to any GarbageCollectorMXBean, since those regions will 
    never be collected.
    What do you think about this design, would it work for your use case?
    If we want to go ahead with this design, then I think we might have to 
    file a CSR. David (who is the HotSpot CSR representative), do we have to 
    file a CSR for changing the names of MemoryPoolMXBeans and 
    On 01/20/2018 12:40 AM, Hohensee, Paul wrote:
    > I’d appreciate a review please.
    > Bug: https://bugs.openjdk.java.net/browse/JDK-8195115
    > Webrev: http://cr.openjdk.java.net/~phh/8195115/webrev.00/
    > The bug is that from the JMX point of view, G1’s incremental collector 
    > (misnamed as the “G1 Young Generation” collector) only affects G1’s 
    > survivor and eden spaces. In fact, mixed collections run by this 
    > collector also affect the G1 old generation.
    > This proposed fix is to record, for each of a JMX garbage collector's 
    > memory pools, whether that memory pool is affected by all collections 
    > using that collector. And, for each collection, record whether or not 
    > all the collector's memory pools are affected. After each collection, 
    > for each memory pool, if either all the collector's memory pools were 
    > affected or the memory pool is affected for all collections, record 
    > CollectionUsage for that pool.
    > For collectors other than G1 Young Generation, all pools are recorded as 
    > affected by all collections and every collection is recorded as 
    > affecting all the collector’s memory pools. For the G1 Young Generation 
    > collector, the G1 Old Gen pool is recorded as not being affected by all 
    > collections, and non-mixed collections are recorded as not affecting all 
    > memory pools. The result is that for non-mixed collections, 
    > CollectionUsage is recorded after a collection only the G1 Eden Space 
    > and G1 Survivor Space pools, while for mixed collections CollectionUsage 
    > is recorded for G1 Old Gen as well.
    > Other than the effect of the fix on G1 Old Gen MemoryPool. 
    > CollectionUsage, the only external behavior change is that 
    > GarbageCollectorMXBean.getMemoryPoolNames will now return 3 pool names 
    > rather than 2.
    > With this fix, a collector’s memory pools can be divided into two 
    > disjoint subsets, one that participates in all collections and one that 
    > doesn’t. This is a bit more general than the minimum necessary to fix 
    > G1, but not by much. Because I expect it to apply to other incremental 
    > region-based collectors, I went with the more general solution. I 
    > minimized the amount of code I had to touch by using default parameters 
    > for GCMemoryManager::add_pool and the TraceMemoryManagerStats constructors.
    > Tested by running the new jtreg test included in the webrev. I tried to 
    > use the submit repo, but it was out of order earlier today, so I’d be 
    > much obliged if someone could run it through mach5 and sponsor an 
    > eventual push. I successfully ran a JDK8 version of the patch through 
    > all the JDK8 jtreg tests as well as the JDK8 TCK.
    > Thanks,
    > Paul

More information about the hotspot-gc-dev mailing list