RFR: 8034852: Shrinking of Metaspace high-water-mark causes incorrect OutOfMemoryErrors or back-to-back GCs
erik.helin at oracle.com
Wed Apr 30 11:18:09 UTC 2014
this patch solves a rather tricky problem with the sizing of Metaspace.
The issue happens when the GC threshold for Metaspace (called
"capacity_until_GC" in the code) becomes less than the committed memory
for Metaspace. Any calls to Metaspace::allocate that requires committing
more memory will then fail in MetaspaceGC::allowed_expansion, because
capacity_until_GC() < MetaspaceAux::committed_memory(). The effect will be a
full GC and after the GC we try to expand and allocate. After the
expansion and before the allocation, one of two things can happen:
1. capacity_until_GC is larger than the committed memory after the
expansion. The allocation will now succeed, but the next allocation
requiring a new chunk will *again* trigger a full GC. This pattern
will repeat itself for each new allocation request requiring a new
2. capacity_until_GC is still less than the committed memory even
after the expansion. We throw a Java OOME (incorrectly).
How can the GC threshold for Metaspace be less than the committed
memory? The problem is that MetaspaceGC::compute_new_size uses the field
_allocated_capacity for describing the amount of memory in Metaspace
that is "in use". _allocated_capacity does not consider the memory in
the chunk free lists to be "in use", since memory in the chunk free
lists are supposed to be available for new allocations. The problem is
that the chunk free lists can become fragmented, and then the memory is
not available for all kinds of allocations.
This patch change MetaspaceGC::compute_new_size to use
MetaspaceAux::committed_memory for describing how much memory that is
"in use". The effect will be that memory in the chunk free lists will no
longer be considered "in use" (but will of course be used for future
allocations where possible). This will prevent capacity_until_GC from
shrinking below the committed memory "by definiton", since
capacity_until_GC can't be lower than the memory that is "in use".
Based on the results from the perf testing (see below), this change has
no performance impact.
- Ad-hoc testing:
- Parallel Class Loading testlist
- Metaspace testlist
- GC nightly testlist
- Perf testing:
- Derby regression tests
More information about the hotspot-dev