CRR (M): 7132029: G1: mixed GC phase lasts for longer than it should
tony.printezis at oracle.com
Tue Feb 14 15:18:29 UTC 2012
I'd like a couple of (quick please!) code reviews for this change:
The policy that was choosing old regions for collection during mixed GCs
was buggy and it could exhibit many strange behaviors, one of which was
to go into mixed GC mode, do "mixed GCs" (which they did not actually
collect any old regions), and not get out of it for a while while
preventing subsequent marking cycles not to start. This frequently
caused evac failures and Full GCs.
The logic on which old regions to add to the CSet was spread to many
places. I simplified that and put it mainly in the loop in the
finalize_cset() method (renamed from choose_collection_set(), "finalize"
is more descriptive on what the method does). I think the new version is
easier to follow.
Description of the changes:
* I have introduced min and max number of old regions to be added to the
CSet of each mixed GC. Max number is calculated as a percentage over the
heap size (default: 10% - thanks to Jesper for suggesting to use a heap
percentage for this) and ensures that collections will not get super
long. Min number is calculated based on a desired mixed GC num after a
marking cycle (default: 4) and ensures that each mixed GC will make some
progress in collecting old regions (so that the candidate old regions
are collected in a timely fashion).
* We now don't add any regions with live percentage over a threshold
(default: 95%) to the CSet chooser and we do not consider them for
* I stopped using the cache in the CSet chooser class (it was used to
resort regions according to their latest GC efficiency), since it's not
necessary any more: we'll now go through all the candidate old regions
in the CSet chooser class and we don't have a heuristic of when to stop
mixed GCs based on GC efficiency.
* I introduced the notion of "reclaimable bytes" on HeapRegions, which
not only includes the predicted garbage bytes on each region, but also
the unused space in the area [top,end) which will also be reclaimed. The
CSet chooser class now keeps track of the total reclaimable bytes of all
the regions that it tracks. If that falls under a certain threshold
(default: 1% of the heap) we stop doing mixed GCs as we'll reclaim very
little out of the remaining candidate old regions.
* I eliminated the case where a mixed GC starts and picks no old regions
to collect (I hope!). Now the information on the CSet chooser (remaining
region num / remaining reclaimable bytes) can tell us whether we want to
do more mixed GCs or not. If we do, it's guaranteed that there will be
old regions to collect. Because of that, I removed the
_should_revert_to_young_gcs flag as it's not needed any more. It was
used so that the CSet choice code could flag that we should stop doing
mixed GCs at the end of the GC. Now, we can decide that by just looking
at the information on the CSet chooser (the heuristic is encapsulated in
* I also changed the policy in the case where a fixed young gen is used.
Before, we were a bit arbitrary: during mixed GCs we'd cut the young gen
size in half and fill up the rest with old regions. I thought that
trying to re-use the "desired mixed GC number" heuristic for this made
sense. So, I now add the min number of old regions to each mixed GC (as
long as we don't go over the max). I think this is a more reasonable and
less arbitrary heuristic to what we had before and it's more consistent
with the non-fixed young gen policy. I didn't know whether I should
decrease the young gen size for each mixed GC, like how we did before,
and by how much. So, I now leave it unchanged (the user decided they
want a fixed young gen, they will now always get it!).
* Updated all related ergo output to print the new information.
Like the marking changes, this change leaves a fair amount of unused
code that we can remove. I had to draw the line somewhere on how much I
should remove now and how much I should leave for later. I opened a new
CR for the remaining cleanup:
7145441: G1: collection set chooser-related cleanup
Many thanks to Charlie for doing some last-minute performance testing
with my workspace. He was the one who discovered the problem with the
never-ending mixed GCs and this fix eliminated the problem.
Correctness-testing-wise, I've run overnight with the GC test suite.
More information about the hotspot-gc-dev