RFR (L): 8244603 and 8238858: Improve young gen sizing
thomas.schatzl at oracle.com
Tue May 19 13:37:56 UTC 2020
can I have reviews for this change that improves young gen sizing a
lot to prepare for heap shrinking during young gc (JDK-8238687) ;)
In particular, with this change the following issues two related issues
* 8238858: G1 Mixed gc young gen sizing might cause the first mixed gc
to immediately follow the prepare mixed gc
* 8244603: G1 incorrectly limiting young gen size when using the reserve
can result in repeated full gcs
These have been grouped together because it is too hard to separate them
out as the bugs required significant rewrite in the young gen sizing.
This results in G1 following GCTimeRatio much better than before,
leading to less erratic heap expansion. That is, constant loads do not
result in that much overcommit any more.
At end of gc and when the remembered set sampling thread (re-)evaluates
the young gen G1 calculates two values:
- the *desired* (unconstrained) size of the young gen;
desired/unconstrained meaning that these values are not limited by
actually existing free regions. This value is interesting for adaptive
IHOP so that it (better) converges to a fixed value. (There is some
bugfix, JDK-8238163, coming for this to fix a few issues with that)
- the actual *target* size for the young gen, i.e. after taking
constraints in available free regions into account, and whether we are
currently already needing to use parts of the reserve or not.
Some problems that were fixed with the current code:
- during calculation of the *desired* young gen length G1 basically
sizes the young gen during mixed gc to the minimum allowed size always.
This causes unnecessary spikes in short/long term ratios, causing lots
of heap increases even with a constant load.
Since we do not shrink the heap yet during regular gcs, this typically
ended up in fully expanding the heap (unless the reclamations during
Remark actually reclaimed something, but the equilibrium of committed
heap between these two mechanisms is much higher).
E.g. on specjbb2015 fixed IR (constant load) with "small" and "medium"
load G1 will use half the heap now while staying < GCTimeRatio.
- at end of gc g1 calculates the young gen for the *next* gc, i.e.
during the prepare mixed gc g1 should already use the "reduced" amount
of regions (that's JDK-8238858); similarly the *last* mixed gc in the
mixed gc phase should already use the calculation for the young phase.
The current code does not. This partially fixes some "at end of mixed gc
it takes a while for g1 to achieve previous young gen size again" issues.
(There is a CR for that, but as mentioned, this change does not
completely solve it).
- there were some calculations to ensure that "at least one region will
be allocated" every time g1 recalculates young gen but that really
resulted in g1 increasing the young gen by at least one. You do not
always want that, particularly since we regularly revise the young gen.
What you want is a minimum desired *eden* size.
- the desired and target young gen size calculation was done in a single
huge method. This change splits up the code a bit.
- the code that calculated the actual *target* young length has been
very inconsistent in handling the reserve. I.e. the limit to the
actually available free regions only applied in some cases, and notably
not if you were already using the reserve, causing strange behavior
where at least the calculated young gen target length was higher than
available free regions. This could cause full gcs.
- I added some detailed trace-level logging for the ergonomic decisions
which really helps when debugging issues, but might be too
large/intrusive for the code - I am not sure whether to leave it in.
Reviewing: I think the best entry point for this change is
G1Policy::update_young_list_target_length(size_t) where the new code
first calculates a desired young gen length and then target young length.
Some additional notes:
- eden before a mixed gc is calculated by predicting the time for the
minimum amount of old gen regions we will definitely take
(min_old_cset_length) and then letting eden fill up the remainder.
Often this results in significantly larger young gens than before this
change, at worst young gen will be limited to minimum young gen size (as
before). Overall it works fairly well, i.e. gives much smoother cpu
usage. There is a caveat to that in that it depends on accuracy of
predictions. Since G1 predictions are often too high, we might want to
take more a lot more optional regions in the future to not be required
to early terminate the mixed gc
I.e. I have often seen that we do not use up the 200ms pause time goal.
- currently, and like before, MMU desired length overrides the pause
time desired length. I.e. *if* a GCPauseTimeIntervalMillis is set, the
spacing between gcs is more important than actual pause times. The
difference is that now that there is an explicit comment about this
behavior there :)
- when eating into the reserve (last 10% of heap), we at most allow use
of the sizer's min young length or half of the reserve regions (rounded
up!), whichever is higher. This is an arbitrary decision, but since
supposedly at that time we are already close to the next mixed gc due to
adaptive ihop, so we can take more.
- note that all this is about calculating a *target* length, not the
actual length. Actual length may be higher e.g. due to gclocker.
- there may be some out-of-box performance regressions since G1 does not
expand the heap that much more. Performance can be restored by either
decreasing GCTimeRatio, or better setting minimum heap sizes.
Actually, in the future, when shrinking is implemented (JDK-8238687),
these may be more severe (in some benchmarks, actual gc usage is still
<2%). I will likely try to balance that with decreasing default
GCTimeRatio value in the future.
mach5 tier1-5, perf testing
More information about the hotspot-gc-dev