backport JEP 344-Abortable Mixed Collections for G1 to jdk11u
zanglin5 at jd.com
Fri Mar 29 11:02:07 UTC 2019
Thanks for your suggestion！
I will try to get those logs but that may take several days.
And I will try ZGC and Shenandoah too.
I may ask help with more data in future :)
Another possibility I can see is to enlarge the 32MB limitation of region size, after searching the code I think it may also possible to have (maybe 64MB) large regions for large heap with some code change. So that the cross region reference became possibly smaller. And seems enlarge the region to 64MB may a little memory overhead in the sparse PRT, What do you think?
在 2019年3月29日，下午6:19，Thomas Schatzl <thomas.schatzl at oracle.com<mailto:thomas.schatzl at oracle.com>> 写道：
On Fri, 2019-03-29 at 06:38 +0000, 臧琳 wrote:
Dear Thomas and Charlie,
Thanks for your suggestions.
Let me describe more about my experiment. I am trying JDK11 on
some sort of server node with huge heap at ~400GB. And it needs to
keep the responsiveness so that long pause time of GC is not
And for some reason, if the server node paused for a long time
(say 60s), the whole process will be killed and hence cause
With JDK11 using G1， after some measurement I believe that keep
MaxGCPauseMills at 200ms is reasonable and works well when there is
no MixedGC - And also want to mention that the server node is super
busy at allocation so there are usually 2~3 YGCs per minute.
It would be really interesting to get a log file with both a few of the
"good" and the "bad" cases with
t=trace logging output (and showing other VM options).
This would help us to gauge live set size and allocation rates.
Depending on these, both Shenandoah and ZGC might be an alternative,
and/or other tunings.
I suspect that for example, with such a large heap you will get
remembered set coarsenings as described in the "High Update RS and Scan
RS Times" in the tuning guide (see also there for how to diagnose them,
but please read the gotcha for diagnosing this in production - via jcmd
you could just add that logging temporarily though). If you can see
those, I would try "-XX:G1RSetRegionEntries=30000" to remove the
coarsenings and "-XX:G1RSetSparseRegionEntries=256" to fix remembered
set memory consumption; I kind of recommend doing the latter anyway.
There are also ways to decrease minimum pause time by bounding young
generation size a bit (-XX:+MaxNewSize), but without logs that's just
too much guessing.
The problems comes with Mixed GC, I observed mainly two issues
a. There is super long pause time for MixedGC. Some times
I found the whole process is killed because the MixedGC paused over
b. There is back-to-back long MixedGCs, for examples, there
can be 2~3 mixedGC within one minutes, and every one of them tooks
~30s. so the process get killed.
For issue a, I think it seems that the collection set may be too
large to be collected in low pause times. So I have tried to enlarge
XX:G1MixedGCCountTarget to reduce the CSet for every MixedGC. But it
seems this option could introduce more MixedGCs overall, which
That's natural. During mixed phase G1 minimizes the young gen size (to
allowed of course) which determines the frequency of collections. If
you have a huge allocation rate, you burn through the available memory
until next GC very quickly.
They shouldn't take 30s though :) I suspect above remembered set
coarsening to be the main cause here.
affected more or less of latency when normal MixedGCs happened.
(Usually there is a batch of mixedGCs after concurrent marking, and
seems the count of mixedGCs in a batch grows by enlarging
XX:G1MixedGCCountTarget ), and this may cause issue b more severe.
Do not set G1MixedGCCountTarget too high.
I also tried the option of -XX:G1HeapWastePercent, and it could
more or less help reduce the MixedGC pause, but it shows if I enlarge
it too much , more MixedGCs are going to happened.
That's natural too.
The default settings for many of these options are geared for somewhat
smaller applications as you may have noticed. We do not have many
"real-world" applications of that size for developing better auto-
tuning, apart from being a bit short in available time. Help in that
area is always appreciated though :)
I come to consider that those options are good but they takes
effect to all mixedGC’s, even for the cases that MixedGC pause time
are acceptable.I tried to find a way to control the Cset by pause
times, and I found the JEP344, after trying it in my case the long
pause mixedGC is reduced by only introducing more low pause mixedGCs
in that specific batch.
Within a given pause time there can only be so much object copying that
can fit into. We are constantly trying to push these boundaries, e.g.
from internal testing of a JDK-8213108 prototype you will likely be
very positively surprised sometime in the future, but if your design is
based on evacuation in distinct pauses, there is a point where the
amount of data to be copied is just too large to fit a reasonable time;
I do not know whether this is the case here.
PS. for the issue b mentioned, I think JEP344 may not help a
lot. My data shown it comes from the updateRS&scanRS time, the tuning
guide mentioned that so I will try it.
And thanks for guiding me, I also cc this thread to jdk-updates-
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-gc-dev