RFR (M): 8212206: Refactor AdaptiveSizePolicy to separate out code related to GC overhead
per.liden at oracle.com
Wed Dec 19 14:06:27 UTC 2018
On 12/19/18 11:07 AM, Thomas Schatzl wrote:
> On Wed, 2018-12-12 at 18:30 +0800, Man Cao wrote:
>> Addressed several comments. New webrev:
>> Diff from webrev.00:
> Looks good, but see the comments below too.
>>>> Assuming that all collectors want to implement this, and actually
>>>> need to I am leaning towards doing so. However the ZGC/Shenandoah
>>>> people might object to this.
>> I haven't moved the GCOverheadChecker instance to CollectedHeap yet.
>> Should I wait for ZGC/Shenandoah people to give some green light?
I don't foresee that this will be implemented, or even makes sense, for
ZGC. As I see it, this is only a thing STW collectors. For that reason,
I don't think it belongs in CollectedHeap. Keeping it as a separate
utility class for collectors that want to use it sounds better.
I haven't looked very closely at the patch, but couldn't help to notice
that the option is called "GCOverheapLimitThreshold" (and
"AdaptiveSizePolicyGCTimeLimitThreshold" before that), which is a
tautology and a not very good description of what it is.
How about we take the opportunity to clean this up and completely ditch
the "gc_overhead_limit_count" thing and get rid of this option? It's a
"develop" option, so it's not available to normal users anyway. Has
anyone of you ever used this option and actually find it valuable?
> I cc'ed zgc-dev and shenandoah-dev.
>>> Creating a CSR and getting it approved is not a big deal; it may
>>> even be useful as it clearly communicates the change to the users.
>>> Additionally I think due to that translation table I mentioned, the
>>> old name can still be used I think.
>> As for the hsperfdata counter sun.gc.policy.gcTimeLimitExceeded, I
>> found two issues:
>> (a) The translation table in aliasmap seems to mainly target JDK-
>> internal usage of the counters.
>> Only the PerfDataBufferImpl.findByName() method makes use of the
>> aliasmap. There are use cases that doesn't work with the aliasmap.
>> $ jstat -J-Djstat.showUnsupported=true -name java.ci.totalTime <pid>
>> // This works
>> $ jstat -J-Djstat.showUnsupported=true -name hotspot.ci.total.time
>> <pid> // This doesn't work
>> This is because the "jstat -name" would invoke the
>> PerfDataBufferImpl.findByPattern() method,
>> which does not take the aliasmap into account.
>> In addition, there are independent implementations that read
>> file directly, e.g.:
>> And Google internally has a Java implementation that does the job
>> (but uses Guava library).
>> These tools do not support aliasmap.
> Okay, I admit I do not know much about these counters...
>> As for this counter, fortunately I found it hardly used anywhere in
>> OpenJDK or across Google's depot.
>> And its current implementation is not that useful, as described
>> (b) This counter contains a boolean value that is set at every GC.
>> This makes its usefulness limited, as it is very hard to catch the
>> moment when it is set to 1. When a full GC sets it to 1 and throws an
>> OOM exception due to GC overhead exceeded, the JVM could subsequently
>> trigger another full GC and reset the counter to 0, then terminates
>> due to the OOM exception.
>> If -XX:PerfDataSaveFile=foo.hsperf is set, foo.hsperf would contain 0
>> for this counter in this case,
>> which is quite unexpected.
>> I'd propose to change this counter to a cumulative counter, i.e, the
>> total number of GCs that trigger GC-overhead-limit-exceeded event,
>> and rename this counter as the same time.
>> I think it is cleaner to do this change in a separate RFE and CSR.
>> What do you think?
> I agree to separate this change out.
More information about the hotspot-dev