RFC (M) 8058968: Compiler time traces should be improved
vladimir.x.ivanov at oracle.com
Tue Sep 23 16:57:56 UTC 2014
Looks good to me. Thanks for taking care of this long-standing cleanup.
As a request for further improvements, I'd like to see the following:
* More fine grained details about "Incremental Inline"
We do additional optimization passes during incremental inlining, so
it'd be great to see where we actually spend time.
* The following information segregated by compiler type (C1/C2):
Total compiled methods : 19359 methods
Standard compilation : 19298 methods
On stack replacement : 61 methods
Total compiled bytecodes : 4777417 bytes
Standard compilation : 4717650 bytes
On stack replacement : 59767 bytes
Average compilation speed: 20330 bytes/s
nmethod code size : 107922240 bytes
nmethod total size : 201508520 bytes
* Additional statistics about Tiered compilation (compilation task
counts per level) would be useful here as well.
I'm fine if these enhancements go is as separate changes though.
On 9/23/14, 7:56 PM, Aleksey Shipilev wrote:
> Current C1/C2 compiler time tracing is old and rusty: we have separate
> VM options for C1 and C2 compilers; C2 tracking ignores CompilerBroker
> (mostly), and doing the tracking on its own; bailouts and invalidates
> are ignored in time tracks; and significant parts of compiler code are
> not timed accurately.
> We need to fix that for any future compiler performance work to succeed.
> Our next stop in Nashorn/warmup performance endeavor would probably be
> the native compilers performance.
> I submitted this RFE to track:
> Current patch is here, and comments are welcome:
> This code was only compiled on my dev Linux x86_64, and haven't yet been
> in JPRT.
> Summary of changes:
> * Removes -XX:-TimeCompiler and -XX:-TimeCompiler2, and uses
> -XX:+CITime and -XX:+CITimeVerbose consistently.
> * CompilerBroker is now responsible for printing the stats for both C1
> and C2.
> * CompilerBroker now counts the time spent in bailed out compilations
> as well. We are spending significant time there in Nashorn (bug to
> follow), and omitting these compilation from total times ruin the
> summaries (i.e. "C1+C2 > Total" looks really weird.
> * C2 probes are turned into a product probes: there is little reason to
> have them in fastdebug only, because our performance workloads are
> executed with product bits, and we need statistics there. C1 already
> does it in product mode.
> * C2 probes are collapsed into a array, and enumerated event names are
> used, like in C1.
> * Additional C1 and C2 probes are added to cover "blind spots".
> Compare the output:
More information about the hotspot-compiler-dev