RFR (S): 8067014: LinearScan::is_sorted significantly slows down fastdebug builds' performance

Filipp Zhinkin filipp.zhinkin at gmail.com
Thu Feb 18 15:31:45 UTC 2016


I've looked at how frequently misses are actually occur and
how far false positives are from the interval we're looking for.

Also I've tried to implement interval_cmp so that it returns 0
if difference between interval "from" values is below some threshold:

All those misses with distance greater than 64 came from
javax.swing.plaf.synth.SynthStyle::populateDefaultValues [1].

I've also looked to another possible slowness sources and
we spend about 10% of time in LinearScan's verify_intervals method
which checks that every two intervals don't simultaneously intersect
and share the same register [2].

I don't see a way to significantly speed up such verification,
but I've slightly improved performance by rearranging some expressions.

Here is an updated webrev:

Also, unless CommentedAssembly flag is explicitly turned off,
we're generating comments for stubs even if we're not going to print it out.
Avoiding comments generation in such case will speed up compilation a bit more,
but I think it would be better to deal with it in a separate RFE.
Difference in code emission time is about 30% when CommentedAssembly is off
(~ 40s w/ CommentedAssembly, ~ 25s w/o CommentedAssembly).

[1] http://hg.openjdk.java.net/jdk9/hs-comp/jdk/file/6c649a7ac744/src/java.desktop/share/classes/javax/swing/plaf/synth/SynthStyle.java#l68
[2] http://hg.openjdk.java.net/jdk9/hs-comp/hotspot/file/cffca6de2c45/src/share/vm/c1/c1_LinearScan.cpp#l3226

On Fri, Feb 12, 2016 at 7:08 PM, Filipp Zhinkin
<filipp.zhinkin at gmail.com> wrote:
> Hi Aleksey,
> On Fri, Feb 12, 2016 at 3:24 PM, Aleksey Shipilev
> <aleksey.shipilev at oracle.com> wrote:
>> Hi Filipp,
>> On 02/12/2016 02:47 PM, Filipp Zhinkin wrote:
>>> here is a new webrev: http://cr.openjdk.java.net/~fzhinkin/8067014/webrev.01/
>> The webrev seems incomplete: it has only hotspot.patch in it, but no
>> other views?
> It seems like only wdiff's are empty for some reason.
> What else is missed out there?
>> I wonder how many intervals have the same "from", prompting you to
>> wiggle around looking for the exact interval?
> Well, there should be (relatively) many intervals with "from" == 0
> created for physical registers.
> For virtual registers there could be few intervals that share the same
> "from" value:
> it depends on amount of temporary registers required by an operation
> and amount of outputs it produces.
> So we may simply scan intervals from beginning if key's from value is 0.
>> Can we define
>> "interval_cmp" so that "(interval_cmp(i1, i2) == 0) iff (i1 == i2)",
> No, unfortunately we can't, because intervals are ordered only by "from" value.
>> or at least make the false positives less frequent with more extensive
>> interval key (assuming collisions are indeed problematic)?
> Not sure that I've got you.
> Nevertheless, I'll run CTW and check how many false positives are
> actually found.
>>> I've hacked VM sources a bit to run CTW with product bits and C1
>>> compilation time on my x86_64 linux laptop
>>> slowed down by 0.4% (from 51029 ± 306 ms to 51230 ± 293 ms). Please
>>> let me know if it too slow.
>> I think this is within the error margin, and therefore statistically
>> insignificant. Even if it was significant, 0.4% is okay for compilation
>> time regression in C1.
>>> With fastdebug bits provided patch allow to reduce C1 compilation time twice.
>> This is a very good improvement, but we need to see if that's the end of
>> it, or we can squeeze even more with a few changes. I would suggest
>> running the CTW scenario under Solaris Studio Performance Analyzer (see
>> e.g.
>> http://shipilev.net/blog/2016/arrays-wisdom-ancients/#_meet_solaris_studio_performance_analyzer).
> Thank you for the suggestion, I'll check it.
> Regards,
> Filipp.
>> Thanks,
>> -Aleksey

More information about the hotspot-compiler-dev mailing list