RFR(L): 8186027: C2: loop strip mining
nils.eliasson at oracle.com
Thu Nov 23 21:59:48 UTC 2017
On 2017-11-23 15:18, Roland Westrelin wrote:
> Hi Vladimir,
>> I am running testing again. But if this will repeat and presence of this
>> Sparse.small regression suggesting to me that may be we should keep this
>> optimization off by default - keep UseCountedLoopSafepoints false.
>> We may switch it on later with additional changes which address regressions.
>> What do you think?
> If the inner loop runs for a small number of iterations and the compiler
> can't statically prove it, I don't see a way to remove the overhead of
> loop strip mining entirely. So I'm not optimistic the regression can be
Agreed. In other words: Loop strip mining adds a guarantee that
time-to-safepoint won't be too long, and that has a small cost
The current situation is that we have some extra performance with
UseCountedLoopSafepoints default off, but let some users have a bad
experience when they encounter long time-to-safepoint times or failures
(https://bugs.openjdk.java.net/browse/JDK-5014723). I rather turn the
table and have loop strip mining on, and let the power users experiment
with turning it off for any uncertain performance boost.
> If loop strip mining defaults to false, would there we be any regular
> testing on your side?
We would have to add some.
> It seems to me that it would make sense to enable loop strip mining
> depending on what GC is used: it makes little sense for parallel gc but
> we'll want it enabled for Shenandoah for instance. Where does G1 fit? I
> can't really say and I don't have a strong opinion. But as I understand,
> G1 was made default under the assumption that users would be ok trading
> throughput for better latency. Maybe, that same reasoning applies to
> loop strip mining?
Scimark.sparse.small show a regression, but having long
time-to-safepoint has a throughput cost in some settings like the
companion benchmark scimark.sparse.large. Numbers using G1:
-XX:-UseCountedLoopSafepoints (default) ~86 ops/m
-XX:+UseCountedLoopSafepoints ~106 ops/m
-XX:+UseCountedLoopSafepoints -XX:LoopStripMiningIter=1000 ~111 ops/m
I would prefer having it on by default, at least in G1. Let's ask the G1
GC-team on their opinion.
More information about the hotspot-compiler-dev