RFR(L): 8203197: C2: consider all paths in loop body for loop predication
vladimir.kozlov at oracle.com
Thu May 31 18:46:10 UTC 2018
I think you should make new methodData layout as separate change. I want
it to be pushed and tested separately to make sure it does not cause any
I am fine with your approach. Few comments:
why you need it?
In c1_LIRAssembler_aarch64.cpp you replaced LogBytesPerWord with 0
except last case. Why keep it in last case?
No changes for SPARC, PPC64, etc?
About profiling predicates changes. I have concerns.
Did you take into account the dominating (loop exit) check when you move
a following check as profiling predicate? There could be dependencies
between them (klass check, for example, before field load and then NULL
check you want to move).
Why Shenandoah's barriers checks do not convert into implicit NULL checks?
On 5/16/18 1:21 AM, Roland Westrelin wrote:
> Loop predication is only applied to always executed predicates in a loop
> body. However, hoisting predicates in frequently executed branches can
> also help performance. That patch extends loop predication so it
> considers predicates in all paths in a loop.
> Applying loop predication to a very rarely executed predicate could be
> detrimental as it could increase the execution frequency of the
> predicate. To mitigate that problem, profile data is used to evaluate
> the frequency of the predicates in the loop and the number of iterations
> of the loop. The predicates is, then, hoisted only if moving it out of
> the loop doesn't increase its execution frequency.
> A pathological case is a predicate that is never executed but would
> always fail in some method and is moved out of loop anyway due to
> profile pollution. That predicate would cause a recompilation with all
> predicates disabled. To make sure this never happens, traps for
> predicates that are not always executed are recorded under a new reason:
> Reason_profile_predicate. We're short of per bci traps. Changing the
> MethodData layout so it can accomodate more per bci traps was discussed
> in the past and I used this opportunity to implement it.
> There are now 2 blocks of predicates: Reason_predicate ones first and
> then Reason_profile_predicate. Many C2 changes are there to correctly
> recognize that new layout.
> The extended loop predication is only applied to the 2 inner most loops
> of a loop nest. That's where I found it to pay off in practice and is a
> bit arbitrary.
> I see a ~20 % improvement on ScimarkSparse.small with this.
> This change comes from the Shenandoah project: Shenandoah's barriers
> read a field from the object's header so depend on null checks. Failure
> to hoist a frequently executed null check hurts performance
More information about the hotspot-compiler-dev