RFR(L): 8203197: C2: consider all paths in loop body for loop predication
nils.eliasson at oracle.com
Wed May 30 12:37:26 UTC 2018
Thanks for yet another valuable contribution!
Something to start with:
Test: open/test/hotspot/jtreg/serviceability/sa/TestPrintMdo.java and ./ClhsdbCDSCore.java fails. You have added one more trap reason, but it misses a name.
vmStructs_jvmci.cpp misses _header._struct._trap. It doesn't break
anything since the bits aren't used by any jvmci consumer, but they
should be there for completeness.
I'll continue reviewing and testing your code.
On 2018-05-16 10:21, Roland Westrelin wrote:
> Loop predication is only applied to always executed predicates in a loop
> body. However, hoisting predicates in frequently executed branches can
> also help performance. That patch extends loop predication so it
> considers predicates in all paths in a loop.
> Applying loop predication to a very rarely executed predicate could be
> detrimental as it could increase the execution frequency of the
> predicate. To mitigate that problem, profile data is used to evaluate
> the frequency of the predicates in the loop and the number of iterations
> of the loop. The predicates is, then, hoisted only if moving it out of
> the loop doesn't increase its execution frequency.
> A pathological case is a predicate that is never executed but would
> always fail in some method and is moved out of loop anyway due to
> profile pollution. That predicate would cause a recompilation with all
> predicates disabled. To make sure this never happens, traps for
> predicates that are not always executed are recorded under a new reason:
> Reason_profile_predicate. We're short of per bci traps. Changing the
> MethodData layout so it can accomodate more per bci traps was discussed
> in the past and I used this opportunity to implement it.
> There are now 2 blocks of predicates: Reason_predicate ones first and
> then Reason_profile_predicate. Many C2 changes are there to correctly
> recognize that new layout.
> The extended loop predication is only applied to the 2 inner most loops
> of a loop nest. That's where I found it to pay off in practice and is a
> bit arbitrary.
> I see a ~20 % improvement on ScimarkSparse.small with this.
> This change comes from the Shenandoah project: Shenandoah's barriers
> read a field from the object's header so depend on null checks. Failure
> to hoist a frequently executed null check hurts performance
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-compiler-dev