RFR(M): 8193130: Bad graph when unrolled loop bounds conflicts with range checks
tobias.hartmann at oracle.com
Tue Jan 9 16:45:40 UTC 2018
sorry for the delay. Very nice that you were able to find a regression test!
Here are some comments:
- Why do you only remove the Opaque4Nodes after the macro expansion phase? Couldn't we avoid the additional
igvn.optimize() by merging it into line 2346?
- line 3363: Please add the comments from loopPredicate.cpp
- line 780: Why is that check necessary? And wouldn't it make more sense to check this in the caller?
- looks reasonable but this is very hard to review :)
- Please add a comment to CountedLoopNode::skip_predicates()
- Please use the place for "*" consistently (I prefer it at the type instead of the name). For example, in line 1069, 1076.
- Please also fix the whitespaces in the constructor argument list in lines 38, 45
- Maybe we should fix the comment for Opaque4Nodes because they are now also used in a different context
Please update the copyright dates to 2018. Did you run any performance testing?
I've started correctness testing.
On 18.12.2017 15:40, Roland Westrelin wrote:
> In the test case:
> The loop in test1() (inlined from test1_helper()) runs for a small
> number of iterations but the init value is not constant and profiling
> tells the compiler it runs for a large number of iterations. First, loop
> predication moves the range checks out of loop, then pre/main/post loops
> are created and the main loop is unrolled 16 times. After unrolling the
> init value of the main loop is known to be <= -11 and the limit is -10
> (the main loop is never run). The type of the main loop iv is set to <=
> -10. The main loop no longer has range checks but it still has the range
> check CastII and the ConvI2L nodes that enforce that the iv should be >=
> 0. The CastII/ConvI2L nodes become top and some data paths in the main
> loop body die. Because there's no test that compare the iv to 0 in the
> loop body (they are predicates above the pre loop), no control
> paths die. That leads to a broken graph.
> The only way to fix this that I found, is to have predicates between the
> main and pre loops that verify that the init value of the main loop
> falls into the range allowed by range checks. This is tricky because
> when predication runs there's no pre/main/post loops yet. The change I
> propose adds skeleton predicates during loop predicates. They are a copy
> of predicates on the init value of the loop but use a place holder for
> the init value (Opaque1 node). They are built using an If->Opaque4->Bool
> pattern that will make them go away after loop optimizations (the
> Opaque4 node has a default value it takes after loop opts). When the
> pre/main/post loops are created, the skeleton predicates are copied
> above the main loop, the Opaque1 node is replaced by the actual init
> value for the main loop, the uncommon traps of the predicate are
> replaced by a Halt nodes. The copies also use Opaque4 nodes that will
> cause the predicates of the main loop to be removed from the final
> With this change, when the type of the init value of the main loop is
> known to be -11, the main loop becomes unreachable and is optimized out.
> An extra complication is that the same problem exists with iteration
> splitting: after iteration splitting + unrolling the main loop could
> become unreachable at runtime and CastII/ConvI2L nodes in the main loop
> body die. This is fixed here by adding predicates between the main and
> pre loop that enforce range check constraints similar to what is
> decribed above for loop predication.
More information about the hotspot-compiler-dev