[9] RFR(S): 8159016: Over-unrolled loop is partially removed

Tobias Hartmann tobias.hartmann at oracle.com
Tue Jun 21 15:19:42 UTC 2016


please review the following fix:


I was able to reproduce the problem with a simple test:

 Object arr[] = new Object[3];
 int lim = (arg & 3);
 for (int i = 0; i < lim; ++i) {
   arr[i] = new Object();

The loop is split into pre, main and post loops. The pre loop is executed for at least one iteration, initializing arr[0]. The main loop is unrolled twice, initializing at least arr[1], arr[2], arr[3] and arr[4]. The arr[3] and arr[4] stores are always out of bounds and removed. However, C2 is unable to remove the "over-unrolled", dead main loop. As a result, there is a control path from the main loop to the PostScalarRce loop (see JDK-8151573) without a memory path since the last store was replaced by TOP. We crash because we use a memory edge from a non-dominating region.

The out of bound stores in the over-unrolled loop are removed because their range check dependent CastII nodes are removed by the first optimization in CastIINode::Ideal():
 CastII (AddI Phi 2) -> AddI (CastII Phi) 2 -> AddI TOP 2

The CastIINodes are replaced by TOP because the type of the loop Phi is >= 1 and the CastII is [1..2], i.e. there is no value >= 1 that is in the [1..2] range if +2 is added.

The underlying problem is the over-unrolling of the loop. Since lim < 3, we always only execute at least 3 loop iterations. With the pre loop executing at least one iteration, the main loop should not be unrolled more than once. This problem is similar to JDK-4834191 where we over-unrolled a loop with a constant limit.

I changed the implementation of IdealLoopTree::compute_exact_trip_count() to not only compute the exact trip count if the loop limit is constant but to also set the _trip_count field if we are able to determine an upper bound. Like this, the checks in IdealLoopTree::policy_unroll() prevent additional loop unrolling if the trip_count became too small and we keep the maximally unrolled version.

An alternative fix would be to disable the CastII optimization for range check dependent CastII nodes but this does not fix the underlying problem of over-unrolling.

Tested with regression test, JPRT and RBT (running).


More information about the hotspot-compiler-dev mailing list