RFR(M): 8054478 C2: Incorrectly compiled char[] array access crashes JVM

Roland Westrelin roland.westrelin at oracle.com
Wed Sep 17 17:03:42 UTC 2014


The IR after parsing is this in pseudo code:

i_main = 0;
while(true) {
 if (i_main >= 1000000) return;
 char[] array = new char[1];
 if (pattern1 != null) {
   i = 0;
   while(true) {
     if (i >= 0) {
       do {
         if (pattern0 == null) uncommon_trap;
         if (pattern0.length u< i) uncommon_trap;
         if (pattern0[i] != 0) {
           if (pattern1.length u< i) uncommon_trap;
           if (pattern1[i] != 0) {
             goto out1;
         if (pos != -1) {
           goto out2;
       } while (i >= 0);
       goto out1;
     c = array[pos];

The do {} while loop is a CountedLoop. The null check & range check are moved out of the loop. Then a pre and a post loops are created and the body is unrolled. The pattern0[i] LoadNode nodes in the unrolled body have a control that is set to the range check before the pre loop. In the unrolled loop, because of the if (pos != -1) test in the first copy of the loop, the compiler finds that in the second copy of the loop body pos is != -1 and so the loop is exited before reaching the end of the unrolled loop. The back branch of the unrolled loop is dead. The compiler optimizes the CountedLoopNode of the unrolled loop out because it doesn’t have a back branch anymore, PhiNodes are removed and the LoadNode for pattern0[i] in the unroll loop body is now independent of the input control of the CountedLoop. Its control is still set to the range check before the pre loop. In the generated code, the pre loop is followed by the pattern1[i] access which is incorrect because it happens before the if (i >= 0) that dominated the unrolled loop before it was removed.

The graph is good as long as the CountedLoop is not removed because the LoadNode depends on a PhiNode of the CountedLoop. But as soon as the CountedLoop is removed and the PhiNode is disconnected, the LoadNode no longer depends on the check that guarded the CountedLoop. 

Vladimir suggested (in a private conversation) that I use a CastII (control dependent on the if that guards the main loop with as input the input value to the induction variable in the loop) as a way to guarantee correct dependencies. There are 2 problems with this:

1) the CastII nodes are removed after CCP. I added a flag to the CastII nodes so we can mark CastII nodes that can’t be removed and should be left as they are post-CCP.
2) we don’t know the range of values that are possible on the if branch that enters the main loop right away (it can be adjusted as loop opts are applied). We may actually never know it if it depends on non constants. If we use a type for a range that is too wide in the CastII the CastII may be optimized out incorrectly. I added another flag to the CastII to mark nodes that cannot be optimized. In CastII:Ideal I added code that tries to figure out a better type for the CastII so we may be able to optimize the CastII as the compiler optimizes the IR.

Another independent issue here is that the main loop is actually never executed because the loop of the test is a single iteration loop so the compiler should be able to optimize out the main loop entirely but it doesn’t. PhiNode::Value() has this code:

 CountedLoopNode *l = r->is_CountedLoop() ? r->as_CountedLoop() : NULL;
 if( l && l->can_be_counted_loop(phase) &&
     ((const Node*)l->phi() == this) ) { // Trip counted loop!

       if( lo->_hi < hi->_lo )     // Reversed endpoints are well defined :-(
         return TypeInt::make(lo->_lo,hi->_hi,3);

that returns an accurate value for the Phi node of a CountedLoop. In this case we want the pre loop’s Phi to return an accurate type that would then make the guard of the main loop optimize. The pre loop is guarded by a test on an opaque node which is later removed. If PhiNode::Value() is called before the Opaque1 node is removed then, the limit type is int and it cannot return anything accurate. If it’s called after, then limit type is a constant and PhiNode::Value() also returns a constant for the pre-loop. Then the main loop is optimized out. To fix this: I added code to Opaque1Node::Ideal so it reenqueues the Phi when the opaque node is removed.


More information about the hotspot-compiler-dev mailing list