CR for RFR 8151573
Berg, Michael C
michael.c.berg at intel.com
Tue Mar 15 21:04:21 UTC 2016
I would like to contribute multi-versioning post loops for range check elimination. Beforehand cfg optimizations after register allocation were where post loop optimizations were done for range checks. I have added code which produces the desired effect much earlier by introducing a safe transformation which will minimally allow a range check free version of the final post loop to execute up until the point it actually has to take a range check exception by re-ranging the limit of the rce'd loop, then exit the rce'd post loop and take the range check exception in the legacy loops execution if required. If during optimization we discover that we know enough to remove the range check version of the post loop, mostly by exposing the load range values into the limit logic of the rce'd post loop, we will eliminate the range check post loop altogether much like cfg optimizations did, but much earlier. This gives optimizations like programmable SIMD (via SuperWord) the opportunity to vectorize the rce'd post loops to a single iteration based on mask vectors which map to the residual iterations. Programmable SIMD will be a follow on change set utilizing this code to stage its work. This optimization also exposes the rce'd post loop without flow to other optimizations. Currently I have enabled this optimization for x86 only. We base this loop on successfully rce'd main loops and if for whatever reason, multiversioning fails, we eliminate the loop we added.
This code was tested as follows:
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-compiler-dev