CR for RFR 8149421
Berg, Michael C
michael.c.berg at intel.com
Tue Feb 9 23:16:47 UTC 2016
I would like to contribute vectorized post loops. This patch is initially targeted for x86. The design is versatile so as to be portable to other targets as well. This code poses the addition of atomic unrolled drain loops which precede fix-up segments and which are significantly faster than scalar code. The requirement is that the main loop is super unrolled after vectorization. I see up to 54% uplift on micro benchmarks on x86 targets for loops which pass superword vectorization and which meet the above criteria. Also scimark metrics in SpecJvm2008 like lu.small and fft.small show the usage of this design for benefit on x86.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-compiler-dev