[10] RFR (S): 8189177 - AARCH64: Improve _updateBytesCRC32C intrinsic

Dmitry Chuyko dmitry.chuyko at bell-sw.com
Fri Oct 20 17:45:47 UTC 2017


Please review an improvement of CRC32C calculation on AArch64. It is 
done pretty similar to a change for JDK-8189176 described in [1].

MacroAssembler::kernel_crc32c gets unused table registers. They can be 
used to make neighbor loads and CRC calculations independent. Adding 
prologue and epilogue for main by-64 loop makes it applicable starting 
from len=128 so additional by-32 loop is added for smaller lengths.

rfe: https://bugs.openjdk.java.net/browse/JDK-8189177
webrev: http://cr.openjdk.java.net/~dchuyko/8189177/webrev.00/

Results for T88 and A53 [2] are similar to CRC32 change (good), but 
again splitting pair loads may slow down other CPUs so measurements on 
different HW are welcome.



More information about the hotspot-compiler-dev mailing list