[aarch64-port-dev ] [10] RFR: 8187472 - AARCH64: array_equals intrinsic doesn't use prefetch for large arrays

Andrew Haley aph at redhat.com
Mon Oct 30 17:30:36 UTC 2017

On 30/10/17 16:43, Dmitrij Pochepko wrote:
> I've tried simd loads(even aligned ones to be sure that alignment is not 
> an issue). simd versions were attached into JDK-8187472 as
>   - v5.0(simd loads, 16-byte address alignment, 64 bytes per 1 loop 
> iteration)
>   - v7.0(simd loads, 16-byte alignment, 64 bytes per 1 loop iteration)
>   - v9.0(simd loads, 64 byte alignment, 128 bytes per 1 loop iteration).
> I've measured it on ThunderX and found while best non-simd version 
> handles 1000000 bytes arrays in ~295 microseconds, simd versions had 
> numbers about ~355 microseconds.

I'm rather reluctant to accept non-SIMD intrinsics because I expect
SIMD performance to improve, and I expect SIMD to be the future.  The
same is true of implementations which avoid the use of ldp.

Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

More information about the hotspot-compiler-dev mailing list