[aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request
Pengfei Li (Arm Technology China)
Pengfei.Li at arm.com
Thu Feb 28 06:45:09 UTC 2019
Hi Vladimir, Jatin and All,
> So I have question for aarch64 developers. Are aarch64 fmin/fmax
> instructions are always faster than code generated by default? If this is true
> new conditions should be x86 specific. To have a separate function to do
> these checks. We have precedent - clear_upper_avx(). May be later we have
> to add other conditions for other platforms too.
I am the author of original AArch64 fmin/fmax intrinsics patch, but not a reviewer.
Both Andrew Haley and I have tested the performance of AArch64 fmin/fmax instructions before. As far as I could remember, the result is similar to what we have seen here on x86. If selecting the min/max values from an array of random numbers, fmin/fmax instructions show better performance. But for an already (almost) sorted array, fmin/fmax instructions do make the performance worse, but not too much. So personally I think, adding heuristic in shared code would benefit AArch64 as well.
I didn't quite understand Jatin's additional code below.
+ // Being conservative since all the phi edges may not be set
+ // by now. This is done to skip over reduction scenarios.
+ if (a->is_Phi() || b->is_Phi())
+ return false;
Is it going to black out *all* reduction scenarios? I see the intrinsics benefit the reduction in some cases. And in my opinion, adding this kind of platform-dependent macros in hotspot shared code is not so good.
More information about the hotspot-compiler-dev