[aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request
vladimir.kozlov at oracle.com
Thu Feb 28 19:26:34 UTC 2019
Thank you, Pengfei
Then lets keep branch prediction heuristic shared. I take back my previous suggestion to have a function for it.
Jatin, can you Pengfei's question about your change?
On 2/27/19 10:45 PM, Pengfei Li (Arm Technology China) wrote:
> Hi Vladimir, Jatin and All,
>> So I have question for aarch64 developers. Are aarch64 fmin/fmax
>> instructions are always faster than code generated by default? If this is true
>> new conditions should be x86 specific. To have a separate function to do
>> these checks. We have precedent - clear_upper_avx(). May be later we have
>> to add other conditions for other platforms too.
> I am the author of original AArch64 fmin/fmax intrinsics patch, but not a reviewer.
> Both Andrew Haley and I have tested the performance of AArch64 fmin/fmax instructions before. As far as I could remember, the result is similar to what we have seen here on x86. If selecting the min/max values from an array of random numbers, fmin/fmax instructions show better performance. But for an already (almost) sorted array, fmin/fmax instructions do make the performance worse, but not too much. So personally I think, adding heuristic in shared code would benefit AArch64 as well.
> I didn't quite understand Jatin's additional code below.
> +#ifdef X86
> + // Being conservative since all the phi edges may not be set
> + // by now. This is done to skip over reduction scenarios.
> + if (a->is_Phi() || b->is_Phi())
> + return false;
> Is it going to black out *all* reduction scenarios? I see the intrinsics benefit the reduction in some cases. And in my opinion, adding this kind of platform-dependent macros in hotspot shared code is not so good.
>  http://hg.openjdk.java.net/jdk/jdk/rev/f15af1e2c683
More information about the hotspot-compiler-dev