[PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics
sandhya.viswanathan at intel.com
Fri Feb 1 21:58:29 UTC 2019
Please find below the updated patch from Jatin:
This patch fixes the issue reported by Nils. The compiler jtreg tests on Haswell and SKX pass successfully.
It shows about 25% gain on Haswell desktop and about 75% on SKX server.
Changing the maxps to maxss and cmpps to cmpss didn’t show any performance difference.
From: B. Blaser [mailto:bsrbnd at gmail.com]
Sent: Thursday, January 31, 2019 3:46 AM
To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
Cc: Bhateja, Jatin <jatin.bhateja at intel.com>; Vladimir Kozlov <vladimir.kozlov at oracle.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics
On Wed, 30 Jan 2019 at 21:26, Viswanathan, Sandhya
<sandhya.viswanathan at intel.com> wrote:
> Hi Bernard,
> Thanks a lot for your feedback. Let me try to answer your questions below.
> We also started with the same assumption that we may not be able to easily improve the current implementation on x86 because MIN/MAX instructions don't conform to the Java doc for 0.0 and NaN.
> Jatin took this as a challenge and came up with a sequence that does show benefit.
> Our performance run shows about 30% gain with this patch vs the ucomisd sequence generated by the jitted code.
> As you suggest, we could use scalar instructions for max, min and cmp instead of using the packed variant.
> But the blend has to be the packed variant as there is no scalar flavor for that and so changing the others to scalar flavor may not show much perf change, we will confirm both ways.
> The path that won't show much benefit or may show some regression is when both the operands are NaN which is not frequently occurring case I would think.
> Jatin is going to send updated patch fixing the issue reported by Nils. We will include performance numbers along with the updated patch.
> Best Regards,
Thanks for your answers, we'll wait for Jatin's fixes and measures.
I'll check the updated patch once more but I like this idea and I hope
most paths will be faster.
Please let me know if you need help to push the patch providing that
you get a Reviewer approval.
More information about the hotspot-compiler-dev