[PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics
sandhya.viswanathan at intel.com
Sat Feb 2 16:56:58 UTC 2019
There were two problems in the patch:
1) The instruction encoding of 'minpd' and 'maxpd' (it was being encoded as minps, maxps).
2) The registers for operands 'a' and 'b' were being swapped but the 'effect' in the instruct only showed it as USE.
Two alternative here:
a) Mark the register as USE_KILL but that can only be used on fixed registers (a restriction on register allocator).
b) Use another temporary register and don’t overwrite 'a' and 'b'.
So we went with b).
We will do the test-tier1 for 32 bit and get back. We will also update the copyright year and maxss, cmpss change in the next update.
Regarding the test case, it is from Pengfei.Li at arm.com. http://cr.openjdk.java.net/~pli/rfr/8212043/webrev.04 . Andrew Dinn missed pushing it as part of the commit for 8212043.
It would be better if Pengfei/Andrew push it. I will request them and remove from this patch.
From: B. Blaser [mailto:bsrbnd at gmail.com]
Sent: Saturday, February 02, 2019 8:38 AM
To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
Cc: hotspot-compiler-dev at openjdk.java.net; Vladimir Kozlov <vladimir.kozlov at oracle.com>; Bhateja, Jatin <jatin.bhateja at intel.com>; Deshpande, Vivek R <vivek.r.deshpande at intel.com>
Subject: Re: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics
On Fri, 1 Feb 2019 at 22:58, Viswanathan, Sandhya <sandhya.viswanathan at intel.com> wrote:
> Hi All,
> Please find below the updated patch from Jatin:
> This patch fixes the issue reported by Nils.
Could you clarify the problem and its solution?
I don't see any USE_KILL you mentioned earlier, or maybe the test case wasn't correct as Jatin said previously?
> The compiler jtreg tests on Haswell and SKX pass successfully.
The fix also affects x86_32.ad, could you confirm that at least
test-tier1 is successful with these older families too?
> It shows about 25% gain on Haswell desktop and about 75% on SKX server.
> Changing the maxps to maxss and cmpps to cmpss didn’t show any performance difference.
Conceptually, I dislike useless computation even if well parallelized which might increase the processor's temperature and reduce performance on long runs (but this is a personal opinion).
Regarding copyright headers, I note that the year hasn't been updated to 2019 on some of them and on the test file:
3 * Copyright (c) 2018, Arm Limited. All rights reserved.
Is 'Arm Limitied' intended?
> Best Regards,
More information about the hotspot-compiler-dev