[aarch64-port-dev ] RFR(M): 8212043: Add floating-point Math.min/max intrinsics
aph at redhat.com
Fri Oct 26 13:16:53 UTC 2018
On 10/26/2018 11:36 AM, Pengfei Li (Arm Technology China) wrote:
> I got a reason why consecutive fmins are slower. The fmin sequence generated by the nested min() calls has RaW data dependencies. One fmin writes an fp register and the next fmin reads the same one. It leads the instruction pipeline to stall frequently.
Wouldn't that also be true for a non-intrinsic fmin too? Each fp register
output would be the input for a following comparison and conditional branch.
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the hotspot-compiler-dev