[aarch64-port-dev ] RFR(M): 8212043: Add floating-point Math.min/max intrinsics

Andrew Haley aph at redhat.com
Fri Oct 26 13:16:53 UTC 2018

On 10/26/2018 11:36 AM, Pengfei Li (Arm Technology China) wrote:
> I got a reason why consecutive fmins are slower. The fmin sequence generated by the nested min() calls has RaW data dependencies. One fmin writes an fp register and the next fmin reads the same one. It leads the instruction pipeline to stall frequently.

Wouldn't that also be true for a non-intrinsic fmin too? Each fp register
output would be the input for a following comparison and conditional branch.

Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

More information about the hotspot-compiler-dev mailing list