[aarch64-port-dev ]  RFR(S): 8251525: AARCH64: Faster Math.signum(fp)
dmitry.chuyko at bell-sw.com
Mon Aug 31 14:28:46 UTC 2020
Here is another version of intrinsics. It is an extension of webrev.03.
Additional thing is that constants 0 and 1 that are used internally by
intrinics are constructed as nodes. This is somehow similar to what is
done for passing pointers to tables.
As you can see the case of intrinsic for entire signum is now up to
29.2% better for "random" data. NaN is 30% better also. The only
suffering case is 0, which is just 1 number (in two representations) of
the whole range, and the regression is ~7%/10%. Performance in case of 0
becomes the same as for all other numbers (and NaN). I don't suppose
that 0 is so special. Because if input data is all zeroes and program
produces zeroes during the computation, it is trivial. If zero make half
of the data, there still be a win.
For the case of copySign(double), making a constant in IR amplifies
regression in Blackhole benchmark, but still may be interesting to
Just in case, it will be interesting to remeasure Blackhole variants if
compiler support  will be implemented.
Here is also a benchmark variant  where we consume different data,
and it shows same effects as Blackhole.consume(signum).
More information about the aarch64-port-dev