[aarch64-port-dev ] [16] RFR(S): 8251525: AARCH64: Faster Math.signum(fp)
Dmitry Chuyko
dmitry.chuyko at bell-sw.com
Mon Aug 31 14:28:46 UTC 2020
Hi Andrew,
Here is another version of intrinsics. It is an extension of webrev.03.
Additional thing is that constants 0 and 1 that are used internally by
intrinics are constructed as nodes. This is somehow similar to what is
done for passing pointers to tables.
webrev: http://cr.openjdk.java.net/~dchuyko/8251525/webrev.04/
results:
http://cr.openjdk.java.net/~dchuyko/8251525/webrev.04/benchmarks/signum-facgt_ir-copysign.ods
As you can see the case of intrinsic for entire signum is now up to
29.2% better for "random" data. NaN is 30% better also. The only
suffering case is 0, which is just 1 number (in two representations) of
the whole range, and the regression is ~7%/10%. Performance in case of 0
becomes the same as for all other numbers (and NaN). I don't suppose
that 0 is so special. Because if input data is all zeroes and program
produces zeroes during the computation, it is trivial. If zero make half
of the data, there still be a win.
For the case of copySign(double), making a constant in IR amplifies
regression in Blackhole benchmark, but still may be interesting to
experiment with.
Just in case, it will be interesting to remeasure Blackhole variants if
compiler support [1] will be implemented.
Here is also a benchmark variant [2] where we consume different data,
and it shows same effects as Blackhole.consume(signum).
-Dmitry
[1] https://bugs.openjdk.java.net/browse/JDK-8252505
[2]
http://cr.openjdk.java.net/~dchuyko/8251525/webrev.04/benchmarks/DoubleSideSinkBench.java
More information about the aarch64-port-dev
mailing list