RFR: 8046943: RSA Acceleration
aph at redhat.com
Fri Jun 26 16:25:38 UTC 2015
On 06/19/2015 09:34 AM, Andrew Haley wrote:
> On 18/06/15 20:28, Vladimir Kozlov wrote:
>> Yes, it is a lot of handwriting but we need it to work on all OSs.
> Sure, I get that. I knew there would be a few goes around with this,
> but it's worth the pain for the performance improvement.
I made some changes, as requested.
Everything is now private static final.
The libcall now only calls the runtime code: all allocation is done
in Java code.
I tested on Solaris using Solaris Studio 12.3 tools, and it's fine.
There's one thing I'm not sure about. I now longer allocate scratch
memory on the heap. That was only needed for extremely large
integers, larger than anyone needs for crypto. Now, if the size of an
integer exceeds 16384 bits I do not use the intrinsic, and this allows
it to use stack-allocated memory for its scratch space.
The main thing I was worried about is that the time spent in
Montgomery multiplication. The runtime of the algorithm is O(N^2); if
you don't limit the size, the time is unbounded, with no safepoint
delay. This would mean that anyone who passed an absurdly large
integer to BigInteger.modPow() would see the virtual machine
apparently lock up and garbage collection would not run. I note that
the multiplyToLen() intrinsic has the same problem.
More information about the hotspot-compiler-dev