com.sun.crypto.provider.GHASH performance fix
fweimer at redhat.com
Mon Aug 18 12:32:50 UTC 2014
This change addresses a severe performance regression, first introduced
in JDK 8, triggered by the negotiation of a GCM cipher suite in the TLS
implementation. This regression is a result of the poor performance of
the implementation of the GHASH function.
I first tried to eliminate just the allocations in blockMult while still
retaining the byte arrays. This did not substantially increase
performance in my micro-benchmark. I then replaced the 16-byte arrays
with longs, replaced the inner loops with direct bit fiddling on the
longs, eliminated data-dependent conditionals (which are generally
frowned upon in cryptographic algorithms due to the risk of timing
attacks), and split the main loop in two, one for each half of the hash
state. This is the result:
Performance is roughly ten times faster. My test download over HTTPS is
no longer CPU-bound, and GHASH hardly shows up in profiles anymore.
(That's why I didn't consider further changes, lookup tables in
particular.) Micro-benchmarking shows roughly a ten-fold increase in
throughput, but this is probably underestimating it because of the high
allocation rate of the old code.
The performance improvement on 32-bit architectures is probably a bit
less, but I suspect that using four ints instead of two longs would
penalize 64-bit architectures.
Florian Weimer / Red Hat Product Security
More information about the security-dev