RFR (S) 8136500: Integer/Long getChars and stringSize should be more idiomatic

Aleksey Shipilev aleksey.shipilev at oracle.com
Fri Nov 20 22:56:56 UTC 2015


We have discovered this both in Compact Strings and Indy String Concat
work, but it deserves to be treated separately.

Integer/Long getChars code seems to be very old (Josh Bloch estimated
circa 1994) and written under the assumption no compiler is here to help
us. Fast-forward 20 years, and I would like to suggest a cleanup in
Integer/Long getChars and stringSize:

This cleanup *also* improves performance:

While cleaning the code up, the patch does a few things for these reasons:

 * Rewrites Integer.stringSize to the similar loop Long.stringSize is
using. The compiled code shows more efficient code: it does not access
memory anymore, but what's more important, after the loop unrolling we
have the precomputed constants against which we are comparing.

 * Removes the manual strength-reduction of multiplications/divisions to
bit-twiddling: the generated code suggests compiler does it for us. In
fact, manual shifting in current getChar code is a pessimisation!

 * Specializes DigitOnes/Tens for bytes for a Latin1 String cases. This
avoids narrowing conversions in the code: surprisingly, this does affect

 * Since the weird "65536"-sized bit-twiddling is gone, we can now make
the first loops to cover all values above and equal to 100. This opens
up the way to carefully spell out the code that processes the remaining
two digits -- this does help performance a lot.

 * Avoids Integer.digits lookups, and does the computations in place.
This saves bounds check, and a memory access. (Note that it is a lesser
evil for the DigitOnes/Tens case, where the alternative computation
would involve integer division).

  - java/lang, java/util, plus new tests


More information about the core-libs-dev mailing list