[intrinsics]: performance before after (String::format)

Alex Buckley alex.buckley at oracle.com
Fri Feb 22 21:46:20 UTC 2019

Hi Vicente,

Thanks for this nice testing. I am interested in the following results:

1. Variable strings

                          Intrinsified  Vanilla  Speedup
testStringFormat1VariableStr    10443     2394      44x
testStringFormat2VariableStrs    9497       88     109x
testStringFormat100VariableStrs   148        3      46x

With no type conversions in the mix, any speedup comes from evaluating 
the format string at compile time and emitting a series of string 
concatenations for run time. Each and every concat emitted under this 
intrinsification scheme is a win relative to vanilla invocation. 
1VariableStr does 1 concat (arg + space) while 2VariableStr does 3 
concats (arg1 + text + arg2 + space), so 2VariableStr should win 3x as 
much due to intrinsification; indeed its speedup is almost 3x greater 
than 1VariableStr (109x/44x). Is that the right way to think about 
what's going on?

2. Constant ints

                          Intrinsified  Vanilla  Speedup
testStringFormat1ConstantInt    10023      225      45x
testStringFormat2ConstantInt     9636      126      77x
testStringFormat100ConstantInts    10        3       4x

You're no longer taking advantage of all-constant arguments to perform 
the invocation wholly at compile time, so the low speedup for 100 
arguments is to be expected. It's nice to see the speedup column 
following the same pattern for constant versus non-constant arguments:

                          Intrinsified  Vanilla  Speedup
testStringFormat1VariableInt    10037      194      52x
testStringFormat2VariableInts    9665      114      85x
testStringFormat100VariableInts    10        3       3x

Presumably, type conversions at run time are making the 2*Int cases give 
less of a speedup over 1*Int (85x/52x above) than 2*Str gives over 1*Str 


On 2/22/2019 9:33 AM, Vicente Romero wrote:
> Hi,
> I have executed some performance tests on the intrinsics code to compare
> the before and after. Please find the benchmark results and the JMH
> based benchmark attached. This benchmark is based on a previous one
> written by Hannes. The benchmark compares the execution between the JDK
> built from [1], referred here as JDK13, and [2] which is the amber repo,
> branch `intrinsics-project`.
> Some conclusions from the benchmark results:
>   * the intrinsified code is faster in all cases, for which intrinsified
>     code is produced, compared to the legit (JDK13 vanilla) code
>   * there are wide variations though
> For example for the test: `testStringFormatBoxedArray` which is
> basically benchmarking the performance of: `String.format("%s: %d ",
> args);` where args is: `static final Object[] args = { "Bob", i23 };`,
> there is basically no visible gain as in this case the intrinsification
> is bailing out and producing same code as vanilla JDK13. This result is
> expected. The next test with not so much gain is:
> `testStringFormat1ConstantFloat` which is testing:
>      `String.format("%g", 1.0)`
> the execution is ~2.5 times faster in the intrinsified version but
> nothing compared to: `testStringFormat1ConstantStr` which is ~40 times
> faster. Another interesting conclusion is that the improvement fades out
> with the number of parameters for some cases but keeps constant for
> others. For example it is as fast to concatenate 1 or 100 strings but
> formating one primitive int is ~45 times faster vs a 3.5 improvement
> when formating a hundred.
> I have also attached the table I used to play with the numbers.
> Thanks,
> Vicente
> [1] http://hg.openjdk.java.net/jdk/jdk
> [2] http://hg.openjdk.java.net/amber/amber

More information about the amber-dev mailing list