RFR: 8265237: String.join and StringJoiner can be improved further [v3]

Claes Redestad redestad at openjdk.java.net
Mon Apr 19 10:49:36 UTC 2021

On Sun, 18 Apr 2021 19:55:20 GMT, Peter Levart <plevart at openjdk.org> wrote:

>> While JDK-8148937 improved StringJoiner class by replacing internal use of getChars that copies out characters from String elements into a char[] array with StringBuilder which is somehow more optimal, the improvement was marginal in speed (0% ... 10%) and mainly for smaller strings, while GC was reduced by about 50% in average per operation.
>> Initial attempt to tackle that issue was more involved, but was later discarded because it was apparently using too much internal String details in code that lives outside String and outside java.lang package.
>> But there is another way to package such "intimate" code - we can put it into String itself and just call it from StringJoiner.
>> This PR is an attempt at doing just that. It introduces new package-private method in `java.lang.String` which is then used from both pubic static `String.join` methods as well as from `java.util.StringJoiner` (via SharedSecrets). The improvements can be seen by running the following JMH benchmark:
>> https://gist.github.com/plevart/86ac7fc6d4541dbc08256cde544019ce
>> The comparative results are here:
>> https://jmh.morethan.io/?gist=7eb421cf7982456a2962269137f71c15
>> The jmh-result.json files are here:
>> https://gist.github.com/plevart/7eb421cf7982456a2962269137f71c15
>> Improvement in speed ranges from 8% (for small strings) to 200% (for long strings), while creation of garbage has been further reduced to an almost garbage-free operation.
>> So WDYT?
> Peter Levart has updated the pull request incrementally with one additional commit since the last revision:
>   Fix overflow checking logic, add test for it, avoid branch in loop, add 1-char strings to JHM test

Looks good!

test/micro/org/openjdk/bench/java/util/StringJoinerBenchmark.java line 48:

> 46: @Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
> 47: @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
> 48: @Fork(value = 1, jvmArgsAppend = {"-Xms1g", "-Xmx1g"})

In general I think it's prudent to default to at least 3 forks, to catch issues with run-to-run variance.


Marked as reviewed by redestad (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3501

More information about the core-libs-dev mailing list