RFR(10): 8181147: JNU_GetStringPlatformChars should have a fast path for UTF-8

Claes Redestad claes.redestad at oracle.com
Mon Jun 12 17:22:22 UTC 2017

Hi again,

after an embarrassing attempt at using HotSpot's modified UTF-8
utilities as a drop-in implementation for real UTF-8 a few weeks
ago, I've exploredvarious better (read: actually working)

While I've experimented with a few different implementations[1],
my favored approach is to add a fast path in the JNI code if the
String is Latin1 coded, but defer to Java code for UTF16 Strings.
This keeps the amount of JNI code we have to maintain in tandem
with the Java implementation from blowing out of proportion.

Overall this gives us a speedup of around 40% for ASCII/Latin1
Strings, while not regressing noticeably for UTF16 encoded Strings.

JDK webrev: http://cr.openjdk.java.net/~redestad/8181147/jdk.04/
Top webrev: http://cr.openjdk.java.net/~redestad/8181147/top.04/
Bug: https://bugs.openjdk.java.net/browse/JDK-8181147

Special thanks to Erik Joelsson and Chris Hegarty for helping me
piece together the changes necessary to add a sanity test for this

If there's preference I don't mind splitting that part off as a
separate RFE, as I think sanity testing should be added in this
area independently of the actual code changes here.



[1] Some attempts used GetStringChars (or GetStringCritical),
but the issue with these is that they add a number of unavoidable
mallocs for latin1 Strings - since the jbyte array is inflated to a
jchar array - which actually slows things down (and might even be
slower than the baseline in some cases when NMT is enabled, since
JNI code "cheats" and doesn't use NMT to track mallocs):


Not wanting to move forward with a solution that actually regress
performance in certain cases, I explored ways to access the byte
array directly to avoid extra mallocs. Thinking that using
GetByteField and friends would be prohibitively expensive, I first
implemented a version using special purpose JNI methods on the
HotSpot side.

This narrowly beats the approach in the proposed version in terms
of raw throughput. For a small (<10%) gain though, it doesn't seem
worthwhile to go through the process of adding such special
purpose JNI methods - but it was a fun experiment:


More information about the core-libs-dev mailing list