RFR(xs) JDK-8163861 JNI NewString() and GetStringLength() documentation incorrect

David Holmes david.holmes at oracle.com
Fri Apr 21 07:07:53 UTC 2017

On 21/04/2017 12:25 AM, George Triantafillou wrote:
> Hi David,
> Thanks for the review.  I wasn't aware that the CSR isn't yet available
> for 10.  Do you know when it will be available?

"real soon now"


> On 4/19/2017 8:48 PM, David Holmes wrote:
>> Hi George,
>> Apologies in advance - this is not "xs" in the discussion. But this
>> needs to go through CSR process (which is not yet available for 10) so
>> it can't be pushed yet anyway (sorry). But consider this my CSR
>> review. :)
>> On 20/04/2017 3:43 AM, George Triantafillou wrote:
>>> Please review this very small fix to the spec for JDK-8163861:
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8163861
>>> webrev: http://cr.openjdk.java.net/~gtriantafill/8163861-webrev/webrev
>>> <http://cr.openjdk.java.net/%7Egtriantafill/8163861-webrev/webrev>
>>> The change provides a more precise definition for NewString() and
>>> GetStringLength().  Thanks.
>> As the bug report states 'most of the Java documention was updated to
>> deal with this', so I think it important to ensure any changes are
>> consistent with that updated documentation and terminology.
>> Specifically we need to refer to the String class API specification
>> and the Character class specification which defines Unicode Character
>> Representations for the Java platform.
>> Looking at the JDK 8 JNI spec as referenced in the bug report we first
>> see:
>> jsize GetStringLength(JNIEnv *env, jstring string);
>> Returns the length (the count of Unicode characters) of a Java string.
>> This should be functionally equivalent to java.lang.String.length().
>> If anything the "count of Unicode characters" should just be deleted
>> as this method is intended to return the length of the Java string,
>> just as String.length does. The phrase "count of Unicode characters"
>> is at best "loose" as the reporter notes (and this terminology is
>> mis-used in a few of the JNI string functions!). The String.length()
>> method states:
>> "The length is equal to the number of Unicode code units in the string."
>> where "Unicode code units" is defined in the Character class.
>> So the correct fix would be to change "characters" to "code units".
>> However that there are two descriptions of how the function behaves -
>> in javadoc terminology we have one description in the main body:
>> 4168 <p>Returns the length (the count of Unicode
>> 4169 characters) of a Java string.</p>
>> and one equivalent to @returns:
>> 4180 <h4>RETURNS:</h4>
>> 4181
>> 4182 <p>Returns the length of the Java string.</p>
>> The latter is perfectly and accurately correct! If you want to know
>> how the length of a Java String is defined then you need to see the
>> String class.
>> So I would say that the correct fix here is to either delete "(the
>> count of Unicode characters)", or else change "characters" to "code
>> units" as previously discussed.
> That sounds reasonable!
>> ---
>> Turning to NewString .... the JNI spec states:
>> "Constructs a new java.lang.String object from an array of Unicode
>> characters."
>> and the @param equivalent for 'len' states:
>> "len: length of the Unicode string. May be 0."
>> So initially we have the problematic use of "Unicode characters"
>> again; then we have the problem as to what the definition of the
>> length of a "Unicode string" is.
>> I find no help elsewhere in the JNI spec to clarify this (despite a
>> very lengthy discussion on its use of UTF-8 encodings). I can only
>> assume, based on the implementation, that the expectation is that the
>> array of "Unicode characters" is in UTF-16 format. In which case a
>> more technically accurate fix would be to say:
>> "len: length of the Unicode string in UTF-16 code units. May be 0."
>> Thanks,
>> David
>> PS. I will also add the above to the bug report.
> Thanks, I appreciate it.
> -George
>>> -George

More information about the hotspot-dev mailing list