<i18n dev> RFR: 8248655: Support supplementary characters in String case insensitive operations

naoto.sato at oracle.com naoto.sato at oracle.com
Wed Jul 15 18:32:40 UTC 2020

Hi Joe,

Thank you for your review.

On 7/15/20 10:57 AM, Joe Wang wrote:
> Hi Naoto,
> In StringUTF16.java, if one is isHighSurrogate and the other not, you 
> may quickly return without going through the rest of the process, 
> probably not significant as cp1 and cp2 and/or u1 and u2 won't be equal 
> anyways. But it could skip a couple of 
> toCodePoint/toUpperCase/toLowerCase calls.

Yes, that is correct as of now, which is based on the assumption that 
case mappings do not cross BMP and supplementary planes boundary. I 
could not find any description where that's given or not. So I just took 
it to be safe.


> -Joe
> On 7/15/20 9:00 AM, naoto.sato at oracle.com wrote:
>> Hello,
>> Please review the fix to the following issues:
>> https://bugs.openjdk.java.net/browse/JDK-8248655
>> https://bugs.openjdk.java.net/browse/JDK-8248434
>> The proposed changeset and its CSR are located at:
>> https://cr.openjdk.java.net/~naoto/8248655.8248434/webrev.00/
>> https://bugs.openjdk.java.net/browse/JDK-8248664
>> A bug was filed against SimpleDateFormat (8248434) where 
>> case-insensitive date format/parse failed in some of the new locales 
>> in JDK15. The root cause was that case-insensitive 
>> String.regionMatches() method did not work with supplementary 
>> characters. The problem is that the method's spec does not expect case 
>> mappings of supplementary characters, possibly because it was 
>> overlooked in the first place, JSR 204 - "Unicode Supplementary 
>> Character support". Similar behavior is observed in other two 
>> case-insensitive methods, i.e., compareToIgnoreCase() and 
>> equalsIgnoreCase().
>> The fix is straightforward to compare strings by code point basis, 
>> instead of code unit (16bit "char") basis. Technically this change 
>> will introduce a backward incompatibility, but I believe it is an 
>> incompatibility to wrong behavior, not true to the meaning of those 
>> methods' expectations.
>> Naoto

More information about the i18n-dev mailing list