Unicode script support in Regex and Character class

Xueming Shen xueming.shen at oracle.com
Tue Apr 27 04:32:26 UTC 2010

Ulf Zibis wrote:
>> I would like to have the 3 special cases INHERITED, COMMON and 
>> UNKNOWN together at the beginning or end of the enum list.
>> Why?  Since the current list is generated by the script from the 
>> Scripts.txt, it's in the order of what
>> they are in the Scripts.txt, any particular reason they should be 
>> listed differently? We do have the
>> links at the beginning already. I don't see any advantage of putting 
>> them physically together.
> Someone might find it useful to code for example
>     if (script < UnicodeScript.LATIN)
> to easily filter the special cases.
> Same might be considered for SURROGATE, PRIVATE_USE, UNASSIGNED.
I don't think Java should DEFINE and FORCE a logical order of Unicode 
script names for Unicode consortium. It
would be better to Leave that to the appropriate party.


More information about the core-libs-dev mailing list