JDK 9 Build 111 seems to miss some locale data, Lucene tests fail with Farsi and Thai language

Alan Bateman Alan.Bateman at oracle.com
Sat Mar 26 14:10:18 UTC 2016

On 26/03/2016 11:56, Uwe Schindler wrote:
> Hi,
> after also testing the separate "Jigsaw" build on jdk9.java.net I see the same problems. So both builds 111 are wrong.
> To me it looks like the Unicode data files are missing some information - which could again be a packaging bug. As said before, build 110 does not have this problem, so it seems to be a side-effect of Jigsaw merging.
> The following stuff does not work:
> (1) Thai's locale does not have working dictionary-based BreakIterator available. The following "check" in Lucene for this fails, because it cannot detect a boundary correctly:
>    /**
>     * True if the JRE supports a working dictionary-based breakiterator for Thai.
>     * If this is false, this tokenizer will not work at all!
>     */
>    public static final boolean DBBI_AVAILABLE;
>    private static final BreakIterator proto = BreakIterator.getWordInstance(new Locale("th"));
>    static {
>      // check that we have a working dictionary-based break iterator for thai
>      proto.setText("ภาษาไทย");
>      DBBI_AVAILABLE = proto.isBoundary(4);
>    }
> After this static initializer, DBBI_AVAILABLE is false. This makes some tests to be ignored, but 2 fail because of this (which might be an oversight on our side). But nevertheless, this is a bug in build 111.
I just tried to duplicate this on OSX and Linux without success. The log 
you linked to suggests this is Linux, is that right? Is this the JDK 
bundle, I haven't checked the JRE bundle but would be surprise anything 
is missing. The JDK has several tests for Thai so if it was completely 
broken then I would have expected it would have been seen. I've no doubt 
that it is not working in your environment, we just need to figure out 
what is different.

> (2) The collator for Arabic (Farsi) language fails to work correctly. This also looks like missing data.
> Collator collator = Collator.getInstance(new Locale("ar"));
Are there any exceptions or anything here? Or maybe it tests the 
collector with compare?


More information about the core-libs-dev mailing list