JDK 1.8.0 33/40, diacritics and file problems
Fabrizio.Giudici at tidalwave.it
Fri Apr 24 23:39:15 UTC 2015
Ok, I've run into many problems in the past with diacritics, as there were
some JDK problems, but I supposed they were all fixed today. But perhaps
there's something I'm not understanding.
I've several files with diacritics in their name, let's say e.g. "La
Cathédrale Engloutie.m4a". A catalog contains their names, and it has been
prepared on Mac OS X, JDK 1.8.0_40 and saved with UTF-8 encoding. The
catalog is read, of course specifying UTF-8 as encoding, on the Raspberry
PI Rasbian with JDK 1.8.0_33. Everything is correct as I see the proper
characters in the UI and logfiles.
The problem arises when I try to open a file with diacritics (this doesn't
happen with all files with diacritics in their name, only with some): I
get an exception because the file name is not found (both with io and
nio). Thanks to some suggestions, I made it work by passing the file name
through Paths.get(Normalizer.normalize(path.toString(), NFD)). This
transforms the initial encoding for the é from c3 a9 (doesn't work) to 65
Now, first I don't understand why I have to take care of this. I'm aware
that different file systems use different encodings, but I supposed that
all the conversions were done by the JVM. BTW, both systems are configured
The Java system properties are:
sun.io.unicode.encoding: UnicodeLittle (ARM) sun.io.unicode.encoding:
The files on the ARM were rsynced from the Mac. I'm not sure that
LC_ALL/LANG/whatever were already set when the rsync was performed.
If it's correct that I have to deal with it, is there any official
documentation I can reference? BTW, I'm not aware of why the NFD
normalisation is the one who works, and not one of the other three.
Fabrizio Giudici - Java Architect @ Tidalwave s.a.s.
"We make Java work. Everywhere."
http://tidalwave.it/fabrizio/blog - fabrizio.giudici at tidalwave.it
More information about the openjfx-dev