JDK 1.8.0 33/40, diacritics and file problems

Fabrizio Giudici Fabrizio.Giudici at tidalwave.it
Tue Apr 28 13:06:41 UTC 2015

On Mon, 27 Apr 2015 15:13:46 +0200, Mike Hearn <mike at plan99.net> wrote:

> Thus this may not be a bug in Java so much as a design problem/oversight
> with the operating systems themselves.
> Note that the issue you're running in to is *not* to do with encodings.
> It's not a UTF-8 vs UTF-16 type issue. Rather, the issue is that Unicode
> allows visually identical strings to be represented differently at the
> logical layer, using different sequences of code points.

Yes, I understand.

> You didn't say what app originally saved the files. However, what exact

They were rsynced from Mac OS X. Actually I thought it could be related to  
the piece of software that brought the file on the RPI, but in the end -  
thinking in general - a user could transfer the files in either way, and I  
must be able to deal with them.

> sequence of code points you get on disk for a given piece of human  
> readable
> text can depend on things as varying as what input method editor the user
> typed the file name with, precisely what combination of keys they pressed
> and when, what libraries the app used, and so on.
> Yes it's a mess.
> If you encounter such situations frequently then your best bet may be to
> simply write a little wrapper that tries different normalisations until  
> it
> finds one that works.

I feared that. In the end it might be even reasonably doable, if I can  
take advantage of some preconditions... for instance: is it safe to assume  
that, given a specific instance of a filesystem, everything is  
encoded/normalised in the same way? In this case I could just run a quick  
test at the start of the application, find once for all the correct  
normalisation, and then always apply the same. Otherwise, I have to try  
all the combinations for every file that I open...

Fabrizio Giudici - Java Architect @ Tidalwave s.a.s.
"We make Java work. Everywhere."
http://tidalwave.it/fabrizio/blog - fabrizio.giudici at tidalwave.it

More information about the openjfx-dev mailing list