A bug in filesystem bootstrap (unix/ linux) prevents
Ulf.Zibis at CoSoCo.de
Mon Aug 27 08:47:11 UTC 2012
what is "jnu" standing for?
You have touched on 3 "classes" of system encodings:
- default encoding of the file content
- the encoding of the file path
- the "text" encoding when use the platform APIs
Are there more?
> so in theory file.encoding should be used to only for the encoding of "file content", and
> the sun.jnu.encoding should be used when you need the encoding to talk to those platform APIs
Which property is used for the encoding of the file path?
In Charset.defaultCharset() it is not specified, on which of those 3+? "classes" this method refers.
IMHO this should be done!
Am 05.07.2012 09:52, schrieb Xueming Shen:
> The code cited is a little shortcut, if there is locale over there is indeed using
> utf-16, or any encoding that needs to switch/shift into ASCII (or its single byte
> charset area) with a shift/in/out character.. So far I'm not aware of any such
> a locale on any our supported platform. Historically, this kind of assumption
> might run into trouble when being ported to other platform, such as ebcdic
> based system (but I don't think it's a problem in this case). Ideally, the code
> probably should be coded to be able to deal with a mb type of "/", but obviously
> it was decided to take the short-cut for better performance here.
> "We" have been taking the stand that file.encoding is an informative/read-only
> system property for a long time, mainly because of two reasons. First this
> property is really defined/implemented/used as the default encoding that the jvm
> uses to communicated with the underlying platform for local/encoding sensitive
> stuff, the default encoding of the file content, the encoding of the file path and
> the "text" encoding when use the platform APIs, for example. It's like a "contract"
> between the jvm and the underlying platform, it needs to be understood by both
> and agreed on by both. So it needs to be set based on what your underlying system
> is using, not something you want to set via either -D or System.setProperty. If
> your underlying locale is not UTF-16, I don't think you should expect the jvm
> could work correctly if it keeps "talking" in UTF-16 to the underlying system,
> for example, pass in a file name in utf-16, when your are running on a utf-8
> locale (it is more complicated on a windows platform, when you have system
> locale and user locale, and historically file.encoding was used for both, consider
> if your system locale and user locale are set differently...).
> The property sun.jnu.encoding introduced in jdk6 (this is mainly
> to address the issue we have with file.encoding on windows platform though)
> somehow helps remove some "pressure" from the file.encoding, so in theory
> file.encoding should be used to only for the encoding of "file content", and
> the sun.jnu.encoding should be used when you need the encoding to talk to
> those platform APIs, so something might be done here (currently file.encoding
> and sun.jnu.encoding are set to the same thing on non-Windows platform).
> The other reason is the timing of how the file.encoding is being initialized and
> how it is being used during the "complicated" system initialization stage, almost
> everyone touched System. initializeSystemClass() got burned here and there
> in the past:-) So sometime you want to ask if it is worth the risk to change
> something work for a use scenario that is not "supported". That said, as
> I said above, something might be done to address this issue, but obviously
> not a priority for now.
> if you want to do -Dfile.encoding=xyz, you
> are on your own, it might work, it might not work.
More information about the core-libs-dev