RFR 8124977 cmdline encoding challenges on Windows

Xueming Shen xueming.shen at oracle.com
Mon Jul 20 18:50:21 UTC 2015

On 07/20/2015 10:22 AM, Kirk Shoop wrote:
> So when default system locale differs from the active one, we have different behavior on Linux and Windows. The new options allow a windows user to select the same behavior that one would expect on unix. The switches can certainly be removed, if the compatibility impact is acceptable.

Kirk, on Windows file.encoding is from the user locale and the sun.jnu.encoding is from
the system locale setting. sun.jnu.encoding is purely for those text encoding sensitive
jnu functiond to communicate with the underlying windows system api, when the system
locale and the user locale are set to different value. On unix/linux/osx, these two are
always set to the same value. Yes, they might be input/output issue if the encoding used
by the console (oem codepage) is not compatible with the encoding used by the "user locale"
and you are trying to use System.in/out/err for the input/output to the console.

Here is the original CCC request regarding the sun.jnu.encoding, which might provide
some background info.


If you/we are NOT going to change the encoding used by the underlying console, I don't
think we need/should change the "encoding" used by the java.io.Console. As I suggested
in my previously email, the Java_java_io_Console_encoding() implementation probably
need to update to return utf8 if the cp == 65001 (that was 10 years ago, I'm not sure if
the 65001 was really used back then when we wrote this code).  My understanding of
the issue here is that if you continue to use the "A" version of the API to parse/get the
arguments, and try to solve the possible issue triggered by the "incompatibility" of the
oem encoding used by the console and the user locale encoding used by the System.in/
out/err, it's fine to define a new system property to specify a preferred encoding for the
launcher to use, but this "preferred" encoding should not be used by java.io.Console.
But isn't it more reasonable to simply always use the "W" version for this purpose in


> (2)
> The defaultUnicodeCharset() method because it is called from java.io.Console() constructor as well as from LauncherHelper. So its scope should be wider than just LauncherHelper.java.
> Kirk and Valery
>> -----Original Message-----
>> From: Xueming Shen [mailto:xueming.shen at oracle.com]
>> Sent: Saturday, July 11, 2015 11:51 AM
>> Hi Kirk,
>> Two questions
>> (1) Why do you need to change the "encoding" used by the java.io.Console class. My understanding
>>        is that the console encoding is specifically used to "talk" to the underlying terminal, it should just
>>        be the one used by the underlying terminal/console. I don't think the proposed change updates
>>        the underlying console encoding (something like chcp)  when -Dwindows.UnicodeConsole=true
>>        specified, if I read the webrev correctly. Instead, the Java_java_io_Console_encoding() probably
>>        need to be updated to return utf8 if the cp == 65001, so if the underlying terminal/console is
>>        using cp65001, the java.io.Console should encode/decode in utf8.
>>        I would assume the encoding of java.io.Condole should have nothing to do with using
>>        GetCommandLIneW() to parse the arguments in unicode in launcher?
>> (2) Why do you need a defaultUnicodeCharset() in Charset class? Seems to me the scope should/could
>>        be limited inside LauncherHelper.java?
>> Thanks,
>> -Sherman

More information about the core-libs-dev mailing list