<Sound Dev> 8177951: Charset problem when the name of the sound device contains Chinese character.

Sergey Bylokhov sergey.bylokhov at oracle.com
Tue Jul 18 00:38:15 UTC 2017

I uploaded the current patch to cr.openjdk: 

I have tested the patch and here is my observation: 
- The patch works for direct devices, but it looks like the same bug exists in Ports(also reproduced by your testcase), did you have a chance to look into this issue as well? 
- jdk uses "warning-as-error" policy during the build, so currently there is a build failure, because of this warning: 
PLATFORM_API_WinOS_DirectSound.cpp(93) : warning C4267: 'initializing' : conversion from 'size_t' to 'DWORD', possible loss of data 
- Note that the memory which is allocated by "new[]" should be deallocated by the "delete[]", but current fix use simple "delete". 
- Can you please sign and submit the OCA[1], which will allow you to contribute to the openjdk? 

[1] http://www.oracle.com/technetwork/community/oca-486395.html 

----- cqjjjzr at 126.com wrote: 
Please review this bug report: https://bugs.openjdk.java.net/browse/JDK-8177951 

A brief description of the issue: 
In non-English Windows, the DirectAudioDeviceProvider can't work properly, AudioSystem.getMixerInfo()[0].getName() (or any other index, as long as the name of mixer contains non-ASCII characters)will return a corrupted string (all non-ASCII chars become messy codes). 
The main reason is in native codes, we get a string in ANSI(platform- dependent ) charset. But in the code the string is just processed as a UTF-8 string. So the JVM encodes ANSI string by UTF-8 encoding. 

Detailed description: 
The performace of the bug is contained in the link above, I'll talk about the reason of the issue. All research below are based on OpenJDK 9, but I think OpenJDK 8 is also applicable. 

In jdk/src/java.desktop/windows/native/libjsound/PLATFORM_API_WinOS_DirectSound.cpp , Function DS_GetDesc_Enum , Line 236 , the name of the device is gotten(called by function DirectSoundDeviceEnumerate ) from the OS, in ANSI charset, in a LPCSTR . And you just copy the ANSI encoded string to the DirectAudioDeviceDescription struct. So let's look at the jdk/src/java.desktop/share/native/libjsound/DirectAudioDeviceProvider.c , Function getDirectAudioDeviceDescription and Java_com_sun_media_sound_DirectAudioDeviceProvider_nNewDirectAudioDeviceInfo , Line 48 and 98 , you called NewStringUTF function with a ANSI encoded string. So we got a UTF-8 encoded ANSI string . But obviously we need a UTF-8 encoded Unicode String. 

I wrote to Oracle but they can't reproduce the issue, so I went on fixing the bug by myself. I wrote a function to convert ANSI string to UTF-8 encoded Unicode string. 

And I found a problem: In Multi-Byte compiling mode, DirectSoundDeviceEnumerate will call DirectSoundDeviceEnumerateA and it will present a ANSI string as the argument, but in Unicode mode, DirectSoundDeviceEnumerate calls DirectSoundDeviceEnumerateW which presents a UTF-8 encoded Unicode string! So I think it's necessary to check if the compiler is in Unicode mode(by checking UNICODE macro), and only convert the string when it's in Multi-Byte mode. 

But, I don't have the debugging environment, I have problem configuring the compiler of OpenJDK. LINK : error LNK2001: unresolved external symbol _mainCRTStartup when executing ./configure script. So I can't test the validness of the patch. I'll be grateful if someone can help solve the configuring problem or test the patch for me. Even if you can compile the JDK with the patch is OK. 
If you'd like to test the patch, you can test it with the first device from DirectSoundDeviceEnumerate , 'Primary Sound Driver'. Maybe you don't have Chinese font, I'll attach a picture to the correct output. 

The patch is below and attached with the E-Mail. It's applicable for OpenJDK9, maybe 8 if you change it. 
The code in the picture is just for generate a output, in the Unicode mode, so it's not applicable for JDK. 


*** old/jdk/src/java.desktop/windows/native/libjsound/PLATFORM_API_WinOS_DirectSound.cpp 2017-06-21 03:57:42.000000000 +0800 
--- new/jdk/src/java.desktop/windows/native/libjsound/PLATFORM_API_WinOS_DirectSound.cpp 2017-06-24 16:26:57.232247800 +0800 
*** 86,91 **** 
--- 86,113 ---- 
static UINT64 g_lastCacheRefreshTime = 0; 
static INT32 g_mixerCount = 0; 

+ /// FIX BUG JDK-8177951: Convert ANSI encoded string to UTF-8 encoded string 
+ LPCSTR ANSIToUTF8(const LPCSTR& lpAnsiStr) 
+ { 
+ // ANSI -> Unicode 
+ DWORD dwAnsiLen = strlen(lpAnsiStr); 
+ DWORD dwUnicodeLen = ::MultiByteToWideChar(CP_ACP, 0, lpAnsiStr, -1, NULL, 0); 
+ LPWSTR lpUnicodeStr; 
+ lpUnicodeStr = new WCHAR[dwUnicodeLen]; 
+ memset(lpUnicodeStr, 0, (dwUnicodeLen) * sizeof(WCHAR)); 
+ MultiByteToWideChar(CP_ACP, 0, lpAnsiStr, -1, lpUnicodeStr, dwUnicodeLen); 
+ // Unicode -> UTF8 
+ LPSTR lpUTF8Str; 
+ DWORD dwUTF8Len; 
+ dwUTF8Len = WideCharToMultiByte(CP_UTF8, 0, lpUnicodeStr, -1, NULL, 0, NULL, NULL); 
+ lpUTF8Str = new CHAR[dwUTF8Len]; 
+ memset(lpUTF8Str, 0, sizeof(CHAR) * (dwUTF8Len)); 
+ WideCharToMultiByte(CP_UTF8, 0, lpUnicodeStr, -1, lpUTF8Str, dwUTF8Len, NULL, NULL); 
+ delete lpUnicodeStr; 
+ return lpUTF8Str; 
+ } 
BOOL DS_lockCache() { 
/* dummy implementation for now, Java does locking */ 
return TRUE; 
*** 233,239 **** 
--- 255,267 ---- 

INT32 cacheIndex = findCacheItemByGUID(lpGuid, g_audioDeviceCache[desc->deviceID].isSource); 
if (cacheIndex == desc->deviceID) { 
+ #ifndef UNICODE 
+ LPCSTR utf8EncodedName = ANSIToUTF8(lpstrDescription); 
+ strncpy(desc->name, utf8EncodedName, DAUDIO_STRING_LENGTH); 
+ delete utf8EncodedName; 
+ #else 
strncpy(desc->name, lpstrDescription, DAUDIO_STRING_LENGTH); 
+ #endif 
//strncpy(desc->description, lpstrModule, DAUDIO_STRING_LENGTH); 
desc->maxSimulLines = -1; 
/* do not continue enumeration */ 


Charlie Jiang 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/sound-dev/attachments/20170717/31514afb/attachment.html>

More information about the sound-dev mailing list