RFR: JDK-8073152: Update Standard/ExtendedCharsets to work with module system

Xueming Shen xueming.shen at oracle.com
Mon Feb 16 18:46:09 UTC 2015

On 2/16/15 4:11 AM, Alan Bateman wrote:
> On 14/02/2015 20:30, Xueming Shen wrote:
>> Hi,
>> Please help review the changes for  JDK-8073152
>> Issue: https://bugs.openjdk.java.net/browse/JDK-8073152
>> Webrev: http://cr.openjdk.java.net/~sherman/8073152/webrev/
>> This change is to re-organize how the "standard" and "extended" 
>> charsets repository are
>> built for the jdk/jre images. Before module system, the "standard" 
>> charset provider is built
>> into rt.jar and the "extended" charsets provider is in charsets.jar, 
>> which is located in the
>> "lib/ext" directory. both are available during the system boot time. 
>> In jigsaw, the extended
>> charsets provider is in jdk.charset (vs the "standard" is in 
>> java.base) module and might not
>> be "visible" during the early phase of system classes loading, which 
>> might cause problem
>> as some of the charsets now in "extended" charsets may needed during 
>> the boot stage. 
> Right, the overall goal is to ensure that we only execute in code 
> java.base during startup (essentially until VM.isBooted returns true) 
> so this means fixing a number of issues.
> Another thing to say is that the compatibility impact will be that 
> attempting to start with file.encoding set to a charset that remains 
> in jdk.charset will not work but then again, startup up with this 
> property set has never been supported anyway. The other impact for 
> embedded environments is increased static footprint of java.base but I 
> think that can be dealt with via filtering when creating the runtime 
> image for the target platform.
> So overall then I think these changes look good. At some point I think 
> that the private Charset.lookupExtendedCharset should fail if 
> VM.isBooted() is false. Also I think ExtendedProviderHolder should go 
> away and the extended charsets loaded via ServiceLoader.

Yes, we have a separate P2 bug for that, will address that separately later.

> Another thing that comes to mind is the Charset.isSupported and 
> forName methods, we might need to think about adding a note to their 
> javadoc to say that some charsets might not be available during 
> startup. Chris is penning similar text for URL protocol handlers.
> A few small comments about the changes in the webrev:
> 1. Is list_old needed, I can't tell if this is checked in for use by 
> future archaeologists.
No, it's not used by any code. I just added it the last minute to be a 
reference for the future, as
I'm deleting the original list sbcs, dbcs and extsbcs.

> 2. Hasher.java is showing up in the webrev as a new file, was this 
> build.tools.hasher.Hasher and so we know have two copies?
"This" Hasher.java has a different use interface and works on a pair of 
key/value lists directly (instead
of parsing the key/value from a std in). Given the implementation is 
simple/small enough, I just copied/
pasted the parts I need from Mark's code, left the original one 
untouched (I kinda remember it is used
by other as well)

> 3. In Typo in SPI.java line 136 ("Unknow type"). As this is a 
> built-time tool then it's not a big deal but it is possible to use the 
> streams API in more succulent ways, for example List<String> 
> aliasValues = cs.aliases().stream().map(a -> 
> a.toLowerCase(Locale.ENGLISH)).collect(Collectors::toList);
Typo fixed.

The "aliasValues" and "aliasKeys" pair is a little tricky.  Two are 
supposed to collect the alias->csname pair
for all charsets, with a mapping relation but in two separate lists (to 
pass to the Hasher). While it's possible
to collect them separately (or them into a "entry" as a pair) with the 
.collect(), but it appears to me more
nature to do just in once pass.

> 4. Are there any tests that need to be updated?
Only NIOCharsetAvailabilityTest.java. But I believe that test should not 
stop working for a while as
it tried to "parse" the rt.jar and charsets.jar to collect the charset 
class names, which obviously
does not work in module. It need to be rewritten with the new iterating 
mechanism. (it is supposed
to be the last file in webrev. But it appears it was not there in first 

> Finally, I see the sun.awt.motif.** classes are changed to importing 
> using wildcard, a subtle trick as the package name gets decided at 
> build time. Hopefully someone in the client area can sort this 
> dependency issue soon.

Phil is moving away from using those sun.nio.cs.ext directly. My change 
in motif should have no conflict
with his changes, in fact they probably no longer needed. Just be there 
in case I will push first :-)



> -Alan

More information about the build-dev mailing list