Proposal: #ModuleNameCharacters

David M. Lloyd david.lloyd at
Thu Dec 8 16:26:40 UTC 2016

On 12/08/2016 04:26 AM, forax at wrote:
> ----- Mail original -----
>> De: "mark reinhold" <mark.reinhold at>
>> À: forax at
>> Cc: jpms-spec-experts at
>> Envoyé: Jeudi 8 Décembre 2016 00:40:17
>> Objet: Re: Proposal: #ModuleNameCharacters
>> 2016/12/6 0:08:58 -0800, forax at
>>> 2016/11/29 16:11:02 -0800, mark.reinhold at
>>>> ...
>>>> As I wrote in my reply to David, I'm open to lifting the traditional
>>>> restrictions on the class-file representation of qualified names in the
>>>> case of module names.
>>> Ok, cool.
>>>>                         Given the weight of tradition and the past value
>>>> of the existing constraints, however, I'd like to have a more compelling
>>>> reason than "some future hypothetical module system might need this
>>>> flexibility".
>>> Existing constraints exist because a package name is a part of a
>>> qualified class name. There is no tradition for module names. Module
>>> names in the class file are not mixed with other constrained names, so
>>> i see no compelling reason to add arbitrary rules to try to restrict
>>> module names.
>> Okay, okay ... taken together with David's examples, I get the point.
>> (Personally I've always considered the whole `.`-to-`/` mapping kind
>> of archaic anyway.)
>>> Note that, JLS module names have to be parsed by the compiler, so for
>>> JLS module names, having the same constraints as any other qualified
>>> identifiers make sense, but here, we're talking about module names in
>>> the JVM spec, not in the JLS.
>> Correct.
>>> Now, the constant pool is typed and structured, if we want to have
>>> constraints on module names, in my opinion, we should introduce a new
>>> constant pool item to make it clear that module names are not plain
>>> names but specific names exactly like there is a Class constant pool
>>> item.
>> Agreed.  This is, in fact, an inconsistency in the present proposal,
>> since it imposes constraints on otherwise untagged CONSTANT_Utf8
>> structures.  If we're going to impose constraints on free-standing
>> module and package names then we should introduce the obvious new
>> `CONSTANT_Module_info` and `CONSTANT_Package_info` structures.
>>> And with my ASM hat, having to add replace('.', '/') and replace('/',
>>> '.') at the right places is error prone, if we can avoid that is a big
>>> win in term of usability.
>> Yep.
>>>> In trying to think about the future I do wonder if, today, we should
>>>> reserve a character or two just in case we discover five or ten years
>>>> from now that we need to add more structure to module names.  Should
>>>> we set aside `:`, or perhaps some other character, just in case?
>>> if we want structure, we will add another constant pool item. It's
>>> what valhalla does for parameterized types.
>> So the question is, then: Which, if any, characters should we reserve?
>> Peering into the myriad alternate visions swirling around in my cloudy
>> crystal ball, I can see:
>>  - A structured namespace of modules.  `:` is a logical separator here,
>>    even in the source language if need be, so let's reserve it now in
>>    class files.
>>  - Module names encoded in class files together with specific version
>>    strings, to form compound module identifiers.  We already use `@` to
>>    separate module names from version strings in the module-system API
>>    (e.g., the result of `ModuleDescriptor::toString`), so let's reserve
>>    that in class files now.
>> (This is just my imagination, not specific suggestions for the future!)
>> Additionally we should reserve the universal escape character (`\`) and
>> for sanity also forbid any character whose code point is less than 0x20
>> (` `).  (Ideally we'd forbid all Unicode non-printing characters, but
>> it's best not to have the JVMS depend upon the Unicode specification.)
>> To sum up: Reserve `:`, `@`, and `\` for future use, and forbid the ASCII
>> non-printing characters (< 0x20).
> You also need to reserve '/' because the java launcher (-m) use '/' to separate between the module name and the main class.
> Rémi
>> David -- Are these restrictions acceptable in your use cases, or if not
>> then at least tolerable?  I'm pretty sure I've never seen any of these
>> characters in Java EE module names, JAR file base names, Maven group or
>> artifact names, or the other examples you mentioned.

Breaking it down one by one...

':' is going to break two things that I know of: modules generated from 
Maven coordinates which have the syntax 
<groupId>:<artifactId>:<classifier>, and modules in the JBoss Modules 
static loader which have a slot component, using the syntax 

'@' might be OK to reserve; I can't think of any specific conflicts, 
though we have allowed this character in the past.

'\' is a problem because in JBoss Modules uses that character to escape 
':' (particularly in the Maven coordinates case) to avoid mixing up the 
slot name with the module name.  For a module named `foo\:bar:5`, the 
static module loader would treat the name component as "foo:bar" and the 
slot as "5", and locate the module accordingly, however the core system 
does not treat '\' specially: the proper name of this module would be 
`foo\:bar:5` according to the system, and that's the string you would 
have to use to load the module by name.

'/' also may be a problem because within our container, we use file 
names from the file system as the name of modules that come from the 
file system.  This also causes a problem for '\' on Windows.  We could 
possibly work out some kind of alternative in this case, with some 
creative thinking.

Definitely in favor of forbidding non-printing code points 0x00 through 
0x1F, and probably also 0x80 through 0x9F (we probably don't want to go 
any further down the Unicode rabbit hole than that though - at least, 
not in the JVM - if we want to get out of here this side of 2020).


More information about the jpms-spec-experts mailing list