RFC: draft API for JEP 269 Convenience Collection Factories

Paul Benedict pbenedict at apache.org
Fri Oct 9 02:39:33 UTC 2015

I don't think the statements "Creates an unmodifiable set containing X
elements" is always true. Since sets cannot have duplicates, it's possible
passing in X elements gives you less than that based on equality. I think
the Set docs should say "...X possible elements if unique". Wordsmith
something better if you can, of course.


On Thu, Oct 8, 2015 at 6:39 PM, Stuart Marks <stuart.marks at oracle.com>

> Hi all,
> Please review and comment on this draft API for JEP 269, Convenience
> Collection Factories. For this review I'd like to focus on the API, and set
> aside implementation issues and discussion for later.
> JEP:
>         http://openjdk.java.net/jeps/269
> javadoc:
> http://cr.openjdk.java.net/~smarks/reviews/jep269/api.20151008.mod/
> specdiff:
> http://cr.openjdk.java.net/~smarks/reviews/jep269/api.20151008.specdiff/overview-summary.html
> Most of the API is pretty straightforward, with fixed-arg and varargs
> "of()" factories for List, Set, ArrayList, and HashSet; and with fixed-arg
> "of()" factories and varargs "ofEntries()" factories for Map and HashMap.
> There are a few issues on which I'd like to solicit discussion.
> 1. Number of fixed arg overloads.
> I've somewhat arbitrarily provided up to 5 fixed-arg overloads for the
> lists and sets, and up to 8 pairs for the fixed-arg map factories. The
> rationale for 8 pairs is that there are 8 primitives, and various language
> processing tools often have maps for the primitive types. (But such tools
> also often need to handle the Void type, which exceeds the limit of 8. So
> this might need to change if we want to follow this rationale.)
> I also note that Guava's immutable factories provide 11 fixed-arg
> overloads for list, 5 for set, and 5 pairs for map. I'd be curious as to
> the rationale for this, and whether it also would apply to the JDK.
> 2. Other concrete collection factories.
> I've chosen to provide factories for the concrete collections ArrayList,
> HashSet, and HashMap, since those seem to be the most commonly used. Is
> there a need to provide factories for other concrete collections, such as
> LinkedHashMap?
> 3. Duplicate handling.
> My current thinking is for the Set and Map factories to throw
> IllegalArgumentException if a duplicate element or key is detected. The
> current draft specification is silent on this point. It needs to be
> specified, one way or another.
> The rationale for throwing an exception is that if these factories are
> used in a "literal like" fashion, then having a duplicate is almost
> certainly a programming error. Consider this example:
>     Map<String,TypeUse> m = Map.ofEntries(
>         entry("CDATA",       CBuiltinLeafInfo.NORMALIZED_STRING),
>         entry("ENTITY",      CBuiltinLeafInfo.TOKEN),
>         entry("ENTITIES",    CBuiltinLeafInfo.STRING.makeCollection()),
>         entry("ENUMERATION", CBuiltinLeafInfo.STRING.makeCollection()),
>         entry("NMTOKEN",     CBuiltinLeafInfo.TOKEN),
>         entry("NMTOKENS",    CBuiltinLeafInfo.STRING.makeCollection()),
>         entry("ID",          CBuiltinLeafInfo.ID),
>         entry("IDREF",       CBuiltinLeafInfo.IDREF),
>         entry("IDREFS",
>                   TypeUseFactory.makeCollection(CBuiltinLeafInfo.IDREF));
>         entry("ENUMERATION", CBuiltinLeafInfo.TOKEN));
> (derived from [1])
> If duplicates were silently ignored, this might result in hard-to-spot
> errors.
> There's also the matter of which value ends up being used in the case of
> duplicate map keys, and whether this should be specified. A fairly obvious
> policy would be "last one wins" but I'm reluctant to specify that, as it
> starts to place unnecessary constraints on implementations. However, the
> alternative of leaving it unspecified is also unpalatable.
> I'm aware that very few programming systems with similar constructs will
> signal an error on duplicate elements. Python, Ruby, Groovy, Scala, and
> Perl all seem to allow duplicates in maps or equivalent, apparently with a
> last-wins policy. (Though sometimes it's hard to tell if the policy is
> specified.)
> The only system I've been able to find that explicitly rejects duplicates
> is Clojure, and this policy isn't without controversy. [2] The main
> rationale is to prevent programming errors.
> There is a python bug [3] where it was proposed that duplicates in a dict
> should raise an error or warning, also in order to catch programming
> errors. The request was rejected, not necessarily because it was a bad
> idea, but primarily because it would be a backward incompatible change.
> The easiest thing to do would simply to require last-wins, since
> "everybody else is doing it" ... but that doesn't mean it's right. Since
> we're introducing a new API here, there is no compatibility issue. Throwing
> an exception for duplicates seems like a good way to prevent a certain
> class of programming errors.
> What do people think?
> s'marks
> [1]
> http://hg.openjdk.java.net/jdk8/jdk8/jaxws/file/d03dd22762db/src/share/jaxws_classes/com/sun/tools/internal/xjc/reader/dtd/TDTDReader.java#l420
> [2]
> http://dev.clojure.org/display/design/Allow+duplicate+map+keys+and+set+elements
> [3] https://bugs.python.org/issue16385

More information about the core-libs-dev mailing list