RFC: draft API for JEP 269 Convenience Collection Factories

Peter Levart peter.levart at gmail.com
Fri Oct 9 12:08:24 UTC 2015


On 10/09/2015 04:39 AM, Paul Benedict wrote:
> I don't think the statements "Creates an unmodifiable set containing X
> elements" is always true. Since sets cannot have duplicates, it's possible
> passing in X elements gives you less than that based on equality. I think
> the Set docs should say "...X possible elements if unique". Wordsmith
> something better if you can, of course.

The same goes for Map.of(....).

The question is should the factories uniquify the element(s) / key(s) or 
should they throw IllegalArgumentException?

In case of the former, which element / entry should they keep - the one 
appearing 1st or last in the source?

For example:

Map<String, Integer> map = Map.of("a", 1, "a", 2);

What should the result be:

1. {"a", 1}
2. {"a", 2}
3. IllegalArgumentException

I don't have a preference, but I think it should be specified.

Regards, Peter

> Cheers,
> Paul
> On Thu, Oct 8, 2015 at 6:39 PM, Stuart Marks <stuart.marks at oracle.com>
> wrote:
>> Hi all,
>> Please review and comment on this draft API for JEP 269, Convenience
>> Collection Factories. For this review I'd like to focus on the API, and set
>> aside implementation issues and discussion for later.
>> JEP:
>>          http://openjdk.java.net/jeps/269
>> javadoc:
>> http://cr.openjdk.java.net/~smarks/reviews/jep269/api.20151008.mod/
>> specdiff:
>> http://cr.openjdk.java.net/~smarks/reviews/jep269/api.20151008.specdiff/overview-summary.html
>> Most of the API is pretty straightforward, with fixed-arg and varargs
>> "of()" factories for List, Set, ArrayList, and HashSet; and with fixed-arg
>> "of()" factories and varargs "ofEntries()" factories for Map and HashMap.
>> There are a few issues on which I'd like to solicit discussion.
>> 1. Number of fixed arg overloads.
>> I've somewhat arbitrarily provided up to 5 fixed-arg overloads for the
>> lists and sets, and up to 8 pairs for the fixed-arg map factories. The
>> rationale for 8 pairs is that there are 8 primitives, and various language
>> processing tools often have maps for the primitive types. (But such tools
>> also often need to handle the Void type, which exceeds the limit of 8. So
>> this might need to change if we want to follow this rationale.)
>> I also note that Guava's immutable factories provide 11 fixed-arg
>> overloads for list, 5 for set, and 5 pairs for map. I'd be curious as to
>> the rationale for this, and whether it also would apply to the JDK.
>> 2. Other concrete collection factories.
>> I've chosen to provide factories for the concrete collections ArrayList,
>> HashSet, and HashMap, since those seem to be the most commonly used. Is
>> there a need to provide factories for other concrete collections, such as
>> LinkedHashMap?
>> 3. Duplicate handling.
>> My current thinking is for the Set and Map factories to throw
>> IllegalArgumentException if a duplicate element or key is detected. The
>> current draft specification is silent on this point. It needs to be
>> specified, one way or another.
>> The rationale for throwing an exception is that if these factories are
>> used in a "literal like" fashion, then having a duplicate is almost
>> certainly a programming error. Consider this example:
>>      Map<String,TypeUse> m = Map.ofEntries(
>>          entry("CDATA",       CBuiltinLeafInfo.NORMALIZED_STRING),
>>          entry("ENTITY",      CBuiltinLeafInfo.TOKEN),
>>          entry("ENTITIES",    CBuiltinLeafInfo.STRING.makeCollection()),
>>          entry("ENUMERATION", CBuiltinLeafInfo.STRING.makeCollection()),
>>          entry("NMTOKEN",     CBuiltinLeafInfo.TOKEN),
>>          entry("NMTOKENS",    CBuiltinLeafInfo.STRING.makeCollection()),
>>          entry("ID",          CBuiltinLeafInfo.ID),
>>          entry("IDREF",       CBuiltinLeafInfo.IDREF),
>>          entry("IDREFS",
>>                    TypeUseFactory.makeCollection(CBuiltinLeafInfo.IDREF));
>>          entry("ENUMERATION", CBuiltinLeafInfo.TOKEN));
>> (derived from [1])
>> If duplicates were silently ignored, this might result in hard-to-spot
>> errors.
>> There's also the matter of which value ends up being used in the case of
>> duplicate map keys, and whether this should be specified. A fairly obvious
>> policy would be "last one wins" but I'm reluctant to specify that, as it
>> starts to place unnecessary constraints on implementations. However, the
>> alternative of leaving it unspecified is also unpalatable.
>> I'm aware that very few programming systems with similar constructs will
>> signal an error on duplicate elements. Python, Ruby, Groovy, Scala, and
>> Perl all seem to allow duplicates in maps or equivalent, apparently with a
>> last-wins policy. (Though sometimes it's hard to tell if the policy is
>> specified.)
>> The only system I've been able to find that explicitly rejects duplicates
>> is Clojure, and this policy isn't without controversy. [2] The main
>> rationale is to prevent programming errors.
>> There is a python bug [3] where it was proposed that duplicates in a dict
>> should raise an error or warning, also in order to catch programming
>> errors. The request was rejected, not necessarily because it was a bad
>> idea, but primarily because it would be a backward incompatible change.
>> The easiest thing to do would simply to require last-wins, since
>> "everybody else is doing it" ... but that doesn't mean it's right. Since
>> we're introducing a new API here, there is no compatibility issue. Throwing
>> an exception for duplicates seems like a good way to prevent a certain
>> class of programming errors.
>> What do people think?
>> s'marks
>> [1]
>> http://hg.openjdk.java.net/jdk8/jdk8/jaxws/file/d03dd22762db/src/share/jaxws_classes/com/sun/tools/internal/xjc/reader/dtd/TDTDReader.java#l420
>> [2]
>> http://dev.clojure.org/display/design/Allow+duplicate+map+keys+and+set+elements
>> [3] https://bugs.python.org/issue16385

More information about the core-libs-dev mailing list