RFC: draft API for JEP 269 Convenience Collection Factories

Timo Kinnunen timo.kinnunen at gmail.com
Wed Oct 14 21:48:54 UTC 2015


That’s intriguing since I wrote a collections library too, covering just Map/Set/List/Stream, with immutable/mutable versions and lots of convenience methods included, but I haven’t noticed such issues. My scope is a lot smaller, of course. It’s also not beholden to the way the Collections Framework does things so I can decide List is immutable, ArrayList is mutable, both will use the same API and there won’t be any subtyping relation between them. Maybe it’s this freedom that makes the difference. 

I actually have 4 uses of ArrayList.of in a smallish project where they are used for reading and generating configuration files. I get to do things like this in a few places:

ArrayList<String> lines = ArrayList.of(
  "<?xml version='1.0' encoding='UTF-8'?>",
  "<?profile version='1.0.0'?>",
  "<profile id='" + ECLIPSE_PROFILE_ID + "' timestamp='" + timestamp + "'>",
  "  <properties size='7'>", 
  // more lines of prefix . . .

And also this sort of thing:

ArrayList<String> list = ArrayList.of(FILES.replaceAll("^\"|\"$", "").split("\" \"")).removeIf(String::isEmpty);
ArrayList<Path> list2 = list.map(Paths::get);

If that looks somewhat like a Stream then just imagine what the user experience is like when stepping into it in a debugger ��

More anecdotes: 
Having methods taking a Collection overloaded with versions taking varargs makes the API a lot more flexible.

I chose ArrayList.of(), ArrayList.of(T t) and ArrayList.of(T...ts), and List.of(T...ts) and the [0. . .10] argument variants for List.of(T t0, ... , T tn) completely unscientifically.  

Sent from Mail for Windows 10

From: Kevin Bourrillion
Sent: Wednesday, October 14, 2015 19:56
To: Stuart Marks
Cc: core-libs-dev
Subject: Re: RFC: draft API for JEP 269 Convenience Collection Factories

(Sorry that Guava questions were asked and I didn't notice this thread

Note that we have empirically learned through our Lists/Sets/Maps factory
classes that varargs factory methods for *mutable* collections are almost
entirely useless.  For one thing, it's simply not common to have a
hardcoded set of initial values yet still actually need to modify the
contents later.  When that does come up, the existing workarounds just
aren't bad at all:


    Set<Foo> foos = new HashSet<>(asList(foo1, foo2, foo3)); // static
import, of course


    static final Set<Foo> INITIAL_VALUES = Set.of(foo1, foo2, foo3);
     . . .

    Set<Foo> foos = new HashSet<>(INITIAL_VALUES);


    Set<Foo> foos = new HashSet<>();
    Collections.addAll(foos, foo1, foo2, foo3);

Note that (c) is a two-liner. But a two-liner is really only bad in the
*immutable* case (because you might be initializing a static final). It's
of little harm in the mutable case.

Anyway, since we created these methods, they became an attractive nuisance,
and thousands of users reach for them who would have been better off in
every way using an immutable collection. Our fondest desire is to one day
be able to delete them. So, obviously, my strong recommendation is not to
add these to ArrayList, etc.

On Fri, Oct 9, 2015 at 4:11 PM, Stuart Marks <stuart.marks at oracle.com>

Now, Guava handles this use case by providing a family of copying factories
> that can accept an array, a Collection, an Iterator, or an Iterable. These
> are all useful, but for JEP 269, we wanted to focus on the "collection
> literal like" APIs and not expand the proposal to include a bunch of
> additional factory methods. Since we need to have a varargs method anyway,
> it seemed reasonable to arrange it so that it could easily accept an array
> as well.

A decision to support only varargs and arrays is reasonable. However, I
don't see the advantage in using the same method name for both. In Guava,
it's clear what the difference between ImmutableList.of(aStringArray) and
ImmutableList.copyOf(aStringArray) is.

Does anybody care about LinkedHashSet?

Assuming you go ahead with this for mutable collection types despite the
above, then YES, absolutely. Accidental dependence on hash order has always
been a runaway problem in our codebase that has made every single major JDK
upgrade difficult. And the memory cost of LHS over HS isn't nearly as great
as HS is already paying over a lean immutable set. The use of HashMap and
HashSet themselves should be discouraged.

(Even I had to fight the temptation to add "except when memory is at a
premium" to that! But it makes no sense. That's like "if you want to lose
weight, then accompany your giant pasta dinner and chocolate cake with a
*Diet* Coke.")

3. Duplicate handling.
>>> My current thinking is for the Set and Map factories to throw
>>> IllegalArgumentException if a duplicate element or key is detected.

To the other question: the reason we chose 11 as the cutoff is that we
determined that there would be no logical basis for exactly where to do it,
so we looked for an illogical basis. Sometimes you'll be at 10, all the way
up, you're at 10 and where can you go from there? Where? Nowhere. So this
way, if we need that extra push over the cliff, we can go up to 11.

Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com

More information about the core-libs-dev mailing list