Collectors update redux
brian.goetz at oracle.com
Thu Feb 7 11:12:51 PST 2013
> Is three-arg collect really the target "on ramp"?
Sorry, I was probably not clear. It is the onramp to the mutable part
of the reduce functionality, but it builds on the more functional
flavors, as outlined in the "digression" section.
> IF you've been successfully spoon-fed the excellent examples (bitset
> etc.) then you can see it as reasonably simple. Otherwise you're pretty
> lost in the woods.
I think that's fair. Which points, as we've already agreed, to the fact
that this is mostly a pedagogical problem.
> I would have thought the first stop would be the combinators. OTOH
> ... there's a lot of stuff in there.
> I think there is *way* too much stuff in there, and I don't have enough
> time to even review it all before it gets set in stone. I strongly
> believe we would be smarter to keep the set of prepackaged Collectors
> much smaller and let third-party libraries experiment with which
> Collectors to provide.
Conceptually, the set is pretty simple:
base collectors == toCollection, toStatistics, toStringBuilder,
joinedWith (takes Stream<T> plus T->U, produces Map<T,U>)
combinator for map+collector
combinator for groupBy+collector
combinator for groupBy+reduce
combinator for partition+collector
combinator for partition+reduce
plus defaults for above where if you don't have a downstream collector,
it assumes "toCollection" (e.g., the no-arg groupBy).
Individually, each of these is dead-simple both in concept and
implementation (once you understand Collector) -- even the most complex
are only 20 LoC, and many are are 1-2 LoC. I think what creates the
perception of complexity is the number of forms that jumps out at you on
the Javadoc page?
The one place where we might consider reducing scope is by eliminating
the forms that take an explicit Supplier<Map>. In other words, you
always get a HashMap / ConcurrentHashMap. This cuts the number of
groupBy/join forms in half. But it leaves those who want, say, to group
to a TreeMap out in the cold.
Do we feel that would be an improvement?
Alternately, we can refactor the Map-driven collectors so that instead
of the Supplier<Map> being an argument, it can be a method on the Collector:
by having a ToMapCollector (extends Collector) with a usingMap() method.
This again gets us a nearly 2x reduction in number of methods in
Collectors, at the cost of moving the "pick your own map" functionality
to somewhere else.
More information about the lambda-libs-spec-observers