brian.goetz at oracle.com
Mon Feb 18 12:20:23 PST 2013
Circling back to this (i.e., "reducing reduce", redux):
There are a lot of considerations here, many mostly accidental (e.g.,
consequences of erasure and the primitive/reference divide).
The three-arg functional reduce form is functionally equivalent to the
two-arg form, except that there are some constructions that are more
efficient to handle in the three arg form. However, the best example we
came up with, Joe's string compare, suffers because he had to use
boxing. So we're currently in a place where the best example to support
this form has other defects that make the form hard to support. And,
any form of functional reduce on a reference would likely result in a
lot of object creation, so the optimization of eliding some of the
mapping would have to overcome that.
Further, one can still handle this without boxing using collect() and an
explicit mutable result holder. On the other hand, if/when the language
acquires tuples, it will be a very different story, and this form would
become infinitely more useful.
So I think the evidence weighs slightly in favor of ditching this form
for now (though I'd feel better if people didn't have to use either an
ad-hoc class or a single-element array as the data box when using the
Secondarily, ditching the three-arg form from Stream would remove one
element of support for naming reduce and collect differently; part of
the motivation for a different name was that the three-arg collect and
three-arg reduce overloaded very poorly if they had the same name.
However, I think we should resist the temptation to act on this. I
think (a) there is pedagogical value in separating function and mutable
reduce forms, and (b) if we do this, we slam the door on the more
flexible version, which will badly bite us in a tupled future.
We might still consider the three-arg version for IntStream. That's the
case where Joe's example works.
On 2/11/2013 11:41 AM, Brian Goetz wrote:
> Now that we've added all the shapes of map() to Stream (map to
> ref/int/long/double), and we've separated functional reduce (currently
> called reduce) from mutable reduce (currently called collect), I think
> that leaves room for taking out one of the reduce methods from Stream:
> <U> U reduce(U identity,
> BiFunction<U, ? super T, U> accumulator,
> BinaryOperator<U> reducer);
> This is the one that confuses everyone anyway, and I don't think we need
> it any more.
> The argument for having this form instead of discrete map+reduce are:
> - fused map+reduce reduces boxing
> - this three-arg form can also fold filtering into the accumulation
> However, since we now have primitive-bearing map methods, and we can do
> filtering before and after the map, is this form really carrying its
> weight? Specifically because people find it counterintuitive, we should
> consider dropping it and guiding people towards map+reduce.
> For example, "sum of pages" over a stream of Documents is better written
> rather than
> docs.reduce(0, (count, d) -> count + d.getPageCount(), Integer::sum)
> The big place where we need three-arg reduce is when we're folding into
> a mutable store. But that's now handled by collect().
> Have I missed any use cases that would justify keeping this form?
More information about the lambda-libs-spec-observers