Extending Collector to handle a post-transform

Howard Lovatt howard.lovatt at gmail.com
Thu Jun 20 00:47:21 PDT 2013

I think the concepts in Ypnos are very good balance between power and API
surface area, e.g. they have reduce and reduceR (the former is a simple
reducer/collector and the latter the complicated one with intermediate

Ypnos also has other useful features like grids of data as well as streams
and the ability to map iteratively, their iterate and iterateT operators.
Are there plans for these features, grids and iterative mapping, in future
versions of the Stream library?

On 15 June 2013 07:01, Brian Goetz <brian.goetz at oracle.com> wrote:

> BTW, this notion of a parallel reduction as a quad of functions:
> (initial-result, accumulate-element, merge-result, final-transform) shows
> up in a lot of places.  Here are just two that were pointed out to us as we
> explored this feature:
> User defined aggregates in MS SQL Server: http://technet.microsoft.com/**
> en-us/library/ms131051(v=sql.**90).aspx<http://technet.microsoft.com/en-us/library/ms131051(v=sql.90).aspx>(Thanks Erik for this pointer.)
> Ypnos: declarative, parallel structured grid programming. (
> http://doi.acm.org/10.1145/**1708046.1708053<http://doi.acm.org/10.1145/1708046.1708053>),
> which describes a Haskell-hosted EDSL for parallel stencil computations:
>    Some reductions generate values of a different type to the element type
> of a grid. A structure called a
>    Reducer packs together a number of functions for parallel reduction
> under reduction operators of this type.
>    The mkReducer constructor builds a Reducer, taking four parameters:
>         • A function reducing an element and partially-reduced value to
> another partially-reduced value: (a → b → b)
>         • A function combining two partially-reduced values, possibly from
> two reduction processes on subgrids: (b → b → b)
>         • An initial partial result: b
>         • A final conversion function that converts the partial-result to
> a final value: (b → c).
> (Thanks Guy for this pointer.)
> In addition, we got requests for this feature from the Oracle "Sumatra"
> team, which is exploring the practicality of transparently translating Java
> bulk operations to run on GPUs.  The notions from the "Ypnos" paper above
> show up all over the GPGPU literature.
> On 6/12/2013 11:39 PM, Mike Duigou wrote:
>> On Jun 11 2013, at 10:04 , Brian Goetz wrote:
>>  What's bad?
>>>>   - More generics in Collector signatures.  For Collectors that don't
>>>> want to export their intermediate type, they are declared as Collector<T,
>>>> ?, R>, which users may find disturbing. (The obvious attempts to make the
>>>> extra type arg go away don't work.)
>> For me this extra type parameter for the intermediary on Collector is no
>> different than the extra type param on BaseStream. Any time you have a type
>> variable that is not part of the user's generification it's going to feel
>> uncomfortable. For Collector the extra param goes largely un-noticed though
>> Collector is rarely assigned. Collector is mostly used as an argument and
>> in this case the wildcard is invisible. The types (and wildcards) just flow
>> through unobserved. This seems fine and overall it's a huge benefit to
>> handle the post-transform in the Collector.
>> Mike

  -- Howard.

More information about the lambda-dev mailing list