Design for collections upgrades

Sam Pullara sam at
Sun Mar 13 12:01:44 PDT 2011

For the last year or so I have been using the Google Collections library, now distributed as part of Guava. It has most of the APIs that we are talking about and I think that regular Java programmers are using them. They separate out Lists, Iterables, Sets, Maps, Collections and Iterators. Most of the code that I have written would work great with the methods in these classes added as extension methods to the normal APIs. Function literals would of course slot right in. Here are some links to their APIs:

My understanding is that they have an advantage over extension methods in the caller determines how to treat the target collection. For example, if you use:

public class Iterables {
public static <F,T> Iterable<T> transform(Iterable <F> fromIterable, Function<? super F,? extends T> function)...

You get streaming behavior whereas the equivalent in Lists is evaluated eagerly:

public class Lists {
public static <F,T> List<T> transform(List<F> fromList, Function<? super F,? extends T> function)

Apparently with extension methods we don't have this luxury as the type at the call site does not pick the extension method default but instead the underlying implementation does. Because of this it appears that the current proposals for things like .asStream() will actually end up being more verbose than using their Guava counterparts once function literals are available. As for the naming conventions, Guava went with filter/transform. Personally, I renamed them f/t since chaining together several in Guava gets a little silly with such long names since they show up next to one another. Here is some current code of mine:

                return l(t(f(s.bagService.featured(0, 30L, category), new P<Row<Bag, byte[]>>() {
                  public boolean apply(@Nullable Row<Bag, byte[]> input) {
                    return input != null && ids.add(input.value.owner());
                }), new F<Row<Bag, byte[]>, BagCodeFactory.BagCode>() {
                  public BagCodeFactory.BagCode apply(Row<Bag, byte[]> input) {
                    return new BagCodeFactory(s).getBag(page, input);
                }), count);

l = limit, t = transform, f = filter

In JDK8 with function literals it would look like (featured() returns an Iterable):

                return l(t(f(s.bagService.featured(0, 30L, category), #{ input -> input != null && ids.add(input.value.owner())}), 
			#{ input -> new BagCodeFactory(s).getBag(page, input)}), count);

Now, using some of the suggestions on the list so far, it could look like:

                return s.bagService.featured(0, 30L, category).asStream().fiter(#{ input -> input != null && ids.add(input.value.owner())})
			.transform(#{ input -> new BagCodeFactory(s).getBag(page, input)}).limit(count);

Which isn't too bad though since featured() is already returning an Iterable it feels redundant to asStream() it especially since it really is an Iterable only and doesn't use an underlying Java collection at all. It concerns me that the call site can't control behavior through typing. If we only put the extension methods on Iterable/Iterator it would be pretty easy for the developer to see that the operations are lazy and there would never be any confusion as to the behavior you would get at the call site.


On Mar 13, 2011, at 4:12 AM, Steven Simpson wrote:

> On 10/03/11 13:57, Rémi Forax wrote:
>>  Yes, we need to provide methods that filter/map directly
>> the content of a collections.
>> But I don't think it's a good idea to name them filter or map.
>> Why not filterAll and mapAll ?
> I'm starting to lose track of this thread, but I recall the following
> points:
>    * Lazy operations are desirable, as are eager ones, and in-place ones.
>    * There could be a type or family thereof for lazy collections, e.g.
>      Stream or Iterable.
>    * Collections currently don't have methods for generating new
>      collections, so adding them is a considerable shift.  They do have
>      in-place methods, and new methods like 'filter' share the same
>      tense, suggesting that they should be in-place methods too.
> In case it hasn't yet been mentioned, I'd like to add that collections
> also support /views/ in several places, e.g. subList, entrySet, etc.  Do
> we need a lazy type when we can have views?  I haven't pinned it down in
> my mind yet, but I suspect that views are slightly different to streams,
> in that they also allow us to modify the original collection.  Or, to
> put it another way, they differ from mutating methods, in that we can
> choose not to perform any mutation.
> When we want to delete a portion of a list, we write:
>  list.subList(a, b).clear()
> ...because the sublist view allows us to modify the original list.  But
> if we want to extract a sublist, and leave the original untouched, we write:
>  new ArrayList(list.subList(a, b))
> List.subList could be regarded as lazy, until we apply a mutating method
> to it like clear().  And if we never do that, it stays lazy even while
> we do an eager copy.
> Can we do the same with filter (using a noun like "selection" to be
> consistent with subList)?
>  list.selection(pred).clear(); // mutating original; removeAll(pred)?
>  new ArrayList(list.selection(pred)); // preserving original
> Views, having ordinary collection types, remain Iterable too:
>  for (Element elem : list.selection(pred)) { ... }
> I suppose my point is that there is already a precedent for (mutable)
> laziness in the collection framework, in the form of views.
> Cheers,
> Steven
> -- 

More information about the lambda-dev mailing list