Streams and Spliterator characteristics confusion

Paul Sandoz paul.sandoz at
Mon Jun 30 09:34:41 UTC 2014

On Jun 28, 2014, at 5:40 PM, Kasper Nielsen <kasperni at> wrote:

> > s.distinct().spliterator() -> Spliterator.DISTINCT = true
> > but limiting the number of distinct elements makes the stream non distinct
> > s.distinct().limit(10).spliterator() -> Spliterator.DISTINCT = false
> I don't observe that (see program below).
> Right, that was an error on my part.
> But still, I think some there are some cases where the flag should be maintained.
> For example, I think following the following program should print 4 'true' values but it only prints 1.
> Especially the second one puzzles me, invoking distinct() makes it non-distinct?
> static IntStream s() {
>   return StreamSupport.intStream(Spliterators.spliterator(new int[] { 12, 34 }, Spliterator.DISTINCT), false);
> }
> public static void main(String[] args) {
>    System.out.println(s().spliterator().hasCharacteristics(Spliterator.DISTINCT));
>    System.out.println(s().distinct().spliterator().hasCharacteristics(Spliterator.DISTINCT));
>    System.out.println(s().boxed().spliterator().hasCharacteristics(Spliterator.DISTINCT));
>    System.out.println(s().asDoubleStream().spliterator().hasCharacteristics(Spliterator.DISTINCT));
> }

The second is a good example as to why this is an implementation detail, here is the implementation (some may want to close their eyes!):

    public final IntStream distinct() {
        // While functional and quick to implement, this approach is not very efficient.
        // An efficient version requires an int-specific map/set implementation.
        return boxed().distinct().mapToInt(i -> i);

We could work out how to inject back in distinct but since the spliterator is intended as an escape hatch i did not think it worth the effort.

Note if the latter source was a a long stream it would not be able to inject DISTINCT because not all long values can be represented precisely as double values.

> I am trying to implement the stream interfaces and I want to make sure that my implementation have similar behaviour as the default implementation in The interoperability between streams and Spliterator.characteristics is the only thing I'm having serious issues with. I feel the current state is more a result of how streams are implemented at the moment then as part of a public API.
> I think something like a table with non-terminal stream operations as rows and characteristics as columns. Where each cell was either: "cleared", "set" or "maintained" would make sense.

We deliberately did not specify this aspect, the implementation could change and we don't want to unduly constrain it based on an escape-hatch (it's not the common case). Implementations can decide to what extent the quality is of that escape-hatch spliterator. For your implementation you are free to provide better quality escape-hatch spliterators.

I think we should clarify the documentation on BaseStream.spliterator() to say something like:

  The characteristics of the returned spliterator need not correlate with characteristics of the stream source
  and those inferred from intermediate operations proceeding this terminal operation.

I have also logged the following issues :




More information about the core-libs-dev mailing list