Streams and Spliterator characteristics confusion

Kasper Nielsen kasperni at
Sat Jun 28 15:40:13 UTC 2014


followup questions inlined.

On Fri, Jun 27, 2014 at 11:43 AM, Paul Sandoz <paul.sandoz at>

> Internally in the stream pipeline we keep track of certain characteristics
> for optimization purposes and those are conveniently used to determine the
> characteristics of the Spliterator, so there are some idiosyncrasies poking
> through.
> > s.sorted().spliterator() -> Spliterator.SORTED = true
> > But if I use specify a comparator the stream is not sorted
> > s.sorted((a,b) -> 1).spliterator() -> Spliterator.SORTED = false
> >
> Right, there is an optimization internally that ensures if the upstream
> stream is already sorted than the sort operation becomes a nop e.g.
>   s.sorted().sorted();
> This optimization cannot apply when a comparator is passed in since we
> don't know if two comparators are identical in their behaviour e.g:
>   s.sorted((a, b) ->
> a.compareTo(b)).sorted(Compatators.naturalOrder()).sorted()
What initially made me wonder was the javadoc of
which list "If this Spliterator's source is SORTED by a Comparator returns
that Comparator."

So I assumed s.sorted((a,b) -> 1).spliterator().getComparator() would
return said comparator.

It just feels a bit inconsistent compared to, for example, new
which returns Spliterator.SORTED = true and a comparator.

> > s.distinct().spliterator() -> Spliterator.DISTINCT = true
> > but limiting the number of distinct elements makes the stream non
> distinct
> > s.distinct().limit(10).spliterator() -> Spliterator.DISTINCT = false
> I don't observe that (see program below).

Right, that was an error on my part.

But still, I think some there are some cases where the flag should be
For example, I think following the following program should print 4 'true'
values but it only prints 1.
Especially the second one puzzles me, invoking distinct() makes it

static IntStream s() {
  return StreamSupport.intStream(Spliterators.spliterator(new int[] { 12,
34 }, Spliterator.DISTINCT), false);

public static void main(String[] args) {





> > On the other hand something like Spliterator.SORTED is maintained when I
> > invoke limit
> > s.sorted().limit(10).spliterator() -> Spliterator.SORTED = true
> >
> >
> > A flag such as Spliterator.NONNULL is also cleared in situations where it
> > should not be.
> That is because it is not tracked in the pipeline as there is no gain
> optimisation-wise (if it was it would be cleared for map/flatMap operations
> and preserved for other ops like filter as you say below).
 It's not that difficult to support this and should add no measurable
> performance cost, we deliberately left space in the bit fields, however
> since spliterator() is an escape-hatch for doing stuff that cannot be done
> by other operations i think the value of supporting NONULL is marginal.
I am trying to implement the stream interfaces and I want to make sure that
my implementation have similar behaviour as the default implementation in The interoperability between streams and
Spliterator.characteristics is the only thing I'm having serious issues
with. I feel the current state is more a result of how streams are
implemented at the moment then as part of a public API.

I think something like a table with non-terminal stream operations as rows
and characteristics as columns. Where each cell was either: "cleared",
"set" or "maintained" would make sense.


More information about the core-libs-dev mailing list