Speed optimization of Spliterators.spliteratorUnknownSize for parallel case

Paul Sandoz paul.sandoz at oracle.com
Wed Jul 15 12:20:45 UTC 2015

Hi Tagir,

On Jul 14, 2015, at 4:43 AM, Tagir F. Valeev <amaembo at gmail.com> wrote:

> Hello!
> Thank you for the detailed answer.
> PS> Thanks for looking at this. Judging by the results i am guessing
> PS> your measurements were performed on a 4-core system.
> Yes, quad-core, I mentioned it before.
> PS> My initial inclination for these scenarios is it is likely better
> PS> for the developer to split the execution in two parts. The first
> PS> part, low N and low Q, is to collect sequentially into an array or
> PS> list. The second part (total size is known), low N and high Q,
> PS> would then use that array/list as the stream source and operate in parallel.
> Probably it would be fine to add such recommendations to the
> corresponding methods javaDoc (for example, BufferedReader.lines).

I think it is more suitable for:


> PS> That, in a sense, is almost what your spliterator does for n <
> PS> initial batch size -:) but because the target threshold size is
> PS> derived from the unknown root size (Long.MAX_VALUE) the
> PS> sub-spliterators are not themselves further split, so the
> PS> parallelism is better but still limited. It's tricky to adjust the
> PS> target size dynamically, we could do it for the first split (e.g.
> PS> unknown size -> two known size), but tracking further splits is
> PS> harder and may not reliably perform (due to concurrent
> PS> sub-splitting). I generally don't want to complicate the
> PS> computation mechanism for poor splitting use-cases.
> I see. That actually was also the thing which I wanted to suggest.
> Still even improving the first split would be great, I'd vote for it.

I am still hesitating. It's not general enough (optimize for poor splitting under fragile scenarios for N elements under batch size) (FWIW the general computation already alternates left/right forking to avoid such spliterators causing out of memory issues where the computation runs away creating tasks).

I also pondered exposing the batch size as an argument, but i don't think developers will generally know what to do with it, where as i suspect a size estimate is much easier to understand (even if guarded language in terms of proportions e.g last least N).

And of course this does not stop us internally using an alternative spliterator implementation for BufferedReader.lines such as suggested by Marko, which i think could be improved if a size estimate is given, as this could also influence the batch size and common difference.

> This way AbstractSpliterator user implementations may also benefit.
> My optimization cannot be applied to AbstractSpliterator as we cannot
> control the tryAdvance method.
> PS> Your approach also degenerates when N is just over the current batch size.
> Yes, it becomes bad, but my measurement show that it's still no worse
> than current implementation:

Yes, no worse.

> Benchmark                (parallel)  (size)  Mode  Cnt    Score    Error  Units
> IteratorTest.base             false    1028  avgt   30  659.308 ± 12.618  us/op
> IteratorTest.base             false    1100  avgt   30  746.619 ± 21.260  us/op
> IteratorTest.base              true    1028  avgt   30  194.800 ±  1.988  us/op
> IteratorTest.base              true    1100  avgt   30  228.926 ±  2.923  us/op
> IteratorTest.jdkSpliterator   false    1028  avgt   30  648.622 ±  3.857  us/op
> IteratorTest.jdkSpliterator   false    1100  avgt   30  750.741 ± 10.279  us/op
> IteratorTest.jdkSpliterator    true    1028  avgt   30  673.574 ±  6.469  us/op
> IteratorTest.jdkSpliterator    true    1100  avgt   30  679.499 ±  4.310  us/op
> IteratorTest.optimized        false    1028  avgt   30  643.209 ±  1.686  us/op
> IteratorTest.optimized        false    1100  avgt   30  718.077 ±  8.123  us/op
> IteratorTest.optimized         true    1028  avgt   30  674.447 ±  5.153  us/op
> IteratorTest.optimized         true    1100  avgt   30  674.622 ±  4.252  us/op
> PS> It does not make as much sense to include such an estimate
> PS> argument for Files.lines for two reasons:
> PS> 1) I recently improved Files.lines for common character sets
> PS> where it is efficient to identify encoded line feeds (such as UTF-8); and
> PS> 2) for other character sets we could derive an estimate from the
> PS> file size (it could actually be the file size, which is the case for 1).
> I see. FileChannelLinesSpliterator is a nice improvement. Have you
> tested it on device files, named pipes or cases when the file size
> is changed in-between by some other process?

No. There is a general clause saying "all bets are off" if the file contents are modified during stream execution. The size is currently snapshot on the Files.lines call. If the file size increases in the interim i suspect it will still work, if it reduces then i would expect an I/O exception.

> Also probably it should
> be mentioned in BufferedReader.lines documentation, that using
> Files.lines for file source is preferred.

Good point a "@see Files.lines ..." would be appropriate.

> PS> So i am inclined to do the following:
> PS> 1) consider adding a size estimate argument for all stream
> PS> factories that are of unknown size and where no reasonable estimate can be derived.
> PS> 2) modify the spliterator factories in Spliterators  to accept an estimate e.g.:
> PS> public static <T> Spliterator<T>
> PS> spliteratorUnknownSize(Iterator<? extends T> iterator,
> PS>                                                         long sizeEstimate,
> PS>                                                         int characteristics)
> By the way currently it's possible to create an IteratorSpliterator
> with estimated size:
> Spliterators.spliterator(iterator, estimatedSize, Spliterator.CONCURRENT);
> Of course it's a misuse of CONCURRENT flag, but it's not used
> otherwisely by stream API, thus currently it has no unwanted
> side-effects.

Very sneaky :-) but not something we cannot encourage.


More information about the core-libs-dev mailing list