Performance model of sorted() ?
brian.goetz at oracle.com
Fri Nov 1 11:18:19 PDT 2013
The "slicing" part is right, but the single-threaded assumption is wrong.
What happens is that the upstream stateless ops will be done in
parallel, with the results being collected at the leaves of the
computation tree, into a bunch of mini-buffers. Now, each will be
sorted (in parallel), and then we begin a parallel merge, collecting the
results into an array. Then the downstream stateless ops and the
terminal op are run, in parallel.
So, no truly sequential steps imposed by the framework (though binary
parallel merges are generally dominated by the last merge step.)
On 11/1/2013 2:01 PM, Millies, Sebastian wrote:
> Hello there,
> in a talk at JAX 2013 the processing model for sorting a stream with sorted() was described as follows.
> I’d like to know if that presentation is really correct:
> .parallelStream().statelessOps() // upstream slice
> .sorted() // sorted slice
> .statelessOps().terminal() // downstream slice
> • stream is sliced
> • resulting stream is buffered at the end of upstream slice (barrier)
> • downstream slice is started after the upstream slice is finished
> • sorted-slice is run by a single thread only
> I wonder particularly about the statement saying that the sorting is done by a single thread, after all
> there is parallel array sort etc. Can anyone clarify the performance model of sorted() ?
> -- Sebastian
> Software AG – Sitz/Registered office: Uhlandstraße 12, 64297 Darmstadt, Germany – Registergericht/Commercial register: Darmstadt HRB 1562 - Vorstand/Management Board: Karl-Heinz Streibich (Vorsitzender/Chairman), Dr. Wolfram Jost, Arnd Zinnhardt; - Aufsichtsratsvorsitzender/Chairman of the Supervisory Board: Dr. Andreas Bereczky - http://www.softwareag.com
More information about the lambda-dev