Exploiting concurrency with IteratorSpliterator of unknown size
marko.topolnik at gmail.com
Sat Mar 22 21:31:47 UTC 2014
I have a use case where I process a BufferedReader#lines() and each line takes a substantial amount of time (say 20 ms). The processing is easily parallelizable, however for smaller input sizes, little or no parallelization is attempted due to the batch size step of 1024 hardcoded into IteratorSpliterator when there is no size estimate.
As a workaround I have coded a modified IteratorSpliterator which takes the batch size as a parameter and keeps it fixed (no arithmetic increasing). With a batch size of 100 I achieve full load on all four cores on my laptop.
Since such an approach is far from elegant (taking more than 100 lines of code, which include a copy-paste of the private ArraySpliterator and the anonymous Iterator over BufferedReader's lines), I was motivated to address this mailing list in a search of a better, more idiomatic way towards achieving good parallelism for my scenario. What could I do instead of reimplementing a Spliterator from scratch?
More information about the lambda-dev