C/P/N/Q par vs. seq break-even analysis with 10ms think time
aleksey.shipilev at oracle.com
Wed Oct 17 07:31:26 PDT 2012
FYI, I had updated the fjp-trace to track UNPARK->UNPARKED edges. This
allows to infer the cost of unparking the thread, please see updated
chart at . Notice the red edges there, their sources are at UNPARK
request for a thread, and destinations are at UNPARKED, when thread has
indeed woken up.
It looks very like a FJP rampup lags, when we try to wake up many
threads exponentially (but still too slow).
On 10/16/2012 07:27 PM, Aleksey Shipilev wrote:
> This is more thorough analysis on what's going on at the break-even
> point in C/P/N/Q experiment . I've took the fjp-trace  profiling
> at the break-even point, and the results are here . The new feature
> for fjp-trace can reconstruct the entire decomposition tree, which you
> might want to peek here .
> - notice that the handoff from the submitter to FJP takes quite a bit
> of time, somewhat 70us in this case;
> - the entire task finishes in ~500us, but the trace shows execution for
> only ~310us. This is due to fjp-trace architecture which can not record
> the JOIN in the external submitters (yet). This might very well mean the
> handoff back to the blocked submitter takes another 100us.
> - threads are waking up rather slow (on this timescale), full-blown
> parallelism lasts for somewhat 50us.
> So, here's what we got on the table. If I understand this data
> correctly, then the 500us execution divides as:
> ~70us: handoff to FJP
> ~200us: FJP rampup
> ~50us: FJP steady (even though lots of balancing)
> ~100us: result handoff
> That means if we want to pursue parallel decompositions on smaller
> scale, we need to figure out the rampup effects first. I have yet to
> figure out if the rampup effects is due to sequential decomposition in
> lambda code, or that is the genuine threading lags.
> Another thing is the interface between submitter and the FJP. I vaguely
> recall the infrastructure for allowing submitters to run the tasks
> themselves in in place, but how much effort that would take to get to at
> least experimental readiness? (Also, I don't see how/if the
> CountedCompleters could interoperate with submitters in this case, is
> there any option to make submitter to be the last completer?).
>  https://github.com/shipilev/fjp-trace
>  http://shipilev.net/pub/jdk/lambda/20121003-fjpwakeup/
More information about the lambda-dev