run-to-run variance on C/P/N/Q experiments
forax at univ-mlv.fr
Tue Oct 9 03:34:57 PDT 2012
is it with tiered compilation enable or not ?
I've found that tiered compilation introduces more jitter than when the
VM is configured to only c2.
On 10/09/2012 11:18 AM, Aleksey Shipilev wrote:
> I'm following up on the decomposition experiments, and this time focus
> on run to run variance for these. I've took one of the break-even points
> of the previous experiment on the same machine , and executed it
> multiple times.
> For C=1, P=32, N=3000, Q=20 in parallel case, we run the tests in two modes:
> a. 10 iterations per JVM invocation, 1000 JVM runs 
> b. 100 iterations per JVM invocation, 10 JVM runs 
> The bottom line for this experiment is that we experience a huge
> run-to-run variance, that are be triaged to be JITting jitter:
> - scores drift from run to run, staying within the bounds in the run
> - -Xint mitigates the variance (with a huge penalty in scores)
> - -Xcomp -Xbatch mitigates the variance (but drops the scores)
> That also means that our break-even experiments are somewhat 30-50% off
> the true value. There is no reasonable way found to lower the run-to-run
> variance without the performance penalty, so we only option left at this
> point is run with multiple invocations.
> The disassembly dumps caught for low-score and high-score are here .
> The integer there is the throughput we have on that code. If someone
> could make sense of those logs alone, you are welcome to do so. The
> entry point for microbenchmark is "testParallel" method. The inline
> trees are somewhat different, but not that different to readily explain
> the performance difference.
>  http://shipilev.net/pub/jdk/lambda/runtorun-variance/i10-f1000/
>  http://shipilev.net/pub/jdk/lambda/runtorun-variance/i100-f10/
>  http://shipilev.net/pub/jdk/lambda/runtorun-variance/i10-f1000/asms/
More information about the lambda-dev