Performance of locally copied members ?

Osvaldo Doederlein opinali at
Mon May 3 13:13:44 PDT 2010

2010/5/3 Martin Buchholz <martinrb at>

> It's a coding style made popular by Doug Lea.
> It's an extreme optimization that probably isn't necessary;
> you can expect the JIT to make the same optimizations.

It certainly is necessary - unfortunately. Testing my particle/octree-based
3D renderer without this manual optimization (dumping FPS performance each
100 frames, begin at 10th score after startup):

JDK 6u21-b03, Hotspot Client:

JDK 6u21-b03, Hotspot Server:

Now let's cache 8 instance variables into local variables (most final, a
couple non-final ones too):

JDK 6u21-b03, Hotspot Client:

JDK 6u21-b03, Hotspot Server:

So, the manual optimization makes no difference for Hotspot Server; but hell
it does for Client - 6% better performance in this test; and the test is not
only the complex, deeply nested rendering loops that use those cacheable
variables to read the input data and update the output pixel and Z buffers -
there's also other code that burns significant CPU and doesn't use these
variables, remarkably buffer filling and copying steps. This means the
speedup in the optimized code should be much higher than 6%, I only reported
/ cared to measure the application's global performance.

We'll need to deal with HotSpot Client for years to come, not to mention
smaller platforms (JavaME, JavaFX Mobile&TV) which JIT compilers are even
lesser than JavaSE's C1. Tuned bytecode is also faster to interpret, which
benefits warm-up time too. Please keep your dirty purist hands off the API
code that Doug and others micro-optimized; it is necessary. :)

And my +1 to add the same opts to other perf-critical APIs. Even most
important for java.nio as under C1, it doesn't currently benefit from
intrinsic compilation of critical DirectBuffer methods.


> (you can try to check the machine code yourself!)
> Nevertheless, copying to locals produces the smallest
> bytecode, and for low-level code it's nice to write code
> that's a little closer to the machine.
> Also, optimizations of finals (can cache even across volatile
> reads) could be better.  John Rose is working on that.
> For some algorithms in j.u.c,
> copying to a local is necessary for correctness.
> Martin
> On Mon, May 3, 2010 at 04:40, Ulf Zibis <Ulf.Zibis at> wrote:
> > Hi,
> >
> > in class String I often see member variables copied to local variables.
> > In java.nio.Buffer I don't see that (e.g. for "position" in
> nextPutIndex(int
> > nb)).
> > Now I'm wondering.
> >
> > From JMM (Java-Memory-Model) I learned, that jvm can hold non-volatile
> > variables in a cache for each thread, so e.g. even in CPU register for
> few
> > ones.
> > From this knowing, I don't understand, why doing the local caching
> manually
> > in String (and many other classes), instead trusting on the JVM.
> >
> > Can anybody help me in understanding this ?
> >
> > -Ulf
> >
> >
> >
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the core-libs-dev mailing list