Scoped variables

Andrew Haley aph at
Tue Dec 4 17:12:30 UTC 2018

On 12/4/18 4:02 PM, Ron Pressler wrote:
> I am, therefore, not aware of an implementation of the concept that
> does not involve some kind of hash-map lookup (or worse). The
> question is how fast that map lookup can be made to be.

Exactly, yes. The problem is that the current TheadLocal code is very
complex, and if we restrict ourselves to a simple get() we can do
better.  I don't know if you saw my analysis of ThreadLocal
performance? It's at
The fast path is 12 field loads, 5 conditional branches, and these are
dependent loads, so have a lot of latency. We also suffer a fair bit
from mispredicted branches, from the look of the profile.

Josh Bloch said in that discussion that he didn't intend people to use
ThreadLocal.get() with high frequency, but it's clear that Java
programmers find them so attractive they'll use them all over the

For example, in Chapter 7 of The Art Of Multiprocessor Programming a
ThreadLocal is used to create a queue node for a CLH lock.  It's not
sensible to use a ThreadLocal for this, but it makes the example code
shorter. Of course, every call to acquire() then takes 12 field loads
just to get to the local queue node.

Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

More information about the loom-dev mailing list