john.r.rose at oracle.com
Mon Dec 3 19:55:37 UTC 2018
In part we are drawing on the very old idea of a Lisp "SPECIAL" variable.
Such a variable is bound by a "LET" construct, which after compilation will
be assigned to a stack frame, and managed within the lifetime of that frame.
The thing that is special about such a variable is that, given the name of the
variable, it can be accessed like a global, from blocks of source code not
directly nested inside the "LET" that binds it. A non-special variable, aka
a lexically scoped variable, is accessible only within the "LET" that binds it.
Lexical scoping is the rule we are all accustomed to in C, Java, etc.
The scoping of "SPECIAL" variables is sometimes called dynamic scoping.
The variables in the Unix shells behave as if dynamically scoped, in that
you can say "$x" inside a block of code that didn't bind "x", and still see it.
In Lisp, modularity or privacy of a binding can be obtained by using a
namespaced symbol, or an uninterned symbol. Java would manage
such concerns using object APIs.
Since the JVM (if not the Java language) has a clear idea of stack frame,
it makes sense to adapt the Lisp concepts to be directly linked to frames.
Thus, a Lisp SPECIAL is the older brother of a Java frame local.
Anyway, that's the background, as I see it.
(Caveat: None of these observations are intended to suggest that the Java
source language should introduce dynamically scoped variables, any more
than today's ThreadLocal should introduce a special kind of Java variable.
Java's scoping rules are complicated enough as they stand, and Java's object
model is sufficient for expressing dynamically bound constructs. Lisp
"SPECIALs" were originally the only kind of variable available, back in the
days when scoping was a research topic. Java can learn from history
and choose the better alternative where Lisp was saddled with both.)
(And why not "just do" fiber local variables like today's thread locals?
That's an understandable starting point, but it would cause fibers with
locals to acquire a permanent footprint cost, since there's no way for
the JVM to prove that a fiber local goes out of scope and can be
deleted. Java threads swim around with an accumulation of thread
local footprint, like a whale shark dragging a colony of lampreys.
Those same lampreys would overwhelm a goldfish. It's a problem
of scaling. So old-fashioned Lisp "SPECIALs", or some other kind
of frame-based local variable, are where you get when you finish
properly adapting the idea of thread local to Java fibers.)
At the implementation level, such as compiled code, the three basic operations
for dynamically scoped variables are creating a binding, destroying a binding,
and looking up the current binding. Lookups are inherently harder to optimize
because the stack frame that does the lookup is not always the stack
frame that holds the binding. So there's a search involved.
(Lookups are done for both reads and writes of a SPECIAL variable.
For Java it's an open question whether frame locals should be writable
at all, since existing Java uplevel variables are final, and that works just
fine for us. You can always add an indirection to a mutable box.)
Two more basic operations usually pop up when you add threads: When you
park a thread your implementation might choose to somehow move the bindings
aside out of the way, and when you unpark a thread you would then want
to move the bindings of that thread back into the foreground, where they
can be more easily accessed.
There's an old literature of how to implement the search for a dynamically
scoped variable binding. Much of it precedes our modern notions of threading.
But the basic trade-offs are still relevant, between what has been called
"deep binding" and "shallow binding". These are not semantic features
but implementation tactics. Shallow binding places the binding in a well
known location (such as a global or static variable), which compiled code can
easily access. Deep binding places the binding near the stack frame that created
the binding, requiring a search among the candidate frames starting with
the most recently created frames. Shallow binding is not thread safe unless the
well known location is somehow localized or specialized to the thread.
Although shallow binding seems like an easy win over deep binding,
shallow binding can slow down thread parking and unparking, since
all the global variables used by the relevant thread need to be edited.
So the tradeoff is the cost of ownership of a binding, versus the cost
of lookup. Clever implementations will try to balance both costs, and
pay them only when they are necessary. A search for deep bindings
can sometimes be made faster by indexing or hinting, or clever
tabular organizations which approximate the characteristics of
shallow bindings, as hash tables approximate arrays.
Deep binding is a win when binding lookup is relatively infrequent
compared to binding creation and thread switching. Deep binding
can be viewed as a lazy evaluation tactic, where the costs of finding
a binding are deferred until the value is actually looked up; conversely
shallow binding is an eager technique for lookup. It's easy to create
hybrid techniques, such as memoization of lookups (making the
second one look more "shallow"), or storing hints in thread local
and/or global tables.
For fibers, since park and unpark are high frequency operations, pure
shallow binding might well be a lose, at least for workloads with large
numbers of dynamically scoped variables. But fibers naturally have
an analog of thread park/unpark operations, and these events in
the life cycle of the fiber can be tasked with moving around table
pointers and/or hints for optimizing lookup of bindings.
Also, fibers have a small amount of control state to which might be
added additional tables or hints which would survive threading events,
and allow a fiber to quickly locate frame-local bindings regardless
of its recent history of threading. Using fiber-local control state has
two challenges: First, it should be kept compact in footprint, like the
fiber itself. Second, the cost of editing the state for bindings (which
happens when bindings are created and destroyed) must not be
excessive, since that would make method calls expensive, when
methods bind frame locals. That said, initial implementations can
be simple and a little wasteful, as long as there is headroom for
later optimization, as workloads may require.
If we do this right, a robust and performant mechanism for frame
locals could let us refactor existing JVM features which today require
stack walking. I'm thinking of doPrivileged, of course, and agreeing
with Andrew's observation about "better thread locals". The techniques
could co-exist: A frame local could define a cache point which would
be initially empty but later filled (on demand) by a stack walk.
A final observation: The JVM probably needs to be in the loop with
frame locals; dynamic scoping is probably not something that can be
implemented on top of the JVM in pure standard Java. When the
JVM is asked to perform a lookup of a frame local, it should immediately
know which method (out of all the methods which might possibly be
on stack) is the method which can define that same frame local.
This implies a registration or numbering of each frame local in the
metadata for a method. It also suggests (to me) that each frame
local should be defined in tandem with a statically defined annotation
on each method that binds it.
On Dec 3, 2018, at 10:24 AM, Ron Pressler <ron.pressler at oracle.com> wrote:
> Hi Andrew.
> Yes, scope variables are indeed orthogonal to fibers (although we would like to
> combine them with structured concurrency).
> Dean Long has started working on an early prototype, which he’ll present hopefully soon.
> While scope locals are certainly superior to TLs from a programming perspective,
> I’m not sure beating them on sheer performance would be so easy. But If you
> have some ideas on how to do that, they would be most welcome.
> On December 3, 2018 at 4:49:47 PM, Andrew Haley (aph at redhat.com) wrote:
> We discussed "scoped variables" as a possible fiber-oriented
> replacement for thread locals. I have some interest in this area, so I
> was wondering if anyone had started to work on this. If not, I'll
> volunteer to propose a design.
> These could also be useful even when fibers aren't being used: we'd
> want something more efficient than ThreadLocals, but that's a very low
> bar. So, please let me know.
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the loom-dev