lazy statics design notes
john.r.rose at oracle.com
Wed Feb 27 20:33:30 UTC 2019
On Feb 27, 2019, at 7:30 AM, Karen Kinnear <karen.kinnear at oracle.com> wrote:
> Subject: Valhalla EG notes Feb 13, 2019
> To: valhalla-spec-experts <valhalla-spec-experts at openjdk.java.net>
> III. [Remi Forax] DynamicValue attribute
> Another project Remi will lead and create JEP
> language level: static lazy final
> improve startup by allowing init with Condy at first access of individual static
> Drawbacks: opt-in at source
> change in semantics
> in static block - there is a lock
> condy BSM can execute multiple times
I was just talking with Vladimir Ivanov about lazy
statics. He is working on yet another performance
pothole with <clinit>, generated by Clojure this time.
(It's not their fault; the system had to clean up a problem
with correct initialization order, and <clinit> execution
is over-constrained already, so the JIT has to generate
more conservative code now.)
I believe lazy statics would allow programmers
(and even more, language implementors) to
use much smaller <clinit>s, or none at all,
in favor of granular lazy statics.
So, here's a brain dump, fresh from my recent
lunch with Vladimir:
Big problem #1: If you touch one static, you buy
them all. Big problem #2: If any one static
misbehaves (blocking, bad bootstrap), all statics
misbehave. Big problem #3: If <clinit> hasn't
run yet, you need initialization barriers on all
use points of statics; result is that <clinit> itself,
and anything it calls, is uniquely non-optimizable.
Big problem #4: After touching one static, the
program cannot make progress until the mutex
on the whole Class object is released. Big problem
#5: Setting up multiple statics is not transactional;
you can observe erroneous intermediate states during
the run of the <clinit>. Big problem #6: Statics
are really, really hard to process in an AOT engine,
because nearly every pre-compiled code path must
assume that the static might not be booted up yet,
and if boot-up happens (just once per execution)
it invalidates many of the assumptions the AOT
engine wants to make about nearby code.
Solutions from lazy statics: Solution #1: If you touch
one that's the one you buy (plus what's in the vestigial
<clinit> if there is one at all). Solution #2: Misbehaving
statics don't misbehave until they are used (yes, bug
masking, boo hoo). Solution #3: Initialization barriers
are trivial: Just detect the T.default value of the variable.
Solution #4: There is no mutex, just a CAS at the end
of the BSM for the lazy static; no critical section.
Solution #5: The CAS at the end of the BSM is inherently
transactional. Solution #6: AOT engines can generate
somewhat simpler fast-path code by just testing for
T.default; the slow-path code is still hard to optimize,
but the limits are from the complexity of the BSM
that initializes the lazy static, not the total complexity
of the <clinit> code.
Objection: What if you *want* a mutex? I didn't like
the JVM blocking everything in <clinit> but I don't
want a million racing threads computing the same
BSM value either. Ans: Fine, but make that an opt-in
mechanism, by folding some kind of flow control
into the relevant BSM, for your particular use case.
The JVM doesn't have to know about it.
Objection: What if I want several statics to initialize in
one event, with or without mutex or transactions?
Ans: Easy, just have the BSM for each touch the others,
or run a common BSM that sets everything up (and then
returns the same value). (Note: At the cost of an
idempotency requirement during lazy init.) In the
most demanding cases, define a private static nested
class to serialize everything, which is today's workaround.
Objection: Those aren't real statics, because you can't
set them to their T.default values! Ans: They are as
real as you are going to get without creating lots of
side metadata to track the N+1st variable state, which
is a cost nobody wants to pay.
Objection: But I do want to opt into the overhead and
you aren't giving me my T.default; I need the full range
of values for my special use case. Ans: Then add an
indirection for your use case, to a wrapped copy of your
desired value; the null wrapper value is the T.default in
this case. It's at least as cheap as anything the JVM would
have done intrinsically.
Objection: You disrespect 'boolean'. It only has one
state left after you filch 'false' to denote non-initialization.
My VM hack can do much better than that. Ans: Let me
introduce you to java.lang.Boolean. It has three states.
Objection: What if someone uses bytecode to monkey
with the state of my lazy static? Your design is broken!
Ans: This is the sort of corner case that needs extra
VM support. In this case, it is sufficient to privatize
write access to a variable, even though it may be public,
to its declaring class. You can trust the declaring class
not to compile subverting assignments into itself,
because javac won't let it.
Objection: I can't imagine the language design for this;
surely there are difficulties you haven't foreseen. Ans:
Neither can I, and there certainly are. The sooner we
start trying out prototypes the sooner we'll shake out
the issues. There are several things to try:
Bonus: The T.default hack scales to non-static
fields as well. So laziness is a separable tool
from the decision to make things static or not;
it survives more refactorings. The technique
is abundantly optimizable (both static and
non-static versions) as proven by the good
track record of @Stable inside the JDK. We
should share this gem outside the JDK,
which requires language and (more) VM
support. Language design issue: It's easier
to do the lazy static with an attribute than
doing the lazy non-static; you need an
instance-specific callback for the latter. TBD.
The nice thing about this is that the OpenJDK JITs
have been making good use of @Stable annotations
for a long time. So the main problem here is finding
a language and VM framework that legitimizes this
sort of pattern (including safety checks and rule
enforcement on state changes). When that is done,
the JITs should make use of it with little extra effort.
More information about the valhalla-spec-observers