values vs. statics

John Rose john.r.rose at
Sun Apr 22 00:17:18 UTC 2018

In L-world (which is working out very well, I think) the same L-type
descriptors are used to denote both classic object types and new-fangled
value types.

When stored in an instance field (or an array element), a value type
behaves slightly but importantly from a regular object type, even
though the field descriptor (or array element descriptor) is an L-type
in both cases.  There are three differences:

1. A value type field (or element) is usually "flattened" by the JVM.
2. A value type is preloaded before any of its containing types is preloaded.
3. A value type field (or element) is never null.

Flattening means the value can be more efficiently accessed as
a part of the enclosing object (or array).  It's as vague as that,
but important; it is the difference between having a field of type
int and a field of type java.lang.Integer.  (A value "…works like
an int".)

So the important property is 1., flattenability.  We don't just say
"flattening" because of certain second-order details, which may
lead the JVM to use a pointer to a buffer, in some cases, instead
of a truly inlined and flattened representation for a variable
of a value type.  These cases include:

 A. Jumbo value types are buffered, not inlined, for a
    platform-dependent definition of "jumbo".
 B. Volatile variables of value type might be buffered as a way
    to control races, if no alternative (such as TSX or STM) is
 C. Values in containers which are inconvenient to flatten
    may be buffered, so that the container does not need to
    deal with the complexities of inlined values.
 D. A classfile may have been compiled against an old
    version of an L-type when it was an object, but now
    it has migrated to be a value.  In that case, fields in
    the old class requires nullability, breaking condition
    #3 above.  The JVM can detect this on a case-to-case
    basis.  This means a type may be flattenable, but in
    these corner cases, a variable of that type might *not*
    be flattened.

Still, even if an internal buffer pointer is used to represent
a value, conditions #2 and #3 apply to value types.  If a
class makes a field which is a value type, that value type
must be loaded first.  (If an array is created, its element
type is already loaded, so condition #2 is already true,
even for object types.)  Also, the JVM must never, ever
allow a null value to appear, even if a value is buffered.

There is a user-visible exception to non-nullability of
instance variables of value type, and that is when the
instance variable is defined by a deficient container,
which we are choosing to support compatibly; that is
case D above.  In the other cases (A,B,C), nulls
are not allowed.  This non-nullability has three
 a. A getfield on the variable is never observed to produce
    a null.
 b. A putfield of a null, if the verifier allows it, is rejected
    by the putfield instruction itself (with NPE).
 c. The default initial value of such a field is not null,
    but rather the default of the value type.

Condition c. is the reason for condition #3 above.
Before the first instance of a class is created, each value
type used by any of that class's fields must be loaded, and
its layout (including size, alignment, and oop content) must
be determined.  The default value of this value type must
also be determined, so that freshly created instances
of the class can be pre-packed with that default value.

Note that if, for a particular field of value type, one of
the cases A-D applies, then the default value must
be "plugged in" to the instance in the form of an invisible
buffer pointer, not copied bitwise.  This implies that when
a value type is preloaded, an important early activity is
to register a pointer to (a bitwise image of) the default
value of that value type, so that all clients of it can use
the pointer, either as the base of a copy operation or
as-is, in one of the cases A-D.

So, what about static fields of value type?  Condition
#1 above does not really apply to static fields, and
since condition #1 started this whole mess, one might
think that none of the remaining considerations apply
to static fields of value types.

(I'm leaving aside the argument of whether, in some
cases, there might be a large enough number of static
variables of value type that, in bulk, cache-coherence
effects would allow flattening to make a discernable
difference.  I'm assuming that there are few enough
statics in the world that they all fit in cache comfortably.
This assumption is true today but might change after
template classes are introduced, if a template species
could have a bunch of statics—in that case, a template
species would have many of the same scaling properties
of instances, if the template were specialized many times.)

Even if we rule out #1 for statics, I claim that conditions
#2 and #3 should be applied to static fields just as much
as instance fields.  In particular, if a field (static or instance)
is marked ACC_FLATTENABLE, the type must be preloaded
before the containing class is "prepared".  The default initial
value of the static must be the default value of the preloaded
class (as determined during the "preparation" phase of *that*
class).  The value of the static must never assume the value
"null", whether as a default initial value, or in response to
a "putstatic" instruction.  Statics defined in case D would
be exempt from these restrictions, just like non-statics.
In all other cases, a static variable of value type would
never be null.

It appears that statics of value type might qualify for
the exception in case C.  After all, who wants to do the
extra work of wedging inlined values into the static
holder of a class that defines statics of value type?

(For the record, the current static holder, in HotSpot,
is the java.lang.Class mirror of the class holding the
static.  I think this is a slick move.  If we were to inline
static value fields into the java.lang.Class, we'd have
to add more "stuff" to the already magic logic that
lays out each java.lang.Class in a customized way.)

So, using case C above, I'm happy to imagine that
value type static fields are implemented as hidden
L-type buffer pointers stored, alongside the other
statics, inside their class mirror.  I think I want to
re-examine this eventually when we do templates.

What's left to implement conditions #1-#3?  Well,
#1 is not applicable, so that's easy.  #2 is simple
also:  We simply preload a value type class if
we see the ACC_FLATTENABLE bit, and don't
inquire (at that moment) whether the ACC_STATIC
bit is set at the same time.  Condition #3 breaks into
the sub-conditions a, b, c, which adapt to statics
as follows:

 a. A getstatic on the variable is never observed to produce
    a null.
 b. A putstatic of a null, if the verifier allows it, is rejected
    by the putfield instruction itself (with NPE).
 c. The default initial value of such a field is not null,
    but rather the default of the value type.

How do we implement this?  Well, c. is handled by
ensuring that the Class mirror is populated with a
pointer to the default value pointer created when
the value type was preloaded.  And b. is handled
by asking the link resolver to mark static fields which
are of value type, so that the interpreter will not
accidentally store a null there.  For instance fields
we do this by consulting the ACC_FLATTENABLE
bit, and throwing NPE if the putfield detects a null.
We can do the same for putstatic (at the cost of
some copied assembly code).  Sub-condition
b. falls out naturally from a. and c.

Note that down-rev. classes (which don't know about
value types, in case D) can't get away with a putfield or
putstatic of null, since it is the container, not the client,
that gets to see when a field is not nullable.  The down-rev.
classes of course can define fields (static or not) which
are of value types, and they have a "special" ability
to store nulls in those fields too.  This special ability
goes away when their code is recompiled.

(I'm not arguing at this point for *any* other support for
nullability control, other than recompiling in the presence
of value types.  I don't want emotional types like
String! and String?, at least not any time soon.  We
have enough to do as it is.  Case D is not there because
we want expressiveness for down-rev. classes, but
because we must have compatibility.)

Why am I worried about statics?  Why not just let them
be nullable whenever they are L-types, even if they are
value types?  Because if we allow statics to be nullable,
they will be a significant source of null-pollution, even in
new code.  It's not enough for javac to insert "vdefault"
and "putstatic" into the "<clinit>" of all classes with value
statics.  The "putstatic" is too late in too many edge cases.
The nulls will be observed in the wild, when programmers
write code that reads statics inside of explicit "<clinit>"
code.  The default value of the static must be plugged in
before "<clinit>" runs, at preparation time.

More vaguely but still significantly, my personally "symmetry
design heuristic" barfs on a language design which would
say that values are not nullable, except when defined as
static variables, or on a JVM design which supports this
language and then requires javac to emit fix-up code
("vdefault;putstatic") to implement the correct semantics.
This heuristic also makes me queasy when I think about
refactoring values between static variable containers and
instance variable containers.  I want to refactor freely,
and not worry about nulls popping up like little goblins.

So I think what I'm asking for, in terms of JVM coding,
is an extra null barrier on "putstatic", and some review
of the sequence of preparation time vs. class preloading,
as it interacts with the logic for ACC_FLATTENABLE.

I am *not* asking for ACC_FLATTENED (the internal flag,
I mean) to be set on any static field.  Just ACC_FLATTENABLE,
with the corresponding null checks.

If the preparation-type patching of the default value into
the java.lang.Class mirror is inconvenient, then the alternate
technique of putting a null barrier on "getstatic" is acceptable.
In fact, that points to a refactoring of the interpreter assembly
code to cover the ACC_F* logic for all four cases of put/get
times field/static.


— John

More information about the valhalla-dev mailing list