null checks vs. class resolution, and translation strategy for casts

John Rose john.r.rose at
Wed Apr 8 18:43:20 UTC 2020

The latest translation strategies for inline classes involve
two classfiles, one for the actual inline class C and one for
its reference projection N.  The reference projection N exists
to provide a name for the type “C or null”.  As we all know
on this list, this is a surprisingly pleasant way to handle the
problem of representing the two types.

(This was a surprise to me; I had assumed from the beginning
of the project that our build-out of new descriptors, including
Q-types, would inevitably provide the natural way for the JVM
to express null vs. non-null versions of the same nominal class.
But this failed to correspond to a language-level type system
that was workable, and also broke binary compatibility in some
cases where we wished to migrate old L-types to new Q-types.
Having names for both C and N fixes both problems, with
surprisingly little cost to the JVM’s model of types.)

But there’s a problem in this translation strategy with null values
which needs resolution.  To be quite precise, this problem requires
careful non-resolution.  The issue is the exact sequencing of the
JVM’s intrinsic null-checking operations as applied to types
which may or may not be inline classes.

(One of the delights of working on a compatible language a
quarter century old is that there’s always more to the story,
because however simple your model of things might seem,
there’s always some 25-year-old constraint you have to cope
with, that adds surprising complexity to your simple mental
model.  Today’s topic is null checking of the instanceof and
checkcast instructions, which we just discussed in a Zoom
meeting—special thanks to Dan H. and Remi and Fred P. for
guiding me in this topic.)

The static operand of an instanceof or checklist instruction
indexes a constant poool entry of type CONSTANT_Class_info
(defined in JVMS §4.4.1).  Such a C_Class entry is resolvable.
Indeed, bytecodes that use such an entry are specified to
resolve it first, which, may cause a cascade of side effects
including loading a classfile, if it has not already been loaded.
§6.5 says this about instanceof (I have added numbers but
nothing else):

> 1. The run-time constant pool item at the index must be a symbolic
> reference to a class, array, or interface type.
> 2. If objectref is null, the instanceof instruction pushes an int
> result of 0 as an int onto the operand stack.
> 3. Otherwise, the named class, array, or interface type is resolved
> (§

The corresponding documentation for checkcast is identical
except (as you might expect) for this step:

> 2. If objectref is null, then the operand stack is unchanged.

Step 1 says, “you must point to a C_Class”.  This is checked
when the class file containing the instruction is loaded.  This
step does *not* call for any classfiles to be loaded.

Step 2 handles the null case.

Step 3 requires that the C_Class reference be resolved, so that
the resolved class can be used to finish the instruction.  The
next step (4) is not so important here but I’ll include it here for
completeness, for both instanceof and checkcast:

> 4. If objectref is an instance of the resolved class or array type, or
> implements the resolved interface, the instanceof instruction pushes
> an int result of 1 as an int onto the operand stack; otherwise, it
> pushes an int result of 0.

> 4. If objectref can be cast to the resolved class, array, or interface
> type, the operand stack is unchanged; otherwise, the checkcast
> instruction throws a ClassCastException.

Notice that if the object reference on the stack is null then step
2 finishes the instruction, and step 3 is not executed to load the
referenced class (nor is step 4 executed).

This is a little bit inconvenient in the case of a checkcast to an
inline class type.  The Java language requires that a cast to an
inline class must always fail on null, while a cast to a regular
identity class must always succeed on null.  (If we ever add
other null-rejecting types to the language, similar points will
hold for their casts.)  This means that checkcast is not exactly
right as a translation for source-level cast to an inline type.

You might think the ordering of steps 2 and 3 is an unimportant
optimization:  Why bother to do the work of loading the class if
you know the outcome of the instruction (because the operand
happens to be null)?  It’s a little more than an optimization, though.
What would happen if we were to switch the order of steps 2 and
3, so that the class is always loaded?  Could we switch the order
of checks in the JVM, moving forward from here, so that the
Java language compiler can use checkcast to translate inline
type casts?  Or, does it even matter; why not just translate with
the existing instruction even if it does let nulls through?

First, the existing behavior is important, to some extent.
If we were to switch steps 2 and 3, existing programs would
change their behavior during bootstrapping (class loading).
Suppose some class X is referred to in a checkcast instruction,
and the early behavior of some program executes this instruction
before X is loaded.  At that point, the only possible operand of
the instruction is null (since there are no instances of X around
yet).  The checkcast instruction will leave X on disk, and the
JVM will wait for some other event to trigger X’s loading.
In fact, X might not even exist at all; perhaps it’s an optional
component that is never dynamically loaded.  Java’s dynamic
linking model encourages programs to be structured this way
(whether or not it’s a good idea in any particular case).  Yes,
we have static frameworks like the module system, but they
co-exist with the original model of Java, which allows loading
decisions to be deferred until resolution of a symbolic reference.

Even if X exists in the application performing an early checkcast
to X on a null, it may be incorrect to load X the first time a
checkcast instruction refers to it.  The cascade of side effects
that arise from the resolution of X do not involve running X’s
static initializers (<clinit>) but they can involve running various
methods on class loaders, including those that may be defined
by the application.  If X has not been loaded yet, perhaps its
eventual class loader is not ready to run, and so a checkcast
may cause bootstrap errors in the application, if the semantics
of the checkcast are changed by switching steps 2 and 3 above.

(BTW, people who work on ahead-of-time compilers for Java
sometimes wish they could switch steps 2 and 3, so that they
can assume that, by the time a cast on X executes, there is
already a predictably stable definition of X in the application.
This is one of the many points where Java’s dynamic linking
model makes AOT hard.)

OK, so applications may fail to configure themselves correctly
if we switch steps 2 and 3 above.  (Who knew??)  What happens
if we leave the instruction as it is and also continue to use it
to translate Java language casts of inline types?

Consider this code:

interface Pointable { … }
inline class Point implements Pointable { … }
Point getThePoint(Pointable ref) {
  return (Point) ref;

The method is never allowed to return null, because null is not
part of the value set of Point.  But if the method were implemented
using a single checkcast instruction, then nulls would be allowed
to leak out, since checkcast must pass nulls without complaint.

This would be a simple case of a more general problem we might
call “null pollution”, when the JVM is presented with the possibility
of a null value where it is expecting an instance of an inline class.
Using null-permissive checkcast instructions to implement cast
expressions which are null-rejecting would allow polluting nulls
to travel to places where they should not be allowed.  Even if the
language allows null pollution in some places (which is not the
case here), the polluting nulls are likely to have a performance
cost, since the JVM must somehow track the null-ness of a value
when it would prefer just to break it up into its fields.

I think this example (and also more subtle ones) proves that the
static compiler needs to translate casts to inline classes differently
from casts to regular identity classes.

(As soon as we admit that we gate translation on static properties
of classes, the problem of binary compatibility arises.  If classfile
W was compiled in 1998 with a checkcast to some X which was an
identity class, and in 2031 X migrates to an inline class, then W
must execute, in some sense, without recompiling.  If W were
recompiled then the checkcast to X would be null-rejecting,
but as originally compiled in 1998 it is null-restrictive.  Both
behaviors must be somehow compatible with the overall
contracts of binary compatibility.  I think that our recent
revisions of translation strategies make this work out pretty
well, since X is likely to be migrated to the reference projection
of its inline class, so null pollution will not be a problem.)

I have a proposal for a translation strategy:

1. Translate casts to inline classes differently from “classic”
casts.  Add an extra step of null hostility.  For very low-level
reasons, I suggest using “ldc X” followed by Class::cast.

Generally speaking, it’s a reasonable move to use reflective
API points (like Class::cast) on constant metadata (like X.class)
to implement language semantics.

The following alternatives are also possible; I present them
in decreasing order of preference:

2. Use invokedynamic to roll our own instruction.  It will
be a trivial BSM since we are really just doing an asType
operation.  But I think this is probably overkill, despite
my fondness for indy.

3. Translate to Object::getClass followed by pop.  JVMs are
likely to optimize this even better than Class::cast, since
getClass is final and also probably an intrinsic.  But the
first proposal is cleaner, and also has better binary
compatibility properties, because it works for both kind
of classes.

4. Use Objects::requireNonNull instead of getClass.
That’s what users are supposed to say, after all.
But JVMs are slightly more likely to optimize getClass.

5. Use test-and-branch bytecodes instead of methods.
Please, no; control flow is harder to optimize than simple
method calls.  In general, every additional bytecode you add
to the idioms of a translation strategy slightly reduces the
probability that it will be properly optimized by the JIT.
All the other options are compact, using 1 or 2 instructions.

6. Consider adding an eager-resolution option in some form
to good old checkcast.  Basically, allow an annotated instruction
which swaps steps 2 and 3 above.  We had something like this
at one point when we allowed CONSTANT_Class names of the
form “QPoint;”; the special semicolon could be taken to trigger
eager loading, or at least prove that nulls are to be rejected.
I don’t think this is the right place to put the primitive.

— John

More information about the valhalla-spec-observers mailing list