Valhalla basic concepts / terminology

Brian Goetz brian.goetz at
Fri May 22 19:36:42 UTC 2020

Hi Kevin!

>   * There are two kinds of objects/instances; the notions "object" and
>     "instance" apply equally to both kinds. These are "inline objects"
>     and "identity objects". Statements like "it's an instance, so that
>     means it's on the heap" and "you can lock on any object" become
>     invalid, but statements like "42 is an instance of `int`" /are/ valid.


 From a pedagogical perspective, it's not clear whether we are better 
off framing it as a partitioning (there are two kinds, red and blue) or 
that some objects have a special property (in addition to their state, 
some objects have a special hidden property, its identity.)

We have been going down the former path, but I am starting to think the 
latter path is more helpful; rather than cleaving the world of objects 
in two, instead highlight how some (many!) objects are "special".

>       o (do we intend to use the term "object", or use the term
>         "instance", or define the two differently somehow?)
To the extent we can avoid redefining these things, I think it is easier 
to just leave these terms in place.
>   * Identity objects get the benefits of identity, at the cost that
>     you may only store /references/ to them. They will ~always go onto
>     the heap (modulo invisible vm tricks).
Yes.  Again, pedagogically, I am not sure whether the heap association 
is helpful or hurtful; on the one hand, C programmers will understand 
the notion of "pointer to heap node", but on the other, this is focusing 
on implementation details, not concepts.

One of the oddest things here is that you can have references to all 
objects, but only can pass/store inline objects directly -- it's like a 
2x2 matrix with one corner blacked out.

>   * Inline objects forgo the benefits of identity to give you the
>     /option/ to store either a reference to a heap object or the data
>     itself inline.
>       o (Users choose by e.g. writing either `Foo.val` or `Foo.ref`,
>         though one would be the default)
Yes.  It is worth noting here that we would like for the actual 
incidence of `.ref` and `.val` in real code to be almost negligible.  
Maurizio likens them to "raw types", in the sense that we need them to 
complete the type system, and there are cases where they are 
unavoidable, but the other 99.9% of the time, you just say "Point".

>  *
>   * We can also sort concrete classes into just two groups: "inline
>     classes" and "identity classes", each of which begets only its own
>     kind of objects/instances.
Yes.  All the instances of a class C are either identity objects, or 
inline objects.
>     We don't say "value types" anymore because the term "value" (as in
>     "value set") applies to /all/ types. 
Yes.  The appeal of "value" comes from "pass by value", but there is too 
much baggage associated with the word value.

The choice of inline is not perfect; it's a strange word to most people, 
but it comes with the intuition that an inline object's layout will be 
"inlined" into containing objects/arrays.  But, it doesn't mean that its 
methods will always be inlined (though that is more likely as they are 
final and the VM will have sharp type information.)

>   * A concrete /class/ is either an "identity class" or an "inline
>     class". But a compile-time /type/ is distinguished not by "inline
>     vs identity" but by "inline vs /reference/".
Yeah, this is the other hard one.  In fact, it took us years to realize 
that the key distinction is not reference vs primitive/inline, but 
_identity_ vs inline.

Here's the scorecard:

Object is a reference type.
For an identity or abstract class C, C is a reference type.
For an interface I, I is a reference type.
For an inline class V, V is an inline type.
Primitives are inline types.

A reference type always holds a reference to an object (which might be 
inline or identity), or null.

>       o must hold a "reference" (or null)
>           + Condition: the type (or, for a type variable, its bound):
>             is neither an interface nor "almost-interface"; or is a
>             subtype of IdentityObject; or is an inline class that
>             specifies ref-default; or bears an explict `.ref`.
>           + this is probably what the term "reference type" needs to
>             apply to now. For example it is currently "reference
>             types" that my nullness analysis project is concerned with
>             and I think it would remain that way.
>           + key: it's always a reference to an instance (well, unless
>             it's not null), but that might be either kind of instance.
>       o must hold an inline object
>           + Condition: it's a subtype of InlineObject (perhaps by
>             being an `inline class` itself that is a val-default....
>             or by being primitive?); or bears an explicit `.val`.
>           + this is probably what the term "inline /type/" should
>             refer to.
>       o might hold either?
>           + or can this not happen because you would be forced to
>             write `.ref/.val`?

I'm not sure I follow what this section is asking?

>   * Primitive types /are/ inline classes, full stop.
>       o It's just that for compatibility reasons they get to have
>         custom-built reference projections instead of only the
>         general-purpose `Foo.val` treatment.

That's where we hope to get, but we will have to break a few eggs to get 

Egg #1: synchronization on wrappers.  Today, you can (but should not) 
synchronize on a j.l.Integer; to achieve this goal, this would throw.
Egg #1a: possibly, depending on where we land for weak refs, a similar 
thing will happen for WR<Integer>
Egg #2: equality.  Today, equality on wrappers is identity based, and 
the primitive cache makes some small wrappers == to each other; to get 
to this goal, == would actually be equality on the contained number.  
This is arguably better, but different from how it works now.

More information about the valhalla-spec-observers mailing list