Valhalla EG notes December 5, 2018

Karen Kinnear karen.kinnear at
Mon Dec 17 22:38:08 UTC 2018

Attendees: Frederic, Tobi, Simms, John, Karen
corrections welcome

Tobi question: JVMS for LW2 is Oct 9 - have there been updates?
Frederic: known to be incomplete, could post an update, but this is not the final one for LW2/LW2+ - which
is a bit of a moving target

1. Nestmates part 2: Lookup.defineClass()
Karen summary:
  This will not be in JDK12
  Struggling with restrictions of non-findable classes. For example, the verifier and bytecodes can not
  “find” the class in the loaded class cache. One obvious example here is that we need special treatment
  in the verifier for “this_class”. Exploring.
  At some point we will be asking for additional examples from Unsafe.defineAnonymousClass expected behaviors,
both without and with constant-pool patching so we can clarify new behavior, which we expect to be more restrictive.

Some of the behavior for UdAC is described in the unsafe.cpp source. 
Appended please find an extract from an earlier (2013) discussion  that did not make it into the source - feedback welcome

2. BootstrapMethod evolution
To support JDK-8211334, ConstantDesc types should be Constable we need to enhance JVMS BSM handling.
In the past we have had some exploration of potential new bootstrap invocation patterns such as
   - lazy resolution of arguments
   - potentially changing the requirements for the initial static arguments

We have not restarted the JVMS draft process.
Will need to ensure any new proposals are not hotspot/implementation specific, rather define the expected
semantics including timing behaviors.

John: new concept of “expression mode” bootstrap methods
If you were to pass ConstantDescs to a bootstrap method, the goal is to lazily resolve arguments, i.e. rather than
evaluating subexpressions, treat as if you are seeing an AST.
Initial proposal: Take a currently illegal bsm form: which throws an exception today,
such as:
  a condy (not indy) with no Lookup and use it pass unresolved arguments

3. JVMTI add/delete private methods progress
Exploring treating deletion “as if” replacing the method body with a body that throws NoSuchMethodError.
Working on a prototype as well as potential rewording of JVMTI spec.

4. Value Types
Karen: open design issues:
- uninitialized values
   - nullability
- substitutability

a. Uninitialized value handling:
John: updated one proposal for handling uninitialized values with clarifications: <>

ed. note: we need a new term here: In LW2 we have a Q-Point which is null-free and an L-Point which is nullable.

John is proposing an opt-in approach for value types which do NOT have a valid all zero/null default value.
For this email - I would like to call these “invalid default value types” so we do not confuse these with existing
null-free/nullable pairs. Feel free to find a better naming in future.

Karen: For invalid default value types - do we need to restrict withfield to constructors/factories? Today we allow author or nestmate, so there is no enforcement of validity - whether that is simply not the invalid default, or other type-specific invariants.
John: No change needed - author has final say which combinations are valid and who can be a nestmate

Karen: The proposal says this should be “rare”? Why do you expect that? Because there will be performance and density costs which will discourage overuse?
John: Use cases for opting in include: value-based classes and inner classes.
For regular value classes - we want “works like an int” - including not being nullable

Karen: I think there are three reasons we might want a proposal like this:
   a) validity
        - default value is not valid for these types
        - we do not want to risk using an invalid default value for
           - a receiver
           - unsafe access to fields and methods
       - acknowledge that in LW1/LW2 we ensure that we run the static initializer before handing out a default value, always - whether this is field or array initialization, defaultvalue, etc.
       - that is not the same as having run an instance initializer/constructor/factory with validity checking

      note: for value types with valid default values, we do not have the same risk

  b) language
       - there may be value to a user in being able to assign null to a value instance/check against null
       - not clear how important this is from a language perspective/traded off with the confusion of only available for
         some value types
       - note that this could be accomplished today with L-Val, which would be the starting point for value-based-classes

  c) performance
       - if a value-based-class were to migrate to a value type, I believe there are fewer optimizations available
       - still studying the details here for alternative approaches

We are exploring possible approaches similar to this all at the language level vs. in the vm as in John’s proposal
Karen: language level appears to be more fragile - potential for broken bytecodes and unexpected “3rd” states
           performance optimizations can count on two states - null or valid

Karen: if there is no pivot field - would the vm inject a field?
John: NO - all fields 0 is considered a vull
          - pivot field allows faster vull test

5. LW2/LW2+ timing:
Proposal is no early access binaries for LW2 itself - sources will be ready soon
Want incremental prototypes - e.g. acmp/isSubstitutable based on a flag
… for other prototypes

LW10: no earlier than 13.

6. Tobi: aaload
in past: bytecode tells width of array element
For value types - compilers need to perform a profile, so there is a cost to determine element width

propose: add a new bytecode with a constant pool entry of the component type

Karen: we had been considering a new bytecode or bytecode changes for arrays
e.g. anewarray - today has component type
multianewarray - has signature of created type
it would be helpful to have an array creation bytecode with the resultant signature in constant pool

Frederic: this is a different issue than aaload dynamically discovering size
Simms: vaload/vastore: vm decides flattening at runtime - so a runtime check is required
             could put the size in the array header - cheaper

John: discourage new bytecodes - not as cheap.
Karen: constant pool would not be sufficient information - runtime need to know if array is actually flattened or not
Tobi: flattened or not is an implementation detail

John: prefer to remove bytecodes rather than add
    if we split aaload/vaload - aaload could be faster since no check required
    could have a header bit on the array
    header is already needed to do a range check

Appendix: UdAC “anonymous” meaning: - feedback welcome

The feature being used is: 
// When you load an anonymous class U, it works as if you changed its name just before loading, 
// to a name that you will never use again. Since the name is lost, no other class can directly 
// link to any member of U. Just after U is loaded, the only way to use it is reflectively, 
// through java.lang.Class methods like Class.newInstance. 
(quoted from unsafe.cpp, which is the documentation currently lives) 

In the reduced test case, the original name U is "Anon". The anonymous class loader loads the bytes for "Anon" but renames it something like "Anon/1234567" (note the illegal class name). In order for "Anon/1234567" to see itself, its own CONSTANT_Class constant pool entry is patched to point to the "live" class pointer for "Anon/1234567", so that CONSTANT_Class[CONSTANT_Utf8[Anon]] is never the subject of resolution. 

The anonymous class loader does not attempt to find other occurrences of the name U, besides the CONSTANT_Class. For example, if there were a string literal, "my name is Anon!" it would not be transformed by the loader. Obviously. 

More subtly (and in a dark area of the specification) if the class name appears in a descriptor string, such as CONSTANT_Utf8("[LAnon;I)V"], the loader will not scan such strings to find the substring "Anon" and attempt to transform it. This leads to surprising behavior if the anonymous class (or the named template class used to set it up) contains a self-reference in a "mangled" form, as a substring like "LAnon;". 

This means that anonymous classes cannot (without special, deep, and unrewarding transformations of UTF8 strings) be mentioned in field or method types, or in signatures, or (here's a corner case) as the subject of an multinewarray instruction. 

The bottom line is that a CONSTANT_Class constant can refer to an anonymous class, but a signature or descriptor string cannot. 

The workaround in this case is to type the field ("arg") as an Object (or other named superclass of Anon/1234567). For type safety, a "checkcast" instruction should be issued after each "getfield Anon.arg:Anon". This will work because the "checkcast" instruction takes a CONSTANT_Class as its parameter, not a CONSTANT_Utf8.

More information about the valhalla-spec-observers mailing list