Moving from VVT to the L-world value types (LWVT)

Frederic Parain frederic.parain at
Tue Jan 16 20:56:11 UTC 2018

Here’s an attempt to bootstrap the L-world exploration, where java.lang.Object
is the top type of all value classes (as discussed during the November meetings
in Burlington).

This proposal tries to evolve the JVMS with a small set of changes to have an
implementable specification of the L-world. Instead of trying to add Q/R/U-types to
the JVMS, the approach is to expend the JVMS notion of “reference” to cover
both regular classes and value classes. The notion of “class” has also be extended
to cover both, but when needed, it is possible to specify an “object class” or a
“value class”, or respectively, “an instance of an object class” vs “an instance of
a value class”. The “Q…;” format is still used for value class types, but the “;Q”
trick is gone.

The attach document contains sections of the JVMS that have been modified
to implement the L-world. The text doesn’t have change bars, so people are
encouraged to read each modified section entirely to see if it is consistent to
cover all cases of the L-world.

Here’s a quick summary of the changes with some consequences on the HotSpot code:
  - all v-bytecodes are removed except vdefault and vwithfield
  - all bytecodes operating on an object receiver are updated to support values as well,
    except putfield and new
  - single carrier type for both instances of object classes and instances of value classes
  - this carrier type maps to the T_OBJECT BasicType
  - T_VALUETYPE still exists but its usage is limited (same purpose as T_ARRAY)
  - qtos TosState is removed
  - JNI: the jobject type can be used to carry either a reference to an object or an
           array or a value. The type jvaluetype, sub-type of jobject, is used when only
           a value class instance is expected
 - Q…; remains the way to encode value classes in signature (fields and methods)
 - In the constant pool, the CONSTANT_CLASS_info entry type is used to store a
   symbolic reference to either an object class or a value class
 - the ;Q escape sequence is not used anymore in value class names

One important point of this exercise is to ensure that the migration of Value Based Classes
into Value Classes is possible, and doable with a reasonable complexity and costs. In addition
to the JVMS update (and consistent with the JVMS modifications), here’s a set of proposals
on how to deal with the VBC migration. 

Migration of Value Based Classes into Value Classes:
  - challenges:
      - signature mismatch
      - null
      - change in behavior

  - proposal for signature mismatch:
       - with LWVT, value class types in signatures are using the Q…; format
       - legacy code is using signature with L…; format (because VBC are object classes)
       - methods will have two signatures:
         - true signature, which could include Q…; elements 
         - a L-ified signature where all Q…; elements are re-written with the L…; format
         - method lookup still works by signature string comparisons
         - the signature of the method being looked up will compared against both the
           true and the L-ified signatures, if the looked up signature matches the L-ified
           signature but not the true signature, it means a situation where legacy code
           is trying to invoke migrated code has been detected, and additional work might
           be required for the invocation (actions to be taken have to be defined)
        - signature mismatch can also occur for fields, this is still being investigating, the
          proposal will be updated as soon as we have a solution ready to be published

  - proposal for null references leaking to migrated code
      - having a null reference for a Value Based Class variable or field is valid in legacy code
        but it becomes invalid when the Value Based Class has been migrated to a Value Class
      - trying to prevent all references with a value class type to get a null value would be very
        expensive (it would require to look at the stackmap for each assignment to a local variable)
     -  the proposed solution is to allow null references for local variable and expression stack slots,
        but forbid them for fields or array elements (bytecodes operating on fields and array have to
        be updated to throw a NPE whenever a null reference is provided instead of a value class
     - null references are likely to be an issue for JIT optimizations like passing values in registers
       when a method is invoked. The proposed solution is to only allow null references for value classes
       in legacy code, by detecting them and blocking them when leaking to migrated code. The
       detection can be done at invocation time, when a mismatch between the signature expected
      by the caller and the real signature of the callee is detected (see signature mismatch proposal above)
    - the null reference should also be detected and blocked when it is used as a return value and the
      type of the value to be returned is a value class type 

In addition to the JVMS update, here’s a chart trying to summarize the new checks that will have to
be added to existing bytecode when moving the vbytecodes semantic in to a* bytecodes. The categories
in the chart are not very precise, but we can use it as a starting point for our discussions. The chart
can also help defining which experiments could be done to estimate the costs of the different additional
checks needed to be added to existing bytecodes.

All these are preliminary works for a proposal to implement the L-world value types and not a definitive
specification. This has to be analyzed and discussed before any attempt to implement it starts.
Feel free to send feedback, comments, other proposals, etc.

Thank you,

-------------- next part --------------

More information about the valhalla-dev mailing list