ClassValue performance model and Record::toString
john.r.rose at oracle.com
Wed Apr 8 22:46:08 UTC 2020
This note is prompted by work in a parallel project, Amber,
on the implementation record types, but is properly a JVM
question about JSR 292 functionality. Since we’ve got a quorum
of experts here, and since we briefly raised the topic this morning
on a Zoom chat, I’ll raise the question here of ClassValue performance.
I’m BCC-ing amber-spec-experts so they know we are takling
about this. (In fact the EGs overlap.)
JSR 292 introduced ClassValue as a hook for libraries (especially
dynamic language implementations) to efficiently store library
specific metadata on JVM classes. A general use case envisioned
was to store method handles (or tuples of them) on classes, where
a lazy link step (tied to the semantics of ClassValue::get) would
materialize the required M’s as needed. A specific use case was
to be able to create extensible v-table-like structures, where a
CV would embody a v-table position, and each CV::get binding
would embody a filled slot at that v-table position, for a particular
The assumption was that dynamic languages using CV would
continue to use the JVM’s built-in class mechanism for part or
all of their own types, and also that it would be helpful for a
dynamic language to adjoin metadata to system classes like
java.lang.String. Both tactics have been used in the field.
In the future, template classes may provide an even richer
substrate for the types of non-Java languages.
JSR 292 was envisioned for dynamic languages, but was built
according to the inherent capabilities of the JVM, and so
eventually (actually, in the next release!) it has been used
for Java language implementations as well (indy for lambda).
ClassValue has not yet been used to implement Java language
features, but I believe the time may have come to do so.
The general use case I have in mind is an efficient translation
strategy for generic algorithms, where the genericity is in the
receiver type. The specific use case is the default toString method
of records (and also the equals and hashCode methods).
The logic of this method is generic over the receiver type.
For each record type (unless that record type overrides its
toString method in source code), the toString method is
defined to iterate over the fields of the record type, and
produce a printed representation that mentions both the
names and values of the fields. The name of the record’s
class is also mentioned.
If you ask an intermediate Java coder for an implementation
of this spec., you will get something resembling an interpreter
which walks over the metadata of “this.getClass()” and collects
the necessary strings into a string builder.
If you then deliver this code to users, after about a microsecond
you will get complaints about its performance. We’re old hands
who don’t fall for such traps, so we asked an experienced coder
for better code. That code runs the interpreter-like logic once
per distinct record type, collecting the distinct field accesses
and folding up the string concatenations into a spongy mass
of method handles, depositing the result in a cache. That’s
(Programming with method handles is, alas, not an improvement
over source code. Java hasn’t found its best self yet for doing partial
evaluation algorithms, though there is good work out there, like
In order not to have bad performance numbers, we are also
preconditioning the v-table slot for each record’s toString
method, as follows:
0. If the record already has a source-code definition, do nothing
1. Otherwise, synthesize a synthetic override method to
Object::toString which contains a single indy instruction.
(There is also data movement via aload and return.)
2. Set up the indy to run the fancy partial MH-builder mentioned
above, the first time, and use the cached MH the second time.
In essence, toString works like a generic algorithm, where the
generic type parameter is the receiver type. (If we had template
methods we’d have another route to take but not today…)
This works great. But there’s a flaw, because it doesn’t use ClassValue.
As far as I can tell, it would be better for the translation strategy to
*not* generate synthetic methods, but instead to put steps 1. and 2.
above into a plain old Java method called Record::toString. This
method would call x=this.getClass() and then y=R_TOSTRING.get(x)
and then y.invokeExact(this).
Non-use of CV is not the flaw, it’s the cause of the flaw. The
flaw is apparent if you read the javadoc for Record::toString.
It doesn’t say there’s a method there (because there isn’t) but
it says weaselly stuff about “the default method provided does
this and that”. In a purely dynamic OOL, the default method
is just method bound to Record::toString, and it’s active as
long as nobody overrides it (or calls super.toString). People
spend years learning to reason about overrides in OOLs like
Java, and we should cater to that. We could in this case, but
we don’t, because we are pulling a non-OOL trick under the
covers, and we have to be honest about it in the Javadoc.
So there’s a concern with CV (though I don’t think an overriding one)
that we don’t get to step 3 and profit, because the lookups of x and
y appear to be interpreter-like overheads. Won’t record types suffer
in performance by having those extra indirections happen every time
toString (or equals or hashCode) is called?
(This problem isn’t unique to Records, but Records are an early
case of this sort of problem, of the need for link-time optimization
of inheritable OO methods. If you look around you might find
similar opportunities with interfaces and default methods.)
This is where CV has to get up out of its chair and make itself
useful. I think the JVM should take three steps, two sooner and
the other later, and both without changing any public API points.
1. Encourage the JIT to constant-fold through ClassValue::get.
This would fold up the proposed Record::toString method at
all points where the type of the receiver record is known to the
JIT. (That’s most places.)
2. Ensure that, if the operand to CV::get is not constant, we
get good code anyway. (This is already true, probably.) Look
for any small optimization cleanups getting through CV::get
and on into MH::invokeExact.
3. Later on, consider v-table slot splitting in response to
polymorphic methods which perform CV::get on their
receiver. In general, v-table slot splitting is the practice
of installing differently compiled code in different v-table
slots of the same method. It can make sense if the JIT
can do different jobs optimizing the same code on different
classes of receivers. It’s usually a heroic hand optimization,
but can also be done by the JVM.
One more item, not directly related to CV’s but related
to the above optimizations:
4. We should invest in one or more auto-bridging features
in the JVM, where a call site (such as MyRecord::toString)
can be rerouted through an intermediate step before it gets
to the built-in target mandated by the JVMS (such as
Object::toString or Record::toString), and can also be
routed somewhere even if the supposed target method
doesn’t even exist. Perhaps the target method symbolic
reference is Foo::equals(int) and statically matching method
is Foo::equals(Object); normally the static compiler puts
in an auto-boxing step to fix the descriptor but there are
reasons to consider a more dynamic bridging solution.
Such a rerouting decision would be very naturally cached
in v-table slots, obviating some or all of step 3 above.
In the presence of feature #4, we might rewrite
Record::toString to (somehow) advertise that it
had no regular method body, but that it would be very
happy to bridge any and all calls, using some advertised
BSM, and decoupling the implementation from ClassValue.
This implementation decision could be hidden from
the user (and the Javadoc), but only if we did the ClassValue
trick today, so we could advertise Record::toString as
a regular old object-oriented method (with clever
optimizations inside its implementation, natch).
So, let’s take ClassValue off the bench, and start warming
More information about the valhalla-spec-observers