JEP draft: Better-defined JVM class file validation
daniel.smith at oracle.com
Tue Jun 8 21:10:39 UTC 2021
> On Jun 6, 2021, at 10:17 AM, Remi Forax <forax at univ-mlv.fr> wrote:
> ----- Mail original -----
>> De: "daniel smith" <daniel.smith at oracle.com>
>> À: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
>> Envoyé: Vendredi 4 Juin 2021 19:11:11
>> Objet: Re: JEP draft: Better-defined JVM class file validation
>>> On Jun 4, 2021, at 10:41 AM, Dan Smith <daniel.smith at oracle.com> wrote:
>>> Posted a new JEP draft, here:
> Hi Dan,
> that's a lot of nice cleanups.
> If i can emit a critic, i would have preferred to have the text of the JEP to clearly separate spec issues from Hotspot issues,
> to the point where i wonder if it's not better to have two JEPs.
It's not something that is cleanly separable—a lot of times JVMS says X, HotSpot does Y, and the best way to reconcile the difference is with Z. But I have tried to call out exactly what HotSpot behavioral changes can be expected in each section (look for the bulleted lists).
>> • New errors caused by improper appearances of the Module, ModulePackages,
>> ModuleMainClass, and ConstantValue attributes.
> yes, those attributes should be checked by the VM an not only in Java code.
It's tricky because, strictly speaking, the JVM spec never "loads" a module class file, so never performs format checking on it. But we assert plenty of rules about what a properly-formed module class file looks like (see JVMS 4.1), and that seems like a reasonable thing to do, given that modules are a fundamental part of the runtime system. Exactly when those rules are enforced is left out of JVMS, and is up to libraries (like ClassLoader, for example), which should have their own analog to format checking.
>> • New errors caused by pre-51 class files that declare a useless method with
>> name <clinit> and 1 or more parameters.
> I don't see the point of this change, unlike all the others, this one will reject classes that were valid before.
It comes from trying to sort out the definition of "class initialization method".
Often, the name '<clinit>' is interpreted as an unambiguous indicator of a class initialization method. If I recall correctly, HotSpot has certain behaviors that are unique to methods named '<clinit>', like verifying them as if they are static, whether ACC_STATIC is set or not. This is derived from some assertions in 4.6 about "class initialization methods".
But there's also a notion that a class only has one initialization method, and that method has the expected descriptor (see 2.9.2, 5.5 step 9).
How do we reconcile these? By not letting anything through that is named '<clinit>' but that has the wrong descriptor. Yes, it's an incompatible change, but it seems less disruptive than pulling on the thread of "what does it mean to have a method named '<clinit>' that isn't a class initialization method, and what bugs do we have to track down and fix for this model to make sense?"
If we have evidence of these methods in the wild, that might push me to find a different solution, but otherwise I expect it's just a theoretical incompatibility that nobody will notice.
>> • Accepting class files with malformed optional attributes, even though those
>> class files could fail to load on an older JVM.
> It's not clear to me if this change is only for classfile version >= V18 or not.
> I would vote for having that change applied on all classes given it's an Hotspot bug, not a spec bug.
Everything in the JEP applies to class files of all version numbers, unless explicitly stated otherwise.
This is a significant change to both the spec and implementation, in this case expanding the set of legal class files. It's a bug fix, yes, but one that is resolved with a very broad stroke. The rationale is that reconciling all the differences between JVMS and HotSpot in this area would be a bunch of work that adds a lot of complexity to both, and all to serve... no good purpose, really. We'll all be happier (we hope) if we draw a bright, hands-off line for format checking of optional attributes, and leave further validation as an implementation/API detail.
>> Besides the risk to JVM users, there is some risk that, by relaxing the
>> constraints on optional attributes, downstream tools will be surprised by
>> unvalidated attribute contents in class files that can be successfully loaded.
> Actually, the fact that the InnerClasses attribute is validated by the VM is confusing because it implies that the VM uses the values of that attributes to check access which it does not.
> For me, the spec says that those attributes are optionals so Hotspot should behave as the spec says.
The spec says some things that are unclear about how/whether InnerClass attributes should be validated, but yes, agree that there will be less ambiguity if we're totally hands-off.
> And i'm a little worry about rejecting the ConstantValue attribute if the field is not static because i think i've seen a code using ConstantValue on all fields but i'm not able to recall where so i've not idea if it was in library used or not. I suppose this behavior will be gated with a V18 checks so it should not be an issue anyway.
Again, no, not expecting to gate this with a version check. (If we did, it's more trouble than it's worth, because we've now got to specify and implement two different behaviors, when the goal is simplification.)
The particular problem that led me here is that HotSpot currently doesn't recognize a ConstantValue attribute *at all* if it is applied to a non-static field. It is treated like a nonstandard attribute, just as it (rightly) is if applied to a method. I'm okay with specifying that an attribute is recognized, or not, depending on where it appears, but I would really prefer not to bring the access flags of a field declaration into it! (JVMS backs up the HotSpot interpretation in 4.7.2, but if we're going to have a special rule, it really belongs in 4.7.1. This feels like the kind of thing where somebody noticed an anomaly in the implementation and tried to document it in the spec, but didn't follow through on the implications.)
Again, if there is evidence of this that we can find in the wild, that could shift the balance in the compatibility conversation, and going out of our way to support it may be justified.
More information about the valhalla-spec-observers