RFR (S): Class.getSimpleName() should work for non-JLS compliant class names
john.r.rose at oracle.com
Fri Apr 10 19:15:01 UTC 2015
On Apr 9, 2015, at 4:16 PM, David Holmes <david.holmes at oracle.com> wrote:
> Hi Vladimir,
> On 10/04/2015 12:51 AM, Vladimir Ivanov wrote:
>> David, thanks for the feedback!
>> Updated webrev:
>> I restored original logic and call into VM only if it fails.
> This change seems fine to me, but I think John may prefer to see getSimpleName implemented such that it always returns the name from the innerclass attribute. (Though perhaps with caching if performance is a concern?)
Yes, I do prefer the new logic instead of a combination of old and new.
Two concerns: Somebody somewhere might be relying on a bug in the old logic and be frustrated by seeing that bug fixed. This is a fine line to walk, but in this case I think the behavior change (if any) will show up in places where we already have outages. Perhaps someone is post-processing the result of getSimpleName to correct it. If so they may no longer need to do that.
Second, the new logic may itself have issues. A typical problem with the InnerClasses attribute is it is broken or stripped, or a related nestmate is missing. (See parallel thread in core-libs-dev on 8076264.) But the proposed change takes effect after a call to Class.getEnclosingClass, which uses InstanceKlass::compute_enclosing_class_impl and accesses the same InnerClasses record.
This leads me to a comment: The common code (two uses of InnerClassesIterator with klass_name_at_matches) should be factored out. The factored subroutine should search out the self-record in the InnerClasses attribute. The logic is tricky enough that I'm uncomfortable with it being duplicated. Key issue: We don't want to load any classes doing this lookup.
(For bonus points, factor the third use in that file, the one that looks not for self but for children-of-self. Add a boolean flag that varies the test.)
Bottom line: The new logic should replace the old.
>>>> The logic to compute simple name (Class.getSimpleName()) for
>>>> inner/nested/local classes is tightly coupled with Java naming scheme
>>>> and sometimes fails for classes generated from non-Java code.
>>> Meta-question: if this is non-Java code then what does/should
>>> "simpleName" even mean? "Returns the simple name of the underlying class
>>> as given in the source code." If there is no (java) source code does
>>> this have any meaning? Should it return "" ?
>> My reading of the JVMS is that InnerClasses attribute can be freely used
>> by non-Java compilers to store simple names for classes they generate.
>> The current wording for inner_name_index doesn't mention Java language.
> True, but it does refer to source code: "represents the original simple name of C, as given in the source code from which this class file was compiled. " which seems misplaced as we're discussing classes for which there is no source code as such.
It could be non-Java source code. In any case, the indicated component of the "binary name" of the class is well-defined, whether or not it occurs in source code. Note also that classes might be generated by ASM (no source code per se) but we still have a reasonable expectation of reflecting on them, even though reflection is (mostly) defined in terms of source code constructs.
> Anyway it just flags to me that perhaps these Class methods need to be generalized a bit given the support for non-Java languages on the JVM. But that's for core-libs folk to decide.
Yes, I think that's the issue. The multi-language folks (like me) would be disappointed to hear that we decided that non-Java languages don't have any needs here; they do.
>> Best regards,
>> Vladimir Ivanov
>>>> Instead of parsing class name and try to extract simple name based on
>>>> JLS rules, the fix I propose is to use InnerClasses attribute from the
>>>> class file. Simple name is already recorded there.
>>>> JVMS-4.7.6: The InnerClasses Attribute
>>>> "inner_name_index: If C is anonymous (JLS §15.9.5), the value of the
>>>> inner_name_index item must be zero. Otherwise, the value of the
>>>> inner_name_index item must be a valid index into the constant_pool
>>>> table, and the entry at that index must be a CONSTANT_Utf8_info
>>>> structure (§4.4.7) that represents the original simple name of C, as
>>>> given in the source code from which this class file was compiled."
>>>> Since I consider backporting the fix into 8u60, I'd like to hear
>>>> opinions about backward compatibility of such change.
>>>> As an alternative solution, I can restore original logic and consult
>>>> InnerClasses attribute when class name parsing logic fails.
>>> I think I'd prefer to see the VM call only used as a fallback if the
>>> regular parsing fails. That would prevent any compatibility issues for
>>> the 8u backport (ignoring tests that deliberately generate invalidly
>>> named classes).
>>>> Testing: regression test, jck-runtime/java_lang, jdk/test/java/lang/
>>>> Best regards,
>>>> Vladimir Ivanov
More information about the core-libs-dev