Request for review (#4) 7174978: NPG: Fix bactrace builder for class redefinition

Coleen Phillimore coleen.phillimore at
Tue Jan 15 13:12:58 PST 2013


Thank you for the prompt review!

On 01/14/2013 10:48 PM, David Holmes wrote:
> Hi Coleen,
> Thanks for the detailed explanation. Overall approach sounds good to 
> me. A few comments ...

> Copyrights to 2013 :)

It's unclear whether we have to do this.   I think there is going to be 
some RE process that fixes these files.

> Are you sure you can remove the old merlin API? :D But note that it is 
> still referenced in jdk/src/share/javavm/export/jvm.h so a follow up 
> cleanup on the JDK side is needed. (Hmmm does that also mean a CCC?)

I will file a bug against JDK to clean this up and see what hoops I have 
to go through to remove an unused entry.   This is the only file where 
it appears.   It must have had a CCC to remove it in the first place.

> ---
> src/share/vm/prims/jvmtiRedefineClasses.cpp
> 2411   constantPoolHandle smaller_cp(THREAD,
> 2412   ConstantPool::allocate(loader_data, scratch_cp_length, THREAD));
> Can you indent 2412 please so it is more obvious allocate is being 
> used to pass a parameter to the handle constructor. (The old code had 
> the same flaw).


> ---
> src/share/vm/oops/constantPool.hpp
> !   void add_version(int version)           {
> !     if (version == -1) {
> !       _saved._version = version;  // keep overflow
> !     } else {
> !       int new_version = version++;
> !       _saved._version = version == INT_MAX ? -1 : version;
> !     }
> !   }
> new_version is unused here but I think you meant to do
> _saved._version = new_version == INT_MAX ? -1 : version;

This was a pre-code-review cleanup...   I didn't mean to assign 
version++ into new_version.   The second line stays the same.  I 
retested this fix.

> ---
> src/share/vm/classfile/javaClasses.cpp
> 1350   // Allocate temporary buffer with extra space for formatting 
> and line number
> 1351   char* buf = NEW_RESOURCE_ARRAY(char, buf_len + 64);
> Can you elaborate on the value 64?

Not my fault?   Someone must have calculated that 64 is enough to hold 
the potential things printed to the buffer below.  I recalculated and 64 
is sufficient because source_file_name is added to the buf_len earlier 
so all that's left is the line numbers and

"(Unknown Source) (nmethod  0x1234567812345678)"

> 1370         // Neither soucename and linenumber
> Typo: soucename
> And it should be "Neither sourcename nor linenumber"


> 1425         // NULL mirror means end of stack trace
> 1426         if (mirror.is_null()) goto handle_cause;
> Don't like the goto :) But I'm having trouble following all the 
> looping in this section of code. Is a NULL mirror intentional or does 
> it indicate an error elsewhere? I'm trying to see why encountering a 
> NULL mirror in the inner for loop means the enclosing while loop is 
> also done - and if that loop would terminate "naturally" if you just 
> did a break from the for loop ?

I don't like the 'goto' either.   It's 2013 as you point out earlier ;)  
I had a version that extracted this code getting rid of the goto, but 
opted for the lesser change.   Yes, a NULL mirror means there are no 
more stack trace elements in that chunk.   The chunk (length 32) is 
preallocated, preinitialized to zero, and filled in as method/mirror/bci 
elements are found on the stack.  When you hit 0 for mirror, that's the 
end.  0 for method id doesn't mean it's the end.   There's an assert 
that we never add a NULL mirror in BacktraceBuilder::push.

This code is only called for exceptions thrown at initialization time.  
It's hard to test this code since most exceptions during initialization 
in universe_post_init() cause

#  Internal Error 
pid=10369, tid=47170653824768
#  assert(resolved_method->method_holder()->is_linked()) failed: must be 

Should be another CR, which I thought we'd already fixed!

> 1434   handle_cause:
> 1435     {
> 1436       EXCEPTION_MARK;
> 1437       JavaValue result(T_OBJECT);
> This seems to be hiding the "result" variable declared at 1409 at the 
> top of the while loop ??

Yes, it does.  I renamed this "result" to "cause".
> 1682     // Count element in remaining partial chunk.  Non-null mirror 
> means theres
> 1683     // the stack trace element is saved.
> This comment doesn't read correctly.


     // Count element in remaining partial chunk.  NULL value for mirror
     // marks the end of the stack trace elements that are saved.

> 1735   // SystemDictionary::stackTraceElement_klass() will be null for 
> pre-1.4 JDKs
> 1736   assert(JDK_Version::is_gte_jdk14x_version(), "should only be 
> called in >= 1.4");
> Seems obsolete now - chuck it out with the rest of the merlin code?

With pleasure.


> Cheers,
> David
> ------
> On 15/01/2013 12:36 PM, Coleen Phillimore wrote:
>> Summary: Remove Method* from backtrace but save version so redefine
>> classes doesn't give inaccurate line numbers. Removed old Merlin API
>> with duplicate code.
>> Sorry, this is so long but I want to explain this change properly.
>> When exceptions are thrown, the VM saves a backtrace for each which may
>> or may not be printed or saved for later. For speed, the information
>> that the VM saves are arrays 32 of Method* and bci. For permgen
>> elimination we added an array of mirrors (instances of java_lang_Class),
>> so that the class loader owning the Method* pointer doesn't get unloaded
>> and deallocated.
>> The problem with redefine classes is that the method can be redefined
>> and the Method* pointer can be deallocated even if the class loader
>> owning the Method* is kept alive. Printing or touching the backtrace
>> later will crash the VM.
>> This is my 4th attempt at fixing this problem. The first that was
>> reviewed, saved a side structure for each backtrace created so that we
>> can walk the Method* pointers and keep them from being deallocated. This
>> change had an unfortunate side structure to manage, and had also a 6.1%
>> slowdown in javac. The second attempt was to predigest the Method* into
>> the data needed for java_lang_StackTraceElement into method_name,
>> class_name, file_name, and line_number, so that the Method* doesn't need
>> to be saved. This had a 18% slowdown in javac.
>> The 3rd attempt is to look for a way to have GC handle backtraces and
>> mark methods as on_stack but adding a InstanceThrowableKlass and
>> determining which closures needed special actions for the array of
>> Method* pointers is a lot of additional special case code to the garbage
>> collectors. It's unclear how this would work actually, but we might have
>> to revisit this idea for java_lang_invoke_MemberName.
>> The 4th attempt is to save the method_idnum, bci, version_number, and
>> mirror. From the method_idnum, we can get the Method*. The version
>> number is updated for redefine classes. There was an unused field that I
>> repurposed. If the method's version doesn't match the backtrace, we do
>> not return the source_file and line_number from the Method* because it's
>> changed. You get a backtrace of the form:
>> java.lang.RuntimeException: Test exception
>> at RedefineMethodInBacktraceTarget.methodToRedefine(Unknown Source)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke( 
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke( 
>> at java.lang.reflect.Method.invoke(
>> at
>> RedefineMethodInBacktraceApp.getThrowableFromMethodToRedefine( 
>> at
>> RedefineMethodInBacktraceApp.doMethodInBacktraceTest( 
>> This fix for the Method* crash does not add to the footprint of the
>> backtraces (idnum is short and bci and version number is packed into an
>> int). It does not degrade performance of javac (or refworkload). The
>> cost is to not support getting a source file and line number from a
>> method that was redefined after being saved in a backtrace. Note that
>> this change does not return the wrong information. A CR can be filed for
>> somebody to someday lift this restriction if someday we don't care about
>> performance or a new solution is found.
>> This might be hard to review as a diff because I refactored duplicated
>> code.
>> The webrev:
>> Bug link:
>> Tested with NSK quick testlist, runThese jck tests, JPRT, refworkload.
>> Thanks,
>> Coleen

More information about the hotspot-runtime-dev mailing list