RFR 8028497: SIGSEGV at ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)

Coleen Phillimore coleen.phillimore at oracle.com
Tue Apr 1 03:30:11 UTC 2014


You are right.  The old code would use the Klass->mirror link to clear 
out the mirror->Klass link so that we didn't have a pointer to a 
deallocated Klass in a mirror that might still be live for CMS marking.
The new code doesn't set the Klass->mirror link unless the mirror fields 
are completely allocated, so we could crash if we deallocated a Klass 
before the mirror because the mirror would still be pointing to the class.

I've caught failed allocation in mirror fields so that the link is reset 
in this change.  The only change is to the function create_mirror calls 

open webrev at http://cr.openjdk.java.net/~coleenp/8028497_3/

Tested with vm.parallel_class_loading.testlist which do hit this case.


On 3/27/14 1:11 PM, Erik Helin wrote:
> Hi Coleen,
> thanks for taking on this tricky bug! However, I think there is an 
> issue with your fix :(
> This case is for "normal" (non-CDS) class loading:
> 1. We create the mirror
> 2. We set up the link from the mirror to the Klass
> 3. We fail somewhere before setting the link from the Klass to the
>    mirror, for example when allocating the Java object for the init
>    lock.
> 4. The destructor for ClassFileParser is run and we add the Klass to the
>    deallocate list.
> 5. A CMS concurrent cycle is started, it will treat all objects in eden
>    + the survivor area as roots (including the mirror).
> 6. The above concurrent cycle enters the "final remark" phase and does
>    class unloading which will call free_deallocate_list() which will
>    return the memory for the Klass added to the deallocate list in step
>    4 to Metaspace.
> 7. The concurrent CMS cycle ends (the mirror is still in eden or young)
> 8. A new concurrent CMS cycle is started, the mirror will once again be
>    treated as a root since it is still in eden or young (no YC has been
>    done).
> 9. During the initial marking, CMS will now follow the pointer to the
>    Klass that the mirror is mirroring, but we returned this memory to
>    Metaspace in step 6.
> 10. We now crash :(
> Unfortunately, I don't have any suggestion on how to solve the above 
> scenario at the moment.
> Thanks,
> Erik
> On 2014-03-27 16:04, Coleen Phillimore wrote:
>> On 3/27/14 8:39 AM, Stefan Karlsson wrote:
>>> On 2014-03-27 03:00, Coleen Phillimore wrote:
>>>> I have a new version of this change based on code that Ioi noticed
>>>> that I was missing for making restoring the method's adapters
>>>> restartable if something fails during restore_unshareable_info, and
>>>> the constant pool resolved references array.
>>>> Please review this new code (only method.* and instanceKlass.cpp and
>>>> constantPool.cpp have differences).  Reran tests with CDS and CMS.
>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/
>>> This seems good. I'd prefer if someone with more experience with CDS
>>> also Reviews the change.
>> Thanks, Ioi said he'd look at it.
>>> Small comment:
>>> http://cr.openjdk.java.net/~coleenp/8028497_2/src/share/vm/oops/instanceKlass.cpp.udiff.html 
>>> -    methodHandle m(THREAD, methods->at(index2));
>>> -    m()->link_method(m, CHECK);
>>> -    // restore method's vtable by calling a virtual function
>>> -    m->restore_vtable();
>>> +    Method *m = methods->at(index2);
>>> +    m->restore_unshareable_info(CHECK);
>>> Do we really want to use a raw Method * here. I know that we wrap the
>>> method in a handle inside  m->restore_unshareable_info, but it's not
>>> clear from the this context. I'm thinking about this issue:
>>> https://bugs.openjdk.java.net/browse/JDK-8004815
>>> On the other hand, I guess we can't redefine a class before it has
>>> been defined. But maybe it would be good to not promote the usage of
>>> Method* over calls that might hit a Safepoint.
>> I think you're right, we don't use 'm' after the potential safepoint and
>> we can't really redefine this class yet anyway but I'll restore the code
>> because we don't want to encourage uses of unhandled Methods unless
>> proven to be harmless.
>> Thanks!
>> Coleen
>>> thanks,
>>> StefanK
>>>> Thanks,
>>>> Coleen
>>>> On 3/26/14 5:15 PM, Coleen Phillimore wrote:
>>>>> Hi Jon,
>>>>> Thank you for looking at the code changes.
>>>>> On 3/26/14 12:24 PM, Jon Masamitsu wrote:
>>>>>> Coleen,
>>>>>> Did you remove Class_obj_allocate() because it was only
>>>>>> called in that one spot and  use obj_allocate() did most
>>>>>> of the work?
>>>>> I deleted it because GC doesn't need to know how these javaClasses
>>>>> are set up and I spent way to long looking for this code and didn't
>>>>> find it where it belongs, so I removed this. This Class_obj_allocate
>>>>> is an artifact of the permgen code.
>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/instanceMirrorKlass.cpp.frames.html 
>>>>>>  373   // the size in the mirror size itself.
>>>>>> Drop the second "size".  You know how we GC guys are
>>>>>> getting with our comments :-).
>>>>> Yes, I noticed that.  Thanks - fixed.
>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/klass.cpp.frames.html 
>>>>>>  498 void Klass::restore_unshareable_info(TRAPS) {
>>>>>>  499   // If an exception happened during CDS restore, some of
>>>>>> these fields may already be
>>>>>>  500   // set.  We leave the class on the CLD list, even if
>>>>>> incomplete so that we don't
>>>>>>  501   // modify the CLD list outside a safepoint.
>>>>>>  502   if (class_loader_data() == NULL) {
>>>>>>  503     ClassLoaderData* loader_data =
>>>>>> ClassLoaderData::the_null_class_loader_data();
>>>>>> I can see that this works with the current CDS (sharing only
>>>>>> classes loader
>>>>>> with the NULL class loader).  Will it work if the shared classes
>>>>>> are loaded
>>>>>> with an application class loader?
>>>>> I hope Ioi can answer that.  I assume so because the classes don't
>>>>> change class loaders if restoring shareable info fails.
>>>>> Ioi pointed out to me (not on open list) that I missed making
>>>>> link_method() and creating the constant pool resolved references
>>>>> array restartable, so I'm testing a fix for that also and I'll post
>>>>> an update later.   But you've seen most of it.
>>>>> Thanks!!
>>>>> Coleen
>>>>>> Rest looks good.
>>>>>> Jon
>>>>>> On 3/25/2014 1:42 PM, Coleen Phillimore wrote:
>>>>>>> Summary: Keep class in CLD::_klasses list and mirror created for
>>>>>>> CDS classes if OOM during restore_shareable_info(). This keeps
>>>>>>> pointers consistent for CMS.
>>>>>>> This change makes restoring the mirror for shared classes
>>>>>>> restartable if there is an OOM in the middle of
>>>>>>> restore_unshareable_info.   The mirror field in the classes and
>>>>>>> CLD link is not cleaned up anymore.  The class isn't marked loaded
>>>>>>> though so must be loaded again.
>>>>>>> Tested with Stefan's test case with hard coded OOM in jvm, so
>>>>>>> can't add the test.   Also ran all the tests against -client and
>>>>>>> server with -Xshare:auto and with -XX:+UseConcMarkSweepGC.
>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497/
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8028497
>>>>>>> Thanks,
>>>>>>> Coleen

More information about the hotspot-dev mailing list