RFR: 8213486: SIGSEGV in CompiledMethod::cleanup_inline_caches_impl with AOT
erik.osterlund at oracle.com
Wed Nov 21 14:26:04 UTC 2018
Thank you for the explanation.
I inserted some debugging code to be able to draw better conclusions.
I put in a bool that tags CompiledMethods that are visited during code
When I crash, I find the AOTCompiledMethod in the following state:
* The CodeCache::_unloading_cycle is 2, meaning there has been at least
one unloading that occurred since the VM started.
* My new boolean saying if it was found during code cache walking said
false; it has never been observed by GC code cache unloading.
* The _is_unloading_state of the AOT method is 2, meaning it was from
CodeCache::_unloading_cycle number 1. So there has been at least one
code cache unloading cycle since the AOT method was created.
* It is in the CompiledMethod state in_use.
* The corresponding CodeToAMethod entry has _state equal to in_use, and
the _aot pointer points right back to the AOT method as expected.
So the conclusions I can draw from this is that we are looking at an
AOTCompiledMethod that is in_use, alive, and published, but was created
at least one unloading cycle earlier, yet was *not* found by the evident
code cache unloading that has occurred between creating the AOT method
and the crash. In fact it has not been found in any code cache unloading
event at all.
So I am certain about the problem being that the AOT method is not found
during our code cache walk, and that the solution to that problem is to
move the epoch business to nmethod instead. However, it would be
desirable to know exactly why the AOT method isn't showing up in the
code cache walk, but I'm afraid I can't see right now exactly why it has
been missed out, only that it definitely has been missed out. :/
On 2018-11-20 19:08, Vladimir Kozlov wrote:
> To clarify.
> 'invalid' state is set only for AOT methods which never was registers
> (their state is 'not_set' and corresponding class was not initialized
> yet). Such methods are never visible to iterators and IC caches.
> If aot method is registered its state become 'in_use' and in such case
> it is treated and processed as normal nmethod. And AOTCompiledMethod
> structure is created only for such case.
> On 11/20/18 3:12 AM, Erik Österlund wrote:
>> Hi Dean,
>> Yeah I think I misunderstood what I observed. So what I know is that
>> in the code cache unloading, I'm not getting all is_alive() AOT
>> methods into my iterator, which messes up the assumptions made by the
>> epoch based scheme for AOT methods.
>> I noticed that in AOTCodeHeap::sweep_dependent_methods(int* indexes,
>> int methods_cnt) we make AOTCompiledMethods "invalid" in the AOT
>> heap, making it no longer observable from the iterators. Then it
>> calls the VM_Deoptimize vm operation after. Throughout all this, the
>> AOTCompiledMethod is alive(), yet when the iterators ask for all
>> is_alive() AOTCompiledMethods, it won't be visible. But I suppose IC
>> caches may still reference these methods and check if it
>> is_unloading, and then we blow up. There may possibly be multiple
>> ways for is_alive() AOTCompiledMethods to not be visible from
>> iterators yet be visible through IC caches using the "invalid" state
>> in the .
>> Anyway, the fix is the same: stop doing the epoch state thingey for
>> is_unloading() on AOTCompiledMethod where it isn't needed, and
>> doesn't seem to play well with the rather different life cycle it
>> has, and just return false instead.
>> On 2018-11-20 00:00, dean.long at oracle.com wrote:
>>> Hi Erik,
>>> On 11/19/18 12:42 PM, Erik Österlund wrote:
>>>> ...except it looks like for AOTCompiledMethods when running with
>>>> tiered compilation, may first be is_alive(), then become
>>>> !is_alive() for a while, and then get resurrected to is_alive()
>>>> using make_entrant().
>>> this doesn't sounds quite right. AOTCompiledMethods aren't allowed
>>> to transition to zombie (!alive), only not_used or not_entrant,
>>> which should still have is_alive() returning true. Maybe some code
>>> is using is_not_entrant() instead of !is_alive()?
More information about the hotspot-dev