RFR(S): 8074553: Crash with assert(!is_unloaded()) failed: should not call follow on unloaded nmethod
mikael.gerdin at oracle.com
Thu Mar 31 16:13:01 UTC 2016
On 2016-03-31 17:49, Vladimir Kozlov wrote:
> On 3/31/16 6:22 AM, Tobias Hartmann wrote:
>> please review the following patch:
>> While the code cache sweeper processes a nmethod in
>> NMethodSweeper::process_nmethod(), safepoints may happen and the GC
>> may unload the currently processed nmethod. To prevent this, the
>> sweeper uses a NMethodMarker which saves the nmethod in
>> CodeCacheSweeperThread::_scanned_nmethod. The nmethod is then passed
>> to the GC through a CodeBlobClosure in
>> CodeCacheSweeperThread::oops_do() to keep it alive when the GC
>> iterates over all threads.
>> The problem is that G1 calls nmethods_do() on all threads in the
>> remark phase (see G1RemarkThreadsClosure::do_thread()) which is not
>> overwritten by the sweeper thread. Since the currently processed
>> nmethod is not passed through nmethods_do() by any thread, it is
>> unloaded and we later hit the assert when encountering the nmethod
>> through oops_do().
>> Mikael Gerdin and Stefan Karlsson (thanks again!) suggested to
>> overwrite nmethods_do() as well in CodeCacheSweeperThread and pass
>> _scanned_nmethod to the closure. I also modified
>> Threads::nmethods_do() to ignore the sweeper thread because we want to
>> avoid marking the _scanned_nmethod as seen on the stack when scanning
>> stacks from the sweeper (the nmethod may already be zombie and only
>> referenced by the sweeper).
> I did not get this. If you exclude CodeCacheSweeperThread in
> Threads::nmethods_do() then CodeCacheSweeperThread::nmethods_do() will
> not be called. What is the point?
The GC code in question iterates over the threads and calls nmethods_do
on each JavaThread and the VMThread after claiming them with atomic
operations to achieve parallelism.
There is still something a bit fishy here though.
Thread::nmethods_do is not virtual, so one must be careful to downcast
one's Thread* to JavaThread before calling it. And since Tobias' change
does not make that or JavaThread::nmethods_do (which actually shadows
and does not override the methods) virtual we can't reach the new code
unless the GC code is changed to downcast to CodeCacheSweeperThread
before calling nmethods_do.
I still believe that what Tobias is attempting to do is a necessary fix
but this only shows how hard this is to reproduce.
Perhaps Thread::nmethods_do should simply be removed (along with any
calls to it) and JavaThread::nmethods_do should then be made virtual.
>> Unfortunately, this bug is extremely hard to reproduce (it showed up
>> 18 times since early 2015). I was able to reproduce it only once after
>> thousands of runs and therefore not often enough to verify the fix.
>> However, I'm very confident that this solves the problem.
>> Tested with JPRT and RBT (running).
More information about the hotspot-dev