RFR(S): 8074553: Crash with assert(!is_unloaded()) failed: should not call follow on unloaded nmethod
tobias.hartmann at oracle.com
Thu Mar 31 13:22:43 UTC 2016
please review the following patch:
While the code cache sweeper processes a nmethod in NMethodSweeper::process_nmethod(), safepoints may happen and the GC may unload the currently processed nmethod. To prevent this, the sweeper uses a NMethodMarker which saves the nmethod in CodeCacheSweeperThread::_scanned_nmethod. The nmethod is then passed to the GC through a CodeBlobClosure in CodeCacheSweeperThread::oops_do() to keep it alive when the GC iterates over all threads.
The problem is that G1 calls nmethods_do() on all threads in the remark phase (see G1RemarkThreadsClosure::do_thread()) which is not overwritten by the sweeper thread. Since the currently processed nmethod is not passed through nmethods_do() by any thread, it is unloaded and we later hit the assert when encountering the nmethod through oops_do().
Mikael Gerdin and Stefan Karlsson (thanks again!) suggested to overwrite nmethods_do() as well in CodeCacheSweeperThread and pass _scanned_nmethod to the closure. I also modified Threads::nmethods_do() to ignore the sweeper thread because we want to avoid marking the _scanned_nmethod as seen on the stack when scanning stacks from the sweeper (the nmethod may already be zombie and only referenced by the sweeper).
Unfortunately, this bug is extremely hard to reproduce (it showed up 18 times since early 2015). I was able to reproduce it only once after thousands of runs and therefore not often enough to verify the fix. However, I'm very confident that this solves the problem.
Tested with JPRT and RBT (running).
More information about the hotspot-dev