Y. S. Ramakrishna
y.s.ramakrishna at oracle.com
Wed Mar 16 11:01:50 PDT 2011
Hi Tom --
A form of the strong-scanning problem for the nmethod scanning]
code existed for a while, long preceding ScavengeRootsInCode, and
was fixed a while ago so that the code-cache scanning did not
inadvertently turn code-oops into strong refs. I can see that
a parameter do_strong_roots_only now distinguishes the two types of scans:
being true for the active nmethods on thread stacks scan
and false for the code-cache scan of all nmethods, where the
refs are treated weakly. I think that the list of scavengeable
nmethod scanning should also be done weakly. I believe that would also take
care of the synchronization issue that you ran into below,
since there is in fact a synchronization
barrier between the strong roots scan which must semantically
strictly precede the weak scan. It would seem as though one
should do the oop relocations during the second, weak scan?
I don't know the answers to your other questions, but I am guessing
John Rose would.
On 03/16/11 10:35, Tom Rodriguez wrote:
> I was getting ready to finish my statics fields in Class changes when I hit a failure with jbb and CMS. I've tracked it down to a race in the machinery for updating oop relocations and the logic for making sure that a scavengable nmethod is only scanned once. During a scavenge an nmethod can be reached for scanning in two different ways, either as a live activation on some thread stack or during the scan of scavengeable nmethods. The scan of scavengeable nmethods does two things though. It does the oops_do for the nmethod and then it calls fix_oop_relocations to update the generated code to match the new oop values. The problem is that the scan of the thread stacks and the scan of the scavengable nmethods are performed concurrently so the stack scanning thread might claim the nmethod first but actually scan the nmethod after the call to fix_oop_relocations in the other thread, leaving the oops valid but the code stale.
> I think the logical place to move the fix_oop_relocations call is into nmethod::oops_do_marking_epilogue. Does this seem reasonable to anyone who understands the new nmethod scavenge code better than I do? It seems to work fine.
> Actually one thing I noticed is that the nmethod::oops_do_marking_prologue/epiloque logic is being called during full gcs which seems somewhat pointless to me since it mostly creates redundant work. Actually if it's really scanning the scavengable nmethods there then it's turning them into strong roots which is wrong. Only nmethods which are live on stack should be scanned as strong roots.
> Does anyone know why the test_set_oops_do_mark builds yet another linked list instead of just having a flag on the nmethod to indicate that it's claimed? It seems overly complicated. The contents of the list should be the same as the scavenge roots list and a simple flag would indicate whether it was marked or not.
More information about the hotspot-gc-dev