RFR: JDK-8220671: Initialization race for non-JavaThread PtrQueues

Roman Kennke rkennke at redhat.com
Wed Mar 27 20:27:13 UTC 2019


>>>>> I've tested it against my stress test and Shenandoah test suite, and all seems fine.
>>>>> Is the extra locking around assignment of global state needed? I think the locking in threads_do() is sufficient?
>>>>
>>>> The locking over "_all_active = active;" looks sane to me. This patch has no locking in threads_do(), or am I missing something?
>>>>
>>>> cheers,
>>>> Per
>>>>
>>>>> Roman
>>> Right, there's no locking in (non_java_)threads_do, by design.
>>
>> I don't see how this is supposed to work. There's code that modifies the list, and there's code that iterates the list, and in the absense of other synchronization (e.g. participation in safepoints), they need to be protected by the same locks, no? What am I missing?
> 
> You are missing the use of SingleWriterSynchronizer, which is another
> RCU-inspired synchronization mechanism, similar to GlobalCounter. It is used
> in the implementation of NonJavaThread iteration because NJT iteration is
> used in the implementation of GlobalCounter, in such a way that the
> iteration can't itself use GlobalCounter or it would self-deadlock.


Aha, ok then. Thanks for the explanation.

>>> But looking at this again, the expanded exposure of the
>>> NonJavaThreadsList_lock is making me regret that it's also being used
>>> to protect the synchronize() in remove_from_the_list. If an iterator
>>> were to try to lock that mutex while some NonJavaThread was on its way
>>> out, we'd deadlock.
>>
>> Why?
> 
> NJT iteration is a synchronizer critical section. remove_from_the_list
> synchronizes on iteration critical sections while holding the NJTList_lock.
> So if an iteration is in progress when remove_from_the_list is called, the
> synchronization therein will block until the iteration completes. But if the
> iteration then attempts to lock the NJTList_lock, we have deadlock.

Ok.

>>> The one new use of the mutex being added by this
>>> change doesn't introduce such a problem, but still... I think it's
>>> easy to avoid that problem by adding a new mutex just to protect the
>>> synchronize call.
>>
>> Uh, no? Wouldn't that be missing the point above? We can't just 'avoid' the problem by protecting something by two mutexes that ought to be under one mutex.
>>
>> Or what am I missing?
> 
> Moving that synchronization outside the NJTList_lock solves that deadlock
> problem. But the synchronization itself still needs protection; this
> synchronizer only supports one synchronization at a time (one of its
> disadvantages compared to GlobalCounter, and the reason for "SingleWriter"
> in the name). So we need a new lock, specifically for protecting those
> synchronization calls.

Alright, ok then. Let's see that patch then. :-)

Thanks, Roman


More information about the hotspot-gc-dev mailing list