RFR: JDK-8220671: Initialization race for non-JavaThread PtrQueues
kim.barrett at oracle.com
Wed Mar 20 18:44:12 UTC 2019
> On Mar 20, 2019, at 4:09 AM, Roman Kennke <rkennke at redhat.com> wrote:
> Am 20.03.19 um 09:06 schrieb Roman Kennke:
>>> The test started failing somewhere between jdk-13+9 and jdk-13+11, and I
>>> bisected it down to NJT PtrQueues change. It also seemed like the most
>>> likely candidate in that frame. It only ever seems to crash with
>>> +UseStringDuplication, and since the strdedup thread does SATB, it seems
>>> plausible that the change affects this.
I agree that change is the likely culprit, in one way or another.
>>> Any help would be greatly appreciated.
>> I have added asserts that verify that, after final flushing of thread-local SATB queues, that *all* thread's SATB queues are empty. It does not trigger, any yet, I see crashes.
>> This tells me that it is failing to enqueue some oops to begin with. Our ShBS::enqueue() not only checks the thread-local SATB-active flag, but also the global one. Do you think there might be a race accessing this? I.e. NJT possibly seeing a stale value because it does not synchronize on the same stuff as Java threads do when safepointing?
> E.g., PtrQueueSet::_all_active is not volatile and is not accessed using any OrderAccess either... ?
PQS::_all_active isn’t volatile because there aren’t supposed to be concurrent readers when it’s written.
The initialization race for NJTs is a counter-example that is a bug, which we’re discussing a fix for here.
Why does ShBS::enqueue look at the global SATB-active flag? That seems like a mistake. Though
I wouldn’t expect there to be any threads calling enqueue() while the global SATB state is being changed.
Doing so also seems like a mistake.
More information about the hotspot-gc-dev