RFR: JDK-8220671: Initialization race for non-JavaThread PtrQueues

Roman Kennke rkennke at redhat.com
Sat Mar 16 21:18:50 UTC 2019

>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8220671
>> Webrev:
>> http://cr.openjdk.java.net/~rkennke/JDK-8220671/webrev.03/
>> The problem seems specific to the StringDedupThread (for now), and so is
>> the solution: in the StringDedupThread's pre_run() and post_run(), take
>> the STSJoiner. This ensures that the thread doesn't accidentally crosses
>> safepoints while initializing or exiting, and thus loosing SATB updates.
>> I tried a couple of other approaches like:
>> http://cr.openjdk.java.net/~rkennke/JDK-8220671/webrev.02/
>> But we also need to protect the addition/removal of the thread to the
>> global NonJavaThread list.
>> Testing: Running the offending test (TestStringDedupStress.java) 20x in
>> a row. It used to fail ~1 of 5 runs before. Now it all passes. Also,
>> hotspot_gc_shenandoah passes. tier1 is fine too. Will push it through
>> jdk-submit next.
>> Can I please get reviews?
>> Thanks,
>> Roman
> I think the proposed fix for _just_ StringDedupThread is insufficient.
> I think there are other threads that could have similar races between
> on_thread_attach and being added to the NJT thread list. And it
> doesn't matter whether those threads touch oops or not. Even if they
> don't, the state verification done when the SATB transition is made
> can fail. (See SATBMarkQueueSet::verify_active_states.)

Yeah I agree that this warrants a more generic solution. I am thinking 
about it. It is a tricky little bugger...

> It might be sufficient to add a STS joiner around the on_thread_attach
> / add_to_the_list sequence in NJT::pre_run, with the joiner only
> active if invoked when not at a safepoint. That way we can still
> create threads during a safepoint that are needed in that safepoint
> (e.g. lazy allocation of worker threads). There are some possibly
> complicated cases to think about though, or maybe disallow somehow.

Checking for being at safepoint itself is racy if the thread is not 
participating in the safepointing protocol. On the other hand it might 
not be bad per se to block when at safepoint. As long as it doesn't hold 
back other threads that would not leave the safepoint as a result...

I'll look a bit deeper into this and come back... on Monday ;-)


More information about the hotspot-gc-dev mailing list