RFR: JDK-8220671: Initialization race for non-JavaThread PtrQueues
rkennke at redhat.com
Mon Mar 25 13:26:27 UTC 2019
This thread went a bit off. May I propose this for review:
it passes tier1 tests locally, and I submitted it to jdk/submit but that
seems to have other hiccups.
> In Shenandoah testing we discovered an initialization race: A non-Java
> GC thread (we have observed it on the StringDedupThread) may be
> initialized concurrently while Java and GC are already up an running,
> but not (yet) participate in safepointing.
> BS::on_thread_attach() usually does propagate global GC state to
> thread-local GC state, in this case the SATB active flag.
> When doing this concurrently, while not participating in safepointing,
> this may propagate the wrong state, and subsequently lead to heap
> corruption (e.g. because we missed some SATB updates).
> This is related to JDK-8219613 because before that change,
> non-Java-threads would simply use a shared SATB queue instead.
> The bug appeared in Shenandoah testing, but I don't see why it wouldn't
> affect G1 too. It's probably not run with aggressive enough tests to
> make it happen. (We run Shenandoah in aggressive mode, which starts
> continuous GCing right at the start.)
> The problem seems specific to the StringDedupThread (for now), and so is
> the solution: in the StringDedupThread's pre_run() and post_run(), take
> the STSJoiner. This ensures that the thread doesn't accidentally crosses
> safepoints while initializing or exiting, and thus loosing SATB updates.
> I tried a couple of other approaches like:
> But we also need to protect the addition/removal of the thread to the
> global NonJavaThread list.
> Testing: Running the offending test (TestStringDedupStress.java) 20x in
> a row. It used to fail ~1 of 5 runs before. Now it all passes. Also,
> hotspot_gc_shenandoah passes. tier1 is fine too. Will push it through
> jdk-submit next.
> Can I please get reviews?
More information about the hotspot-gc-dev