RFR: JDK-8220671: Initialization race for non-JavaThread PtrQueues
rkennke at redhat.com
Fri Mar 15 13:39:46 UTC 2019
In Shenandoah testing we discovered an initialization race: A non-Java
GC thread (we have observed it on the StringDedupThread) may be
initialized concurrently while Java and GC are already up an running,
but not (yet) participate in safepointing.
BS::on_thread_attach() usually does propagate global GC state to
thread-local GC state, in this case the SATB active flag.
When doing this concurrently, while not participating in safepointing,
this may propagate the wrong state, and subsequently lead to heap
corruption (e.g. because we missed some SATB updates).
This is related to JDK-8219613 because before that change,
non-Java-threads would simply use a shared SATB queue instead.
The bug appeared in Shenandoah testing, but I don't see why it wouldn't
affect G1 too. It's probably not run with aggressive enough tests to
make it happen. (We run Shenandoah in aggressive mode, which starts
continuous GCing right at the start.)
The problem seems specific to the StringDedupThread (for now), and so is
the solution: in the StringDedupThread's pre_run() and post_run(), take
the STSJoiner. This ensures that the thread doesn't accidentally crosses
safepoints while initializing or exiting, and thus loosing SATB updates.
I tried a couple of other approaches like:
But we also need to protect the addition/removal of the thread to the
global NonJavaThread list.
Testing: Running the offending test (TestStringDedupStress.java) 20x in
a row. It used to fail ~1 of 5 runs before. Now it all passes. Also,
hotspot_gc_shenandoah passes. tier1 is fine too. Will push it through
Can I please get reviews?
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 801 bytes
Desc: OpenPGP digital signature
More information about the hotspot-gc-dev