RFR(M) JDK-8014555: G1: Memory ordering problem with Conc refinement and card marking

Mikael Gerdin mikael.gerdin at oracle.com
Tue Oct 8 01:11:50 PDT 2013


On 10/07/2013 02:29 PM, Mikael Gerdin wrote:
> All,
> I plan to push this change with the additional comment as suggested by
> Roland. Thanks for all the reviews.

I made a new webrev with the comment suggested by Roland and Martin's 
optimized invalidate().

Incremental webrev:
Full webrev:


> /Mikael
> On 10/02/2013 02:28 PM, Mikael Gerdin wrote:
>> Hi
>> Please review my fix for the issue discussed in the "G1 question:
>> concurrent cleaning of dirty cards" thread on the hotspot-gc-dev mailing
>> list.
>> I'd like someone from the compiler (and runtime? the interpreter uses
>> macroAssembler_*, right?) teams to at least look at the changes to:
>> macroAssembler_*.cpp
>> c1_Runtime_*.cpp
>> graphKit.cpp
>> Problem description:
>> G1 has a race where the concurrent refinement thread may miss object
>> references in a dirty card.
>> The problem arises if the CPU re-orders the load of the old card value
>> (which G1 checks to determine if it can skip the barrier)
>> before the store to the actual object.
>> If that occurs the concurrent refinement thread may have set the card to
>> "clean" and proceeded to scan the card but the java thread may have seen
>> the "dirty" value and skipped the post barrier.
>> Suggested fix:
>> * Add a memory fence between the store to a java object and the reading
>> of the previous card value.
>> * Modify the code for handling young regions so that all writes to young
>> regions can skip the fence (since it will never be needed for such
>> writes). This introduces a new value in the card table for G1 which
>> indicates a young region.
>> Performance impact:
>> * This fix has a negative throughput performance impact of 1-1.5%
>> (tested on x86-AMD x86-Intel and SPARC).
>> * We may want to try to get rid of this race at some point by
>> redesigning G1's post barrier but there is not enough time to do that
>> for JDK8.
>> Performance numbers for x86 platforms can be seen here:
>> http://cr.openjdk.java.net/~mgerdin/8014555/perf.txt
>> Unfortunately the JIRA issue is not externally visible, but the major
>> parts of the discussions about this are present in the mailing list
>> thread. The bug mostly contains my analysis of the crashes which seems
>> to have been caused by this bug.
>> Bug link: https://bugs.openjdk.java.net/browse/JDK-8014555
>> Webrev: http://cr.openjdk.java.net/~mgerdin/8014555/webrev.0

More information about the hotspot-dev mailing list