RFR: 8031703 - Missing post-barrier in ReferenceProcessor
tprintezis at twitter.com
Mon Feb 3 19:20:21 PST 2014
(apologies for not replying earlier; we had a small firedrill at the end
of last week)
I went over the code one more time and I think your assessment is
correct. But, could I recommend that you at least add a comment to that
effect where the DiscoveredListIterator iter(...) is declared? Also,
maybe, rename the discovered_list_needs_barrier parameter as
..._needs_post_barrier to make its role a bit more explicit?
On 1/30/14, 9:19 AM, Per Liden wrote:
> Hi Tony,
> On 01/29/2014 11:06 PM, Tony Printezis wrote:
>> Good catch. Quick question: Is there a reason why pre-barriers are not
>> needed here?
> Thanks for looking at it.
> My reason for not needing a pre-barrier there goes like this:
> The Reference was inserted into the discovered list (ref_list) because
> it was discovered during evacuation or marking and it's thus strongly
> reachable from some other place and is marked (or will be marked)
> live. If we later remove the Reference from the discovered list we
> don't need to enqueue it for SATB because we know it has already been
> found/marked, that's how it ended up in the discovered list in the
> first place.
> Makes sense?
> Also note that if we had a pre-barrier there it would cause
> verify_no_cset_oops() to fail if we get an young collection during
> concurrent mark.
>> On 1/28/14, 4:57 AM, Per Liden wrote:
>>> Summary: In ReferenceProcessor::process_discovered_references() the
>>> lists of discovered references are pruned (a reference who's referent
>>> is still alive is unlinked, etc). During this pruning the
>>> Reference.discovered field is rewritten without a post-barrier (in
>>> DiscoveredListIterator::remove()), which means we fail to dirty the
>>> appropriate card. This only affect G1's CM reference processor, which
>>> is the only case where the reference processor needs post-barriers for
>>> the discovered list (i.e. _discovered_list_needs_barrier is true). The
>>> window for this to happen was widened after fixing JDK-8029255. Before
>>> that fix the appropriate card was usually dirtied by pure luck because
>>> of a write (in a different context) to the Reference.next field in the
>>> same object. However, if the object was unlucky and the
>>> next/discovered fields spanned two different cards the problem would
>>> have appeared. The patch also corrects the post-barrier in
>>> ReferenceProcessor::set_discovered(), which used an incorrect field
>>> offset or 0 instead of discovered_offset.
>>> The problem could be easily reproduced by running Kitchensink on a
>>> 32-bit x86 JVM with G1. The bug is however not 32-bit related, it just
>>> happened to expose the problem more frequently.
>>> Kitchensink 32-bit x86 JVM with G1 (6 hours)
>>> Kitchensink 64-bit x86 JVM with G1 (30 minutes)
>>> Kitchensink 32-bit x86 JVM with CMS (30 minutes)
Tony Printezis | JVM/GC Engineer / VM Team | Twitter
tprintezis at twitter.com
More information about the hotspot-gc-dev