RFR (9) 8185133: Reference pending list root might not get marked

Kim Barrett kim.barrett at oracle.com
Fri Jul 28 19:53:43 UTC 2017

> On Jul 28, 2017, at 1:20 PM, Erik Osterlund <erik.osterlund at oracle.com> wrote:
> Hi Roman,
>> On 28 Jul 2017, at 16:53, Roman Kennke <rkennke at redhat.com> wrote:
>> Hi Mikael,
>> I don't really understand what the problem is. The WR ends up on the
>> RPL, with its referent cleared, i.e. no longer pointing to the SR? But
>> we want to keep the SR alive?
> No. The WR gets promoted to old during the initial mark evacuation as it was strongly reachable by a SR in young. The referent of the WR died, and therefore it gets discovered. The assumption is then that since it was strongly reachable from the SR in young, the WR will be found during concurrent marking due to SATB. This is normally a safe assumption.
> However, just before finishing the initial mark pause and letting concurrent marking start trace through the heap, soft references may change strength to suddenly become weak. Therefore, the WR in old never gets marked during concurrent marking unless the GC is made aware of the existence of this new strong edge to the pending list head.
> This is a problem, because the pending list was in this scenario exposed to Java land through the pending list head, without the concurrent marking knowing about it, violating GC completeness.

I think SR also needs to be promoted by the initial-mark pause.  If SR
is young and not promoted, then it will be a survivor of the
initial-mark pause, and so will be scanned by scan_root_regions.
scan_root_regions doesn't do reference processing, so the scan of the
survivor SR will mark WR.

Here's my understanding of the problem scenario:

(1) initial state

SR => WR => O
WR, and O are young
WR and O are unreachable except through the chain from SR
SR has not expired

(2) initial_mark

SR and WR are both promoted to oldgen.
SR is not discovered, because it has not expired.
WR is discovered and enqueued, because O is unreachable.
WR ends up at the head of the pending list.  This happens after the
initial root scan has examined the head of the pending list.

(3) SR expires

We now have an oldgen WR in the pending list, and no certain path by
which concurrent marking will reach it, even though it is accessible.
(The Java reference processing thread might process and discard it
before any damage is actually done, but that's far from certain.)

So it requires a fairly unlikely sequence of events.

Note: If WR ends up anywhere other than at the head of the pending
list, it will eventually be visited, either by scan_root_region or
normal concurrent marking, depending on its predecessor in the list.
(Assuming its predecessor is not another similar case that *did* end
up at the head of the list.)

More information about the hotspot-gc-dev mailing list