RFR(M/L): 6484982: G1: process references during evacuation pauses

John Cuthbertson john.cuthbertson at oracle.com
Mon Sep 12 23:42:21 UTC 2011

Hi Bengt,

Thanks for the review. Please see the replies inline....

On 09/12/11 06:22, Bengt Rutisson wrote:
> Hi John,
> Stefan and I have been looking at this webrev today. Overall it looks 
> good! This will be a great improvement for G1.
> We have not quite gotten through all the code yet, but I hope that 
> we've covered most of the important changes.
> Here are some comments:
> referenceProcessor.cpp
> - Not strictly related to your change, but it would be nice to change 
> the name of _discoveredSoftRefs to _discoveredRefs and introduce a 
> separate _discoveredSoftRefs. It is quite confusing that 
> _discoveredSoftRefs is being used for all reference types.

I like this suggestion but I would rather do the renaming as a separate 
RFE because the change is already ~1600 lines. We also would need to 
check every reference to _discoveredSoftRefs and determine if we mean 
the soft ref lists only or the handle to the entire array list. Most are 
obvious though.

> - Why does iter.make_active() need the pre-barrier? We do 
> iter.make_referent_alive() just after. That will mark and push the 
> referent object, right?

make_active() operates on the reference object - not the referent. 
Reference objects are linked through their _next field and I think we 
may have to keep the next pending reference object alive. The only way 
we may have to do this is through the SATB barrier.

> - ReferenceProcessor::balance_queues() got convoluted by the extra 
> levels of if statements. A suggestion is to make a  set_discovered 
> method like this:
>   set_discovered(oop ref, oop value) {
>     if (_discovered_list_needs_barrier) {
>       java_lang_ref_Reference::set_discovered(ref, value);
>     } else {
>       java_lang_ref_Reference::set_discovered_raw(ref, value);
>     }
>   }
>   and then us it as:
>   if (ref_lists[to_idx].head() == NULL) {
>     // to list is empty. Make a loop at the end.
>     set_discovered(move_tail, move_tail);
>   } else {
>     set_discovered(move_tail, ref_lists[to_idx].head());
>   }

Good suggestion. Done.

> g1CollectedHeap.cpp
> - What does this comment mean? 
> instanceRefKlass::process_discovered_references does not exist.
>     // We want to discover references, but not process them yet.
>     // This mode is disabled in
>     // instanceRefKlass::process_discovered_references if the
>     // generation does some collection work, or
>     // instanceRefKlass::enqueue_discovered_references if the
>     // generation returns without doing any work.

I think the comment is out of date. It has been in the G1 sources since 
I as long as I can remember. Removed.

> - The setup of the STW reference processor uses an is_alive_closure. 
> This seems to be correct and efficient, but why do the other full and 
> STW reference processors in HotSpot skip this?

The is_alive_non_header is an optimization for collectors that use an 
external data structure (bitmap) rather than the object's header to 
record liveness information. Only G1 and CMS have this external data 
structure. So when we try to 'discover' a reference we can see if its 
referent is already alive and, if so, skip the 'discovery' and treat the 
reference as a regular oop.What this buys us is less reference objects 
to process. The other collectors use an object's mark word to record 
liveness info during the GC. With these other GCs you can't normally 
tell if the referent is alive while walking the heap marking (and doing 
discovery) until the reachable object graph has been walked.

> - Why is it necessary to go through the references found during CM in 
> G1CollectedHeap::g1_process_strong_roots?
>     if 
> (!_process_strong_tasks->is_task_claimed(G1H_PS_refProcessor_oops_do)) {
>       // We need to treat the discovered reference lists of the
>       // concurrent mark ref processor as roots and keep entries
>       // (which are added by the marking threads) on them live
>       // until they can be processed at the end of marking.
>       ref_processor_cm()->weak_oops_do(&buf_scan_non_heap_roots);
>     }
Reference objects, currently, can only be discovered by one reference 
processor and there is currently no mechanism for moving a reference 
from one RP to another. References discovered by the CM reference 
processor are processed during remark. So if we have a reference object 
that has already been discovered by the CM reference processor, which is 
also in the collection set, then it will not be 'discovered' by the STW 
reference processor. But the routine 
ReferenceProcessor::discover_reference() will return false (meaning that 
the calling code does not need to treat the reference object as a 
regular oop) and the strongly reachable graph from the reference object 
will not be walked during the evacuation clause.  So we have to ensure 
that the reference and it's referent are preserved until marking 
completes; the CM ref processor will then process the reference. If we 
had some way of encoding the reference processor instance into the 
discovered reference object and then also check against that in 
ReferenceProcessor::discover_reference() then we could do without this code.
> - The G1_DEBUG define makes the code really difficult to read. Could 
> this be removed?

OK. It was really helpful while debugging but I suppose it's served it's 
> - Have not looked in detail at the large new block of code in 
> g1CollectedHeap.cpp. One question though, would it be possible to 
> somehow simplify the setup in G1ParTask, G1STWRefProcTaskProxy, 
> G1ParPreserveCMReferentsTask and process_discovered_references()?

The problem I see is that these are all stack allocated objects so I 
think the only way to simplify this would be to have a stack allocated 
wrapper object - G1ParCopyInitializer(.....). Can you think of an 
alternative way? I'm not sure a consensus could be reached about this.
> - Did you do any tests with -XX:RefDiscoveryPolicy=1?
No. Ramki has said that it has bit-rotted and he wishes to remove it.
> Some minor things:
> referenceProcessor.hpp
> - Would be nice if subclasses_of_ref() could be called something with 
> "number_of" or similar. That indicates better how it is being used.

OK. Done.

> thread.cpp
> - only bracket changes. Don't include in thins change.

I had debugging code in here and so left the 'correctly' formatted code 
after removing the debugging statements. Since I've already been thanked 
for doing this - I would leave it.

> referenceProcessor.cpp
> - The code that is commented out in ReferenceProcessor::weak_oops_do() 
> and ReferenceProcessor::clean_up_discovered_references() could 
> probably be removed by now.

Possibly - but again I would leave it in. I think it nicely explains the 
format of _discoveredSoftRefs and a possible alternative (and correct) 
implementation if a particular C++ compiler is able to unroll the nested 

> - You removed inline from void DiscoveredListIterator::remove(), will 
> that impact performance?

Possibly. The include dependencies are such that I've been unable to 
move these routines into referenceProcessor.hpp or a 
referenceProcessor.inline.hpp as compilation errors occur when I include 
javaClasses.hpp and systemDictionary.hpp.

> g1CollectedHeap.cpp
> - What does "ser" stand for in "stw_rp_disc_ser"?

Serial. We've made discovery for the STW reference processor non MT.

> - Would be nice to move these two lines into enable_discovery(). (May 
> need a parameter to disable the check for NoRefDiscovery to work 
> properly.)
>   assert(!ref_processor_stw()->discovery_enabled(), "Precondition");
>   ref_processor_stw()->verify_no_references_recorded();
OK. Again I would rather this was a separate change for precisely the 
reason you mention.

> - Are these asserts really necessary?
>   void G1ParScanThreadState::trim_queue() {
>     assert(_evac_cl != NULL, "not set");
>     assert(_evac_failure_cl != NULL, "not set");
>     assert(_partial_scan_cl != NULL, "not set");

Ensuring that the PSS was fully initialized before calling these 
closures was very helpful. Are they necessary? No. Are they defensive? 
Yes. I vote to leave them in.

> - Maybe G1KeepAliveClosure should be called something else. It does 
> not keep objects alive, it just adjusts the pointer to the forwarded 
> location.

I like the "keep alive" part of the name as it it ties in with being 
used as the "keep_alive" parameter. G1UpdatePtrKeepAliveClosure is 
perhaps a more accuarate, if verbose, name. Unless you have a strong 
objection (or a really good alternative name) I vote to leave it.
> g1CollectedHeap.hpp:
> - This comment is a bit confusing:
>   //   Will enqueue any non-live discovered references on the
>   //    STW ref processor's discovered lists.
>  Maybe "Will enqueue any non-live discovered reference that the STW 
> ref processor discovered.

OK. Changed:

  // At the end of a full GC we:
  //  * Enqueue any reference objects discovered by the STW ref processor
  //    that have non-live referents. This has the side-effect of
  //    making the STW ref processor inactive by disabling discovery.

> heapRegion.hpp
> - I guess #if WHASSUP should be removed...

Are you referring to line 593? I thought I had removed it.I can't see it 
in my workspace.

Again thanks for the very thorough code review.


More information about the hotspot-gc-dev mailing list