RFR (M): 8153503: Move remset scan iteration claim to remset local data structure

Derek White derek.white at oracle.com
Wed Apr 13 17:27:23 UTC 2016

Hi Thomas,

Small comment at the end...

On 4/12/16 5:41 AM, Thomas Schatzl wrote:
> Hi,
> On Fri, 2016-04-08 at 18:34 -0400, Derek White wrote:
>> Hi Thomas,
>> On 4/5/16 5:46 AM, Thomas Schatzl wrote:
>>> Hi all,
>>>     can I have reviews for this refactoring of the remset scan
>>> iteration
>>> claim procedure?
>>> First, this changeset tries to capture all remset scanning specific
>>> data that is located everywhere, and only used during remembered
>>> set
>>> scan, into a single new G1RemsetScanState class.
>>> This cleans up lots of code like members in various classes and
>>> additional maintenance code all over the place (and in fact this
>>> change
>>> deletes more code than it adds).
>>> Second, I took the liberty to change the weird scan algorithm that
>>> does
>>> two-passes over the collection set into a single-pass one.
>>> Instead of, by setting some internal flag whether we are in the
>>> second
>>> pass and not do code root scan (iirc), use the result of an atomic
>>> operation to make sure that only a single thread does the code root
>>> scan.
>>> Then the change removes the nasty _cset_rs_update_cl member of
>>> G1RemSet, which hid a nasty bug (JDK-8153360) which this change
>>> does
>>> not address (but make obvious).
>>> Also did the usual touch-up of method names that are affected.
>>> Webrev:
>>> https://bugs.openjdk.java.net/browse/JDK-8153503
>>> CR:
>>> http://cr.openjdk.java.net/~tschatzl/8153503/webrev/
>>> Testing:
>>> jprt, vm.gc multiple times, a few benchmarks
>>> Thanks,
>>>     Thomas
>> I've only started looking at the micro-level, not the whole picture.
>> But the new code is more easier to understand!
> That's the main point.
>> Here are some partial comments:
>> --- src/share/vm/gc/g1/g1RemSet.hpp
>> Line 132:
>>      "references that point into the heap". Don't all references point
>> into the heap?
>>       - Should say "into heap region", or "into collection set"?
> "Into the heap" is indeed kind of redundant.
>> --- src/share/vm/gc/g1/g1RemSet.cpp
>> Line 49:
>> - Is G1RemSetScanState really only used during evacuation?
> Yes. There is no state preserved across collections. You could recreate
> it from scratch at every collection. I wanted to change runtime
> behavior as little as possibly by recreating it from scratch every
> time, as the original call also preserved the per-region members now
> collected into arrays.
>> - Should mention it's the scan state for the whole heap (or all heap
>> regions)?
> I will try to clarify the comment.
>> Lines 62, 65: Wasn't clear by the field names that these are arrays,
>> not single structs. Maybe make the names plural? "_iter_states",
>> "_iter_claims"
> Okay. No preference.
>> Lines 62, 65: Should declare "_iter_states" and "_iter_claims"
>> volatile to go with the Atomic accessors?
> Can do. Since all accesses are before/after Atomic cmpxchg which do the
> necessary (compiler) barriers anyway, this seemed redundant.
>> Line 83: Save "max_regions" as a field here, and remove the
>> max_regions argument to reset(). We could then add bounds
>> checks when/where needed.
> Can do.
>> Line 86: Should G1RemSetScanState::initialize() call
>> G1RemSetScanState::reset()?
>> Perhaps the arrays are not used until after a call to reset(), but
>> it's a little weird that "initialize()"
>> doesn't actually initialize all of the memory.
> It's kind of redundant as you note, as reset() is called at the start
> of every collection anyway. Initialize is called during startup, I
> tried to avoid this useless overhead.
>> Line 89: Add an assert that _iter_state and _iter_claimed are not
>> NULL.
>> i.e. check that initialize() was called before reset().
> Okay.
>> Line 93: Add a CR after. And lines 102, 111, 115, 119.
> Okay.
>> Line 113: Can declare const also.
> Okay.
>> I'll look at this more on Monday.
>>    - Derek

Sorry for dripping this out. I spent a lot of time trying to understand 
what G1ScanRSClosure::doHeapRegion() was really doing - what it was 
iterating over, how it was filtering out the claimed cards, etc.  If 
that iteration were done anywhere else I'd suggest separating it out 
into a "claimed" iterator class (but it's not done anywhere else).

One suggestion to existing code:

g1RemSet.cpp, Line 235 (etc):

Rename "jump_to_card" => "claimed_cards". The "jump" term pre-dates the 
current parallelization scheme and now isn't very meaningful.

Still looking...

  - Derek

More information about the hotspot-gc-dev mailing list