RFR(S): G1: assert(_finger == _heap_end) failed, concurrentMark.cpp:809
john.cuthbertson at oracle.com
Wed Mar 13 12:18:22 PDT 2013
Thank you for looking at the code changes.
On 3/13/2013 11:28 AM, Jon Masamitsu wrote:
> No comments affecting the correctness of the
> change so looks good.
> 979 // If we're during the concurrent phase of marking, reset the
> during -> doing ?
OK. Whatever communicates that the overflow is seen while the concurrent
phase is active.
> From concurrentMark.hpp
> // It should be called to indicate which phase we're in (concurrent
> // mark or remark) and how many threads are currently active.
> void set_phase(uint active_tasks, bool concurrent);
> Is there a better name that we can give to set_phase() so
> that we know if should only be called once per phase (if that
> is in fact true?)? Maybe set_phase_on_entry() or
> initialize_phase() or set_phase_start().
> Also set_phase() seems to have much to do with the
> concurrency of the phase so maybe set_concurrency(),
> set_concurrency_on_entr(), initialize_concurrency(),
Actually I like set_concurrency_and_phase(...).
> 2403 // Not strictly necessary but...
> 2404 //
> 2405 // We need to reset the number of workers in the parallel
> 2406 // task terminator, before each proxy task execution, so
> 2407 // that the termination protocol in CMTask::do_marking_step()
> 2408 // knows how many workers to wait for.
> 2409 _cm->terminator()->reset_for_reuse(_active_workers);
> If you're reusing a ParallelTaskTerminator, I think it really is best
> to not assume that it can be reused.
OK. Kind of prompts the change I'll suggest below.
> On 3/12/2013 5:10 PM, John Cuthbertson wrote:
>> Hi Everyone,
>> Can I have a couple of volunteers look over these changes? The webrev
>> can be found at: http://cr.openjdk.java.net/~johnc/8009940/webrev.0/
>> These changes were, at Bengt's request, split out from the original
>> webrev for 8009536. In the webrev they have been applied on top of
>> 8009536 but they are independent. The issue has been in the G1 code
>> for a while (since I added the reference processing code a couple of
>> years ago) - we just never hit it.
>> Anyway the assertion is caused by calling set_phase() - which throws
>> the assertion failure - after resetting the marking state - which
>> resets the value of the global finger.
> Is set_phase() actually safe to call other than the assertion that fires
> if the global finger is reset?
set_phase() was actually safe to call (other than the assert). Even
resetting the global finger is, I believe, safe. Doing so means that we
would mark object but would no longer push them - as they are no longer
below the finger. Objects are pushed so that they can be traced. Since
marking is going to be restarted, the marked objects would be traced in
the normal manner when marking reaches them.
Since it's not safe to reuse a parallel task terminator (I didn't know
that) my suggestion is:
* break out the concurrency-related code from set_phase() into it's own
* call that routine within set_phase (now renamed set_concurrency_and_phase)
* call the set_concurrency routine within the parallel reference processing.
That should also address Bengt's concern about the inconsistency between
the parallel task terminator and the WorkGangBarrierSyncs.
More information about the hotspot-gc-dev