RFR: 8069367: assert(_nextMarkBitMap->isMarked((HeapWord*) obj)) failed

Bengt Rutisson bengt.rutisson at oracle.com
Mon Mar 30 13:44:37 UTC 2015

On 2015-03-30 14:42, Kim Barrett wrote:
> On Mar 30, 2015, at 5:15 AM, Bengt Rutisson <bengt.rutisson at oracle.com> wrote:
>> On 2015-03-30 03:28, Kim Barrett wrote:
>>> The cost of the pre-filtering of the mark stack is small, and some of
>>> it may be recovered by avoiding the cost of moving the filtered
>>> objects through the mark stack, and by doing less work on the filtered
>>> objects than would otherwise be needed. Aurora GC performance tests
>>> show mostly insignificant differences, with a few small improvements
>>> compared to the baseline that has bug 8069367.
>> I like this approach. I think it makes sense even without the
>> humongous reclamation issues to handle type arrays specially.
> I've been waffling over this. I would guess it would be a rare
> application that had a sufficiently low typeArray / object ratio that
> this would negatively impact performance (by a small amount?), and
> common for it to be a net benefit.  But I don't have any data to back
> up that guess.  And the effect is dynamic, since it doesn't apply in
> the non-push case - I don't think we should add such a check in
> scan_object.

Right. Agreed.

>>> [While I was at it, I simplified the conditionalization of
>>> deal_with_references for _CHECK_BOTH_FINGERS_, eliminating some code
>>> duplication.]
>> Personally I would prefer to do a small cleanup fix before your change
>> to simply remove _CHECK_BOTH_FINGERS_. It has been in the code since
>> 2011 (JDK-7046558: G1: concurrent marking optimizations) and as I
>> understand it it was mostly added because there was not time to do
>> enough performance testing at the time. I doubt that anyone will pick
>> up the performance work here, so I'd rather remove the code.
> I would be fine with killing off _CHECK_BOTH_FINGERS_.  And looking at
> that code again, it seems like it would be worth trying to eliminate
> one or both of the _finger or _curr_region NULL checks.
> I'll take a look at that.

Great! Thanks!

>> A couple of questions about this code in concurrentMark.inline.hpp:
>> 300           if (_CHECK_BOTH_FINGERS_ && _finger != NULL && objAddr < _finger) {
>> 301             if (obj->is_typeArray()) {
>> 302               // Immediately process arrays of binary data, rather
>> 303               // than pushing on the mark stack.  This keeps us from
>> 304               // adding humongous objects to the mark stack that might
>> 305               // be reclaimed before the entry is processed - see
>> 306               // G1EagerReclaimHumongousPreSnapshotTypeArrays. The
>> 307               // cost of the additional type test is mitigated by
>> 308               // avoiding a trip through the mark stack, and by only
>> 309               // doing bookkeeping update and avoiding the actual scan
>> 310               // of the object - a typeArray contains no references,
>> 311               // and the metadata is built-in.
>> 312               process_grey_object<false>(obj);
>> 313             } else {
>> 314               if (_cm->verbose_high()) {
>> 315                 gclog_or_tty->print_cr(
>> 316                   "[%u] below the local finger (" PTR_FORMAT "), pushing " PTR_FORMAT,
>> 317                   _worker_id, p2i(_finger), p2i(objAddr));
>> 318               }
>> 319               push(obj);
>> 320             }
>> It looks to me like the logging (lines 314-318) is relevant for type
>> arrays too. Can we move it up before line 301? Possibly changing the
>> wording from "pushing" to something more generic - maybe simply "obj".
> I limited the logging to the "push" branch intentionally.
> process_grey_objects similarly logs, and to me it didn't seem useful
> to have both in the non-push case and to lose the "pushing"
> distinction in that case. I would be more inclined to add an explicit
> immediate processing vs pushing distinction in the log output than to
> merge away that distinction.  (That distinction is presently implicit
> from deal_with_reference logging each object that needs marking,
> followed by the process_grey_objects message, the pushing message, or
> silence because par_mark lost the race with some other task.)

Ok. Thanks for explaining this. I'm fine with the logging as you 
suggested it.

>> I think the comment (lines 302-311) is good but it takes up a lot of
>> space inside the deal_with_reference() method. How about breaking
>> lines 301-320 out into its own method? (Keeping the logging that I
>> mentioned in the previous paragraph inside deal_with_reference().)
> I thought about that, but put it aside when I failed to come up with a
> decent name for that helper function, and didn't got back to it.  I'll
> think about it some more.

Good. Thanks!

>> I'm not sure I understand the changes in the g1CollectedHeap.*
>> files. In particular the changes to
>> RegisterHumongousWithInCSetFastTestClosure:
>> http://cr.openjdk.java.net/~kbarrett/8069367/webrev.03/src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp.frames.html
>> It looks like the G1EagerReclaimHumongousPreSnapshotTypeArrays flag
>> controls if you want to use the previous version (don't reclaim
>> objects that may be part of the SATB snapshot) and the new version
>> (don't push type arrays on the mark stack). Is this debugging code
>> that was left or is your intention that we should have both versions
>> in the code?
> We've encountered a number of bugs related to the eager reclaim of
> humongous objects feature, some of which might end up requiring the
> previous behavior.  (When I say "might", I mean I have little
> confidence either way right now.) And there might be more we haven't
> run into yet. This new flag is intended as a partial mitigation
> strategy for those, being weaker than simply turning off one or both
> of the other related flags.
> It also provides a searchable name to for the coupling between the
> this code and the is_typeArray check in deal_with_reference.  But
> maybe the comment in the latter is sufficient now?

I'm not too happy about adding code just in case we need it. I'm afraid 
that the code for the -XX:-G1EagerReclaimHumongousPreSnapshotTypeArrays 
case will be tested very rarely and when we suddenly need it it will 
contain some subtle bugs and just cause more problems for us.



More information about the hotspot-gc-dev mailing list