CRR (L / updated): 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces
tony.printezis at oracle.com
Tue Dec 27 18:05:12 UTC 2011
Here's an updated webrev for this change that takes into account the new
approach of chunking object arrays (see previous e-mails on 7121623):
If anything else the new approach simplified the code a bit since now we
can always read an object's size from its from-image instead of having
to check one or the other depending on whether it's a chunked array or
not. I also moved the body of some methods from heapRegion.hpp to the
.inline.hpp and .cpp files (as they were getting a bit large to keep in
the .hpp file).
On 12/21/2011 05:37 PM, Tony Printezis wrote:
> Hi all,
> I'd like a couple of code reviews for the following non-trivial
> changes (large, not necessary in lines of code modified but more due
> to the fact that the evacuation pause / concurrent marking interaction
> is changed quite dramatically):
> Here's some background, motivation, and a summary of the changes (I
> felt that it was important to write a longer then usual explanation):
> * Background / Motivation
> Each G1 heap region has a field top-at-mark-start (aka TAMS) which
> denotes where the top of the region was when marking started. An
> object is considered implicitly live if it's over TAMS (i.e., it was
> allocated since marking started) or explicitly live if it's below TAMS
> (i.e., it was allocated before marking started) and marked on the
> bitmap. (It follows that it's unnecessary to explicitly mark objects
> over TAMS.)
> In fact, we have two copies of the above marking information: "Next
> TAMS / Next Bitmap" and "Prev TAMS / Prev Bitmap". Prev is the copy
> that was obtained by the last marking cycle that was successfully
> completed (so, it is consistent: all live objects should appear as
> live in the prev marking information). Next is the copy that will be
> obtained / is currently being obtained and it's not consistent because
> it's not guaranteed to be complete.
> G1 uses SATB marking which has the advantage not to require objects
> allocated since the start of marking to be visited at all by the
> marking threads (they are implicitly live and they do not need to be
> scanned). So, the active marking cycle can totally ignore objects over
> NTAMS (since they have been allocated since marking started).
> The current interaction between evacuation pauses (let's call these
> "GCs" from now on) and concurrent marking is very tricky. Even though
> marking ignores all objects over NTAMS (currently: all objects in Eden
> regions) it still has to visit and mark objects in the Survivors
> regions. But those will be moved by subsequent GCs. So, a GC needs to
> be aware that it's moving objects that have been marked by the marking
> threads and not only propagate those marks but also notify the marking
> threads that said objects have been moved. For that we use several
> data structures: pushes to the global marking stack and also to what's
> referred to as the "region stack" which is only used by the GC to push
> a group of objects instead of pushing them individually ("region"
> here is a mem region and smaller than a G1 region).
> Additionally, because the marking threads could come across objects
> that could potentially move we have to make sure that we don't leave
> references to regions that have been evacuated on any marking data
> structure. To do that we treat as roots all entries on the taskqueues
> / global stack and drained all SATB buffers (both active buffers and
> also enqueued buffers).
> The first issue with the above interaction is that it has performance
> issues. Draining all SATB buffers and scanning the mark stack and
> taskqueues has been shown to be very time-consuming in some cases.
> Also, having to check whether objects are marked and propagate the
> marks appropriately during GC is an extra overhead.
> The second issue is that it has been shown to be very fragile. We have
> discovered and fixed many issues over time which were subtle and hard
> to reproduce.
> We really need to simplify the GC/marking interaction to both improve
> performance of GCs during marking, as well as improve our reliability.
> This changeset does exactly that.
> * Explanation of the changes
> The goal is to ensure that all the objects that are copied by the GC
> do not need to be visited by the marking threads and as a result do
> not need to be explicitly marked, pushed, etc.
> The first observation is that most objects copied during a GC are
> allocated after marking starts and are therefore implicitly live. This
> is the case for all objects on Eden regions, as well as most objects
> on Survivor regions. The only exception are objects on the Survivor
> regions during the initial-mark pause. Unfortunately, it's not easy to
> track those separately as they will get mixed in with future
> Survivors. The first decision to deal with this is to turn off
> Survivors during the initial-mark pause. This ensures that all objects
> copied during each subsequent GC will only visit objects that have
> been allocated since marking started and are therefore implicitly live
> (i.e., over NTAMS). This allows us to totally eliminate that code that
> propagates marks during the GC. We just have to make sure that all
> copied objects are over NTAMS. Turning off Survivors during an
> initial-mark pause is a bit of a "big hammer" approach, but it will
> suffice for now. We have ideas on how to re-enable them in the future
> and we'll explore a couple of alternatives.
> Given that the GC only copies objects that are implicitly marked it
> follows that none of the objects that are copied during any GC should
> appear on either the taskqueues nor the global marking stack. Also
> remember that we filter SATB buffers before enqueueing them which will
> filter out all implicitly marked objects. It follows that no enqueued
> SATB buffer should have references to objects that are being moved.
> This leaves the currently active SATB buffers given that the code that
> populates them is unconditional. But if we run the filtering on those
> during each GC such "offending" references are also quickly
> eliminated. So, instead of having to scan all stacks and all SATB
> buffers we only have to filter the active SATB buffers, which should
> be much, much faster.
> * Implementation Notes
> The actual changes are not too extensive as they basically mostly
> disable functionality in the GC code. The tricky part was to get the
> TAMS fields correct at various phases (start of copying, start of
> marking, etc.) and especially when an evacuation failure occurs. I put
> all that functionality in methods on HeapRegion which do the right
> thing when a GC starts, a marking starts, etc.
> The most important changes are in the "main" GC code, i.e.
> G1ParCopyHelper::do_oop_work() and
> G1ParCopyHelper::copy_to_survivor_space(). Instead of having to
> propagate marks we only now need to mark objects directly reachable
> from roots during the initial-mark pause. The resulting code is much
> simplified (and hopefully more performant!).
> I also added a method verify_no_cset_oops() which checks that indeed
> all the marking data structures do not point to regions that are being
> GCed at the start / end of each GC. (BTW, I'm considering adding a
> develop flag to enable this on demand.)
> I should point out that this changeset will leave a lot of dead code.
> However, I took the decision to keep the changes to a minimum in order
> not overwhelm the code reviewers and make the important changes
> clearer. (I also discussed this with a couple of potential code
> reviewers and they agreed that this is a good approach.) I temporarily
> added guarantees to ensure that methods that should not be called are
> not called. I will remove all dead code with a future push.
> I also have to apologize to John Cuthbertson for removing a lot of
> code he's added to deal with various bugs we had in the GC/marking
> interaction. Hopefully the new code will be less fragile compared to
> what we've had so far and John will be able to concentrate on more
> interesting features than trying to track down hard-to-reproduce
More information about the hotspot-gc-dev