RFR: 8220465: Use shadow regions for faster ParallelGC full GCs

Haoyu Li leihouyju at gmail.com
Wed Nov 27 02:15:32 UTC 2019

Hi Thomas,

Thanks for your constructive suggestions!  I've addressed the issues you
mentioned, and the updated patches are attached. Please find the details

> - static const ints can be initialized in the definition (UNUSED,
> SHADOW, ...); also they should be CamelCased; they are very unspecific
> too - I added some prefix to distinguish them a bit.
> Now these values of _shadow_state are initilized in the definition and
CamelCased. I have also changed their names to ba more specific: a region
is *UnusedRegion* when untouched, and will become *NormalRegion* if
processed with the original parallel full GC algorithm; if an idle GC
thread steals an unavailable region to process it with the help of a shadow
region, the thread will mark it to *ShadowRegion*, *FilledShadow*, and
*CopiedShadow* in sequence.

- the documentation about this change is imho lacking.
>   - It would be nice to explain the idea of shadow regions somewhere
> assuming that you know how parallel works. Including the reference to
> the paper. :)
>   - some of the comments just show what code (often a single statement)
> does, not the what and why or the reason why a particular method or
> member exists. Or explains one or the other.
> E.g.
> "The shadow region array, we use it in a LIFO fashion, so that we can
> reuse shadow regions for better data locality and utilization"
>    - at this point we have no idea what a "shadow region" is and we
> can't find out easily because it is called "shadow record" or "steal
> record" elsewhere.
> Something better could be:
> "Contains currently free shadow regions (assuming we converge on that
> name). We use it in a LIFO fashion for better data locality and
> utilization."
> Thanks for pointing out the lack of illustration. I've added several
paragraphs to demonstrate the main idea, the typical workflow, and the
source paper of shadow region optimization in the comments of
ParallelCompact::initialize_shadow_region(). Besides, I've also checked
other comments in the patch and made them more precise.

> - I think there is a missed optimization opportunity in (now)
> PSParallelCompact::initialize_shadow_regions(). There, the code
> initializes the "free" region ids to region_at_top+1 to end_region of a
> particular space.
> If the top for a given space is at a region boundary (e.g. if a space is
> empty, which is probably common for one of the survivor spaces), you
> loose a single region per space.
> One reason might that the code uses region "0" as sentinel to indicate
> "there is no shadow region available" in
> ParCompactionManager::acquire_shadow_region().
> This could be fixed by improving the code in
> PSParallelCompact::initialize_shadow_regions() and use a sentinel region
> value of (size_t)~0 (as an explicit constant).
> Even if you do not change this, please introduce an explicit constant
> for this sentinel value. This makes the code more self-explanatory.
> Sorry for the misleading +1 operation. The +1 can be safely removed. The
sentinel value 0 does not cause this design because the first region (in
old space) cannot be a shadow region.

> - at least in ParallelCompactData::RegionData::try_steal I would add a
> dirty read of the _shadow_state to avoid the overhead of obviously
> unsuccessful steal attempts (I do not know about frequencies of those,
> so ymmv, but probably it would be easiest to add it everywhere).
> Also all the cmpxchg can/should use memory_order_relaxed to avoid the
> two full fences every time accessed as far as I can tell.
> Excellent suggestions! I didn't consider the performance factors in these
atomic instructions before, and you're right that GC threads may suffer
many failures the first time getting shadow regions. Changing the memory
order to memory_order_relaxed is also helpful.

> - not sure about whether "acquire_shadow_region()/release_shadow_region"
> are good names for
> "PSParallelCompact::try_pop_shadow_region/push_shadow_region" (or
> something else).
> "Acquire"/"release" has a very specific semantic related to a completely
> different area (memory ordering in MP systems), so we should probably
> avoid using them. There are other well-used pairs of names to add and
> remove elements to a container too.
> Naming functions with "Acuire" and "Release" is indeed misleading. I've
changed these functions to push/pop_shadow_region,
push/pop_shadow_region_mt_safe, and remove_all_shadow_regions in
PSParallelCompact to fit the underlying LIFO _shadow_region_array. Do you
think these names are appropriate now?

> - the changes in PSParallelCompact sometimes use the terms
> "steal_record", "shadow_record" and "shadow_region" (e.g.
> _shadow_region_array) interchangeably.
> Can you give a reason for this? I am good with any (with a preference
> for "shadow_region" since it gives an idea of the contents while
> "record" is quite generic), but it makes reading the code harder than
> necessary.
> Sorry for the inconsistency between variable names. I introduced
steal_record to record the index of the next shadow region, so that a GC
thread could seek shadow regions from the last point instead of the
beginning. To make the code more specific, I've changed the variable
_shadow_record to _next_shadow_region and moved the code in
PSParallelCompact::initialize_shadow_record to

> - the names of the new methods e.g. in PSParallelCompact::RegionData
> should be more precise; e.g. please add what does "try_push" wants to
> push? Or "try_steal" steal?
> Not even the comments for these contain that information, and I believe
> that by better naming of the methods, we can avoid the comments
> completely in most cases.
> Sorry for the vague code. These five atomic interfaces intend to change
the _shadow_state of the current region to reflect the collection process,
not to push or steal anything.  I've changed try_push to mark_normal and
try_steal to mark_shadow, respectively. The _shadow_state and the return
value of these functions can help the collector to determine 1) whether a
region should be collected by the shadow region optimization and 2) if the
data in a shadow region are ready to be copied back to the corresponding
heap region.

Thanks again for your valuable reviews. If there are any further problems,
please contact me at any time. I was also wondering could you please CC the
following mails to me? There seem some problems with my email, and I didn't
receive your last mail until I searched the mail lists in OpenJDK website.
Thanks very much!

Best Regards,
Haoyu Li,
Institute of Parallel and Distributed Systems(IPADS),
School of Software,
Shanghai Jiao Tong University
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shadow-region-v5.patch
Type: text/x-patch
Size: 29986 bytes
Desc: not available
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20191127/c7ff205c/shadow-region-v5.patch>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shadow-region-incr.patch
Type: text/x-patch
Size: 23414 bytes
Desc: not available
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20191127/c7ff205c/shadow-region-incr.patch>

More information about the hotspot-gc-dev mailing list