[PATCH] Exploit Empty Regions in Young Gen to Enhance PS Full GC Performance

Paul Su paul.su at oracle.com
Mon Jan 7 03:04:15 UTC 2019


Hi Haoyu,

Thanks for your contribution. The past two weeks were holidays for most of the US and European regions. We are also in the middle of a critical phase of our release process. We are aware of your proposal and will consider it and provide feedback as soon as possible.

Thanks,
Paul

> On Jan 6, 2019, at 5:41 PM, Haoyu Li <leihouyju at gmail.com> wrote:
> 
> Hi all,
> 
> I submitted a patch about two weeks ago in the previous mail, however, I have not received any response so far. Did I miss something? I just follow the instructions in the webpage about How to Contribute. Can someone sponsor this patch? Any reviews are well appreciated!
> 
> Best Regrads,
> Haoyu Li,
> Institute of Parallel and Distributed Systems(IPADS),
> School of Software,
> Shanghai Jiao Tong University
> 
> 
> Haoyu Li <leihouyju at gmail.com> 于2018年12月24日周一 上午1:38写道:
>> Hi all, 
>> I have developed a patch to enhance the full GC performance of Parallel Scavenge on OpenJDK 11, may I have some reviews? The patch is described as follows and attached in this mail.
>> 
>> Problem
>> Parallel Scavenge(PS) implements a compacting algorithm to do the full GC, and we find that this algorithm leads in terrible GC thread utilization (like only 8% on Derby benchmark in SPECjvm2008 suite) since there are serious dependencies between heap regions, i.e., a region is available to receive live objects from its source regions only after it has been collected. The work stealing does not solve this problem, idle GC threads cannot steal anything because most regions are unavailable to collect.
>> 
>> Optimization
>> We propose shadow region to solve the above problem. The basic idea is to let GC threads collect unavailable regions in advance by copying their live data into newly allocated empty regions, i.e., shadow regions, to resolve the region dependencies. The contents of shadow regions will be copied back to the corresponding regions later. With our approach, GC threads can keep working at most of the time without suffering from any work stealing failure (except the work stealing failure happened in the end of a full GC). And we notice that the to space in young gen is always empty, so we exploit the empty regions in to space to play the role of shadow regoins (if the ScavengeBeforeFullGC option is on, regions in eden space may be used, too) and avoid allocating shadow regions from off heap memory.
>> 
>> Evaluation
>> We evaluate the full GC performance with our patch on DaCapo, SPECjvm2008, JOlden benchmark suits, and the results shows that shadow region optimization could improve full GC throughput by 2.1X on average, up to 3.2X.
>> 
>> The patch and evaluation result are attached.
>> 
>> Best Regrads,
>> Haoyu Li,
>> Institute of Parallel and Distributed Systems(IPADS),
>> School of Software,
>> Shanghai Jiao Tong University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20190106/6bf58fb4/attachment.html>


More information about the hotspot-gc-dev mailing list