G1GC Full GCs

Todd Lipcon todd at cloudera.com
Mon Jan 24 06:16:43 UTC 2011

Unfortunately my test is not easy to reproduce in its current form. But as I
look more and more into it, it looks like we're running into the same issue.

I added some code at the end of the mark phase that, after it sorts the
regions by efficiency, will print an object histogram for any regions that
are >98% garbage but very inefficient (<100KB/ms predicted collection rate)

Here's an example of an "uncollectable" region that is all garbage but for
one object:

Region 0x00002aaab0203e18 (  M1) [0x00002aaaf3800000, 0x00002aaaf3c00000]
Used: 4096K, garbage: 4095K. Eff: 6.448103 K/ms
  Very low-occupancy low-efficiency region. Histogram:

 num     #instances         #bytes  class name
   1:             1            280
Total             1            280

At 6K/ms it's predicting take 600+ms to collect this region, so it will
never happen.

I can't think of any way that there would be a high mutation rate of
references to this Entry object..

So, my shot-in-the-dark theory is similar to what Peter was thinking. When a
region through its lifetime has a large number of other regions reference
it, even briefly, its sparse table will overflow. Then, later in life when
it's down to even just one object with a very small number of inbound
references, it still has all of those coarse entries -- they don't get
scrubbed because those regions are suffering the same issue.



On Sun, Jan 23, 2011 at 12:42 AM, Peter Schuller <
peter.schuller at infidyne.com> wrote:

> > I still seem to be putting off GC of non-young regions too much though. I
> Part of my experiments I have been harping on was the below change to
> cut GC efficiency out of the decision to perform non-young
> collections. I'm not suggesting it actually be disabled, but perhaps
> it can be adjusted to fit your workload? If there is nothing outright
> wrong in terms of predictions and the problem is due to cost estimates
> being too high, that may be a way to avoid full GC:s at the expense of
> more expensive GC activity. This smells like something that should be
> a tweakable VM option. Just like GCTimeRatio affects heap expansion
> decisions, something to affect this (probably just a ratio applied to
> the test below?).
> Another thing: This is to a large part my human confirmation biased
> brain speaking, but I would be really interested to find out if if the
> slow build-up you seem to be experiencing is indeed due to rs scan
> costs de to sparse table overflow (I've been harping about roughly the
> same thing several times so maybe people are tired of it; most
> recently in the thread "g1: dealing with high rates of inter-region
> pointer writes").
> Is your test easily runnable so that one can reproduce? Preferably
> without lots of hbase/hadoop knowledge. I.e., is it something that can
> be run in a self-contained fashion fairly easily?
> Here's the patch indicating where to adjust the efficiency thresholding:
> --- a/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp   Fri
> Dec 17 23:32:58 2010 -0800
> +++ b/src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp   Sun
> Jan 23 09:21:54 2011 +0100
> @@ -1463,7 +1463,7 @@
>     if ( !_last_young_gc_full ) {
>       if ( _should_revert_to_full_young_gcs ||
>            _known_garbage_ratio < 0.05 ||
> -           (adaptive_young_list_length() &&
> +           (adaptive_young_list_length() && //false && // scodetodo
>             (get_gc_eff_factor() * cur_efficiency <
> predict_young_gc_eff())) ) {
>         set_full_young_gcs(true);
>       }
> --
> / Peter Schuller

Todd Lipcon
Software Engineer, Cloudera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110123/74351a9e/attachment.htm>
-------------- next part --------------
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net

More information about the hotspot-gc-dev mailing list