G1GC Full GCs
Y. S. Ramakrishna
y.s.ramakrishna at oracle.com
Wed Jul 7 11:28:56 PDT 2010
On 07/07/10 08:45, Todd Lipcon wrote:
> Overnight I saw one "concurrent mode failure".
> 2010-07-07T07:56:27.786-0700: 28490.203: [GC 28490.203: [ParNew
> (promotion failed): 59008K->59008K(59008K), 0.0179250 secs]28490.221:
> [CMS2010-07-07T07:56:27.901-0700: 28490.317: [CMS-concurrent-preclean:
> 0.556/0.947 secs] [Times:
> user=5.76 sys=0.26, real=0.95 secs]
> (concurrent mode failure): 6359176K->4206871K(8323072K), 17.4366220
> secs] 6417373K->4206871K(8382080K), [CMS Perm : 18609K->18565K(31048K)],
> 17.4546890 secs] [Times: user=11.17 sys=0.09, real=17.45 secs]
> I've interpreted pauses like this as being caused by fragmentation,
> since the young gen is 64M, and the old gen here has about 2G free. If
> there's something I'm not understanding about CMS, and I can tune it
> more smartly to avoid these longer pauses, I'm happy to try.
Yes the old gen must be fragmented. I'll look at the data you have
made available (for CMS). The CMS log you uploaded does not have the
suffix leading into the concurrent mode failure ypu display above
(it stops less than 2500 s into the run). If you could include
the entire log leading into the concurrent mode failures, it would
be a great help. Do you have large arrays in your
application? The shape of the promotion graph for CMS is somewhat
jagged, indicating _perhaps_ that. Yes, +PrintTenuringDistribution
would shed a bit more light. As regards fragmentation, it can be
tricky to tune against, but we can try once we understand a bit
more about the object sizes and demographics.
I am sure you don't have an easily shared test case, so we
can reproduce both the CMS fragmentation and the G1 full gc
issues locally for quickest progress on this?
More information about the hotspot-gc-use