ParallelOldGC : Single threaded with one cpu 100% during full gc

Srinivas Ramakrishna ysr1729 at
Mon Nov 5 05:21:14 UTC 2012

CMS might help if you are able to tune it to avoid the full gc pauses
entirely by doing all the
collection work mostly concurrently. With CMS, a few cpu's will be busy for
a while as CMS runs
concurrently with the application in certain phases. If CMS "loses the
race" (or there is excessive
fragmentation), the stop-world full gc will be pretty long and will be

$ /usr/lib/jvm/jdk1.7.0_05/bin/java -XX:+PrintFlagsFinal -version | grep
     bool CMSCompactWhenClearAllSoftRefs            = true
    uintx CMSFullGCsBeforeCompaction                = 0
     bool CompactFields                             = true
    uintx HeapFirstMaximumCompactionCount           = 3
    uintx HeapMaximumCompactionInterval             = 20
     intx MarkSweepAlwaysCompactCount               = 4
     bool UseCMSCompactAtFullCollection             = true
     bool UseMaximumCompactionOnSystemGC            = true
java version "1.7.0_05"
Java(TM) SE Runtime Environment (build 1.7.0_05-b06)
Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode)

The options we used were HeapFirstMaximumCompactionCount  and
HeapMaximumCompactionInterval, both  set to  very large values.
For our use case, this worked reasonably well, but I have seen cases of
other applications where this isn't effective because
of the demographics of large oop-rich objects (such as large hash maps and
such, for example) as they "churn through" the
old generation rather than sedimenting in the dense prefix. We didn't need
to, and I think you won't need to either, worry
about space lost to deadwood in the dense prefix, because the dense prefix
computation doesn't waste much space. I don't
recall the computation, but John would be able to help.

-- ramki

On Sun, Nov 4, 2012 at 11:34 AM, lohit <lohit.vijayarenu at> wrote:

> Thanks Srinivas and Krystal for the pointers and information.
> Would CMS help in this case? Because I see similar behavior in CMS as
> well, where in one cpu gets busy for quite some time.
> I could try to grab similar stack trace if that helps.
> From what I read there is no way around it other than disabling max
> compaction (I could not find any switch to deferring it forever, yet not
> disable it).
> Downside of that is the memory gets fragmented and application performance
> degrade or causes OOM.
> Any further thoughts on this?
> Thanks again!
> 2012/11/4 Srinivas Ramakrishna <ysr1729 at>
>> This is the deferred pointer updates phase which is unfortunately
>> single-threaded, and occasionally greatly dominates the pause times.
>> I called in a bug for this late last year and there was some discussion
>> on how this could be fixed. I was planning to work on
>> a patch, but never quite got around to it.  We worked around it (although
>> the workaround may not always work, see a parallel (hmm)
>> ongoing discussion) by deferring the "maximum compaction" forever, always
>> relying on the dense prefix to demarcate the
>> boundary below which we would never choose to compact. In typical cases I
>> have seen large oop-rich objects (which in our
>> case sedimented to the bottom of the heap) were the cause of this slow
>> down.
>> I don't recall the bug id tracking this, but someone on the list may
>> know. I'll try and find the old discussion of this, and send a pointer.
>> -- ramki
>> On Sun, Nov 4, 2012 at 9:54 AM, lohit <lohit.vijayarenu at> wrote:
>>> Hello Devs,
>>> We are trying to profile Full GC performance on our Java server.
>>> Heap Size is about 80G, running on CentOS 5.5. As of now we use
>>> ParallelOldGC for old generation compactions.
>>> When we trigger fullGC by hand we see that it takes more than 2 minutes.
>>> Running mpstat showed that for about half of the time only one CPU was
>>> spinning 100% and all others were idle. (Box has 24 CPUs)
>>> While this was happening we took stack trace and see that all threads
>>> are waiting behind one thread whose trace look like below.
>>> Question is the below expected. Even though documentation says
>>> parallelOldGC uses multiple threads, are there cases when this kind of
>>> serialization happens where all threads wait behind single thread.
>>> I might have given very little information about problem, but let me
>>> know if I could add any more information to know about this
>>> Thread 18 (Thread 0x419f2940 (LWP 55314)):^M
>>> #0  0x00007f051735e3e0 in BitMap::get_next_one_offset_inline_aligned_right(unsigned long, unsigned long) const ()^M
>>>    from /usr/java/jdk1.6.0_24/jre/lib/amd64/server/^M
>>> #1  0x00007f051735e09e in ParMarkBitMap::live_words_in_range(HeapWord*, oopDesc*) const ()^M
>>>    from /usr/java/jdk1.6.0_24/jre/lib/amd64/server/^M
>>> #2  0x00007f0517398848 in ParallelCompactData::calc_new_pointer(HeapWord*) ()^M
>>>    from /usr/java/jdk1.6.0_24/jre/lib/amd64/server/^M
>>> #3  0x00007f05173413ec in objArrayKlass::oop_update_pointers(ParCompactionManager*, oopDesc*) ()^M
>>>    from /usr/java/jdk1.6.0_24/jre/lib/amd64/server/^M
>>> #4  0x00007f051739d0a7 in PSParallelCompact::update_deferred_objects(ParCompactionManager*, PSParallelCompact::SpaceId) ()^M
>>>    from /usr/java/jdk1.6.0_24/jre/lib/amd64/server/^M
>>> #5  0x00007f051739c6c7 in PSParallelCompact::compact() () from /usr/java/jdk1.6.0_24/jre/lib/amd64/server/^M
>>> #6  0x00007f051739aeca in PSParallelCompact::invoke_no_policy(bool) ()^M
>>>    from /usr/java/jdk1.6.0_24/jre/lib/amd64/server/^M
>>> #7  0x00007f051739a845 in PSParallelCompact::invoke(bool) () from /usr/java/jdk1.6.0_24/jre/lib/amd64/server/^M
>>> #8  0x00007f05174a0bb0 in VM_ParallelGCSystemGC::doit() () from /usr/java/jdk1.6.0_24/jre/lib/amd64/server/^M
>>> #9  0x00007f05174ada5a in VM_Operation::evaluate() () from /usr/java/jdk1.6.0_24/jre/lib/amd64/server/^M
>>> (More stack frames follow...)^M
>>> --
>>> Have a Nice Day!
>>> Lohit
> --
> Have a Nice Day!
> Lohit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the hotspot-gc-dev mailing list