Unexplained long stop the world pauses during concurrent marking step in G1 Collector

Krishnamurthy, Kannan Kannan.Krishnamurthy at contractor.cengage.com
Tue Aug 26 16:43:15 UTC 2014


Yes this happens at every single concurrent mark step and consistently reproducible. Still trying to get some native stack traces, initial tries with gstack was terminating the JVM. Thank you.
________________________________
From: Srinivas Ramakrishna [ysr1729 at gmail.com]
Sent: Monday, August 25, 2014 8:56 PM
To: Krishnamurthy, Kannan
Cc: hotspot-gc-dev at openjdk.java.net; kndkannan at gmail.com
Subject: Re: Unexplained long stop the world pauses during concurrent marking step in G1 Collector

Does this happen at each concurrent marking phase, or only at some?
If you can reproduce the problem under controlled conditions, you might consider starting by
collecting a few native stack traces (pstack) in quick succession during the marking phase
(if you react fast enough! -- I'd suggest running pstack in a loop next to the JVM when heap
occupancy approaches 65%, and you can look at the ones corresponding to the pause during
the marking phase.)

-- ramki


On Mon, Aug 25, 2014 at 8:11 AM, Krishnamurthy, Kannan <Kannan.Krishnamurthy at contractor.cengage.com<mailto:Kannan.Krishnamurthy at contractor.cengage.com>> wrote:
Greetings,

We are experiencing unexplained/unknown long pauses (8 seconds) during concurrent marking step of G1 collector.

    2014-08-07T13:42:30.552-0400: 92183.303: [GC concurrent-root-region-scan-start]
    2014-08-07T13:42:30.555-0400: 92183.305: [GC concurrent-root-region-scan-end, 0.0025230 secs]
    **2014-08-07T13:42:30.555-0400: 92183.305: [GC concurrent-mark-start]**
    **2014-08-07T13:42:39.598-0400: 92192.348: Application time: 9.0448670 seconds**
    2014-08-07T13:42:39.601-0400: 92192.351: Total time for which application threads were stopped: 0.0029740 seconds
    2014-08-07T13:42:39.603-0400: 92192.353: [GC pause (G1 Evacuation Pause) (young) 92192.354: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 7980, predicted base time: 28.19 ms, remaining time: 71.81 ms, target pause time: 100.00 ms


`2014-08-07T13:42:30.555-0400: 92183.305` is when the concurrent mark starts, approximately after 2 seconds of this step the application starts to pause. However the GC logs claims the application was not paused during this window.
Linux "top" shows single CPU at 100% and rest of the CPUs at 0% during the pause.

Any help in understanding the root cause of this issue is appreciated.

Our target JVMS:

    java version "1.7.0_04"
    Java(TM) SE Runtime Environment (build 1.7.0_04-b20)
    Java HotSpot(TM) 64-Bit Server VM (build 23.0-b21, mixed mode)

    java version "1.8.0_11"
    Java(TM) SE Runtime Environment (build 1.8.0_11-b12)
    Java HotSpot(TM) 64-Bit Server VM (build 25.11-b03, mixed mode)


Our JVM options :

    -Xms20G -Xmx20G -Xss10M -XX:PermSize=128M -XX:MaxPermSize=128M -XX:MarkStackSize=16M
    -XX:+UnlockDiagnosticVMOptions -XX:+G1PrintRegionLivenessInfo -XX:+TraceGCTaskThread
    -XX:+G1SummarizeConcMark -XX:+G1SummarizeRSetStats -XX:+G1TraceConcRefinement
    -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=65
    -XX:ParallelGCThreads=24 -verbose:gc -XX:+PrintGC -XX:+PrintGCDetails
    -XX:+PrintGCDateStamps -XX:+PrintAdaptiveSizePolicy -XX:+PrintTenuringDistribution
    -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime
    -Xloggc:/common/logs/ocean-partition-gc.log


Thanks and regards,
Kannan


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20140826/e4ba7a55/attachment.html>


More information about the hotspot-gc-dev mailing list