Abnormal high sys time in G1 GC

Roy Zhang roy.sunny.zhang007 at gmail.com
Mon Jan 20 10:54:37 UTC 2020


Sent to hotspot-gc-dev mail list as well :)

Thank you for ur help in advance!!!

Thanks,
Roy

On Mon, Jan 20, 2020 at 6:22 PM Roy Zhang <roy.sunny.zhang007 at gmail.com>
wrote:

> Dear JVM experts,
>
> Recently we found GC spike (long STW minor GC), and sys time is high when
> we GC time is high. Normally sys time is near 0 seconds and minor GC is
> less than 500ms.
>
> From
> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2017-October/020630.html
>  and https://blog.gceasy.io/2016/12/11/sys-time-greater-than-user-time/,
> high sys time could be caused by operation system problem/VM related
> problem/memory constraint/disk IO pressure/Transparent Huge Pages.
>
> I checked them one by one, don't find any clue, could u please kindly
> provide suggestion? Thanks in advance!
>
> 1.operation system problem
> --We have enough CPU/memory/disk (48 cpu cores + 373 RAM with 160G heap,
> disk is enough), and there is no error in /var/log/dmesg
> 2. memory constraint
> -- We have enough available memory. available memory (free -m) is 263G
> 3. disk IO pressure
> -- Not find issue from disk info from prometheus node exporter.
> Granularity is 15s, and I can't find counterpart of avgqu-sz & util metrics
> (disk IO util and saturation metrics)  which is part of iostat. It could be
> caused by big Granularity???
> 4. VM related problem
> -- We are using physical machine
> 5. Transparent Huge Pages.
> It is madvise. It could be a problem, but we don't have this issue
> previously. It has been running for nearly 20 weeks.
>
> *cat /sys/kernel/mm/transparent_hugepage/enabledalways [madvise] never*
>
> *JDK version:*
> OpenJDK Runtime Environment, 1.8.0_222-b10
>
> *Java Opts:*
> -javaagent:/server/jmx_prometheus_javaagent-0.12.0.jar=xxxx:/server/config.yaml
>
> -server
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=xxxx
> -Dcom.sun.management.jmxremote.rmi.port=xxxx
> -Dcom.sun.management.jmxremote.local.only=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Xloggc:/server/xxxx.log
> -XX:+PrintGCDateStamps
> -XX:AutoBoxCacheMax=1000000
> -XX:+UseG1GC
> -XX:MaxGCPauseMillis=500
> -XX:+UnlockExperimentalVMOptions
> -XX:G1NewSizePercent=50
> -XX:InitiatingHeapOccupancyPercent=70
> -XX:+ParallelRefProcEnabled
> -XX:+ExplicitGCInvokesConcurrent
> -XX:+UseStringDeduplication
> -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps
> -Xms160g
> -Xmx160g
> -XX:+HeapDumpOnOutOfMemoryError
>
> *Snippet of GC log:*
>
> 2020-01-20T07:27:03.166+0000: 2756.665: [GC pause (G1 Evacuation Pause)
> (young), *6.2899024 secs*]
>    [Parallel Time: 6255.0 ms, GC Workers: 33]
>       [GC Worker Start (ms): Min: 2756664.9, Avg: 2756665.5, Max:
> 2756666.1, Diff: 1.2]
>       [Ext Root Scanning (ms): Min: 0.0, Avg: 0.5, Max: 5.3, Diff: 5.3,
> Sum: 16.8]
>       [Update RS (ms): Min: 0.0, Avg: 0.8, Max: 1.1, Diff: 1.1, Sum: 25.6]
>          [Processed Buffers: Min: 0, Avg: 1.6, Max: 4, Diff: 4, Sum: 53]
>       [Scan RS (ms): Min: 142.0, Avg: 145.3, Max: 146.4, Diff: 4.4, Sum:
> 4794.1]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.3, Max: 3.5, Diff: 3.5,
> Sum: 8.8]
>      * [Object Copy (ms): Min: 6100.1, Avg: 6101.8, Max: 6106.5, Diff:
> 6.4, Sum: 201358.4]*
>       [Termination (ms): Min: 0.1, Avg: 5.2, Max: 6.7, Diff: 6.6, Sum:
> 172.9]
>          [Termination Attempts: Min: 1, Avg: 1353.0, Max: 1476, Diff:
> 1475, Sum: 44650]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum:
> 7.0]
>       [GC Worker Total (ms): Min: 6253.4, Avg: 6254.1, Max: 6254.7, Diff:
> 1.2, Sum: 206383.7]
>       [GC Worker End (ms): Min: 2762919.4, Avg: 2762919.6, Max: 2762919.8,
> Diff: 0.4]
>    [Code Root Fixup: 0.6 ms]
>    [Code Root Purge: 0.0 ms]
>    [String Dedup Fixup: 0.7 ms, GC Workers: 33]
>       [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4]
>       [Table Fixup (ms): Min: 0.0, Avg: 0.1, Max: 0.6, Diff: 0.6, Sum: 2.0]
>    [Clear CT: 4.0 ms]
>    [Other: 29.6 ms]
>       [Choose CSet: 0.1 ms]
>       [Ref Proc: 10.3 ms]
>       [Ref Enq: 0.6 ms]
>       [Redirty Cards: 11.3 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.0 ms]
>       [Free CSet: 6.5 ms]
>    [Eden: 72576.0M(72576.0M)->0.0B(80896.0M) Survivors: 9344.0M->1024.0M
> Heap: 83520.0M(160.0G)->11046.9M(160.0G)]
> * [Times: user=27.19 sys=162.28, real=6.30 secs] *
>
> 2020-01-20T06:59:23.382+0000: 1096.881: [GC pause (G1 Evacuation Pause)
> (young) (initial-mark), *4.1248088 secs*]
>    [Parallel Time: 4098.0 ms, GC Workers: 33]
>       [GC Worker Start (ms): Min: 1096882.1, Avg: 1096882.8, Max:
> 1096883.2, Diff: 1.2]
>       [Ext Root Scanning (ms): Min: 4.0, Avg: 4.8, Max: 6.1, Diff: 2.0,
> Sum: 159.7]
>       [Update RS (ms): Min: 0.0, Avg: 0.3, Max: 1.1, Diff: 1.1, Sum: 9.5]
>          [Processed Buffers: Min: 0, Avg: 1.3, Max: 6, Diff: 6, Sum: 43]
>      * [Scan RS (ms): Min: 2001.2, Avg: 2012.2, Max: 2013.4, Diff: 12.2,
> Sum: 66401.0]*
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.6, Max: 10.7, Diff: 10.7,
> Sum: 18.5]
>    *   [Object Copy (ms): Min: 2039.3, Avg: 2049.2, Max: 2079.5, Diff:
> 40.2, Sum: 67623.1]*
>       [Termination (ms): Min: 0.0, Avg: 29.6, Max: 39.7, Diff: 39.7, Sum:
> 978.0]
>          [Termination Attempts: Min: 1, Avg: 6587.0, Max: 8068, Diff:
> 8067, Sum: 217372]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.5, Diff: 0.4, Sum:
> 7.9]
>       [GC Worker Total (ms): Min: 4096.3, Avg: 4096.9, Max: 4097.7, Diff:
> 1.4, Sum: 135197.8]
>       [GC Worker End (ms): Min: 1100979.5, Avg: 1100979.7, Max: 1100979.9,
> Diff: 0.4]
>    [Code Root Fixup: 0.6 ms]
>    [Code Root Purge: 0.2 ms]
>    [String Dedup Fixup: 1.0 ms, GC Workers: 33]
>       [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Table Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.7, Diff: 0.7, Sum: 1.4]
>    [Clear CT: 3.4 ms]
>    [Other: 21.7 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 9.1 ms]
>       [Ref Enq: 0.9 ms]
>       [Redirty Cards: 4.3 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.0 ms]
>       [Free CSet: 5.3 ms]
>    [Eden: 81184.0M(81184.0M)->0.0B(72576.0M) Survivors: 736.0M->9344.0M
> Heap: 83508.0M(160.0G)->10944.0M(160.0G)]
>
> * [Times: user=68.40 sys=9.11, real=4.13 secs] *
>
> Thanks,
> Roy
>


More information about the hotspot-gc-dev mailing list