Reducing CMS-remark times

Justin Ellison justin at techadvise.com
Thu Sep 11 09:51:02 PDT 2008


Hi all,

Running a Weblogic jsp application on 1.4.2_17 on 32bit Sparc under
Solaris 9.  Here are the args:
-Xms2304m -Xmx2304m -XX:NewSize=384m -XX:MaxNewSize=384m
-XX:CompileThreshold=3000 -Djava.net.setSoTimeout=20000
-XX:LargePageSizeInBytes=4m -XX:+UseMPSS -Xss128k
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled
-Xnoclassgc -XX:ParallelGCThreads=4 -XX:MaxTenuringThreshold=8
-XX:SurvivorRatio=6 -XX:+UseCMSCompactAtFullCollection -Xloggc:gc.out
-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
-XX:MaxPermSize=92m

I'm seeing CMS-remark pauses that get longer and longer as time goes
on, some getting to be 3 seconds or more.

It's apples to oranges, I know, but my Core 2 Duo laptop running
Ubuntu 8.04 and 1.4.2_18 with the same arguments under a load
generator performs consistently better than production.  The
production application is on a Sun Fire v490 - it should run circles
around my laptop.

Here's the output of PrintGCStats against my laptop:
what         count       total      mean      max   stddev
gen0(s)       8129    1337.382   0.16452    0.812   0.0663
gen0t(s)      8129    1341.128   0.16498    1.132   0.0676
cmsIM(s)      4504     896.872   0.19913    0.542   0.1039
cmsRM(s)      4485    2129.909   0.47490    2.121   0.2060
GC(s)        12633    4367.908   0.34575    2.121   0.1184
cmsCM(s)      4503   37508.595   8.32969   45.481   1.5792
cmsCP(s)      4490    1158.141   0.25794   26.034   0.6600
cmsCS(s)      4485    6408.843   1.42895   11.013   0.3527
cmsCR(s)      4480     162.617   0.03630    0.441   0.0128
alloc(MB)     8129  2340385.405  287.90570  288.000   1.6059
promo(MB)     8129   30006.814   3.69133   73.405   4.4811

and against the v490:
bmapp2d-gc.out

what         count       total      mean      max   stddev
gen0(s)       4739     917.710   0.19365    0.765   0.0483
gen0t(s)      4739     919.012   0.19393    0.765   0.0483
cmsIM(s)        34       4.421   0.13003    0.510   0.0728
cmsRM(s)        34      52.707   1.55019    2.327   0.4003
GC(s)         4773     976.140   0.20451    2.327   0.1273
cmsCM(s)        34     455.876  13.40812   26.797   2.9626
cmsCP(s)        34       5.279   0.15526    0.285   0.0572
cmsCS(s)        34     265.217   7.80050    8.741   0.6182
cmsCR(s)        34       8.585   0.25250    0.295   0.0137
alloc(MB)     4739  1364784.244  287.98992  288.000   0.0525
promo(MB)     4739   15492.045   3.26905   19.177   3.1992

alloc/elapsed_time    = 1364784.244 MB /  77242.704 s  =  17.669 MB/s
alloc/tot_cpu_time    = 1364784.244 MB / 617941.632 s  =   2.209 MB/s
alloc/mut_cpu_time    = 1364784.244 MB / 609397.559 s  =   2.240 MB/s
promo/elapsed_time    =  15492.045 MB /  77242.704 s  =   0.201 MB/s
promo/gc0_time        =  15492.045 MB /    919.012 s  =  16.857 MB/s
gc_seq_load           =   7809.116 s  / 617941.632 s  =   1.264%
gc_conc_load          =    734.957 s  / 617941.632 s  =   0.119%
gc_tot_load           =   8544.073 s  / 617941.632 s  =   1.383%

Especially of note it the difference in mean remark times between the
two above.  I've placed full log snippets at the bottom of this email.

Here's my current trains of thought:

a) My load generator is not very close to real-world load.
b) There are some OS level tunables that need set on the v490
c) There is a bug in 1.4.2_17 that's biting me.
d) I'm not getting concurrency in the remark phase (which would
explain my dual core laptop keeping up with my 8 core server)
d) I'm running into what Jon is describing here:
http://blogs.sun.com/jonthecollector/entry/did_you_know regarding
CMSMaxAbortablePrecleanTime.  I have no idea how I can resolve this on
1.4.2 if that's the case.  Perhaps shrink my New???

Bah!  I've been working on this for too long, and I'm going in
circles.  Can anyone offer up any insights?

Justin


Production server:
....
18959.886: [GC 18959.886: [ParNew: 322704K->29504K(344064K), 0.2497164
secs] 1912832K->1624273K(2310144K), 0.2498432 secs]
18968.737: [GC 18968.737: [ParNew: 324416K->22709K(344064K), 0.2113128
secs] 1919185K->1625475K(2310144K), 0.2114685 secs]
18981.220: [GC 18981.220: [ParNew: 317621K->29164K(344064K), 0.1744130
secs] 1920387K->1631931K(2310144K), 0.1746267 secs]
18987.149: [GC 18987.149: [ParNew: 324076K->24086K(344064K), 0.2326857
secs] 1926843K->1634489K(2310144K), 0.2328768 secs]
18987.382: [GC [1 CMS-initial-mark: 1610403K(1966080K)]
1634489K(2310144K), 0.1155761 secs]
18987.498: [CMS-concurrent-mark-start]
18999.599: [CMS-concurrent-mark: 12.101/12.101 secs]
18999.599: [CMS-concurrent-preclean-start]
18999.698: [CMS-concurrent-preclean: 0.094/0.099 secs]
18999.700: [GC18999.700: [Rescan (parallel) , 1.0845970
secs]19000.785: [weak refs processing, 1.0721075 secs] [1 CMS-remark:
1610403K(1966080K)] 1894660K(2310144K), 2.1578777 secs]
19001.858: [CMS-concurrent-sweep-start]
19002.043: [GC 19002.043: [ParNew: 318998K->28072K(344064K), 0.1886607
secs] 1926098K->1635173K(2310144K), 0.1888124 secs]
19009.800: [CMS-concurrent-sweep: 7.751/7.942 secs]
19009.800: [CMS-concurrent-reset-start]
19010.051: [CMS-concurrent-reset: 0.250/0.250 secs]
19015.384: [GC 19015.384: [ParNew: 322984K->30317K(344064K), 0.2178368
secs] 1496676K->1207838K(2310144K), 0.2179773 secs]
19025.841: [GC 19025.842: [ParNew: 325229K->27691K(344064K), 0.2313683
secs] 1502750K->1213013K(2310144K), 0.2315669 secs]
....

Laptop:
12436.567: [GC 12436.567: [ParNew: 322821K->28170K(344064K), 0.4482710
secs] 1820614K->1532726K(2310144K), 0.4483510 secs]
12438.456: [GC 12438.456: [ParNew: 323082K->27352K(344064K), 0.3058890
secs] 1827638K->1537746K(2310144K), 0.3059690 secs]
12440.072: [GC 12440.072: [ParNew: 322264K->22917K(344064K), 0.2582750
secs] 1832658K->1543612K(2310144K), 0.2583510 secs]
12441.594: [GC 12441.594: [ParNew: 317829K->31252K(344064K), 0.2623760
secs] 1838524K->1551947K(2310144K), 0.2624510 secs]
12443.136: [GC 12443.137: [ParNew: 326164K->49152K(344064K), 0.5364920
secs] 1846859K->1594211K(2310144K), 0.5365690 secs]
12443.694: [GC [1 CMS-initial-mark: 1545059K(1966080K)]
1597435K(2310144K), 0.2112670 secs]
12443.905: [CMS-concurrent-mark-start]
12445.287: [GC 12445.287: [ParNew: 344063K->12141K(344064K), 0.5005420
secs] 1889123K->1604975K(2310144K), 0.5006200 secs]
12447.348: [GC 12447.348: [ParNew: 307053K->21747K(344064K), 0.2396110
secs] 1899887K->1614581K(2310144K), 0.2396900 secs]
12449.061: [GC 12449.061: [ParNew: 316659K->29249K(344064K), 0.2359550
secs] 1909493K->1622083K(2310144K), 0.2360330 secs]
12450.740: [GC 12450.740: [ParNew: 324161K->30724K(344064K), 0.3210140
secs] 1916995K->1631546K(2310144K), 0.3210850 secs]
12452.539: [GC 12452.539: [ParNew: 325636K->28577K(344064K), 0.3849060
secs] 1926458K->1639437K(2310144K), 0.3849890 secs]
12454.548: [GC 12454.548: [ParNew: 323489K->30308K(344064K), 0.3750850
secs] 1934349K->1650841K(2310144K), 0.3751660 secs]
12455.880: [CMS-concurrent-mark: 9.839/11.975 secs]
12455.880: [CMS-concurrent-preclean-start]
12457.341: [CMS-concurrent-preclean: 1.200/1.461 secs]
12457.363: [GC12457.363: [Rescan (parallel) , 1.9972650
secs]12459.360: [weak refs processing, 0.1239550 secs] [1 CMS-remark:
1620532K(1966080K)] 1941855K(2310144K), 2.1214010 secs]
12459.499: [CMS-concurrent-sweep-start]
12459.509: [GC 12459.509: [ParNew: 325220K->44523K(344064K), 0.4837280
secs] 1945752K->1673386K(2310144K), 0.4838110 secs]
12461.460: [CMS-concurrent-sweep: 1.455/1.961 secs]
12461.461: [CMS-concurrent-reset-start]
12461.496: [CMS-concurrent-reset: 0.036/0.036 secs]
12462.802: [GC 12462.802: [ParNew: 339435K->49152K(344064K), 0.6503040
secs] 1936692K->1669076K(2310144K), 0.6503840 secs]
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use



More information about the hotspot-gc-dev mailing list