From bartosz.markocki at gmail.com Fri Apr 1 02:16:31 2011 From: bartosz.markocki at gmail.com (Bartek Markocki) Date: Fri, 1 Apr 2011 11:16:31 +0200 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? Message-ID: Hi all, Can I ask any of you to review the attached extracts from our production GC log and share your thoughts about them? We have a router-type web application running under tomcat 6.0.28 with Java 1.6.0_21 (64bit) on RHEL 5.2 (2.6.18-92.1.22.el5). The GC settings are: -Xmx2048m -Xms2048m -XX:NewSize=1024m -XX:PermSize=64m -XX:MaxPermSize=128m -XX:ThreadStackSize=128 -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+PrintGCDetails What we did: Lately we have changed from ParallelOld to CMS due to unacceptable long Full GC pauses times. In preparation to the change of the collector we performed a lot of GC tuning related tests and found out that the above (simple) set of settings fulfill our needs in the best way. So far we are happy with what we see (frequency of minor scans/CMS cycles, times of STW pauses) with one exception. What is the problem: Some of our remark phases last much longer than others (up to 8 times on avg.). Normal remark phase lasts between 55 and 90ms, the longest one lasted for 538ms. At first we thought that this is due to aborting the preceding abortable-preclean phase. After a closer look we found out that depending on the volume of traffic (i.e., time of day) in fact some of our abortable-preclean phases are aborted due to time limit (5sec). Despite that most of the following remark phases times still are within acceptable limit (up to 100ms). So we kept digging. As a result of that we found out that the abnormal long remark phases are preceded by aborted abortable-preclean phase. The phase was always aborted due to the time limit however if we have a look at the following report for the young generation occupancy in all cases we were able to find that YG was occupied in far more than 50%. Per my (current :)) understanding the abortable-preclean phase can be aborted due of the time limit or because YG got full in about 50% (so remark phase will happen midway during two minor collections) - whatever comes first. In our case the 'about 50%' condition is not executed and the phase continues until it hits the time limit. The following remark phase always last longer, i.e., 350-550ms. The big question: What can we do to cut down the time of those long lasting remark phases? Below I enclose three samples from our GC log presenting: first one - a CMS cycle that aborted the abortable-preclean phase due time limit and the following remark phase does not show the abnormal behavior. second one - an "ideal" CMS cycle third one - a CMS cycle with aborted the abortable-preclean phase (due to time limit even though YG occupancy is much greater than 50%) and the following remark phase lasts for 0.5second. -- 1142110.458: [GC 1142110.458: [ParNew: 888646K->45370K(943744K), 0.0728880 secs] 1852227K->1013124K(1992320K), 0.0739250 secs] [Times: user=0.33 sys=0.01, real=0.07 secs] 1142110.547: [GC [1 CMS-initial-mark: 967753K(1048576K)] 1013331K(1992320K), 0.0540170 secs] [Times: user=0.06 sys=0.00, real=0.05 secs] 1142110.602: [CMS-concurrent-mark-start] 1142111.010: [CMS-concurrent-mark: 0.408/0.408 secs] [Times: user=1.96 sys=0.07, real=0.41 secs] 1142111.011: [CMS-concurrent-preclean-start] 1142111.028: [CMS-concurrent-preclean: 0.016/0.017 secs] [Times: user=0.02 sys=0.00, real=0.02 secs] 1142111.028: [CMS-concurrent-abortable-preclean-start] CMS: abort preclean due to time 1142116.036: [CMS-concurrent-abortable-preclean: 4.858/5.007 secs] [Times: user=7.31 sys=0.57, real=5.00 secs] 1142116.050: [GC[YG occupancy: 409639 K (943744 K)]1142116.051: [Rescan (parallel) , 0.0389910 secs]1142116.090: [weak refs processing, 0.0156130 secs] [1 CMS-remark: 967753K(1048576K)] 1377393K(1992320K), 0.0554700 secs] [Times: user=0.50 sys=0.00, real=0.06 secs] 1142116.107: [CMS-concurrent-sweep-start] 1142117.721: [CMS-concurrent-sweep: 1.614/1.614 secs] [Times: user=2.41 sys=0.24, real=1.61 secs] 1142117.721: [CMS-concurrent-reset-start] 1142117.732: [CMS-concurrent-reset: 0.010/0.010 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 1142121.278: [GC 1142121.279: [ParNew: 884282K->52652K(943744K), 0.0680850 secs] 1200273K->372087K(1992320K), 0.0690040 secs] [Times: user=0.29 sys=0.01, real=0.07 secs] 1142133.508: [GC 1142133.508: [ParNew: 891564K->47435K(943744K), 0.0682080 secs] 1210999K->370280K(1992320K), 0.0691030 secs] [Times: user=0.29 sys=0.01, real=0.07 secs] -- 1165584.305: [GC 1165584.305: [ParNew: 896212K->59055K(943744K), 0.0761290 secs] 1857148K->1023947K(1992320K), 0.0771330 secs] [Times: user=0.33 sys=0.00, real=0.08 secs] 1165584.398: [GC [1 CMS-initial-mark: 964891K(1048576K)] 1024053K(1992320K), 0.0631010 secs] [Times: user=0.06 sys=0.00, real=0.06 secs] 1165584.463: [CMS-concurrent-mark-start] 1165584.933: [CMS-concurrent-mark: 0.423/0.471 secs] [Times: user=2.40 sys=0.21, real=0.47 secs] 1165584.934: [CMS-concurrent-preclean-start] 1165584.954: [CMS-concurrent-preclean: 0.018/0.021 secs] [Times: user=0.05 sys=0.00, real=0.02 secs] 1165584.955: [CMS-concurrent-abortable-preclean-start] 1165587.876: [CMS-concurrent-abortable-preclean: 2.884/2.921 secs] [Times: user=5.51 sys=0.65, real=2.92 secs] 1165587.892: [GC[YG occupancy: 479051 K (943744 K)]1165587.892: [Rescan (parallel) , 0.0746810 secs]1165587.967: [weak refs processing, 0.0168870 secs] [1 CMS-remark: 964891K(1048576K)] 1443943K(1992320K), 0.0925600 secs] [Times: user=0.91 sys=0.01, real=0.09 secs] 1165587.986: [CMS-concurrent-sweep-start] 1165589.670: [CMS-concurrent-sweep: 1.684/1.684 secs] [Times: user=3.39 sys=0.46, real=1.69 secs] 1165589.671: [CMS-concurrent-reset-start] 1165589.679: [CMS-concurrent-reset: 0.009/0.009 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 1165591.354: [GC 1165591.354: [ParNew: 897967K->54984K(943744K), 0.0862910 secs] 1236513K->397404K(1992320K), 0.0872930 secs] [Times: user=0.34 sys=0.00, real=0.09 secs] 1165598.887: [GC 1165598.888: [ParNew: 893896K->52086K(943744K), 0.0885510 secs] 1236316K->398587K(1992320K), 0.0895820 secs] [Times: user=0.31 sys=0.01, real=0.09 secs] -- 1166753.770: [GC 1166753.770: [ParNew: 899148K->57315K(943744K), 0.0782510 secs] 1862058K->1024198K(1992320K), 0.0793040 secs] [Times: user=0.32 sys=0.01, real=0.08 secs] 1166753.867: [GC [1 CMS-initial-mark: 966883K(1048576K)] 1024305K(1992320K), 0.0642680 secs] [Times: user=0.07 sys=0.00, real=0.07 secs] 1166753.932: [CMS-concurrent-mark-start] 1166754.471: [CMS-concurrent-mark: 0.486/0.538 secs] [Times: user=2.76 sys=0.28, real=0.54 secs] 1166754.471: [CMS-concurrent-preclean-start] 1166754.488: [CMS-concurrent-preclean: 0.015/0.017 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 1166754.488: [CMS-concurrent-abortable-preclean-start] CMS: abort preclean due to time 1166759.533: [CMS-concurrent-abortable-preclean: 4.895/5.044 secs] [Times: user=9.75 sys=1.21, real=5.05 secs] 1166759.549: [GC[YG occupancy: 791197 K (943744 K)]1166759.549: [Rescan (parallel) , 0.5387660 secs]1166760.088: [weak refs processing, 0.0139780 secs] [1 CMS-remark: 966883K(1048576K)] 1758080K(1992320K), 0.5537750 secs] [Times: user=5.58 sys=0.06, real=0.56 secs] 1166760.105: [CMS-concurrent-sweep-start] 1166760.688: [GC 1166760.689: [ParNew: 896188K->57161K(943744K), 0.0727850 secs] 1623884K->788963K(1992320K), 0.0737390 secs] [Times: user=0.31 sys=0.02, real=0.08 secs] 1166761.593: [CMS-concurrent-sweep: 1.363/1.488 secs] [Times: user=3.48 sys=0.49, real=1.49 secs] 1166761.593: [CMS-concurrent-reset-start] 1166761.602: [CMS-concurrent-reset: 0.009/0.009 secs] [Times: user=0.02 sys=0.01, real=0.01 secs] 1166767.947: [GC 1166767.948: [ParNew: 896053K->58188K(943744K), 0.0817680 secs] 1238926K->404605K(1992320K), 0.0828270 secs] [Times: user=0.31 sys=0.01, real=0.08 secs] -- Thank you in advance, Bartek _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From jesper.wilhelmsson at oracle.com Fri Apr 1 04:44:03 2011 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Fri, 01 Apr 2011 13:44:03 +0200 Subject: CRR: 7027766: G1: introduce flag to dump the liveness information per region at the end of marking (S) In-Reply-To: <4D8CCECA.80603@oracle.com> References: <4D8CCECA.80603@oracle.com> Message-ID: <4D95BA83.2070706@oracle.com> Tony, Why are the new formatting macros G1 specific? Wouldn't it be better to have a generic set of formatting macros that we can use all over the GC code? I'm not saying we should change the rest of the GC code to use these macros right now, but since you are introducing new functionality I think it would be nice if that functionality could be used by all GCs. /Jesper On 03/25/2011 06:20 PM, Tony Printezis wrote: > Hi, > > I'd like a couple of reviewers to have a look at this change: > > http://cr.openjdk.java.net/~tonyp/7027766/webrev.0/ > > I ended up repurposing the existing develop flag G1PrintRegionLivenessInfo for > this (I had actually forgotten it was there!) and I print the liveness > information at the end of marking twice: > > - Info for all regions after we finalize the marking information, which means > we have both the latest marking information as well as the previous marking > information available. > - The same info but for all sorted old regions. > > I attached example output. > > Tony From tony.printezis at oracle.com Fri Apr 1 05:48:11 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 01 Apr 2011 08:48:11 -0400 Subject: CRR: 7027766: G1: introduce flag to dump the liveness information per region at the end of marking (S) In-Reply-To: <4D95BA83.2070706@oracle.com> References: <4D8CCECA.80603@oracle.com> <4D95BA83.2070706@oracle.com> Message-ID: <4D95C98B.3030400@oracle.com> Jesper, Not quite sure what you mean by "introducing new functionality" here. I introduced the formatting macros to keep the output of this flag consistent and to be able to easily change it uniformly. The macros basically specify things like "a size column will be 10 characters long, it will have to spaces before it, and make sure you format its header the same way". They are prefixed with G1 to avoid name clashes. I don't think this is something applicable anywhere else. Tony On 4/1/2011 7:44 AM, Jesper Wilhelmsson wrote: > Tony, > > Why are the new formatting macros G1 specific? Wouldn't it be better > to have a generic set of formatting macros that we can use all over > the GC code? > > I'm not saying we should change the rest of the GC code to use these > macros right now, but since you are introducing new functionality I > think it would be nice if that functionality could be used by all GCs. > /Jesper > > > On 03/25/2011 06:20 PM, Tony Printezis wrote: >> Hi, >> >> I'd like a couple of reviewers to have a look at this change: >> >> http://cr.openjdk.java.net/~tonyp/7027766/webrev.0/ >> >> I ended up repurposing the existing develop flag >> G1PrintRegionLivenessInfo for >> this (I had actually forgotten it was there!) and I print the liveness >> information at the end of marking twice: >> >> - Info for all regions after we finalize the marking information, >> which means >> we have both the latest marking information as well as the previous >> marking >> information available. >> - The same info but for all sorted old regions. >> >> I attached example output. >> >> Tony From jesper.wilhelmsson at oracle.com Fri Apr 1 09:01:06 2011 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Fri, 1 Apr 2011 18:01:06 +0200 Subject: CRR: 7027766: G1: introduce flag to dump the liveness information per region at the end of marking (S) In-Reply-To: <4D95C98B.3030400@oracle.com> References: <4D8CCECA.80603@oracle.com> <4D95BA83.2070706@oracle.com> <4D95C98B.3030400@oracle.com> Message-ID: <0DF04653-B0F1-4650-A4A7-430EE4E4D169@oracle.com> Hmm, ok. If we find that we want to use it in some other collector we can always move it to a more generic package at that point. Btw, if everyone has agreed on the format by now, the actual change looks fine. Ship it! /Jesper 1 apr 2011 kl. 14:48 skrev Tony Printezis : > Jesper, > > Not quite sure what you mean by "introducing new functionality" here. I introduced the formatting macros to keep the output of this flag consistent and to be able to easily change it uniformly. The macros basically specify things like "a size column will be 10 characters long, it will have to spaces before it, and make sure you format its header the same way". They are prefixed with G1 to avoid name clashes. I don't think this is something applicable anywhere else. > > Tony > > On 4/1/2011 7:44 AM, Jesper Wilhelmsson wrote: >> Tony, >> >> Why are the new formatting macros G1 specific? Wouldn't it be better to have a generic set of formatting macros that we can use all over the GC code? >> >> I'm not saying we should change the rest of the GC code to use these macros right now, but since you are introducing new functionality I think it would be nice if that functionality could be used by all GCs. >> /Jesper >> >> >> On 03/25/2011 06:20 PM, Tony Printezis wrote: >>> Hi, >>> >>> I'd like a couple of reviewers to have a look at this change: >>> >>> http://cr.openjdk.java.net/~tonyp/7027766/webrev.0/ >>> >>> I ended up repurposing the existing develop flag G1PrintRegionLivenessInfo for >>> this (I had actually forgotten it was there!) and I print the liveness >>> information at the end of marking twice: >>> >>> - Info for all regions after we finalize the marking information, which means >>> we have both the latest marking information as well as the previous marking >>> information available. >>> - The same info but for all sorted old regions. >>> >>> I attached example output. >>> >>> Tony From y.s.ramakrishna at oracle.com Fri Apr 1 09:30:19 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Fri, 01 Apr 2011 09:30:19 -0700 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: References: Message-ID: <4D95FD9B.9080909@oracle.com> Hi Bartek -- Try -XX:+CMSSCavengeBeforeRemark as a temporary workaround for this, and let us know if the performance is reasonable or not. I'll look at your log (can you send me your whole GC log, showing the problem, off-list?). I think there's probably an open CR for this, which i'll dig up for you. -- ramki On 4/1/2011 2:16 AM, Bartek Markocki wrote: > Hi all, > > Can I ask any of you to review the attached extracts from our > production GC log and share your thoughts about them? > > We have a router-type web application running under tomcat 6.0.28 with > Java 1.6.0_21 (64bit) on RHEL 5.2 (2.6.18-92.1.22.el5). The GC > settings are: > -Xmx2048m -Xms2048m -XX:NewSize=1024m > -XX:PermSize=64m -XX:MaxPermSize=128m > -XX:ThreadStackSize=128 > -XX:+DisableExplicitGC > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > -XX:+PrintGCDetails > > What we did: > > Lately we have changed from ParallelOld to CMS due to unacceptable > long Full GC pauses times. In preparation to the change of the > collector we performed a lot of GC tuning related tests and found out > that the above (simple) set of settings fulfill our needs in the best > way. > So far we are happy with what we see (frequency of minor scans/CMS > cycles, times of STW pauses) with one exception. > > > What is the problem: > > Some of our remark phases last much longer than others (up to 8 times > on avg.). Normal remark phase lasts between 55 and 90ms, the longest > one lasted for 538ms. > At first we thought that this is due to aborting the preceding > abortable-preclean phase. After a closer look we found out that > depending on the volume of traffic (i.e., time of day) in fact some of > our abortable-preclean phases are aborted due to time limit (5sec). > Despite that most of the following remark phases times still are > within acceptable limit (up to 100ms). So we kept digging. As a result > of that we found out that the abnormal long remark phases are preceded > by aborted abortable-preclean phase. The phase was always aborted due > to the time limit however if we have a look at the following report > for the young generation occupancy in all cases we were able to find > that YG was occupied in far more than 50%. > Per my (current :)) understanding the abortable-preclean phase can be > aborted due of the time limit or because YG got full in about 50% (so > remark phase will happen midway during two minor collections) - > whatever comes first. In our case the 'about 50%' condition is not > executed and the phase continues until it hits the time limit. The > following remark phase always last longer, i.e., 350-550ms. > > > The big question: > > What can we do to cut down the time of those long lasting remark phases? > > > Below I enclose three samples from our GC log presenting: > first one - a CMS cycle that aborted the abortable-preclean phase due > time limit and the following remark phase does not show the abnormal > behavior. > second one - an "ideal" CMS cycle > third one - a CMS cycle with aborted the abortable-preclean phase (due > to time limit even though YG occupancy is much greater than 50%) and > the following remark phase lasts for 0.5second. > > -- > 1142110.458: [GC 1142110.458: [ParNew: 888646K->45370K(943744K), > 0.0728880 secs] 1852227K->1013124K(1992320K), 0.0739250 secs] [Times: > user=0.33 sys=0.01, real=0.07 secs] > 1142110.547: [GC [1 CMS-initial-mark: 967753K(1048576K)] > 1013331K(1992320K), 0.0540170 secs] [Times: user=0.06 sys=0.00, > real=0.05 secs] > 1142110.602: [CMS-concurrent-mark-start] > 1142111.010: [CMS-concurrent-mark: 0.408/0.408 secs] [Times: user=1.96 > sys=0.07, real=0.41 secs] > 1142111.011: [CMS-concurrent-preclean-start] > 1142111.028: [CMS-concurrent-preclean: 0.016/0.017 secs] [Times: > user=0.02 sys=0.00, real=0.02 secs] > 1142111.028: [CMS-concurrent-abortable-preclean-start] > CMS: abort preclean due to time 1142116.036: > [CMS-concurrent-abortable-preclean: 4.858/5.007 secs] [Times: > user=7.31 sys=0.57, real=5.00 secs] > 1142116.050: [GC[YG occupancy: 409639 K (943744 K)]1142116.051: > [Rescan (parallel) , 0.0389910 secs]1142116.090: [weak refs > processing, 0.0156130 secs] [1 CMS-remark: 967753K(1048576K)] > 1377393K(1992320K), 0.0554700 secs] [Times: user=0.50 sys=0.00, > real=0.06 secs] > 1142116.107: [CMS-concurrent-sweep-start] > 1142117.721: [CMS-concurrent-sweep: 1.614/1.614 secs] [Times: > user=2.41 sys=0.24, real=1.61 secs] > 1142117.721: [CMS-concurrent-reset-start] > 1142117.732: [CMS-concurrent-reset: 0.010/0.010 secs] [Times: > user=0.01 sys=0.00, real=0.01 secs] > 1142121.278: [GC 1142121.279: [ParNew: 884282K->52652K(943744K), > 0.0680850 secs] 1200273K->372087K(1992320K), 0.0690040 secs] [Times: > user=0.29 sys=0.01, real=0.07 secs] > 1142133.508: [GC 1142133.508: [ParNew: 891564K->47435K(943744K), > 0.0682080 secs] 1210999K->370280K(1992320K), 0.0691030 secs] [Times: > user=0.29 sys=0.01, real=0.07 secs] > -- > 1165584.305: [GC 1165584.305: [ParNew: 896212K->59055K(943744K), > 0.0761290 secs] 1857148K->1023947K(1992320K), 0.0771330 secs] [Times: > user=0.33 sys=0.00, real=0.08 secs] > 1165584.398: [GC [1 CMS-initial-mark: 964891K(1048576K)] > 1024053K(1992320K), 0.0631010 secs] [Times: user=0.06 sys=0.00, > real=0.06 secs] > 1165584.463: [CMS-concurrent-mark-start] > 1165584.933: [CMS-concurrent-mark: 0.423/0.471 secs] [Times: user=2.40 > sys=0.21, real=0.47 secs] > 1165584.934: [CMS-concurrent-preclean-start] > 1165584.954: [CMS-concurrent-preclean: 0.018/0.021 secs] [Times: > user=0.05 sys=0.00, real=0.02 secs] > 1165584.955: [CMS-concurrent-abortable-preclean-start] > 1165587.876: [CMS-concurrent-abortable-preclean: 2.884/2.921 secs] > [Times: user=5.51 sys=0.65, real=2.92 secs] > 1165587.892: [GC[YG occupancy: 479051 K (943744 K)]1165587.892: > [Rescan (parallel) , 0.0746810 secs]1165587.967: [weak refs > processing, 0.0168870 secs] [1 CMS-remark: 964891K(1048576K)] > 1443943K(1992320K), 0.0925600 secs] [Times: user=0.91 sys=0.01, > real=0.09 secs] > 1165587.986: [CMS-concurrent-sweep-start] > 1165589.670: [CMS-concurrent-sweep: 1.684/1.684 secs] [Times: > user=3.39 sys=0.46, real=1.69 secs] > 1165589.671: [CMS-concurrent-reset-start] > 1165589.679: [CMS-concurrent-reset: 0.009/0.009 secs] [Times: > user=0.01 sys=0.00, real=0.01 secs] > 1165591.354: [GC 1165591.354: [ParNew: 897967K->54984K(943744K), > 0.0862910 secs] 1236513K->397404K(1992320K), 0.0872930 secs] [Times: > user=0.34 sys=0.00, real=0.09 secs] > 1165598.887: [GC 1165598.888: [ParNew: 893896K->52086K(943744K), > 0.0885510 secs] 1236316K->398587K(1992320K), 0.0895820 secs] [Times: > user=0.31 sys=0.01, real=0.09 secs] > -- > 1166753.770: [GC 1166753.770: [ParNew: 899148K->57315K(943744K), > 0.0782510 secs] 1862058K->1024198K(1992320K), 0.0793040 secs] [Times: > user=0.32 sys=0.01, real=0.08 secs] > 1166753.867: [GC [1 CMS-initial-mark: 966883K(1048576K)] > 1024305K(1992320K), 0.0642680 secs] [Times: user=0.07 sys=0.00, > real=0.07 secs] > 1166753.932: [CMS-concurrent-mark-start] > 1166754.471: [CMS-concurrent-mark: 0.486/0.538 secs] [Times: user=2.76 > sys=0.28, real=0.54 secs] > 1166754.471: [CMS-concurrent-preclean-start] > 1166754.488: [CMS-concurrent-preclean: 0.015/0.017 secs] [Times: > user=0.04 sys=0.00, real=0.01 secs] > 1166754.488: [CMS-concurrent-abortable-preclean-start] > CMS: abort preclean due to time 1166759.533: > [CMS-concurrent-abortable-preclean: 4.895/5.044 secs] [Times: > user=9.75 sys=1.21, real=5.05 secs] > 1166759.549: [GC[YG occupancy: 791197 K (943744 K)]1166759.549: > [Rescan (parallel) , 0.5387660 secs]1166760.088: [weak refs > processing, 0.0139780 secs] [1 CMS-remark: 966883K(1048576K)] > 1758080K(1992320K), 0.5537750 secs] [Times: user=5.58 sys=0.06, > real=0.56 secs] > 1166760.105: [CMS-concurrent-sweep-start] > 1166760.688: [GC 1166760.689: [ParNew: 896188K->57161K(943744K), > 0.0727850 secs] 1623884K->788963K(1992320K), 0.0737390 secs] [Times: > user=0.31 sys=0.02, real=0.08 secs] > 1166761.593: [CMS-concurrent-sweep: 1.363/1.488 secs] [Times: > user=3.48 sys=0.49, real=1.49 secs] > 1166761.593: [CMS-concurrent-reset-start] > 1166761.602: [CMS-concurrent-reset: 0.009/0.009 secs] [Times: > user=0.02 sys=0.01, real=0.01 secs] > 1166767.947: [GC 1166767.948: [ParNew: 896053K->58188K(943744K), > 0.0817680 secs] 1238926K->404605K(1992320K), 0.0828270 secs] [Times: > user=0.31 sys=0.01, real=0.08 secs] > -- > > Thank you in advance, > Bartek > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From bartosz.markocki at gmail.com Fri Apr 1 10:40:26 2011 From: bartosz.markocki at gmail.com (Bartek Markocki) Date: Fri, 1 Apr 2011 19:40:26 +0200 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: <4D95FD9B.9080909@oracle.com> References: <4D95FD9B.9080909@oracle.com> Message-ID: Hi Ramki, On Fri, Apr 1, 2011 at 6:30 PM, Y. Srinivas Ramakrishna wrote: > Try -XX:+CMSSCavengeBeforeRemark as a temporary workaround > for this, and let us know if the performance is reasonable > or not. We will try to push the +CMSScavengeBeforeRemark to our production but as we are talking about the production environment it might take some time to return to you with the results. > I'll look at your log (can you send me your whole GC log, > showing the problem, off-list?). Just did. > I think there's probably an open CR for this, which i'll > dig up for you. Thanks a lot! Bartek > On 4/1/2011 2:16 AM, Bartek Markocki wrote: >> >> Hi all, >> >> Can I ask any of you to review the attached extracts from our >> production GC log and share your thoughts about them? >> >> We have a router-type web application running under tomcat 6.0.28 with >> Java 1.6.0_21 (64bit) on RHEL 5.2 (2.6.18-92.1.22.el5). The GC >> settings are: >> -Xmx2048m -Xms2048m -XX:NewSize=1024m >> -XX:PermSize=64m -XX:MaxPermSize=128m >> -XX:ThreadStackSize=128 >> -XX:+DisableExplicitGC >> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC >> -XX:+PrintGCDetails >> >> What we did: >> >> Lately we have changed from ParallelOld to CMS due to unacceptable >> long Full GC pauses times. In preparation to the change of the >> collector we performed a lot of GC tuning related tests and found out >> that the above (simple) set of settings fulfill our needs in the best >> way. >> So far we are happy with what we see (frequency of minor scans/CMS >> cycles, times of STW pauses) with one exception. >> >> >> What is the problem: >> >> Some of our remark phases last much longer than others (up to 8 times >> on avg.). Normal remark phase lasts between 55 and 90ms, the longest >> one lasted for 538ms. >> At first we thought that this is due to aborting the preceding >> abortable-preclean phase. After a closer look we found out that >> depending on the volume of traffic (i.e., time of day) in fact some of >> our abortable-preclean phases are aborted due to time limit (5sec). >> Despite that most of the following remark phases times still are >> within acceptable limit (up to 100ms). So we kept digging. As a result >> of that we found out that the abnormal long remark phases are preceded >> by aborted abortable-preclean phase. The phase was always aborted due >> to the time limit however if we have a look at the following report >> for the young generation occupancy in all cases we were able to find >> that YG was occupied in far more than 50%. >> Per my (current :)) understanding the abortable-preclean phase can be >> aborted due of the time limit or because YG got full in about 50% (so >> remark phase will happen midway during two minor collections) - >> whatever comes first. In our case the 'about 50%' condition is not >> executed and the phase continues until it hits the time limit. The >> following remark phase always last longer, i.e., 350-550ms. >> >> >> The big question: >> >> What can we do to cut down the time of those long lasting remark phases? >> >> >> Below I enclose three samples from our GC log presenting: >> first one - a CMS cycle that aborted the abortable-preclean phase due >> time limit and the following remark phase does not show the abnormal >> behavior. >> second one - an "ideal" CMS cycle >> third one - a CMS cycle with aborted the abortable-preclean phase (due >> to time limit even though YG occupancy is much greater than 50%) and >> the following remark phase lasts for 0.5second. >> >> -- >> 1142110.458: [GC 1142110.458: [ParNew: 888646K->45370K(943744K), >> 0.0728880 secs] 1852227K->1013124K(1992320K), 0.0739250 secs] [Times: >> user=0.33 sys=0.01, real=0.07 secs] >> 1142110.547: [GC [1 CMS-initial-mark: 967753K(1048576K)] >> 1013331K(1992320K), 0.0540170 secs] [Times: user=0.06 sys=0.00, >> real=0.05 secs] >> 1142110.602: [CMS-concurrent-mark-start] >> 1142111.010: [CMS-concurrent-mark: 0.408/0.408 secs] [Times: user=1.96 >> sys=0.07, real=0.41 secs] >> 1142111.011: [CMS-concurrent-preclean-start] >> 1142111.028: [CMS-concurrent-preclean: 0.016/0.017 secs] [Times: >> user=0.02 sys=0.00, real=0.02 secs] >> 1142111.028: [CMS-concurrent-abortable-preclean-start] >> ?CMS: abort preclean due to time 1142116.036: >> [CMS-concurrent-abortable-preclean: 4.858/5.007 secs] [Times: >> user=7.31 sys=0.57, real=5.00 secs] >> 1142116.050: [GC[YG occupancy: 409639 K (943744 K)]1142116.051: >> [Rescan (parallel) , 0.0389910 secs]1142116.090: [weak refs >> processing, 0.0156130 secs] [1 CMS-remark: 967753K(1048576K)] >> 1377393K(1992320K), 0.0554700 secs] [Times: user=0.50 sys=0.00, >> real=0.06 secs] >> 1142116.107: [CMS-concurrent-sweep-start] >> 1142117.721: [CMS-concurrent-sweep: 1.614/1.614 secs] [Times: >> user=2.41 sys=0.24, real=1.61 secs] >> 1142117.721: [CMS-concurrent-reset-start] >> 1142117.732: [CMS-concurrent-reset: 0.010/0.010 secs] [Times: >> user=0.01 sys=0.00, real=0.01 secs] >> 1142121.278: [GC 1142121.279: [ParNew: 884282K->52652K(943744K), >> 0.0680850 secs] 1200273K->372087K(1992320K), 0.0690040 secs] [Times: >> user=0.29 sys=0.01, real=0.07 secs] >> 1142133.508: [GC 1142133.508: [ParNew: 891564K->47435K(943744K), >> 0.0682080 secs] 1210999K->370280K(1992320K), 0.0691030 secs] [Times: >> user=0.29 sys=0.01, real=0.07 secs] >> -- >> 1165584.305: [GC 1165584.305: [ParNew: 896212K->59055K(943744K), >> 0.0761290 secs] 1857148K->1023947K(1992320K), 0.0771330 secs] [Times: >> user=0.33 sys=0.00, real=0.08 secs] >> 1165584.398: [GC [1 CMS-initial-mark: 964891K(1048576K)] >> 1024053K(1992320K), 0.0631010 secs] [Times: user=0.06 sys=0.00, >> real=0.06 secs] >> 1165584.463: [CMS-concurrent-mark-start] >> 1165584.933: [CMS-concurrent-mark: 0.423/0.471 secs] [Times: user=2.40 >> sys=0.21, real=0.47 secs] >> 1165584.934: [CMS-concurrent-preclean-start] >> 1165584.954: [CMS-concurrent-preclean: 0.018/0.021 secs] [Times: >> user=0.05 sys=0.00, real=0.02 secs] >> 1165584.955: [CMS-concurrent-abortable-preclean-start] >> 1165587.876: [CMS-concurrent-abortable-preclean: 2.884/2.921 secs] >> [Times: user=5.51 sys=0.65, real=2.92 secs] >> 1165587.892: [GC[YG occupancy: 479051 K (943744 K)]1165587.892: >> [Rescan (parallel) , 0.0746810 secs]1165587.967: [weak refs >> processing, 0.0168870 secs] [1 CMS-remark: 964891K(1048576K)] >> 1443943K(1992320K), 0.0925600 secs] [Times: user=0.91 sys=0.01, >> real=0.09 secs] >> 1165587.986: [CMS-concurrent-sweep-start] >> 1165589.670: [CMS-concurrent-sweep: 1.684/1.684 secs] [Times: >> user=3.39 sys=0.46, real=1.69 secs] >> 1165589.671: [CMS-concurrent-reset-start] >> 1165589.679: [CMS-concurrent-reset: 0.009/0.009 secs] [Times: >> user=0.01 sys=0.00, real=0.01 secs] >> 1165591.354: [GC 1165591.354: [ParNew: 897967K->54984K(943744K), >> 0.0862910 secs] 1236513K->397404K(1992320K), 0.0872930 secs] [Times: >> user=0.34 sys=0.00, real=0.09 secs] >> 1165598.887: [GC 1165598.888: [ParNew: 893896K->52086K(943744K), >> 0.0885510 secs] 1236316K->398587K(1992320K), 0.0895820 secs] [Times: >> user=0.31 sys=0.01, real=0.09 secs] >> -- >> 1166753.770: [GC 1166753.770: [ParNew: 899148K->57315K(943744K), >> 0.0782510 secs] 1862058K->1024198K(1992320K), 0.0793040 secs] [Times: >> user=0.32 sys=0.01, real=0.08 secs] >> 1166753.867: [GC [1 CMS-initial-mark: 966883K(1048576K)] >> 1024305K(1992320K), 0.0642680 secs] [Times: user=0.07 sys=0.00, >> real=0.07 secs] >> 1166753.932: [CMS-concurrent-mark-start] >> 1166754.471: [CMS-concurrent-mark: 0.486/0.538 secs] [Times: user=2.76 >> sys=0.28, real=0.54 secs] >> 1166754.471: [CMS-concurrent-preclean-start] >> 1166754.488: [CMS-concurrent-preclean: 0.015/0.017 secs] [Times: >> user=0.04 sys=0.00, real=0.01 secs] >> 1166754.488: [CMS-concurrent-abortable-preclean-start] >> ?CMS: abort preclean due to time 1166759.533: >> [CMS-concurrent-abortable-preclean: 4.895/5.044 secs] [Times: >> user=9.75 sys=1.21, real=5.05 secs] >> 1166759.549: [GC[YG occupancy: 791197 K (943744 K)]1166759.549: >> [Rescan (parallel) , 0.5387660 secs]1166760.088: [weak refs >> processing, 0.0139780 secs] [1 CMS-remark: 966883K(1048576K)] >> 1758080K(1992320K), 0.5537750 secs] [Times: user=5.58 sys=0.06, >> real=0.56 secs] >> 1166760.105: [CMS-concurrent-sweep-start] >> 1166760.688: [GC 1166760.689: [ParNew: 896188K->57161K(943744K), >> 0.0727850 secs] 1623884K->788963K(1992320K), 0.0737390 secs] [Times: >> user=0.31 sys=0.02, real=0.08 secs] >> 1166761.593: [CMS-concurrent-sweep: 1.363/1.488 secs] [Times: >> user=3.48 sys=0.49, real=1.49 secs] >> 1166761.593: [CMS-concurrent-reset-start] >> 1166761.602: [CMS-concurrent-reset: 0.009/0.009 secs] [Times: >> user=0.02 sys=0.01, real=0.01 secs] >> 1166767.947: [GC 1166767.948: [ParNew: 896053K->58188K(943744K), >> 0.0817680 secs] 1238926K->404605K(1992320K), 0.0828270 secs] [Times: >> user=0.31 sys=0.01, real=0.08 secs] >> -- >> >> Thank you in advance, >> Bartek >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From todd at cloudera.com Fri Apr 1 12:05:21 2011 From: todd at cloudera.com (Todd Lipcon) Date: Fri, 1 Apr 2011 12:05:21 -0700 Subject: G1 feedback In-Reply-To: References: Message-ID: Hi Alex, I've had similar results - see my threads from a few months back on this mailing list. The summary is that, when there is a tight pause bound, some regions will accumulate which have estimates that are "stuck" higher than the goal. As these accumulate, they're never collected, which means that memory usage slowly grows until a full GC is required. The two situations that caused this in my experiments were: 1) The JVM got context-switched out for several scheduling quanta during the "other" portion of a non-young region collection. Because "other" time is considered constant overhead, even mostly-garbage regions were carrying this as part of their estimate, causing all non-young regions to be deemed "too expensive". I fixed this with a patch to notice when the "other time" estimate was greater than the pause goal and decay it back towards 0. 2) A bad region accumulates many inter-region references in its "remember set", overflowing into the coarse rset. Once the coarse rset has been used, there's no facility to "uncoarsen" the rset entries even after all the referring regions are dead. If the number of coarse entries is high enough that the time estimate is greater than the pause time goal, then again, this region will never be collected, and memory will fill up until a full GC. I wrote a patch to improve the time estimation for coarse rset entries based on the liveness info in the coarse regions. This helped somewhat for my application. Let me know if you'd like to try these patches out, I can dig them up again (or you might find them in the mailing list archives). -Todd On Fri, Apr 1, 2011 at 11:48 AM, Alex Aisinzon wrote: > Hi all > > > > Thoughts on this feedback about G1? > > > > Take care > > > > Alex A > > > > *From:* Alex Aisinzon > *Sent:* Saturday, March 26, 2011 6:46 AM > *To:* hotspot-gc-use at openjdk.java.net > *Subject:* G1 feedback > > > > Hi all > > > > I experimented with G1 and Sun JDK 1.6 update 24 and ran two long running > tests (8 hours) with it: > > With " -server -XX:+UseG1GC -XX:+UseCompressedOops -Xms24576m -Xmx24576m", > most pauses were very short. 4 pauses were around 7 seconds and 4 at 34 > seconds. > > I then added the objective of keeping the longest pause around 1 second and > used "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=1000 -XX:+UseCompressedOops > -Xms24576m -Xmx24576m". Most pauses were a little above 1 second except for > one pause at 8 seconds and one at 57 seconds. > > The server is a dual X5570 (8 cores total) and has 48GB of RAM. Its average > CPU utilization was around 60-65% so it was not over-used. > > Everything would be perfect if it were not for the 7, 34 and 8, 57 seconds > pauses. > > What would you recommend I do to either reduce these longer pauses or give > insights into what happened so that G1 can avoid these very rare but pretty > long pauses in the future? > > > > Thanks in advance > > > > Alex A > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -- Todd Lipcon Software Engineer, Cloudera -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110401/fa01154a/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From todd at cloudera.com Fri Apr 1 13:55:05 2011 From: todd at cloudera.com (Todd Lipcon) Date: Fri, 1 Apr 2011 13:55:05 -0700 Subject: G1 feedback In-Reply-To: References: Message-ID: On Fri, Apr 1, 2011 at 1:52 PM, Alex Aisinzon wrote: > Hi Todd > > > > This is very interesting. I feel I need to read some additional material > about G1 to fully understand your explanation. I guess the research paper on > it would help. I will plan to read it. > > In any case, I am happy to give your patch a try. The challenge is that I > am not well set to rebuild G1. Is there a way I could get the java > binary/executable with the patch included? > > > Sorry, I'm not well equipped to distribute a binary (and there might be licensing issues, I'm not even sure). -Todd > > > *From:* Todd Lipcon [mailto:todd at cloudera.com] > *Sent:* Friday, April 01, 2011 12:05 PM > *To:* Alex Aisinzon > *Cc:* hotspot-gc-use at openjdk.java.net > *Subject:* Re: G1 feedback > > > > Hi Alex, > > > > I've had similar results - see my threads from a few months back on this > mailing list. > > > > The summary is that, when there is a tight pause bound, some regions will > accumulate which have estimates that are "stuck" higher than the goal. As > these accumulate, they're never collected, which means that memory usage > slowly grows until a full GC is required. > > > > The two situations that caused this in my experiments were: > > 1) The JVM got context-switched out for several scheduling quanta during > the "other" portion of a non-young region collection. Because "other" time > is considered constant overhead, even mostly-garbage regions were carrying > this as part of their estimate, causing all non-young regions to be deemed > "too expensive". I fixed this with a patch to notice when the "other time" > estimate was greater than the pause goal and decay it back towards 0. > > > > 2) A bad region accumulates many inter-region references in its "remember > set", overflowing into the coarse rset. Once the coarse rset has been used, > there's no facility to "uncoarsen" the rset entries even after all the > referring regions are dead. If the number of coarse entries is high enough > that the time estimate is greater than the pause time goal, then again, this > region will never be collected, and memory will fill up until a full GC. I > wrote a patch to improve the time estimation for coarse rset entries based > on the liveness info in the coarse regions. This helped somewhat for my > application. > > > > Let me know if you'd like to try these patches out, I can dig them up again > (or you might find them in the mailing list archives). > > > > -Todd > > On Fri, Apr 1, 2011 at 11:48 AM, Alex Aisinzon > wrote: > > Hi all > > > > Thoughts on this feedback about G1? > > > > Take care > > > > Alex A > > > > *From:* Alex Aisinzon > *Sent:* Saturday, March 26, 2011 6:46 AM > *To:* hotspot-gc-use at openjdk.java.net > *Subject:* G1 feedback > > > > Hi all > > > > I experimented with G1 and Sun JDK 1.6 update 24 and ran two long running > tests (8 hours) with it: > > With " -server -XX:+UseG1GC -XX:+UseCompressedOops -Xms24576m -Xmx24576m", > most pauses were very short. 4 pauses were around 7 seconds and 4 at 34 > seconds. > > I then added the objective of keeping the longest pause around 1 second and > used "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=1000 -XX:+UseCompressedOops > -Xms24576m -Xmx24576m". Most pauses were a little above 1 second except for > one pause at 8 seconds and one at 57 seconds. > > The server is a dual X5570 (8 cores total) and has 48GB of RAM. Its average > CPU utilization was around 60-65% so it was not over-used. > > Everything would be perfect if it were not for the 7, 34 and 8, 57 seconds > pauses. > > What would you recommend I do to either reduce these longer pauses or give > insights into what happened so that G1 can avoid these very rare but pretty > long pauses in the future? > > > > Thanks in advance > > > > Alex A > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- Todd Lipcon Software Engineer, Cloudera -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110401/592ed7f1/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From todd at cloudera.com Fri Apr 1 14:15:25 2011 From: todd at cloudera.com (Todd Lipcon) Date: Fri, 1 Apr 2011 14:15:25 -0700 Subject: G1 feedback In-Reply-To: References: Message-ID: Here's a patch of all my local changes against a checkout from a couple months back (may not completely apply against JDK7 trunk). This has the fixes mentioned as well as a few other experiments I'd done. I expressly grant permission to the JDK team to include this patch or parts thereof in the JDK should they find the code useful. -Todd On Fri, Apr 1, 2011 at 2:07 PM, Alex Aisinzon wrote: > Todd > > > > Please share the patch. I will see if I can build it. > > > > Regards > > > > Alex Aisinzon > > > > *From:* Todd Lipcon [mailto:todd at cloudera.com] > *Sent:* Friday, April 01, 2011 1:55 PM > > *To:* Alex Aisinzon > *Cc:* hotspot-gc-use at openjdk.java.net > *Subject:* Re: G1 feedback > > > > On Fri, Apr 1, 2011 at 1:52 PM, Alex Aisinzon > wrote: > > Hi Todd > > > > This is very interesting. I feel I need to read some additional material > about G1 to fully understand your explanation. I guess the research paper on > it would help. I will plan to read it. > > In any case, I am happy to give your patch a try. The challenge is that I > am not well set to rebuild G1. Is there a way I could get the java > binary/executable with the patch included? > > > > > > Sorry, I'm not well equipped to distribute a binary (and there might be > licensing issues, I'm not even sure). > > > > -Todd > > > > > > *From:* Todd Lipcon [mailto:todd at cloudera.com] > *Sent:* Friday, April 01, 2011 12:05 PM > *To:* Alex Aisinzon > *Cc:* hotspot-gc-use at openjdk.java.net > *Subject:* Re: G1 feedback > > > > Hi Alex, > > > > I've had similar results - see my threads from a few months back on this > mailing list. > > > > The summary is that, when there is a tight pause bound, some regions will > accumulate which have estimates that are "stuck" higher than the goal. As > these accumulate, they're never collected, which means that memory usage > slowly grows until a full GC is required. > > > > The two situations that caused this in my experiments were: > > 1) The JVM got context-switched out for several scheduling quanta during > the "other" portion of a non-young region collection. Because "other" time > is considered constant overhead, even mostly-garbage regions were carrying > this as part of their estimate, causing all non-young regions to be deemed > "too expensive". I fixed this with a patch to notice when the "other time" > estimate was greater than the pause goal and decay it back towards 0. > > > > 2) A bad region accumulates many inter-region references in its "remember > set", overflowing into the coarse rset. Once the coarse rset has been used, > there's no facility to "uncoarsen" the rset entries even after all the > referring regions are dead. If the number of coarse entries is high enough > that the time estimate is greater than the pause time goal, then again, this > region will never be collected, and memory will fill up until a full GC. I > wrote a patch to improve the time estimation for coarse rset entries based > on the liveness info in the coarse regions. This helped somewhat for my > application. > > > > Let me know if you'd like to try these patches out, I can dig them up again > (or you might find them in the mailing list archives). > > > > -Todd > > On Fri, Apr 1, 2011 at 11:48 AM, Alex Aisinzon > wrote: > > Hi all > > > > Thoughts on this feedback about G1? > > > > Take care > > > > Alex A > > > > *From:* Alex Aisinzon > *Sent:* Saturday, March 26, 2011 6:46 AM > *To:* hotspot-gc-use at openjdk.java.net > *Subject:* G1 feedback > > > > Hi all > > > > I experimented with G1 and Sun JDK 1.6 update 24 and ran two long running > tests (8 hours) with it: > > With " -server -XX:+UseG1GC -XX:+UseCompressedOops -Xms24576m -Xmx24576m", > most pauses were very short. 4 pauses were around 7 seconds and 4 at 34 > seconds. > > I then added the objective of keeping the longest pause around 1 second and > used "-server -XX:+UseG1GC -XX:MaxGCPauseMillis=1000 -XX:+UseCompressedOops > -Xms24576m -Xmx24576m". Most pauses were a little above 1 second except for > one pause at 8 seconds and one at 57 seconds. > > The server is a dual X5570 (8 cores total) and has 48GB of RAM. Its average > CPU utilization was around 60-65% so it was not over-used. > > Everything would be perfect if it were not for the 7, 34 and 8, 57 seconds > pauses. > > What would you recommend I do to either reduce these longer pauses or give > insights into what happened so that G1 can avoid these very rare but pretty > long pauses in the future? > > > > Thanks in advance > > > > Alex A > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > > > > > -- > Todd Lipcon > Software Engineer, Cloudera > -- Todd Lipcon Software Engineer, Cloudera -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110401/18da6b85/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: jdk7-g1-fixes.patch Type: text/x-patch Size: 36582 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110401/18da6b85/attachment-0001.bin -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tom.rodriguez at oracle.com Fri Apr 1 14:55:37 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 1 Apr 2011 14:55:37 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination Message-ID: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> I could push this to hotspot-gc so it gets more CMS testing . http://cr.openjdk.java.net/~never/7032963 7032963: StoreCM shouldn't participate in store elimination Reviewed-by: StoreCM shouldn't participate in redundant store elimination since that could violate the requirement that a StoreCM must be strictly after a field update. This results in a large number of redundant StoreCMs being emitted for blocks of fields updates, so I added an optimization to fold them up safely. Previously the extra dependence was converted into a precedence edge just before register allocation but I moved this logic into final_graph_reshape. I then added logic to search through chains of StoreCMs to eliminate earlier redundant ones and transfer their precedence edges to the one that is kept. This ensures that they are scheduled properly. This actually eliminates duplicates that were previously missed so the code quality is slightly better. Tested by inspecting code generation with script to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and -XX:+UseG1GC. From y.s.ramakrishna at oracle.com Fri Apr 1 15:13:40 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Fri, 01 Apr 2011 15:13:40 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> Message-ID: <4D964E14.7040405@oracle.com> On 4/1/2011 2:55 PM, Tom Rodriguez wrote: > I could push this to hotspot-gc so it gets more CMS testing . That would be a very good idea (CMS as well as G1 testing, actually)! I can't review your changes, lacking sufficient bkgrd or familiarity with the code, but ... > This actually >> eliminates duplicates that were previously missed so the code quality >> is slightly better. wow! What more could one ask for --you fixed a correctness bug _and_ got us a bit more performance. Hmm, by chance does the fix also come with a free bottle of beer for all of us? ;-) -- ramki > > http://cr.openjdk.java.net/~never/7032963 > > 7032963: StoreCM shouldn't participate in store elimination > Reviewed-by: > > StoreCM shouldn't participate in redundant store elimination since > that could violate the requirement that a StoreCM must be strictly > after a field update. This results in a large number of redundant > StoreCMs being emitted for blocks of fields updates, so I added an > optimization to fold them up safely. Previously the extra dependence > was converted into a precedence edge just before register allocation > but I moved this logic into final_graph_reshape. I then added logic > to search through chains of StoreCMs to eliminate earlier redundant > ones and transfer their precedence edges to the one that is kept. > This ensures that they are scheduled properly. This actually > eliminates duplicates that were previously missed so the code quality > is slightly better. Tested by inspecting code generation with script > to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and > -XX:+UseG1GC. > From vladimir.kozlov at oracle.com Fri Apr 1 15:47:04 2011 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 01 Apr 2011 15:47:04 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> Message-ID: <4D9655E8.9050003@oracle.com> You may put n->in(MemNode::Address) and n->in(MemNode::ValueIn) into locals before the loop. Also you need to kill the node explicitly otherwise it still be connected to its inputs: + // Eliminate the previous StoreCM + prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); + assert(mem->outcnt() == 0, "should be dead"); + mem->disconnect_inputs(NULL); Vladimir Tom Rodriguez wrote: > I could push this to hotspot-gc so it gets more CMS testing . > > http://cr.openjdk.java.net/~never/7032963 > > 7032963: StoreCM shouldn't participate in store elimination > Reviewed-by: > > StoreCM shouldn't participate in redundant store elimination since > that could violate the requirement that a StoreCM must be strictly > after a field update. This results in a large number of redundant > StoreCMs being emitted for blocks of fields updates, so I added an > optimization to fold them up safely. Previously the extra dependence > was converted into a precedence edge just before register allocation > but I moved this logic into final_graph_reshape. I then added logic > to search through chains of StoreCMs to eliminate earlier redundant > ones and transfer their precedence edges to the one that is kept. > This ensures that they are scheduled properly. This actually > eliminates duplicates that were previously missed so the code quality > is slightly better. Tested by inspecting code generation with script > to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and > -XX:+UseG1GC. > From tom.rodriguez at oracle.com Fri Apr 1 16:26:54 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 1 Apr 2011 16:26:54 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <4D9655E8.9050003@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> <4D9655E8.9050003@oracle.com> Message-ID: <12783E68-E28E-44BB-85D1-911369EEBA41@oracle.com> On Apr 1, 2011, at 3:47 PM, Vladimir Kozlov wrote: > You may put n->in(MemNode::Address) and n->in(MemNode::ValueIn) into locals before the loop. Also you need to kill the node explicitly otherwise it still be connected to its inputs: > > + // Eliminate the previous StoreCM > + prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); > + assert(mem->outcnt() == 0, "should be dead"); > + mem->disconnect_inputs(NULL); I'll have to rework the mem traversal a little. Actually I think there might have been a bug with the old code since it always updated prev. I believe this is correct: // Eliminate the previous StoreCM prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); assert(mem->outcnt() == 0, "should be dead"); mem->disconnect_inputs(NULL); } else { prev = mem; } mem = prev->in(MemNode::Memory); } I think I'll put together a little test case to make sure this is working correctly. tom > > Vladimir > > Tom Rodriguez wrote: >> I could push this to hotspot-gc so it gets more CMS testing . >> http://cr.openjdk.java.net/~never/7032963 >> 7032963: StoreCM shouldn't participate in store elimination >> Reviewed-by: >> StoreCM shouldn't participate in redundant store elimination since >> that could violate the requirement that a StoreCM must be strictly >> after a field update. This results in a large number of redundant >> StoreCMs being emitted for blocks of fields updates, so I added an >> optimization to fold them up safely. Previously the extra dependence >> was converted into a precedence edge just before register allocation >> but I moved this logic into final_graph_reshape. I then added logic >> to search through chains of StoreCMs to eliminate earlier redundant >> ones and transfer their precedence edges to the one that is kept. >> This ensures that they are scheduled properly. This actually >> eliminates duplicates that were previously missed so the code quality >> is slightly better. Tested by inspecting code generation with script >> to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and >> -XX:+UseG1GC. From tom.rodriguez at oracle.com Fri Apr 1 16:29:27 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 1 Apr 2011 16:29:27 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <4D964E14.7040405@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> <4D964E14.7040405@oracle.com> Message-ID: <6988EBE2-402E-4BA6-BF1A-A7DA19547846@oracle.com> On Apr 1, 2011, at 3:13 PM, Y. Srinivas Ramakrishna wrote: > On 4/1/2011 2:55 PM, Tom Rodriguez wrote: >> I could push this to hotspot-gc so it gets more CMS testing . > > That would be a very good idea (CMS as well as G1 testing, actually)! > > I can't review your changes, lacking sufficient bkgrd or > familiarity with the code, but ... > >> This actually >>> eliminates duplicates that were previously missed so the code quality >>> is slightly better. > > wow! What more could one ask for --you fixed a correctness > bug _and_ got us a bit more performance. I have some other ideas about improving the code for barriers that occurred to me while looking at the conditional card marks code. It would mainly help with the G1 code and might require a little extra work but it's simpler than the more extensive changes we've talked about before. > Hmm, by chance does > the fix also come with a free bottle of beer for all of us? ;-) If you stop by my office I'd be happy to give you a beer. ;) tom > > -- ramki > >> >> http://cr.openjdk.java.net/~never/7032963 >> >> 7032963: StoreCM shouldn't participate in store elimination >> Reviewed-by: >> >> StoreCM shouldn't participate in redundant store elimination since >> that could violate the requirement that a StoreCM must be strictly >> after a field update. This results in a large number of redundant >> StoreCMs being emitted for blocks of fields updates, so I added an >> optimization to fold them up safely. Previously the extra dependence >> was converted into a precedence edge just before register allocation >> but I moved this logic into final_graph_reshape. I then added logic >> to search through chains of StoreCMs to eliminate earlier redundant >> ones and transfer their precedence edges to the one that is kept. >> This ensures that they are scheduled properly. This actually >> eliminates duplicates that were previously missed so the code quality >> is slightly better. Tested by inspecting code generation with script >> to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and >> -XX:+UseG1GC. >> > From vladimir.kozlov at oracle.com Fri Apr 1 16:37:14 2011 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 01 Apr 2011 16:37:14 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <12783E68-E28E-44BB-85D1-911369EEBA41@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> <4D9655E8.9050003@oracle.com> <12783E68-E28E-44BB-85D1-911369EEBA41@oracle.com> Message-ID: <4D9661AA.6070400@oracle.com> An other problem if n is on a branch and you could eliminate dominated StoreCM which above the split point resulting in not having StoreCM on opposite branch. Vladimir Tom Rodriguez wrote: > On Apr 1, 2011, at 3:47 PM, Vladimir Kozlov wrote: > >> You may put n->in(MemNode::Address) and n->in(MemNode::ValueIn) into locals before the loop. Also you need to kill the node explicitly otherwise it still be connected to its inputs: >> >> + // Eliminate the previous StoreCM >> + prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >> + assert(mem->outcnt() == 0, "should be dead"); >> + mem->disconnect_inputs(NULL); > > I'll have to rework the mem traversal a little. Actually I think there might have been a bug with the old code since it always updated prev. I believe this is correct: > > // Eliminate the previous StoreCM > prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); > assert(mem->outcnt() == 0, "should be dead"); > mem->disconnect_inputs(NULL); > } else { > prev = mem; > } > mem = prev->in(MemNode::Memory); > } > > I think I'll put together a little test case to make sure this is working correctly. > > tom > >> Vladimir >> >> Tom Rodriguez wrote: >>> I could push this to hotspot-gc so it gets more CMS testing . >>> http://cr.openjdk.java.net/~never/7032963 >>> 7032963: StoreCM shouldn't participate in store elimination >>> Reviewed-by: >>> StoreCM shouldn't participate in redundant store elimination since >>> that could violate the requirement that a StoreCM must be strictly >>> after a field update. This results in a large number of redundant >>> StoreCMs being emitted for blocks of fields updates, so I added an >>> optimization to fold them up safely. Previously the extra dependence >>> was converted into a precedence edge just before register allocation >>> but I moved this logic into final_graph_reshape. I then added logic >>> to search through chains of StoreCMs to eliminate earlier redundant >>> ones and transfer their precedence edges to the one that is kept. >>> This ensures that they are scheduled properly. This actually >>> eliminates duplicates that were previously missed so the code quality >>> is slightly better. Tested by inspecting code generation with script >>> to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and >>> -XX:+UseG1GC. > From tom.rodriguez at oracle.com Fri Apr 1 16:58:36 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 1 Apr 2011 16:58:36 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <4D9661AA.6070400@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> <4D9655E8.9050003@oracle.com> <12783E68-E28E-44BB-85D1-911369EEBA41@oracle.com> <4D9661AA.6070400@oracle.com> Message-ID: <656262C7-8969-45AA-A8D5-FED8D07BFD39@oracle.com> On Apr 1, 2011, at 4:37 PM, Vladimir Kozlov wrote: > An other problem if n is on a branch and you could eliminate dominated StoreCM which above the split point resulting in not having StoreCM on opposite branch. You mean: a.f = x b.f = y; if (test) return a.b = c; The StoreCM for a.f has a single user but it's used by the StoreCM of b.f which has multiple users. So I think the search needs to stop when it encounters multiple users of a StoreCM since that represents a split of control flow. Thanks for catching that. Sounds like a job for partial redundancy elimination. tom > > Vladimir > > Tom Rodriguez wrote: >> On Apr 1, 2011, at 3:47 PM, Vladimir Kozlov wrote: >>> You may put n->in(MemNode::Address) and n->in(MemNode::ValueIn) into locals before the loop. Also you need to kill the node explicitly otherwise it still be connected to its inputs: >>> >>> + // Eliminate the previous StoreCM >>> + prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >>> + assert(mem->outcnt() == 0, "should be dead"); >>> + mem->disconnect_inputs(NULL); >> I'll have to rework the mem traversal a little. Actually I think there might have been a bug with the old code since it always updated prev. I believe this is correct: >> // Eliminate the previous StoreCM prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >> assert(mem->outcnt() == 0, "should be dead"); >> mem->disconnect_inputs(NULL); >> } else { prev = mem; } >> mem = prev->in(MemNode::Memory); >> } >> I think I'll put together a little test case to make sure this is working correctly. >> tom >>> Vladimir >>> >>> Tom Rodriguez wrote: >>>> I could push this to hotspot-gc so it gets more CMS testing . >>>> http://cr.openjdk.java.net/~never/7032963 >>>> 7032963: StoreCM shouldn't participate in store elimination >>>> Reviewed-by: >>>> StoreCM shouldn't participate in redundant store elimination since >>>> that could violate the requirement that a StoreCM must be strictly >>>> after a field update. This results in a large number of redundant >>>> StoreCMs being emitted for blocks of fields updates, so I added an >>>> optimization to fold them up safely. Previously the extra dependence >>>> was converted into a precedence edge just before register allocation >>>> but I moved this logic into final_graph_reshape. I then added logic >>>> to search through chains of StoreCMs to eliminate earlier redundant >>>> ones and transfer their precedence edges to the one that is kept. >>>> This ensures that they are scheduled properly. This actually >>>> eliminates duplicates that were previously missed so the code quality >>>> is slightly better. Tested by inspecting code generation with script >>>> to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and >>>> -XX:+UseG1GC. From vladimir.kozlov at oracle.com Fri Apr 1 17:08:08 2011 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 01 Apr 2011 17:08:08 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <656262C7-8969-45AA-A8D5-FED8D07BFD39@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> <4D9655E8.9050003@oracle.com> <12783E68-E28E-44BB-85D1-911369EEBA41@oracle.com> <4D9661AA.6070400@oracle.com> <656262C7-8969-45AA-A8D5-FED8D07BFD39@oracle.com> Message-ID: <4D9668E8.3020407@oracle.com> Actually I thought about slightly different case: a.f = x if (test) { a.b = y; } But StoreCM for a.f should have several users (StoreCM for a.b and mergemem) so your condition (stop serch if multiple users) stays true. Vladimir Tom Rodriguez wrote: > On Apr 1, 2011, at 4:37 PM, Vladimir Kozlov wrote: > >> An other problem if n is on a branch and you could eliminate dominated StoreCM which above the split point resulting in not having StoreCM on opposite branch. > > You mean: > > a.f = x > b.f = y; > if (test) > return > a.b = c; > > The StoreCM for a.f has a single user but it's used by the StoreCM of b.f which has multiple users. So I think the search needs to stop when it encounters multiple users of a StoreCM since that represents a split of control flow. Thanks for catching that. > > Sounds like a job for partial redundancy elimination. > > tom > >> Vladimir >> >> Tom Rodriguez wrote: >>> On Apr 1, 2011, at 3:47 PM, Vladimir Kozlov wrote: >>>> You may put n->in(MemNode::Address) and n->in(MemNode::ValueIn) into locals before the loop. Also you need to kill the node explicitly otherwise it still be connected to its inputs: >>>> >>>> + // Eliminate the previous StoreCM >>>> + prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >>>> + assert(mem->outcnt() == 0, "should be dead"); >>>> + mem->disconnect_inputs(NULL); >>> I'll have to rework the mem traversal a little. Actually I think there might have been a bug with the old code since it always updated prev. I believe this is correct: >>> // Eliminate the previous StoreCM prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >>> assert(mem->outcnt() == 0, "should be dead"); >>> mem->disconnect_inputs(NULL); >>> } else { prev = mem; } >>> mem = prev->in(MemNode::Memory); >>> } >>> I think I'll put together a little test case to make sure this is working correctly. >>> tom >>>> Vladimir >>>> >>>> Tom Rodriguez wrote: >>>>> I could push this to hotspot-gc so it gets more CMS testing . >>>>> http://cr.openjdk.java.net/~never/7032963 >>>>> 7032963: StoreCM shouldn't participate in store elimination >>>>> Reviewed-by: >>>>> StoreCM shouldn't participate in redundant store elimination since >>>>> that could violate the requirement that a StoreCM must be strictly >>>>> after a field update. This results in a large number of redundant >>>>> StoreCMs being emitted for blocks of fields updates, so I added an >>>>> optimization to fold them up safely. Previously the extra dependence >>>>> was converted into a precedence edge just before register allocation >>>>> but I moved this logic into final_graph_reshape. I then added logic >>>>> to search through chains of StoreCMs to eliminate earlier redundant >>>>> ones and transfer their precedence edges to the one that is kept. >>>>> This ensures that they are scheduled properly. This actually >>>>> eliminates duplicates that were previously missed so the code quality >>>>> is slightly better. Tested by inspecting code generation with script >>>>> to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and >>>>> -XX:+UseG1GC. > From vladimir.kozlov at oracle.com Fri Apr 1 17:15:31 2011 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 01 Apr 2011 17:15:31 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <4D9668E8.3020407@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> <4D9655E8.9050003@oracle.com> <12783E68-E28E-44BB-85D1-911369EEBA41@oracle.com> <4D9661AA.6070400@oracle.com> <656262C7-8969-45AA-A8D5-FED8D07BFD39@oracle.com> <4D9668E8.3020407@oracle.com> Message-ID: <4D966AA3.804@oracle.com> And, it seems, your current code covers this case already. So my false assumption helped you to find the real problem ;) Thanks, Vladimir Vladimir Kozlov wrote: > Actually I thought about slightly different case: > > a.f = x > if (test) { > a.b = y; > } > > But StoreCM for a.f should have several users (StoreCM for a.b and > mergemem) so your condition (stop serch if multiple users) stays true. > > Vladimir > > Tom Rodriguez wrote: >> On Apr 1, 2011, at 4:37 PM, Vladimir Kozlov wrote: >> >>> An other problem if n is on a branch and you could eliminate >>> dominated StoreCM which above the split point resulting in not having >>> StoreCM on opposite branch. >> >> You mean: >> >> a.f = x >> b.f = y; >> if (test) >> return >> a.b = c; >> >> The StoreCM for a.f has a single user but it's used by the StoreCM of >> b.f which has multiple users. So I think the search needs to stop >> when it encounters multiple users of a StoreCM since that represents a >> split of control flow. Thanks for catching that. >> >> Sounds like a job for partial redundancy elimination. >> >> tom >> >>> Vladimir >>> >>> Tom Rodriguez wrote: >>>> On Apr 1, 2011, at 3:47 PM, Vladimir Kozlov wrote: >>>>> You may put n->in(MemNode::Address) and n->in(MemNode::ValueIn) >>>>> into locals before the loop. Also you need to kill the node >>>>> explicitly otherwise it still be connected to its inputs: >>>>> >>>>> + // Eliminate the previous StoreCM >>>>> + prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >>>>> + assert(mem->outcnt() == 0, "should be dead"); >>>>> + mem->disconnect_inputs(NULL); >>>> I'll have to rework the mem traversal a little. Actually I think >>>> there might have been a bug with the old code since it always >>>> updated prev. I believe this is correct: >>>> // Eliminate the previous >>>> StoreCM >>>> prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >>>> assert(mem->outcnt() == 0, "should be dead"); >>>> mem->disconnect_inputs(NULL); >>>> } else >>>> { >>>> prev = >>>> mem; >>>> } >>>> mem = prev->in(MemNode::Memory); >>>> } >>>> I think I'll put together a little test case to make sure this is >>>> working correctly. >>>> tom >>>>> Vladimir >>>>> >>>>> Tom Rodriguez wrote: >>>>>> I could push this to hotspot-gc so it gets more CMS testing . >>>>>> http://cr.openjdk.java.net/~never/7032963 >>>>>> 7032963: StoreCM shouldn't participate in store elimination >>>>>> Reviewed-by: >>>>>> StoreCM shouldn't participate in redundant store elimination since >>>>>> that could violate the requirement that a StoreCM must be strictly >>>>>> after a field update. This results in a large number of redundant >>>>>> StoreCMs being emitted for blocks of fields updates, so I added an >>>>>> optimization to fold them up safely. Previously the extra dependence >>>>>> was converted into a precedence edge just before register allocation >>>>>> but I moved this logic into final_graph_reshape. I then added logic >>>>>> to search through chains of StoreCMs to eliminate earlier redundant >>>>>> ones and transfer their precedence edges to the one that is kept. >>>>>> This ensures that they are scheduled properly. This actually >>>>>> eliminates duplicates that were previously missed so the code quality >>>>>> is slightly better. Tested by inspecting code generation with script >>>>>> to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and >>>>>> -XX:+UseG1GC. >> From tom.rodriguez at oracle.com Fri Apr 1 18:05:15 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 1 Apr 2011 18:05:15 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <4D966AA3.804@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> <4D9655E8.9050003@oracle.com> <12783E68-E28E-44BB-85D1-911369EEBA41@oracle.com> <4D9661AA.6070400@oracle.com> <656262C7-8969-45AA-A8D5-FED8D07BFD39@oracle.com> <4D9668E8.3020407@oracle.com> <4D966AA3.804@oracle.com> Message-ID: <375A0C90-548D-49FB-B6D6-CAB8800C382D@oracle.com> It turns out that the code I wrote is safe from this bug because the CastP2X has control so the card mark addresses don't appear to be the same since the CastP2X's are different. Picking a higher control is the basis of the other idea I had for improving the barrier code since it improves the sharing of card mark computations. If I enable that logic then the bug in my code shows up. I've fixed it by checking for outcnt() == 1 in the main loop control. tom On Apr 1, 2011, at 5:15 PM, Vladimir Kozlov wrote: > And, it seems, your current code covers this case already. So my false assumption helped you to find the real problem ;) > > Thanks, > Vladimir > > Vladimir Kozlov wrote: >> Actually I thought about slightly different case: >> a.f = x >> if (test) { >> a.b = y; >> } >> But StoreCM for a.f should have several users (StoreCM for a.b and mergemem) so your condition (stop serch if multiple users) stays true. >> Vladimir >> Tom Rodriguez wrote: >>> On Apr 1, 2011, at 4:37 PM, Vladimir Kozlov wrote: >>> >>>> An other problem if n is on a branch and you could eliminate dominated StoreCM which above the split point resulting in not having StoreCM on opposite branch. >>> >>> You mean: >>> >>> a.f = x >>> b.f = y; >>> if (test) >>> return >>> a.b = c; >>> >>> The StoreCM for a.f has a single user but it's used by the StoreCM of b.f which has multiple users. So I think the search needs to stop when it encounters multiple users of a StoreCM since that represents a split of control flow. Thanks for catching that. >>> >>> Sounds like a job for partial redundancy elimination. >>> >>> tom >>> >>>> Vladimir >>>> >>>> Tom Rodriguez wrote: >>>>> On Apr 1, 2011, at 3:47 PM, Vladimir Kozlov wrote: >>>>>> You may put n->in(MemNode::Address) and n->in(MemNode::ValueIn) into locals before the loop. Also you need to kill the node explicitly otherwise it still be connected to its inputs: >>>>>> >>>>>> + // Eliminate the previous StoreCM >>>>>> + prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >>>>>> + assert(mem->outcnt() == 0, "should be dead"); >>>>>> + mem->disconnect_inputs(NULL); >>>>> I'll have to rework the mem traversal a little. Actually I think there might have been a bug with the old code since it always updated prev. I believe this is correct: >>>>> // Eliminate the previous StoreCM prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >>>>> assert(mem->outcnt() == 0, "should be dead"); >>>>> mem->disconnect_inputs(NULL); >>>>> } else { prev = mem; } >>>>> mem = prev->in(MemNode::Memory); >>>>> } >>>>> I think I'll put together a little test case to make sure this is working correctly. >>>>> tom >>>>>> Vladimir >>>>>> >>>>>> Tom Rodriguez wrote: >>>>>>> I could push this to hotspot-gc so it gets more CMS testing . >>>>>>> http://cr.openjdk.java.net/~never/7032963 >>>>>>> 7032963: StoreCM shouldn't participate in store elimination >>>>>>> Reviewed-by: >>>>>>> StoreCM shouldn't participate in redundant store elimination since >>>>>>> that could violate the requirement that a StoreCM must be strictly >>>>>>> after a field update. This results in a large number of redundant >>>>>>> StoreCMs being emitted for blocks of fields updates, so I added an >>>>>>> optimization to fold them up safely. Previously the extra dependence >>>>>>> was converted into a precedence edge just before register allocation >>>>>>> but I moved this logic into final_graph_reshape. I then added logic >>>>>>> to search through chains of StoreCMs to eliminate earlier redundant >>>>>>> ones and transfer their precedence edges to the one that is kept. >>>>>>> This ensures that they are scheduled properly. This actually >>>>>>> eliminates duplicates that were previously missed so the code quality >>>>>>> is slightly better. Tested by inspecting code generation with script >>>>>>> to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and >>>>>>> -XX:+UseG1GC. >>> From vladimir.kozlov at oracle.com Fri Apr 1 18:07:58 2011 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 01 Apr 2011 18:07:58 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <375A0C90-548D-49FB-B6D6-CAB8800C382D@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> <4D9655E8.9050003@oracle.com> <12783E68-E28E-44BB-85D1-911369EEBA41@oracle.com> <4D9661AA.6070400@oracle.com> <656262C7-8969-45AA-A8D5-FED8D07BFD39@oracle.com> <4D9668E8.3020407@oracle.com> <4D966AA3.804@oracle.com> <375A0C90-548D-49FB-B6D6-CAB8800C382D@oracle.com> Message-ID: <4D9676EE.5000007@oracle.com> Yes, this looks right. Thanks, Vladimir Tom Rodriguez wrote: > It turns out that the code I wrote is safe from this bug because the CastP2X has control so the card mark addresses don't appear to be the same since the CastP2X's are different. Picking a higher control is the basis of the other idea I had for improving the barrier code since it improves the sharing of card mark computations. If I enable that logic then the bug in my code shows up. I've fixed it by checking for outcnt() == 1 in the main loop control. > > tom > > On Apr 1, 2011, at 5:15 PM, Vladimir Kozlov wrote: > >> And, it seems, your current code covers this case already. So my false assumption helped you to find the real problem ;) >> >> Thanks, >> Vladimir >> >> Vladimir Kozlov wrote: >>> Actually I thought about slightly different case: >>> a.f = x >>> if (test) { >>> a.b = y; >>> } >>> But StoreCM for a.f should have several users (StoreCM for a.b and mergemem) so your condition (stop serch if multiple users) stays true. >>> Vladimir >>> Tom Rodriguez wrote: >>>> On Apr 1, 2011, at 4:37 PM, Vladimir Kozlov wrote: >>>> >>>>> An other problem if n is on a branch and you could eliminate dominated StoreCM which above the split point resulting in not having StoreCM on opposite branch. >>>> You mean: >>>> >>>> a.f = x >>>> b.f = y; >>>> if (test) >>>> return >>>> a.b = c; >>>> >>>> The StoreCM for a.f has a single user but it's used by the StoreCM of b.f which has multiple users. So I think the search needs to stop when it encounters multiple users of a StoreCM since that represents a split of control flow. Thanks for catching that. >>>> >>>> Sounds like a job for partial redundancy elimination. >>>> >>>> tom >>>> >>>>> Vladimir >>>>> >>>>> Tom Rodriguez wrote: >>>>>> On Apr 1, 2011, at 3:47 PM, Vladimir Kozlov wrote: >>>>>>> You may put n->in(MemNode::Address) and n->in(MemNode::ValueIn) into locals before the loop. Also you need to kill the node explicitly otherwise it still be connected to its inputs: >>>>>>> >>>>>>> + // Eliminate the previous StoreCM >>>>>>> + prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >>>>>>> + assert(mem->outcnt() == 0, "should be dead"); >>>>>>> + mem->disconnect_inputs(NULL); >>>>>> I'll have to rework the mem traversal a little. Actually I think there might have been a bug with the old code since it always updated prev. I believe this is correct: >>>>>> // Eliminate the previous StoreCM prev->set_req(MemNode::Memory, mem->in(MemNode::Memory)); >>>>>> assert(mem->outcnt() == 0, "should be dead"); >>>>>> mem->disconnect_inputs(NULL); >>>>>> } else { prev = mem; } >>>>>> mem = prev->in(MemNode::Memory); >>>>>> } >>>>>> I think I'll put together a little test case to make sure this is working correctly. >>>>>> tom >>>>>>> Vladimir >>>>>>> >>>>>>> Tom Rodriguez wrote: >>>>>>>> I could push this to hotspot-gc so it gets more CMS testing . >>>>>>>> http://cr.openjdk.java.net/~never/7032963 >>>>>>>> 7032963: StoreCM shouldn't participate in store elimination >>>>>>>> Reviewed-by: >>>>>>>> StoreCM shouldn't participate in redundant store elimination since >>>>>>>> that could violate the requirement that a StoreCM must be strictly >>>>>>>> after a field update. This results in a large number of redundant >>>>>>>> StoreCMs being emitted for blocks of fields updates, so I added an >>>>>>>> optimization to fold them up safely. Previously the extra dependence >>>>>>>> was converted into a precedence edge just before register allocation >>>>>>>> but I moved this logic into final_graph_reshape. I then added logic >>>>>>>> to search through chains of StoreCMs to eliminate earlier redundant >>>>>>>> ones and transfer their precedence edges to the one that is kept. >>>>>>>> This ensures that they are scheduled properly. This actually >>>>>>>> eliminates duplicates that were previously missed so the code quality >>>>>>>> is slightly better. Tested by inspecting code generation with script >>>>>>>> to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and >>>>>>>> -XX:+UseG1GC. > From tony.printezis at oracle.com Mon Apr 4 08:23:11 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Mon, 04 Apr 2011 11:23:11 -0400 Subject: CRR: 7033292: G1: nightly failure: Non-dirty cards in region that should be dirty (XXS) Message-ID: <4D99E25F.2000400@oracle.com> Tiny patch (only a single digit changed and a short comment added): http://cr.openjdk.java.net/~tonyp/7033292/webrev.0/ The issue is that in the G1 card cache the epoch the cache entries are initialized to (e.g., 0) is the same as what the current epoch is also initialized to. This makes the initialized, but not yet populated, cache entries to look valid until the first GC. When one of those entries is replaced the evicted card is incorrectly materialized to be the one that corresponds to the bottom of the heap. The fix is to initialize the current epoch to 1 to automatically invalidate all the cache entries that have not been populate yet. Many thanks to John Cuthbertson for his help on this. Tony From igor.veresov at oracle.com Mon Apr 4 09:47:36 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 04 Apr 2011 09:47:36 -0700 Subject: CRR: 7033292: G1: nightly failure: Non-dirty cards in region that should be dirty (XXS) In-Reply-To: <4D99E25F.2000400@oracle.com> References: <4D99E25F.2000400@oracle.com> Message-ID: <4D99F628.7040102@oracle.com> Good catch. Looks great! igor On 4/4/11 8:23 AM, Tony Printezis wrote: > Tiny patch (only a single digit changed and a short comment added): > > http://cr.openjdk.java.net/~tonyp/7033292/webrev.0/ > > The issue is that in the G1 card cache the epoch the cache entries are > initialized to (e.g., 0) is the same as what the current epoch is also > initialized to. This makes the initialized, but not yet populated, cache > entries to look valid until the first GC. When one of those entries is > replaced the evicted card is incorrectly materialized to be the one that > corresponds to the bottom of the heap. The fix is to initialize the > current epoch to 1 to automatically invalidate all the cache entries > that have not been populate yet. > > Many thanks to John Cuthbertson for his help on this. > > Tony From y.s.ramakrishna at oracle.com Mon Apr 4 09:59:00 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Mon, 04 Apr 2011 09:59:00 -0700 Subject: CRR: 7033292: G1: nightly failure: Non-dirty cards in region that should be dirty (XXS) In-Reply-To: <4D99E25F.2000400@oracle.com> References: <4D99E25F.2000400@oracle.com> Message-ID: <4D99F8D4.6090208@oracle.com> looks good. - ramki On 4/4/2011 8:23 AM, Tony Printezis wrote: > Tiny patch (only a single digit changed and a short comment added): > > http://cr.openjdk.java.net/~tonyp/7033292/webrev.0/ > > The issue is that in the G1 card cache the epoch the cache entries are initialized to (e.g., 0) is > the same as what the current epoch is also initialized to. This makes the initialized, but not yet > populated, cache entries to look valid until the first GC. When one of those entries is replaced the > evicted card is incorrectly materialized to be the one that corresponds to the bottom of the heap. > The fix is to initialize the current epoch to 1 to automatically invalidate all the cache entries > that have not been populate yet. > > Many thanks to John Cuthbertson for his help on this. > > Tony From tony.printezis at oracle.com Mon Apr 4 09:59:14 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Mon, 04 Apr 2011 12:59:14 -0400 Subject: CRR: 7033292: G1: nightly failure: Non-dirty cards in region that should be dirty (XXS) In-Reply-To: <4D99F8D4.6090208@oracle.com> References: <4D99E25F.2000400@oracle.com> <4D99F8D4.6090208@oracle.com> Message-ID: <4D99F8E2.7050609@oracle.com> Thanks Ramki (and Igor)! All set. I'll push this asap (as soon as a job I currently have in the queue goes through). Tony Y. Srinivas Ramakrishna wrote: > looks good. > > - ramki > > On 4/4/2011 8:23 AM, Tony Printezis wrote: >> Tiny patch (only a single digit changed and a short comment added): >> >> http://cr.openjdk.java.net/~tonyp/7033292/webrev.0/ >> >> The issue is that in the G1 card cache the epoch the cache entries >> are initialized to (e.g., 0) is >> the same as what the current epoch is also initialized to. This makes >> the initialized, but not yet >> populated, cache entries to look valid until the first GC. When one >> of those entries is replaced the >> evicted card is incorrectly materialized to be the one that >> corresponds to the bottom of the heap. >> The fix is to initialize the current epoch to 1 to automatically >> invalidate all the cache entries >> that have not been populate yet. >> >> Many thanks to John Cuthbertson for his help on this. >> >> Tony > From tony.printezis at oracle.com Mon Apr 4 12:37:17 2011 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Mon, 04 Apr 2011 19:37:17 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7033292: G1: nightly failure: Non-dirty cards in region that should be dirty Message-ID: <20110404193721.15ADF477A7@hg.openjdk.java.net> Changeset: c84ee870e0b9 Author: tonyp Date: 2011-04-04 13:18 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/c84ee870e0b9 7033292: G1: nightly failure: Non-dirty cards in region that should be dirty Summary: The epochs on the card cache array are initialized to 0 and our initial epoch also starts at 0. So, until the first GC, it might be possible to successfully "claim" a card which was in fact never initialized. Reviewed-by: johnc, iveresov, ysr ! src/share/vm/gc_implementation/g1/concurrentG1Refine.cpp From tony.printezis at oracle.com Mon Apr 4 14:39:34 2011 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Mon, 04 Apr 2011 21:39:34 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7027766: G1: introduce flag to dump the liveness information per region at the end of marking Message-ID: <20110404213937.F0B76477AD@hg.openjdk.java.net> Changeset: 371bbc844bf1 Author: tonyp Date: 2011-04-04 14:23 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/371bbc844bf1 7027766: G1: introduce flag to dump the liveness information per region at the end of marking Summary: Repurpose the existing flag G1PrintRegionLivenessInfo to print out the liveness distribution across the regions in the heap at the end of marking. Reviewed-by: iveresov, jwilhelm ! src/share/vm/gc_implementation/g1/collectionSetChooser.cpp ! src/share/vm/gc_implementation/g1/collectionSetChooser.hpp ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/concurrentMark.hpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp ! src/share/vm/gc_implementation/g1/g1_globals.hpp ! src/share/vm/gc_implementation/g1/heapRegion.hpp From john.cuthbertson at oracle.com Mon Apr 4 16:42:26 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Mon, 04 Apr 2011 23:42:26 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7020042: G1: Partially remove fix for 6994628 Message-ID: <20110404234230.09DA0477B8@hg.openjdk.java.net> Changeset: 8f1042ff784d Author: johnc Date: 2011-02-18 10:07 -0800 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/8f1042ff784d 7020042: G1: Partially remove fix for 6994628 Summary: Disable reference discovery and processing during concurrent marking by disabling fix for 6994628. Reviewed-by: tonyp, ysr ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/g1_globals.hpp From john.cuthbertson at oracle.com Mon Apr 4 16:43:16 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Mon, 04 Apr 2011 16:43:16 -0700 Subject: RFR(M): 7009266: G1: assert(obj->is_oop_or_null(true )) failed: Error In-Reply-To: <4D90C6E0.9000901@oracle.com> References: <4D7ACDBA.7020003@oracle.com> <4D90C6E0.9000901@oracle.com> Message-ID: <4D9A5794.1030507@oracle.com> Hi Everyone, A new webrev for this fix can be found at: http://cr.openjdk.java.net/~johnc/7009266/webrev.5/ The changes in this revision include: * Revised barrier set calls in JNI_GetObjectField and Unsafe_getObject so that the value that is returned is the value that gets recorded in an SATB buffer. The code in the previous revision re-read the value of the field potentially causing the value the is logged to be different from that returned. * Added the static checks, suggested by Tom, in library_call.cpp. * Removed the complicated control flow from the C1 LIR implementation of Unsafe.getObject. Instead the checks have been moved into a code stub while some static checks have been added. * Re-enabled reference discovery (and therefore reference processing) during concurrent marking as a result of pushing the changes for 7020042 to hs21. Thanks, JohnC On 03/28/11 10:35, John Cuthbertson wrote: > Hi Everyone, > > A new webrev with changes based upon comments from Tom can be found > at: http://cr.openjdk.java.net/~johnc/7009266/webrev.4/. > > The latest changes include inserting a suitably guarded barrier call > in case the referent field of a Reference object is being read/fetched > using JNI, reflection, or Unsafe. > > Thanks, > > JohnC > > On 3/11/2011 5:34 PM, John Cuthbertson wrote: >> Hi Everyone, >> >> I'm looking for a few of volunteers to review the changes that fix >> this assertion failure. The latest changes can be found at: >> http://cr.openjdk.java.net/~johnc/7009266/webrev.3/ and include >> changes based upon earlier internal reviews. The earlier changes are >> also on cr.openjdk.java.net for reference. >> >> Background: >> The G1 garbage collector includes a concurrent marking algorithm that >> makes use of snapshot-at-the-beginning or SATB. With this algorithm >> the GC will mark all objects that are reachable at the start of >> marking; objects that are allocated since the start of marking are >> implicitly considered live. In order to populate the "snapshot" of >> the object graph that existed at the start of marking, G1 employs a >> write barrier. When an object is stored into another object's field >> the write-barrier records the previous value of that field as it was >> part of the "snapshot" and concurrent marking will trace the >> sub-graph that is reachable from this previous value. >> >> Unfortunately, in the presence of Reference objects, SATB might not >> be sufficient to mark a referent object as live. Consider that, at >> the start of marking, we have a weakly reachable object i.e. an >> object where the only pointer to that object. If the referent is >> obtained from the Reference object and stored to another object's >> field (making the referent now strongly reachable and hence live) the >> G1 write barrier will record the field's previous value but not the >> value of the referent. >> >> If the referent object is strongly reachable from some other object >> that will be traced by concurrent marking, _or_ there is a subsequent >> assignment to the field where we have written the referent (in which >> case we record the previous value - the referent - in an SATB buffer) >> then the referent will be marked live. Otherwise the referent will >> not be marked. >> >> That is the issue that was causing the failure in this CR. There was >> a Logger object that was only reachable through a WeakReference at >> the start of concurrent marking. During marking the Logger object is >> obtained from the WeakReference and stored into a field of a live >> object. The G1 write barrier recorded the previous value in the field >> (as it is part of the snapshot at the start of marking). Since there >> was no other assignment to the live object's field and there was no >> other strong reference to the Logger object, the Logger object was >> not marked. At the end of concurrent marking the Logger object was >> considered dead and the link between the WeakReference and the Logger >> was severed by clearing the referent field during reference processing. >> >> To solve this (entirely in Hotspot and causing a performance overhead >> for G1 only) it was decided that the best approach was to intrinsify >> the Reference.get() method in the JIT compilers and add new >> interpreter entry points so that the value in the referent field will >> be recorded in an SATB buffer by the G1 pre-barrier code. >> >> The changes for Zero and the C++ interpreters are place holder >> routines but should be straight forward to implement. >> >> None of the individual changes is large - they are just well >> distributed around the JVM. :) >> >> Testing: white box test; eyeballing the generated compiled and >> interpreter code; the failing Kitchensink big-app on x86 (32/64 bit), >> sparc (32/64 bit), Xint, Xcomp (client and server), with and without >> G1; the GC test suite with and without G1; and jprt. >> >> Thanks and regards, >> >> JohnC > From antcrawlerster at gmail.com Mon Apr 4 22:28:12 2011 From: antcrawlerster at gmail.com (wei he) Date: Tue, 5 Apr 2011 13:28:12 +0800 Subject: No subject Message-ID: -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110405/6d588288/attachment.html From tom.rodriguez at oracle.com Tue Apr 5 11:56:32 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Tue, 5 Apr 2011 11:56:32 -0700 Subject: RFR(M): 7009266: G1: assert(obj->is_oop_or_null(true )) failed: Error In-Reply-To: <4D9A5794.1030507@oracle.com> References: <4D7ACDBA.7020003@oracle.com> <4D90C6E0.9000901@oracle.com> <4D9A5794.1030507@oracle.com> Message-ID: <48D28564-37FA-4DC7-A0C2-3531844EB152@oracle.com> Looks good. tom On Apr 4, 2011, at 4:43 PM, John Cuthbertson wrote: > Hi Everyone, > > A new webrev for this fix can be found at: http://cr.openjdk.java.net/~johnc/7009266/webrev.5/ > > The changes in this revision include: > > * Revised barrier set calls in JNI_GetObjectField and Unsafe_getObject so that the value that is returned is the value that gets recorded in an SATB buffer. The code in the previous revision re-read the value of the field potentially causing the value the is logged to be different from that returned. > > * Added the static checks, suggested by Tom, in library_call.cpp. > > * Removed the complicated control flow from the C1 LIR implementation of Unsafe.getObject. Instead the checks have been moved into a code stub while some static checks have been added. > > * Re-enabled reference discovery (and therefore reference processing) during concurrent marking as a result of pushing the changes for 7020042 to hs21. > > Thanks, > > JohnC > > On 03/28/11 10:35, John Cuthbertson wrote: >> Hi Everyone, >> >> A new webrev with changes based upon comments from Tom can be found at: http://cr.openjdk.java.net/~johnc/7009266/webrev.4/. >> >> The latest changes include inserting a suitably guarded barrier call in case the referent field of a Reference object is being read/fetched using JNI, reflection, or Unsafe. >> >> Thanks, >> >> JohnC >> >> On 3/11/2011 5:34 PM, John Cuthbertson wrote: >>> Hi Everyone, >>> >>> I'm looking for a few of volunteers to review the changes that fix this assertion failure. The latest changes can be found at: http://cr.openjdk.java.net/~johnc/7009266/webrev.3/ and include changes based upon earlier internal reviews. The earlier changes are also on cr.openjdk.java.net for reference. >>> >>> Background: >>> The G1 garbage collector includes a concurrent marking algorithm that makes use of snapshot-at-the-beginning or SATB. With this algorithm the GC will mark all objects that are reachable at the start of marking; objects that are allocated since the start of marking are implicitly considered live. In order to populate the "snapshot" of the object graph that existed at the start of marking, G1 employs a write barrier. When an object is stored into another object's field the write-barrier records the previous value of that field as it was part of the "snapshot" and concurrent marking will trace the sub-graph that is reachable from this previous value. >>> >>> Unfortunately, in the presence of Reference objects, SATB might not be sufficient to mark a referent object as live. Consider that, at the start of marking, we have a weakly reachable object i.e. an object where the only pointer to that object. If the referent is obtained from the Reference object and stored to another object's field (making the referent now strongly reachable and hence live) the G1 write barrier will record the field's previous value but not the value of the referent. >>> >>> If the referent object is strongly reachable from some other object that will be traced by concurrent marking, _or_ there is a subsequent assignment to the field where we have written the referent (in which case we record the previous value - the referent - in an SATB buffer) then the referent will be marked live. Otherwise the referent will not be marked. >>> >>> That is the issue that was causing the failure in this CR. There was a Logger object that was only reachable through a WeakReference at the start of concurrent marking. During marking the Logger object is obtained from the WeakReference and stored into a field of a live object. The G1 write barrier recorded the previous value in the field (as it is part of the snapshot at the start of marking). Since there was no other assignment to the live object's field and there was no other strong reference to the Logger object, the Logger object was not marked. At the end of concurrent marking the Logger object was considered dead and the link between the WeakReference and the Logger was severed by clearing the referent field during reference processing. >>> >>> To solve this (entirely in Hotspot and causing a performance overhead for G1 only) it was decided that the best approach was to intrinsify the Reference.get() method in the JIT compilers and add new interpreter entry points so that the value in the referent field will be recorded in an SATB buffer by the G1 pre-barrier code. >>> >>> The changes for Zero and the C++ interpreters are place holder routines but should be straight forward to implement. >>> >>> None of the individual changes is large - they are just well distributed around the JVM. :) >>> >>> Testing: white box test; eyeballing the generated compiled and interpreter code; the failing Kitchensink big-app on x86 (32/64 bit), sparc (32/64 bit), Xint, Xcomp (client and server), with and without G1; the GC test suite with and without G1; and jprt. >>> >>> Thanks and regards, >>> >>> JohnC >> > From John.Coomes at oracle.com Tue Apr 5 15:02:41 2011 From: John.Coomes at oracle.com (John Coomes) Date: Tue, 5 Apr 2011 15:02:41 -0700 Subject: review request (XS) - 7034133: cleanup obsolete option handling Message-ID: <19867.37249.351733.543415@oracle.com> Hi all, Please review a simple change to improve handling of obsolete options: http://cr.openjdk.java.net/~jcoomes/7034133-cleanup-obsolete/ -John From tom.rodriguez at oracle.com Tue Apr 5 15:09:39 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Tue, 5 Apr 2011 15:09:39 -0700 Subject: review for 7032963: StoreCM shouldn't participate in store elimination In-Reply-To: <4D964E14.7040405@oracle.com> References: <9D6B4DC5-E378-40DD-ADB0-F95AECF23A8D@oracle.com> <4D964E14.7040405@oracle.com> Message-ID: <41D4D88B-6059-4720-8373-9DA7E269F936@oracle.com> On Apr 1, 2011, at 3:13 PM, Y. Srinivas Ramakrishna wrote: > On 4/1/2011 2:55 PM, Tom Rodriguez wrote: >> I could push this to hotspot-gc so it gets more CMS testing . > > That would be a very good idea (CMS as well as G1 testing, actually)! > > I can't review your changes, lacking sufficient bkgrd or > familiarity with the code, but ... So I was going to push this to hotspot-gc. Is that ok? It won't make nightly testing tonight... tom > >> This actually >>> eliminates duplicates that were previously missed so the code quality >>> is slightly better. > > wow! What more could one ask for --you fixed a correctness > bug _and_ got us a bit more performance. Hmm, by chance does > the fix also come with a free bottle of beer for all of us? ;-) > > -- ramki > >> >> http://cr.openjdk.java.net/~never/7032963 >> >> 7032963: StoreCM shouldn't participate in store elimination >> Reviewed-by: >> >> StoreCM shouldn't participate in redundant store elimination since >> that could violate the requirement that a StoreCM must be strictly >> after a field update. This results in a large number of redundant >> StoreCMs being emitted for blocks of fields updates, so I added an >> optimization to fold them up safely. Previously the extra dependence >> was converted into a precedence edge just before register allocation >> but I moved this logic into final_graph_reshape. I then added logic >> to search through chains of StoreCMs to eliminate earlier redundant >> ones and transfer their precedence edges to the one that is kept. >> This ensures that they are scheduled properly. This actually >> eliminates duplicates that were previously missed so the code quality >> is slightly better. Tested by inspecting code generation with script >> to identify duplicates. Also ran CTW with -XX:+UseCondCardMark and >> -XX:+UseG1GC. >> > From John.Coomes at oracle.com Tue Apr 5 15:12:18 2011 From: John.Coomes at oracle.com (John Coomes) Date: Tue, 5 Apr 2011 15:12:18 -0700 Subject: review request (S) 6841742 par compact - remove unsupported options Message-ID: <19867.37826.560239.745031@oracle.com> I'd appreciate reviews of a change that marks some unused/unsupported par compaction options as obsolete and removes the associated code. http://cr.openjdk.java.net/~jcoomes/6841742-pc-opts/ -John From y.s.ramakrishna at oracle.com Tue Apr 5 16:05:19 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 05 Apr 2011 16:05:19 -0700 Subject: review request (XS) - 7034133: cleanup obsolete option handling In-Reply-To: <19867.37249.351733.543415@oracle.com> References: <19867.37249.351733.543415@oracle.com> Message-ID: <4D9BA02F.307@oracle.com> looks fine to me. On 04/05/11 15:02, John Coomes wrote: > Hi all, > > Please review a simple change to improve handling of obsolete options: > > http://cr.openjdk.java.net/~jcoomes/7034133-cleanup-obsolete/ > > -John From john.cuthbertson at oracle.com Tue Apr 5 16:57:49 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 05 Apr 2011 16:57:49 -0700 Subject: review request (XS) - 7034133: cleanup obsolete option handling In-Reply-To: <19867.37249.351733.543415@oracle.com> References: <19867.37249.351733.543415@oracle.com> Message-ID: <4D9BAC7D.2030804@oracle.com> Hi John, Looks good to me. JohnC On 04/05/11 15:02, John Coomes wrote: > Hi all, > > Please review a simple change to improve handling of obsolete options: > > http://cr.openjdk.java.net/~jcoomes/7034133-cleanup-obsolete/ > > -John > From John.Coomes at oracle.com Tue Apr 5 22:38:01 2011 From: John.Coomes at oracle.com (John Coomes) Date: Tue, 5 Apr 2011 22:38:01 -0700 Subject: review request (XS) - 7034133: cleanup obsolete option handling In-Reply-To: <4D9BA02F.307@oracle.com> References: <19867.37249.351733.543415@oracle.com> <4D9BA02F.307@oracle.com> Message-ID: <19867.64569.543504.929863@oracle.com> Y. S. Ramakrishna (y.s.ramakrishna at oracle.com) wrote: > looks fine to me. Thanks! -John > On 04/05/11 15:02, John Coomes wrote: > > Hi all, > > > > Please review a simple change to improve handling of obsolete options: > > > > http://cr.openjdk.java.net/~jcoomes/7034133-cleanup-obsolete/ > > > > -John From John.Coomes at oracle.com Tue Apr 5 22:38:40 2011 From: John.Coomes at oracle.com (John Coomes) Date: Tue, 5 Apr 2011 22:38:40 -0700 Subject: review request (XS) - 7034133: cleanup obsolete option handling In-Reply-To: <4D9BAC7D.2030804@oracle.com> References: <19867.37249.351733.543415@oracle.com> <4D9BAC7D.2030804@oracle.com> Message-ID: <19867.64608.989281.686267@oracle.com> John Cuthbertson (john.cuthbertson at oracle.com) wrote: > Hi John, > > Looks good to me. Thanks! -John > On 04/05/11 15:02, John Coomes wrote: > > Hi all, > > > > Please review a simple change to improve handling of obsolete options: > > > > http://cr.openjdk.java.net/~jcoomes/7034133-cleanup-obsolete/ > > > > -John > > > From John.Coomes at oracle.com Tue Apr 5 22:43:40 2011 From: John.Coomes at oracle.com (John Coomes) Date: Tue, 5 Apr 2011 22:43:40 -0700 Subject: review request (XS) - 7034133: cleanup obsolete option handling In-Reply-To: <4D9BCF9E.2010504@oracle.com> References: <19867.37249.351733.543415@oracle.com> <4D9BCF9E.2010504@oracle.com> Message-ID: <19867.64908.211747.569381@oracle.com> Poonam Bajaj (poonam.bajaj at oracle.com) wrote: > Hi John, > > > Java HotSpot(TM) Client VM warning: ignoring option HandlePromotionFailure; support was removed in 6.0_24 > > Here, I think it would be appropriate to have the jdk version string > as either 1.6.0_24 or 6u24 Hi Poonam, Thanks for looking at this. I also noticed the version string was unusual, but it's not new with this code. Changing the string representation should be done separately as it's distinct from handling obsolete options, is also used by the error handler and the code in JDK_Version goes to some length to format 1.5.x and later as d.d, while 1.4.2 and earlier appear as d.d.d. -John > On 4/6/2011 3:32 AM, John Coomes wrote: > > Hi all, > > Please review a simple change to improve handling of obsolete options: > > http://cr.openjdk.java.net/~jcoomes/7034133-cleanup-obsolete/ > > -John > > > > -- > Best regards, Poonam > > Sun, an Oracle company > Sun, an Oracle Company > Poonam Bajaj | Staff Engineer > Phone: +66937451 | Mobile: +9844511366 > JVM Sustaining Engineering > | Bangalore > Green Oracle Oracle is committed to developing practices and products that help > protect the environment > From tom.rodriguez at oracle.com Tue Apr 5 23:12:47 2011 From: tom.rodriguez at oracle.com (tom.rodriguez at oracle.com) Date: Wed, 06 Apr 2011 06:12:47 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7032963: StoreCM shouldn't participate in store elimination Message-ID: <20110406061249.65D1A4782A@hg.openjdk.java.net> Changeset: e6beb62de02d Author: never Date: 2011-04-05 19:14 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/e6beb62de02d 7032963: StoreCM shouldn't participate in store elimination Reviewed-by: kvn ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/lcm.cpp ! src/share/vm/opto/memnode.cpp ! src/share/vm/opto/output.cpp From shane.cox at gmail.com Wed Apr 6 13:03:38 2011 From: shane.cox at gmail.com (Shane Cox) Date: Wed, 6 Apr 2011 16:03:38 -0400 Subject: CMS Concurrent Mark blocking? Message-ID: We are observing that certain functions/operations appear to be getting delayed or blocked by the CMS Concurrent Mark, causing multi-second outliers. Typically our requests are processed in less than 1ms. However, we periodically see multi-second processing times that correlate perfectly with the completion of the CM. Excerpts from logs below demonstrate the strong correlation between the outliers and completion of CM. 2011-04-05 13:53:05.845-INFO - ServerGroupAdapter-JGroupsReceiverThread-End process received messages: final queue size 35 time 10343ms 2011-04-05T13:53:05.845-0400: 144421.365: [CMS-concurrent-mark: 23.492/23.498 secs] [Times: user=36.52 sys=0.06, real=23.50 secs] 2011-04-05 14:05:06.873-INFO - ServerGroupAdapter-JGroupsReceiverThread-End process received messages: final queue size 38 time 15325ms 2011-04-05T14:05:06.871-0400: 145142.391: [CMS-concurrent-mark: 25.652/25.746 secs] [Times: user=38.09 sys=0.16, real=25.75 secs] 2011-04-05 14:05:47.562-INFO - ServerGroupAdapter-JGroupsReceiverThread-End process received messages: final queue size 27 time 7552ms 2011-04-05T14:05:47.563-0400: 145183.083: [CMS-concurrent-mark: 25.821/25.831 secs] [Times: user=37.89 sys=0.13, real=25.83 secs] 2011-04-05 14:06:28.677-INFO - ServerGroupAdapter-JGroupsReceiverThread-End process received messages: final queue size 13 time 15781ms 2011-04-05T14:06:28.677-0400: 145224.197: [CMS-concurrent-mark: 26.138/26.143 secs] [Times: user=36.97 sys=0.08, real=26.14 secs] 2011-04-05 14:07:05.283-INFO - ServerGroupAdapter-JGroupsReceiverThread-End process received messages: final queue size 8 time 6034ms 2011-04-05T14:07:05.284-0400: 145260.803: [CMS-concurrent-mark: 21.316/21.330 secs] [Times: user=36.52 sys=0.15, real=21.33 secs] I don't believe that the threads themselves are paused. CPU utilization averaged 25% during this period (2 CMS threads were busy on an 8 core box), so it's not a case of starvation. Our guess is that we're doing some operation/function that's blocked by the CM. Any ideas would be helpful/appreciated. java version "1.6.0_21" Java(TM) SE Runtime Environment (build 1.6.0_21-b06) Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode) Linux pdk-pt-cxas-01.intcx.net 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 5.3 (Tikanga) -Xms24g -Xmx24g -Xmn1g -Xss256k -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110406/cc01202e/attachment.html From y.s.ramakrishna at oracle.com Wed Apr 6 16:53:01 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Wed, 06 Apr 2011 16:53:01 -0700 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: References: <4D95FD9B.9080909@oracle.com> Message-ID: <4D9CFCDD.9040000@oracle.com> Hi Bartek -- On 04/01/11 10:40, Bartek Markocki wrote: > Hi Ramki, > > On Fri, Apr 1, 2011 at 6:30 PM, Y. Srinivas Ramakrishna > wrote: >> Try -XX:+CMSSCavengeBeforeRemark as a temporary workaround >> for this, and let us know if the performance is reasonable >> or not. > We will try to push the +CMSScavengeBeforeRemark to our production but > as we are talking about the production environment it might take some > time to return to you with the results. > >> I'll look at your log (can you send me your whole GC log, >> showing the problem, off-list?). > Just did. > >> I think there's probably an open CR for this, which i'll >> dig up for you. > Thanks a lot! The CR I had in mind is this one:- 6990419 CMS: Remaining work for 6572569: consistently skewed work distribution in (long) re-mark pauses It's an RFE, and I added you to the "Service Request". If you have a support contract with Oracle, please send the SR# to your support engineer, so he can do the needful. I looked at yr logs and it seems very much like the problem I mention in this CR, although data from -XX:PrintCMSStatistics=2 would help ascertain if that was the issue. (Note to self, make PrintCMSStatistics a manageable flag so it can be turned on in a live JVM rather than to restart a fresh run; ditto for CMSScavengeBeforeRemark: i'll file RFE's for those, although not sure when we can get them done.) An alternative workaround that might also work for you would be -XX:CMSWaitDuration=X where X = at least two times the maximum interscavenge duration observed by yr application. (The RFE is to, among other things, ergonomify that setting.) -- ramki > > Bartek > > >> On 4/1/2011 2:16 AM, Bartek Markocki wrote: >>> Hi all, >>> >>> Can I ask any of you to review the attached extracts from our >>> production GC log and share your thoughts about them? >>> >>> We have a router-type web application running under tomcat 6.0.28 with >>> Java 1.6.0_21 (64bit) on RHEL 5.2 (2.6.18-92.1.22.el5). The GC >>> settings are: >>> -Xmx2048m -Xms2048m -XX:NewSize=1024m >>> -XX:PermSize=64m -XX:MaxPermSize=128m >>> -XX:ThreadStackSize=128 >>> -XX:+DisableExplicitGC >>> -XX:+UseConcMarkSweepGC -XX:+UseParNewGC >>> -XX:+PrintGCDetails >>> >>> What we did: >>> >>> Lately we have changed from ParallelOld to CMS due to unacceptable >>> long Full GC pauses times. In preparation to the change of the >>> collector we performed a lot of GC tuning related tests and found out >>> that the above (simple) set of settings fulfill our needs in the best >>> way. >>> So far we are happy with what we see (frequency of minor scans/CMS >>> cycles, times of STW pauses) with one exception. >>> >>> >>> What is the problem: >>> >>> Some of our remark phases last much longer than others (up to 8 times >>> on avg.). Normal remark phase lasts between 55 and 90ms, the longest >>> one lasted for 538ms. >>> At first we thought that this is due to aborting the preceding >>> abortable-preclean phase. After a closer look we found out that >>> depending on the volume of traffic (i.e., time of day) in fact some of >>> our abortable-preclean phases are aborted due to time limit (5sec). >>> Despite that most of the following remark phases times still are >>> within acceptable limit (up to 100ms). So we kept digging. As a result >>> of that we found out that the abnormal long remark phases are preceded >>> by aborted abortable-preclean phase. The phase was always aborted due >>> to the time limit however if we have a look at the following report >>> for the young generation occupancy in all cases we were able to find >>> that YG was occupied in far more than 50%. >>> Per my (current :)) understanding the abortable-preclean phase can be >>> aborted due of the time limit or because YG got full in about 50% (so >>> remark phase will happen midway during two minor collections) - >>> whatever comes first. In our case the 'about 50%' condition is not >>> executed and the phase continues until it hits the time limit. The >>> following remark phase always last longer, i.e., 350-550ms. >>> >>> >>> The big question: >>> >>> What can we do to cut down the time of those long lasting remark phases? >>> >>> >>> Below I enclose three samples from our GC log presenting: >>> first one - a CMS cycle that aborted the abortable-preclean phase due >>> time limit and the following remark phase does not show the abnormal >>> behavior. >>> second one - an "ideal" CMS cycle >>> third one - a CMS cycle with aborted the abortable-preclean phase (due >>> to time limit even though YG occupancy is much greater than 50%) and >>> the following remark phase lasts for 0.5second. >>> >>> -- >>> 1142110.458: [GC 1142110.458: [ParNew: 888646K->45370K(943744K), >>> 0.0728880 secs] 1852227K->1013124K(1992320K), 0.0739250 secs] [Times: >>> user=0.33 sys=0.01, real=0.07 secs] >>> 1142110.547: [GC [1 CMS-initial-mark: 967753K(1048576K)] >>> 1013331K(1992320K), 0.0540170 secs] [Times: user=0.06 sys=0.00, >>> real=0.05 secs] >>> 1142110.602: [CMS-concurrent-mark-start] >>> 1142111.010: [CMS-concurrent-mark: 0.408/0.408 secs] [Times: user=1.96 >>> sys=0.07, real=0.41 secs] >>> 1142111.011: [CMS-concurrent-preclean-start] >>> 1142111.028: [CMS-concurrent-preclean: 0.016/0.017 secs] [Times: >>> user=0.02 sys=0.00, real=0.02 secs] >>> 1142111.028: [CMS-concurrent-abortable-preclean-start] >>> CMS: abort preclean due to time 1142116.036: >>> [CMS-concurrent-abortable-preclean: 4.858/5.007 secs] [Times: >>> user=7.31 sys=0.57, real=5.00 secs] >>> 1142116.050: [GC[YG occupancy: 409639 K (943744 K)]1142116.051: >>> [Rescan (parallel) , 0.0389910 secs]1142116.090: [weak refs >>> processing, 0.0156130 secs] [1 CMS-remark: 967753K(1048576K)] >>> 1377393K(1992320K), 0.0554700 secs] [Times: user=0.50 sys=0.00, >>> real=0.06 secs] >>> 1142116.107: [CMS-concurrent-sweep-start] >>> 1142117.721: [CMS-concurrent-sweep: 1.614/1.614 secs] [Times: >>> user=2.41 sys=0.24, real=1.61 secs] >>> 1142117.721: [CMS-concurrent-reset-start] >>> 1142117.732: [CMS-concurrent-reset: 0.010/0.010 secs] [Times: >>> user=0.01 sys=0.00, real=0.01 secs] >>> 1142121.278: [GC 1142121.279: [ParNew: 884282K->52652K(943744K), >>> 0.0680850 secs] 1200273K->372087K(1992320K), 0.0690040 secs] [Times: >>> user=0.29 sys=0.01, real=0.07 secs] >>> 1142133.508: [GC 1142133.508: [ParNew: 891564K->47435K(943744K), >>> 0.0682080 secs] 1210999K->370280K(1992320K), 0.0691030 secs] [Times: >>> user=0.29 sys=0.01, real=0.07 secs] >>> -- >>> 1165584.305: [GC 1165584.305: [ParNew: 896212K->59055K(943744K), >>> 0.0761290 secs] 1857148K->1023947K(1992320K), 0.0771330 secs] [Times: >>> user=0.33 sys=0.00, real=0.08 secs] >>> 1165584.398: [GC [1 CMS-initial-mark: 964891K(1048576K)] >>> 1024053K(1992320K), 0.0631010 secs] [Times: user=0.06 sys=0.00, >>> real=0.06 secs] >>> 1165584.463: [CMS-concurrent-mark-start] >>> 1165584.933: [CMS-concurrent-mark: 0.423/0.471 secs] [Times: user=2.40 >>> sys=0.21, real=0.47 secs] >>> 1165584.934: [CMS-concurrent-preclean-start] >>> 1165584.954: [CMS-concurrent-preclean: 0.018/0.021 secs] [Times: >>> user=0.05 sys=0.00, real=0.02 secs] >>> 1165584.955: [CMS-concurrent-abortable-preclean-start] >>> 1165587.876: [CMS-concurrent-abortable-preclean: 2.884/2.921 secs] >>> [Times: user=5.51 sys=0.65, real=2.92 secs] >>> 1165587.892: [GC[YG occupancy: 479051 K (943744 K)]1165587.892: >>> [Rescan (parallel) , 0.0746810 secs]1165587.967: [weak refs >>> processing, 0.0168870 secs] [1 CMS-remark: 964891K(1048576K)] >>> 1443943K(1992320K), 0.0925600 secs] [Times: user=0.91 sys=0.01, >>> real=0.09 secs] >>> 1165587.986: [CMS-concurrent-sweep-start] >>> 1165589.670: [CMS-concurrent-sweep: 1.684/1.684 secs] [Times: >>> user=3.39 sys=0.46, real=1.69 secs] >>> 1165589.671: [CMS-concurrent-reset-start] >>> 1165589.679: [CMS-concurrent-reset: 0.009/0.009 secs] [Times: >>> user=0.01 sys=0.00, real=0.01 secs] >>> 1165591.354: [GC 1165591.354: [ParNew: 897967K->54984K(943744K), >>> 0.0862910 secs] 1236513K->397404K(1992320K), 0.0872930 secs] [Times: >>> user=0.34 sys=0.00, real=0.09 secs] >>> 1165598.887: [GC 1165598.888: [ParNew: 893896K->52086K(943744K), >>> 0.0885510 secs] 1236316K->398587K(1992320K), 0.0895820 secs] [Times: >>> user=0.31 sys=0.01, real=0.09 secs] >>> -- >>> 1166753.770: [GC 1166753.770: [ParNew: 899148K->57315K(943744K), >>> 0.0782510 secs] 1862058K->1024198K(1992320K), 0.0793040 secs] [Times: >>> user=0.32 sys=0.01, real=0.08 secs] >>> 1166753.867: [GC [1 CMS-initial-mark: 966883K(1048576K)] >>> 1024305K(1992320K), 0.0642680 secs] [Times: user=0.07 sys=0.00, >>> real=0.07 secs] >>> 1166753.932: [CMS-concurrent-mark-start] >>> 1166754.471: [CMS-concurrent-mark: 0.486/0.538 secs] [Times: user=2.76 >>> sys=0.28, real=0.54 secs] >>> 1166754.471: [CMS-concurrent-preclean-start] >>> 1166754.488: [CMS-concurrent-preclean: 0.015/0.017 secs] [Times: >>> user=0.04 sys=0.00, real=0.01 secs] >>> 1166754.488: [CMS-concurrent-abortable-preclean-start] >>> CMS: abort preclean due to time 1166759.533: >>> [CMS-concurrent-abortable-preclean: 4.895/5.044 secs] [Times: >>> user=9.75 sys=1.21, real=5.05 secs] >>> 1166759.549: [GC[YG occupancy: 791197 K (943744 K)]1166759.549: >>> [Rescan (parallel) , 0.5387660 secs]1166760.088: [weak refs >>> processing, 0.0139780 secs] [1 CMS-remark: 966883K(1048576K)] >>> 1758080K(1992320K), 0.5537750 secs] [Times: user=5.58 sys=0.06, >>> real=0.56 secs] >>> 1166760.105: [CMS-concurrent-sweep-start] >>> 1166760.688: [GC 1166760.689: [ParNew: 896188K->57161K(943744K), >>> 0.0727850 secs] 1623884K->788963K(1992320K), 0.0737390 secs] [Times: >>> user=0.31 sys=0.02, real=0.08 secs] >>> 1166761.593: [CMS-concurrent-sweep: 1.363/1.488 secs] [Times: >>> user=3.48 sys=0.49, real=1.49 secs] >>> 1166761.593: [CMS-concurrent-reset-start] >>> 1166761.602: [CMS-concurrent-reset: 0.009/0.009 secs] [Times: >>> user=0.02 sys=0.01, real=0.01 secs] >>> 1166767.947: [GC 1166767.948: [ParNew: 896053K->58188K(943744K), >>> 0.0817680 secs] 1238926K->404605K(1992320K), 0.0828270 secs] [Times: >>> user=0.31 sys=0.01, real=0.08 secs] >>> -- >>> >>> Thank you in advance, >>> Bartek >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Wed Apr 6 16:59:54 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Wed, 06 Apr 2011 16:59:54 -0700 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: <4D9CFCDD.9040000@oracle.com> References: <4D95FD9B.9080909@oracle.com> <4D9CFCDD.9040000@oracle.com> Message-ID: <4D9CFE7A.6040901@oracle.com> Typo corrected:- On 04/06/11 16:53, Y. S. Ramakrishna wrote: ... > An alternative workaround that might also work > for you would be -XX:CMSWaitDuration=X That should have been: -XX:CMSMaxAbortablePrecleanTime=X > where X = at least two times the maximum interscavenge > duration observed by yr application. (The RFE > is to, among other things, ergonomify that setting.) Sorry about the typo. -- ramki _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From john.cuthbertson at oracle.com Thu Apr 7 14:16:42 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Thu, 07 Apr 2011 21:16:42 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7009266: G1: assert(obj->is_oop_or_null(true )) failed: Error Message-ID: <20110407211648.24A694789E@hg.openjdk.java.net> Changeset: e1162778c1c8 Author: johnc Date: 2011-04-07 09:53 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/e1162778c1c8 7009266: G1: assert(obj->is_oop_or_null(true )) failed: Error Summary: A referent object that is only weakly reachable at the start of concurrent marking but is re-attached to the strongly reachable object graph during marking may not be marked as live. This can cause the reference object to be processed prematurely and leave dangling pointers to the referent object. Implement a read barrier for the java.lang.ref.Reference::referent field by intrinsifying the Reference.get() method, and intercepting accesses though JNI, reflection, and Unsafe, so that when a non-null referent object is read it is also logged in an SATB buffer. Reviewed-by: kvn, iveresov, never, tonyp, dholmes ! src/cpu/sparc/vm/assembler_sparc.cpp ! src/cpu/sparc/vm/assembler_sparc.hpp ! src/cpu/sparc/vm/c1_CodeStubs_sparc.cpp ! src/cpu/sparc/vm/c1_LIRGenerator_sparc.cpp ! src/cpu/sparc/vm/cppInterpreter_sparc.cpp ! src/cpu/sparc/vm/interpreterGenerator_sparc.hpp ! src/cpu/sparc/vm/interpreter_sparc.cpp ! src/cpu/sparc/vm/templateInterpreter_sparc.cpp ! src/cpu/sparc/vm/templateTable_sparc.cpp ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp ! src/cpu/x86/vm/c1_CodeStubs_x86.cpp ! src/cpu/x86/vm/c1_LIRGenerator_x86.cpp ! src/cpu/x86/vm/cppInterpreterGenerator_x86.hpp ! src/cpu/x86/vm/cppInterpreter_x86.cpp ! src/cpu/x86/vm/interpreterGenerator_x86.hpp ! src/cpu/x86/vm/templateInterpreter_x86_32.cpp ! src/cpu/x86/vm/templateInterpreter_x86_64.cpp ! src/cpu/x86/vm/templateTable_x86_32.cpp ! src/cpu/x86/vm/templateTable_x86_64.cpp ! src/cpu/zero/vm/cppInterpreter_zero.cpp ! src/cpu/zero/vm/interpreterGenerator_zero.hpp ! src/share/vm/c1/c1_CodeStubs.hpp ! src/share/vm/c1/c1_GraphBuilder.cpp ! src/share/vm/c1/c1_LIRGenerator.cpp ! src/share/vm/c1/c1_LIRGenerator.hpp ! src/share/vm/classfile/vmSymbols.hpp ! src/share/vm/gc_implementation/g1/g1SATBCardTableModRefBS.cpp ! src/share/vm/gc_implementation/g1/g1SATBCardTableModRefBS.hpp ! src/share/vm/gc_implementation/g1/g1_globals.hpp ! src/share/vm/interpreter/abstractInterpreter.hpp ! src/share/vm/interpreter/cppInterpreter.cpp ! src/share/vm/interpreter/interpreter.cpp ! src/share/vm/interpreter/templateInterpreter.cpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/graphKit.cpp ! src/share/vm/opto/graphKit.hpp ! src/share/vm/opto/library_call.cpp ! src/share/vm/prims/jni.cpp ! src/share/vm/prims/unsafe.cpp From john.coomes at oracle.com Thu Apr 7 19:08:00 2011 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 08 Apr 2011 02:08:00 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7034133: cleanup obsolete option handling Message-ID: <20110408020806.B262D478BA@hg.openjdk.java.net> Changeset: 9c4f56ff88e9 Author: jcoomes Date: 2011-04-07 16:52 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/9c4f56ff88e9 7034133: cleanup obsolete option handling Reviewed-by: ysr, johnc, poonam ! src/share/vm/runtime/arguments.cpp From john.coomes at oracle.com Thu Apr 7 23:23:02 2011 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 08 Apr 2011 06:23:02 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 6841742: par compact - remove unused/unsupported options Message-ID: <20110408062306.4C78B478D0@hg.openjdk.java.net> Changeset: eda9eb483d29 Author: jcoomes Date: 2011-04-07 17:16 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/eda9eb483d29 6841742: par compact - remove unused/unsupported options Summary: ignore UseParallel{OldGCDensePrefix,OldGCCompacting,DensePrefixUpdate} Reviewed-by: jwilhelm, brutisso ! src/share/vm/gc_implementation/parallelScavenge/psOldGen.cpp ! src/share/vm/gc_implementation/parallelScavenge/psOldGen.hpp ! src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.cpp ! src/share/vm/gc_implementation/parallelScavenge/psParallelCompact.hpp ! src/share/vm/gc_implementation/parallelScavenge/psPermGen.cpp ! src/share/vm/gc_implementation/parallelScavenge/psPermGen.hpp ! src/share/vm/gc_implementation/parallelScavenge/psYoungGen.cpp ! src/share/vm/gc_implementation/parallelScavenge/psYoungGen.hpp ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/globals.hpp From shane.cox at gmail.com Fri Apr 8 04:40:27 2011 From: shane.cox at gmail.com (Shane Cox) Date: Fri, 8 Apr 2011 07:40:27 -0400 Subject: understanding CMS logs Message-ID: I'm having trouble reconciling the timings in the following log entries. Could someone please explain? Take the first example (concurrent mark). The cpu time was ~ 25 seconds, elapsed time was ~ 43 seconds. So how can "user" time be 283 seconds? There are only 2 Parallel CMS threads. I would think User time would be more along the lines of 50 seconds. Obviously there is something that I'm misinterpreting. Also, If cpu time is 25 seconds but elapsed time is 43 seconds, does that mean the CM spent 18 seconds doing something other than executing on CPU? I'm wondering if some condition is extending the CM execution time because it cannot obtain adequate cpu resources. Log excerpts: 2011-04-05T19:20:38.944-0400: 164034.340: [CMS-concurrent-mark: 25.221/43.448 secs] [Times: user=283.46 sys=8.05, real=43.44 secs] 2011-04-05T19:20:51.735-0400: 164047.131: [CMS-concurrent-preclean: 10.211/17.222 secs] [Times: user=72.78 sys=2.24, real=17.22 secs] GC Threads: "Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x000000005f598000 nid=0x2ab5 runnable "Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x000000005f59a000 nid=0x2ab6 runnable "Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x000000005f59c000 nid=0x2ab7 runnable "Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x000000005f59d800 nid=0x2ab8 runnable "Gang worker#4 (Parallel GC Threads)" prio=10 tid=0x000000005f59f800 nid=0x2ab9 runnable "Gang worker#5 (Parallel GC Threads)" prio=10 tid=0x000000005f5a1800 nid=0x2aba runnable "Gang worker#6 (Parallel GC Threads)" prio=10 tid=0x000000005f5a3000 nid=0x2abb runnable "Gang worker#7 (Parallel GC Threads)" prio=10 tid=0x000000005f5a5000 nid=0x2abc runnable "Concurrent Mark-Sweep GC Thread" prio=10 tid=0x000000005f698800 nid=0x2ac3 runnable "Gang worker#0 (Parallel CMS Threads)" prio=10 tid=0x000000005f694800 nid=0x2ac1 runnable "Gang worker#1 (Parallel CMS Threads)" prio=10 tid=0x000000005f696800 nid=0x2ac2 runnable Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110408/3f6d5350/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From bartosz.markocki at gmail.com Fri Apr 8 05:25:41 2011 From: bartosz.markocki at gmail.com (Bartek Markocki) Date: Fri, 8 Apr 2011 14:25:41 +0200 Subject: Why abortable-preclean phase is not being aborted after YG occupancy exceeds 50%? In-Reply-To: <4D9CFE7A.6040901@oracle.com> References: <4D95FD9B.9080909@oracle.com> <4D9CFCDD.9040000@oracle.com> <4D9CFE7A.6040901@oracle.com> Message-ID: Hi Ramki, Thanks for the information. Currently we are mid way to our production with the scavenge before remark option enabled. I try my best to add CMS statistics to the list of changes. As a separate action we will try to increase the max abortable preclean time from 5 to 11-12 seconds. I will update you as soon as we have the data. Thanks, Bartek On Thu, Apr 7, 2011 at 1:59 AM, Y. S. Ramakrishna wrote: > Typo corrected:- > > > On 04/06/11 16:53, Y. S. Ramakrishna wrote: > ... >> >> An alternative workaround that might also work >> for you would be -XX:CMSWaitDuration=X > > That should have been: > > ?-XX:CMSMaxAbortablePrecleanTime=X > >> where X = at least two times the maximum interscavenge >> duration observed by yr application. (The RFE >> is to, among other things, ergonomify that setting.) > > Sorry about the typo. > -- ramki > > _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From john.cuthbertson at oracle.com Fri Apr 8 11:04:19 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 08 Apr 2011 11:04:19 -0700 Subject: RFR(XS): 7035177: G1: nsk/stress/jni/jnistress002 fails with an assertion failure caused by changes for 7009266 Message-ID: <4D9F4E23.6060301@oracle.com> Hi Everyone, Can I have a couple of volunteers to look over the fix for this CR? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7035177/webrev.0/. The problem is that the node representing the offset (in an Unsafe.getObject compilation) could be typed as a long and generating the compare of offset against java_lang_ref_Reference::referent_offset (typed as an int) caused an assertion failure about the mis-matching types. The fix is to generate a suitably typed constant based upon the type of "offset". Tested using the failing test case from the nightly tests. Thanks, JohnC From tom.rodriguez at oracle.com Fri Apr 8 11:31:47 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 8 Apr 2011 11:31:47 -0700 Subject: RFR(XS): 7035177: G1: nsk/stress/jni/jnistress002 fails with an assertion failure caused by changes for 7009266 In-Reply-To: <4D9F4E23.6060301@oracle.com> References: <4D9F4E23.6060301@oracle.com> Message-ID: <5FAB74B1-C3A6-46C4-ABC3-E58380059D70@oracle.com> Actually you want to write this: Node* referent_off = __ ConX(java_lang_ref_Reference::referent_offset); which will pick ConL in 64 bit and ConI in 32 bit. Sorry I didn't catch this in the review. tom On Apr 8, 2011, at 11:04 AM, John Cuthbertson wrote: > Hi Everyone, > > Can I have a couple of volunteers to look over the fix for this CR? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7035177/webrev.0/. > > The problem is that the node representing the offset (in an Unsafe.getObject compilation) could be typed as a long and generating the compare of offset against java_lang_ref_Reference::referent_offset (typed as an int) caused an assertion failure about the mis-matching types. > > The fix is to generate a suitably typed constant based upon the type of "offset". > > Tested using the failing test case from the nightly tests. > > Thanks, > > JohnC From john.cuthbertson at oracle.com Fri Apr 8 11:42:18 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 08 Apr 2011 11:42:18 -0700 Subject: RFR(XS): 7035177: G1: nsk/stress/jni/jnistress002 fails with an assertion failure caused by changes for 7009266 In-Reply-To: <5FAB74B1-C3A6-46C4-ABC3-E58380059D70@oracle.com> References: <4D9F4E23.6060301@oracle.com> <5FAB74B1-C3A6-46C4-ABC3-E58380059D70@oracle.com> Message-ID: <4D9F570A.60703@oracle.com> Hi Tom, Thanks. I'll make the change (which was what I started with and then changed it to it's current version). JohnC On 04/08/11 11:31, Tom Rodriguez wrote: > Actually you want to write this: > > Node* referent_off = __ ConX(java_lang_ref_Reference::referent_offset); > > which will pick ConL in 64 bit and ConI in 32 bit. Sorry I didn't catch this in the review. > > tom > > On Apr 8, 2011, at 11:04 AM, John Cuthbertson wrote: > > >> Hi Everyone, >> >> Can I have a couple of volunteers to look over the fix for this CR? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7035177/webrev.0/. >> >> The problem is that the node representing the offset (in an Unsafe.getObject compilation) could be typed as a long and generating the compare of offset against java_lang_ref_Reference::referent_offset (typed as an int) caused an assertion failure about the mis-matching types. >> >> The fix is to generate a suitably typed constant based upon the type of "offset". >> >> Tested using the failing test case from the nightly tests. >> >> Thanks, >> >> JohnC >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110408/49a61400/attachment-0001.html From y.s.ramakrishna at oracle.com Fri Apr 8 16:21:10 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Fri, 08 Apr 2011 16:21:10 -0700 Subject: understanding CMS logs In-Reply-To: References: Message-ID: <4D9F9866.7080906@oracle.com> Hi Shane -- I transferred the snippet from below for easy reference:- >> Log excerpts: >> 2011-04-05T19:20:38.944-0400: 164034.340: [CMS-concurrent-mark: >> 25.221/43.448 secs] [Times: user=283.46 sys=8.05, real=43.44 secs] >> >> 2011-04-05T19:20:51.735-0400: 164047.131: [CMS-concurrent-preclean: >> 10.211/17.222 secs] [Times: user=72.78 sys=2.24, real=17.22 secs] >> ... > I'm having trouble reconciling the timings in the following log entries. > Could someone please explain? Take the first example (concurrent mark). > The cpu time was ~ 25 seconds, elapsed time was ~ 43 seconds. So how can > "user" time be 283 seconds? There are only 2 Parallel CMS threads. I would > think User time would be more along the lines of 50 seconds. Obviously > there is something that I'm misinterpreting. Yes and no. The interpretation that CMS conc mark took about 25 s wall clock time executing, out of a total wall clock time of 43 seconds in indeed correct). It is also correct that there are only 2 marking threads. The [Times:...] part is however misleading in the sense that it is the time for the whole JVM process, not just the virtual time for the marking threads. In other words, for the real elapsed time of 43 seconds, the entire process executed 283.6 virtual seconds in used mode and 8.05 virtual seconds in system mode. I agree that this can be misleading as presented because it may be interpreted as virtual time attrributed just to the marking threads, which is what you did. > > > Also, If cpu time is 25 seconds but elapsed time is 43 seconds, does that > mean the CM spent 18 seconds doing something other than executing on CPU? That's correct. Typically, what might happen is that some amount of time may be spent for foreground GC (scavenge) work, or waiting for locks, during which time the marking threads may not be running. Also this time is an upper limit on the actual time that they may have been executing on cpu because they are calculated by means of bracketing hi-res timers, rather than as virtual cpu time. > I'm wondering if some condition is extending the CM execution time because > it cannot obtain adequate cpu resources. One would have to look at the complete logs to understand. You'd first want to total up all of the foreground GC time spent in between and see how much of balance is left from those 18 seconds. There may be other STW operations 9such as bulk bias revocation) that might also interrupt the concurrent marking, as well as, may be (but am not certin without checking the code) direct allocations into the old generation or class loading which can cause allocation into the perm gen. -- ramki > > > GC Threads: > "Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x000000005f598000 > nid=0x2ab5 runnable > "Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x000000005f59a000 > nid=0x2ab6 runnable > "Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x000000005f59c000 > nid=0x2ab7 runnable > "Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x000000005f59d800 > nid=0x2ab8 runnable > "Gang worker#4 (Parallel GC Threads)" prio=10 tid=0x000000005f59f800 > nid=0x2ab9 runnable > "Gang worker#5 (Parallel GC Threads)" prio=10 tid=0x000000005f5a1800 > nid=0x2aba runnable > "Gang worker#6 (Parallel GC Threads)" prio=10 tid=0x000000005f5a3000 > nid=0x2abb runnable > "Gang worker#7 (Parallel GC Threads)" prio=10 tid=0x000000005f5a5000 > nid=0x2abc runnable > > "Concurrent Mark-Sweep GC Thread" prio=10 tid=0x000000005f698800 nid=0x2ac3 > runnable > "Gang worker#0 (Parallel CMS Threads)" prio=10 tid=0x000000005f694800 > nid=0x2ac1 runnable > "Gang worker#1 (Parallel CMS Threads)" prio=10 tid=0x000000005f696800 > nid=0x2ac2 runnable > > > Thanks! > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From jon.masamitsu at oracle.com Mon Apr 11 21:39:52 2011 From: jon.masamitsu at oracle.com (jon.masamitsu at oracle.com) Date: Tue, 12 Apr 2011 04:39:52 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 41 new changesets Message-ID: <20110412044104.5E865479C7@hg.openjdk.java.net> Changeset: 7449da4cdab5 Author: schien Date: 2011-03-24 11:20 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/7449da4cdab5 Added tag jdk7-b135 for changeset b898f0fc3ced ! .hgtags Changeset: 661c46a8434c Author: trims Date: 2011-03-25 17:26 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/661c46a8434c Added tag hs21-b05 for changeset b898f0fc3ced ! .hgtags Changeset: c10b82a05d58 Author: trims Date: 2011-03-25 18:04 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/c10b82a05d58 Merge - test/compiler/6987555/Test6987555.java - test/compiler/6991596/Test6991596.java Changeset: bd586e392d93 Author: trims Date: 2011-03-25 18:04 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/bd586e392d93 7031227: Bump the HS21 build number to 06 Summary: Update the HS21 build number to 06 Reviewed-by: jcoomes ! make/hotspot_version Changeset: 74e790c48cd4 Author: sla Date: 2011-03-28 12:48 +0200 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/74e790c48cd4 7031571: Generate native VS2010 project files Reviewed-by: hosterda, stefank, brutisso ! make/windows/create.bat ! make/windows/makefiles/projectcreator.make ! make/windows/makefiles/rules.make ! src/share/tools/ProjectCreator/Util.java ! src/share/tools/ProjectCreator/WinGammaPlatform.java + src/share/tools/ProjectCreator/WinGammaPlatformVC10.java ! src/share/tools/ProjectCreator/WinGammaPlatformVC7.java Changeset: df553e4a797b Author: acorn Date: 2011-03-30 17:05 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/df553e4a797b Merge Changeset: 151da0c145a8 Author: twisti Date: 2011-03-24 02:11 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/151da0c145a8 7030207: Zero tweak to remove accidentally incorporated code Summary: IcedTea contains a now-unmaintained ARM-specific interpreter and part of that interpreter was accidentally incorporated in one of the webrevs when Zero was initially imported. Reviewed-by: twisti Contributed-by: Gary Benson ! src/share/vm/interpreter/bytecodeInterpreter.cpp Changeset: b868d9928221 Author: twisti Date: 2011-03-24 23:04 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/b868d9928221 Merge - test/compiler/6987555/Test6987555.java - test/compiler/6991596/Test6991596.java Changeset: f731b22cd52d Author: jcoomes Date: 2011-03-24 23:49 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/f731b22cd52d Merge ! src/share/vm/interpreter/bytecodeInterpreter.cpp Changeset: 322a41ec766c Author: never Date: 2011-03-25 11:29 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/322a41ec766c 7025708: Assertion if using "-XX:+CITraceTypeFlow -XX:+Verbose" together Reviewed-by: never Contributed-by: volker.simonis at gmail.com ! src/share/vm/ci/ciTypeFlow.cpp Changeset: b2949bf39900 Author: never Date: 2011-03-25 18:19 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/b2949bf39900 Merge Changeset: 29524004ce17 Author: never Date: 2011-03-25 18:50 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/29524004ce17 7022204: LogFile wildcarding should use %p instead of star Reviewed-by: coleenp, jrose ! src/share/vm/utilities/ostream.cpp Changeset: 7e88bdae86ec Author: roland Date: 2011-03-25 09:35 +0100 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/7e88bdae86ec 7029017: Additional architecture support for c2 compiler Summary: Enables cross building of a c2 VM. Support masking of shift counts when the processor architecture mandates it. Reviewed-by: kvn, never ! make/linux/makefiles/adlc.make ! make/linux/makefiles/gcc.make ! make/linux/makefiles/rules.make ! make/linux/makefiles/sparcWorks.make ! src/cpu/sparc/vm/sparc.ad ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad ! src/share/vm/adlc/main.cpp ! src/share/vm/opto/chaitin.cpp ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/lcm.cpp ! src/share/vm/opto/matcher.cpp ! src/share/vm/opto/matcher.hpp Changeset: 244bf8afbbd3 Author: roland Date: 2011-03-26 08:31 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/244bf8afbbd3 Merge Changeset: 1927db75dd85 Author: never Date: 2011-03-27 00:00 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/1927db75dd85 7024475: loop doesn't terminate when compiled Reviewed-by: kvn ! src/share/vm/compiler/compileBroker.cpp ! src/share/vm/opto/idealGraphPrinter.cpp ! src/share/vm/opto/idealGraphPrinter.hpp ! src/share/vm/opto/loopTransform.cpp ! src/share/vm/opto/loopnode.cpp ! src/share/vm/opto/node.cpp ! src/share/vm/runtime/globals.hpp + test/compiler/7024475/Test7024475.java Changeset: b40d4fa697bf Author: iveresov Date: 2011-03-27 13:17 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/b40d4fa697bf 6964776: c2 should ensure the polling page is reachable on 64 bit Summary: Materialize the pointer to the polling page in a register instead of using rip-relative addressing when the distance from the code cache is larger than disp32. Reviewed-by: never, kvn ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp ! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp ! src/cpu/x86/vm/nativeInst_x86.hpp ! src/cpu/x86/vm/relocInfo_x86.cpp ! src/cpu/x86/vm/x86_64.ad Changeset: 3d58a4983660 Author: twisti Date: 2011-03-28 03:58 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/3d58a4983660 7022998: JSR 292 recursive method handle calls inline themselves infinitely Reviewed-by: never, kvn ! src/cpu/sparc/vm/sharedRuntime_sparc.cpp ! src/cpu/x86/vm/sharedRuntime_x86_32.cpp ! src/cpu/x86/vm/sharedRuntime_x86_64.cpp ! src/share/vm/c1/c1_GraphBuilder.cpp ! src/share/vm/code/nmethod.cpp ! src/share/vm/code/nmethod.hpp ! src/share/vm/compiler/compileBroker.cpp ! src/share/vm/compiler/compileBroker.hpp ! src/share/vm/opto/bytecodeInfo.cpp ! src/share/vm/opto/doCall.cpp ! src/share/vm/opto/library_call.cpp ! src/share/vm/runtime/sharedRuntime.cpp ! src/share/vm/runtime/sharedRuntime.hpp Changeset: a988a7bb3b8a Author: kvn Date: 2011-03-29 09:11 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/a988a7bb3b8a 7032133: Enable sse4.2 for new AMD processors Summary: New AMD processors support sse4.2. Enable corresponding instructions in Hotspot. Reviewed-by: kvn Contributed-by: eric.caspole at amd.com ! src/cpu/x86/vm/vm_version_x86.cpp Changeset: b1c22848507b Author: iveresov Date: 2011-03-29 17:35 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/b1c22848507b 6741940: Nonvolatile XMM registers not preserved across JNI calls Summary: Save xmm6-xmm15 in call stub on win64 Reviewed-by: kvn, never ! src/cpu/x86/vm/frame_x86.hpp ! src/cpu/x86/vm/stubGenerator_x86_64.cpp Changeset: 2cd0180da6e1 Author: never Date: 2011-03-29 22:05 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/2cd0180da6e1 7032306: Fastdebug build failure on Solaris with SS11 compilers Reviewed-by: kvn, iveresov ! src/share/vm/oops/instanceKlass.cpp Changeset: 348c0df561a9 Author: iveresov Date: 2011-03-29 22:25 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/348c0df561a9 7026307: DEBUG MESSAGE: broken null klass on amd64 Summary: Correct typo introduces in 7020521 Reviewed-by: never, kvn ! src/cpu/x86/vm/stubGenerator_x86_64.cpp Changeset: fe1dbd98e18f Author: iveresov Date: 2011-03-30 03:48 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/fe1dbd98e18f Merge Changeset: 63997f575155 Author: never Date: 2011-03-30 07:47 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/63997f575155 7031614: jmap -permstat fails with java.lang.InternalError in sun.jvm.hotspot.oops.OopField.getValue Reviewed-by: kvn, dcubed ! agent/src/share/classes/sun/jvm/hotspot/jdi/ClassObjectReferenceImpl.java ! agent/src/share/classes/sun/jvm/hotspot/oops/Instance.java ! agent/src/share/classes/sun/jvm/hotspot/oops/InstanceKlass.java + agent/src/share/classes/sun/jvm/hotspot/oops/InstanceMirrorKlass.java ! agent/src/share/classes/sun/jvm/hotspot/oops/ObjectHeap.java ! agent/src/share/classes/sun/jvm/hotspot/oops/Oop.java ! agent/src/share/classes/sun/jvm/hotspot/oops/OopUtilities.java + agent/src/share/classes/sun/jvm/hotspot/oops/java_lang_Class.java ! agent/src/share/classes/sun/jvm/hotspot/runtime/VM.java ! agent/src/share/classes/sun/jvm/hotspot/tools/FinalizerInfo.java ! agent/src/share/classes/sun/jvm/hotspot/utilities/HeapGXLWriter.java ! agent/src/share/classes/sun/jvm/hotspot/utilities/HeapHprofBinWriter.java ! agent/src/share/classes/sun/jvm/hotspot/utilities/ReversePtrsAnalysis.java ! agent/src/share/classes/sun/jvm/hotspot/utilities/soql/JSJavaFactoryImpl.java ! src/share/vm/oops/instanceMirrorKlass.hpp ! src/share/vm/runtime/vmStructs.cpp Changeset: f9424955eb18 Author: kvn Date: 2011-03-30 12:08 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/f9424955eb18 7029152: Ideal nodes for String intrinsics miss memory edge optimization Summary: In Ideal() method of String intrinsics nodes look for TypeAryPtr::CHARS memory slice if memory is MergeMem. Do not unroll a loop with String intrinsics code. Reviewed-by: never ! src/share/vm/opto/loopTransform.cpp ! src/share/vm/opto/memnode.cpp ! src/share/vm/opto/memnode.hpp + test/compiler/7029152/Test.java Changeset: e2eb7f986c64 Author: iveresov Date: 2011-03-30 15:22 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/e2eb7f986c64 6564610: assert(UseCompiler || CompileTheWorld, "UseCompiler should be set by now.") Summary: Remove invalid asserts Reviewed-by: never, kvn ! src/share/vm/runtime/compilationPolicy.cpp Changeset: 9d343b8113db Author: iveresov Date: 2011-03-30 18:55 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/9d343b8113db Merge Changeset: 09f96c3ff1ad Author: twisti Date: 2011-03-31 00:27 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/09f96c3ff1ad 7032388: guarantee(VM_Version::supports_cmov()) failed: illegal instruction on i586 after 6919934 Summary: 6919934 added some unguarded cmov instructions which hit a guarantee on older hardware. Reviewed-by: never, iveresov, kvn, phh ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp ! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp ! src/cpu/x86/vm/c1_Runtime1_x86.cpp ! src/cpu/x86/vm/templateTable_x86_32.cpp Changeset: 38fea01eb669 Author: twisti Date: 2011-03-31 02:31 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/38fea01eb669 6817525: turn on method handle functionality by default for JSR 292 Summary: After appropriate testing, we need to turn on EnableMethodHandles and EnableInvokeDynamic by default. Reviewed-by: never, kvn, jrose, phh ! src/cpu/sparc/vm/cppInterpreter_sparc.cpp ! src/cpu/sparc/vm/interp_masm_sparc.cpp ! src/cpu/sparc/vm/interpreter_sparc.cpp ! src/cpu/sparc/vm/templateTable_sparc.cpp ! src/cpu/x86/vm/interp_masm_x86_32.cpp ! src/cpu/x86/vm/interp_masm_x86_64.cpp ! src/cpu/x86/vm/interpreter_x86_32.cpp ! src/cpu/x86/vm/interpreter_x86_64.cpp ! src/cpu/x86/vm/templateInterpreter_x86_32.cpp ! src/cpu/x86/vm/templateInterpreter_x86_64.cpp ! src/cpu/x86/vm/templateTable_x86_32.cpp ! src/cpu/x86/vm/templateTable_x86_64.cpp ! src/share/vm/classfile/classFileParser.cpp ! src/share/vm/classfile/classFileParser.hpp ! src/share/vm/classfile/javaClasses.cpp ! src/share/vm/classfile/systemDictionary.cpp ! src/share/vm/classfile/systemDictionary.hpp ! src/share/vm/interpreter/linkResolver.cpp ! src/share/vm/oops/constantPoolKlass.cpp ! src/share/vm/oops/constantPoolOop.hpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/oops/klass.cpp ! src/share/vm/oops/methodOop.hpp ! src/share/vm/prims/methodHandles.cpp ! src/share/vm/prims/unsafe.cpp ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/globals.hpp ! src/share/vm/runtime/sharedRuntime.cpp ! src/share/vm/runtime/thread.cpp Changeset: cb162b348743 Author: kvn Date: 2011-03-31 13:22 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/cb162b348743 7032696: Fix for 7029152 broke VM Summary: StrIntrinsicNode::Ideal() should not optimize memory during Parse. Reviewed-by: jrose, never ! src/share/vm/opto/loopTransform.cpp ! src/share/vm/opto/memnode.cpp Changeset: 352622fd140a Author: never Date: 2011-03-31 14:00 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/352622fd140a 7032129: Native memory usage grow unexpectedly for vm/oom/*InternedString tests Reviewed-by: kvn, kamg, jcoomes ! src/share/vm/classfile/javaClasses.cpp ! src/share/vm/classfile/javaClasses.hpp ! src/share/vm/classfile/symbolTable.cpp ! src/share/vm/classfile/symbolTable.hpp ! src/share/vm/memory/dump.cpp Changeset: 2a5104162671 Author: never Date: 2011-03-31 15:30 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/2a5104162671 Merge Changeset: 8010c8c623ac Author: kvn Date: 2011-03-31 16:54 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/8010c8c623ac 7032849: 7022998 changes broke hs_err compile task print Summary: Initialize the time stamp on ostream used for hs_err dumping. Reviewed-by: never ! src/share/vm/utilities/ostream.cpp Changeset: 6b9eb6d07c62 Author: kvn Date: 2011-04-01 15:16 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/6b9eb6d07c62 Merge Changeset: a1615ff22854 Author: schien Date: 2011-03-31 18:14 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/a1615ff22854 Added tag jdk7-b136 for changeset bd586e392d93 ! .hgtags Changeset: 2ffcf94550d5 Author: trims Date: 2011-04-01 12:06 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/2ffcf94550d5 Added tag hs21-b06 for changeset bd586e392d93 ! .hgtags Changeset: 7ea7c9c0305c Author: trims Date: 2011-04-01 20:44 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/7ea7c9c0305c Merge Changeset: 2dbcb4a4d8da Author: trims Date: 2011-04-01 20:44 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/2dbcb4a4d8da 7033237: Bump the HS21 build number to 07 Summary: Update the HS21 build number to 07 Reviewed-by: jcoomes ! make/hotspot_version Changeset: 1d1603768966 Author: trims Date: 2011-04-05 14:12 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/1d1603768966 7010070: Update all 2010 Oracle-changed OpenJDK files to have the proper copyright dates - second pass Summary: Update the copyright to be 2010 on all changed files in OpenJDK Reviewed-by: ohair ! agent/src/share/classes/sun/jvm/hotspot/CommandProcessor.java ! agent/src/share/classes/sun/jvm/hotspot/HotSpotTypeDataBase.java ! agent/src/share/classes/sun/jvm/hotspot/interpreter/BytecodeLoadConstant.java ! agent/src/share/classes/sun/jvm/hotspot/interpreter/BytecodeWithKlass.java ! agent/src/share/classes/sun/jvm/hotspot/memory/DictionaryEntry.java ! agent/src/share/classes/sun/jvm/hotspot/memory/LoaderConstraintEntry.java ! agent/src/share/classes/sun/jvm/hotspot/memory/PlaceholderEntry.java ! agent/src/share/classes/sun/jvm/hotspot/memory/StringTable.java ! agent/src/share/classes/sun/jvm/hotspot/memory/SymbolTable.java ! agent/src/share/classes/sun/jvm/hotspot/oops/ConstantPool.java ! agent/src/share/classes/sun/jvm/hotspot/oops/GenerateOopMap.java ! agent/src/share/classes/sun/jvm/hotspot/oops/Klass.java ! agent/src/share/classes/sun/jvm/hotspot/oops/Method.java ! agent/src/share/classes/sun/jvm/hotspot/oops/Symbol.java ! agent/src/share/classes/sun/jvm/hotspot/tools/jcore/ClassWriter.java ! agent/src/share/classes/sun/jvm/hotspot/types/Field.java ! agent/src/share/classes/sun/jvm/hotspot/ui/classbrowser/HTMLGenerator.java ! agent/src/share/classes/sun/jvm/hotspot/utilities/Hashtable.java ! agent/src/share/classes/sun/jvm/hotspot/utilities/HashtableEntry.java ! make/linux/Makefile ! make/linux/makefiles/arm.make ! make/linux/makefiles/gcc.make ! make/linux/makefiles/mapfile-vers-debug ! make/linux/makefiles/mapfile-vers-product ! make/linux/makefiles/ppc.make ! make/linux/makefiles/sparcWorks.make ! make/linux/makefiles/top.make ! make/linux/makefiles/vm.make ! make/solaris/makefiles/adlc.make ! make/solaris/makefiles/buildtree.make ! make/solaris/makefiles/rules.make ! make/solaris/makefiles/top.make ! make/solaris/makefiles/vm.make ! make/windows/create_obj_files.sh ! make/windows/makefiles/launcher.make ! make/windows/makefiles/vm.make ! src/cpu/sparc/vm/c1_MacroAssembler_sparc.cpp ! src/cpu/sparc/vm/dump_sparc.cpp ! src/cpu/sparc/vm/jni_sparc.h ! src/cpu/sparc/vm/nativeInst_sparc.cpp ! src/cpu/sparc/vm/nativeInst_sparc.hpp ! src/cpu/sparc/vm/relocInfo_sparc.cpp ! src/cpu/x86/vm/jni_x86.h ! src/cpu/x86/vm/vm_version_x86.cpp ! src/cpu/zero/vm/jni_zero.h ! src/os/linux/vm/jvm_linux.cpp ! src/os/linux/vm/osThread_linux.cpp ! src/os/linux/vm/os_linux.inline.hpp ! src/os/linux/vm/thread_linux.inline.hpp ! src/os/solaris/dtrace/generateJvmOffsets.cpp ! src/os/solaris/dtrace/jhelper.d ! src/os/solaris/dtrace/libjvm_db.c ! src/os/solaris/vm/dtraceJSDT_solaris.cpp ! src/os_cpu/linux_sparc/vm/os_linux_sparc.cpp ! src/os_cpu/linux_x86/vm/os_linux_x86.cpp ! src/os_cpu/linux_zero/vm/os_linux_zero.cpp ! src/os_cpu/solaris_sparc/vm/os_solaris_sparc.cpp ! src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp ! src/share/tools/hsdis/hsdis-demo.c ! src/share/tools/hsdis/hsdis.c ! src/share/vm/adlc/main.cpp ! src/share/vm/adlc/output_c.cpp ! src/share/vm/asm/assembler.cpp ! src/share/vm/asm/assembler.hpp ! src/share/vm/asm/codeBuffer.hpp ! src/share/vm/c1/c1_Compilation.hpp ! src/share/vm/c1/c1_Defs.hpp ! src/share/vm/c1/c1_FpuStackSim.hpp ! src/share/vm/c1/c1_FrameMap.cpp ! src/share/vm/c1/c1_FrameMap.hpp ! src/share/vm/c1/c1_LIRAssembler.cpp ! src/share/vm/c1/c1_LIRAssembler.hpp ! src/share/vm/c1/c1_LinearScan.cpp ! src/share/vm/c1/c1_LinearScan.hpp ! src/share/vm/c1/c1_MacroAssembler.hpp ! src/share/vm/c1/c1_globals.hpp ! src/share/vm/ci/ciClassList.hpp ! src/share/vm/ci/ciEnv.cpp ! src/share/vm/ci/ciEnv.hpp ! src/share/vm/ci/ciKlass.cpp ! src/share/vm/ci/ciObjArrayKlass.cpp ! src/share/vm/ci/ciObject.hpp ! src/share/vm/ci/ciObjectFactory.hpp ! src/share/vm/ci/ciSignature.cpp ! src/share/vm/ci/ciSignature.hpp ! src/share/vm/ci/ciSymbol.cpp ! src/share/vm/ci/ciSymbol.hpp ! src/share/vm/ci/compilerInterface.hpp ! src/share/vm/classfile/classFileError.cpp ! src/share/vm/classfile/classFileStream.hpp ! src/share/vm/classfile/classLoader.cpp ! src/share/vm/classfile/classLoader.hpp ! src/share/vm/classfile/dictionary.cpp ! src/share/vm/classfile/dictionary.hpp ! src/share/vm/classfile/javaAssertions.cpp ! src/share/vm/classfile/loaderConstraints.cpp ! src/share/vm/classfile/loaderConstraints.hpp ! src/share/vm/classfile/placeholders.cpp ! src/share/vm/classfile/placeholders.hpp ! src/share/vm/classfile/resolutionErrors.cpp ! src/share/vm/classfile/resolutionErrors.hpp ! src/share/vm/classfile/stackMapFrame.cpp ! src/share/vm/classfile/stackMapFrame.hpp ! src/share/vm/classfile/stackMapTable.cpp ! src/share/vm/classfile/stackMapTable.hpp ! src/share/vm/classfile/verificationType.cpp ! src/share/vm/classfile/verificationType.hpp ! src/share/vm/classfile/verifier.hpp ! src/share/vm/code/codeBlob.cpp ! src/share/vm/code/codeCache.hpp ! src/share/vm/code/compiledIC.cpp ! src/share/vm/code/compiledIC.hpp ! src/share/vm/code/dependencies.cpp ! src/share/vm/code/icBuffer.cpp ! src/share/vm/code/relocInfo.cpp ! src/share/vm/code/relocInfo.hpp ! src/share/vm/code/vmreg.hpp ! src/share/vm/compiler/compileLog.hpp ! src/share/vm/compiler/compilerOracle.cpp ! src/share/vm/compiler/compilerOracle.hpp ! src/share/vm/compiler/disassembler.cpp ! src/share/vm/compiler/disassembler.hpp ! src/share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp ! src/share/vm/gc_implementation/g1/g1SATBCardTableModRefBS.hpp ! src/share/vm/gc_implementation/g1/vm_operations_g1.cpp ! src/share/vm/gc_implementation/parallelScavenge/pcTasks.cpp ! src/share/vm/gc_implementation/parallelScavenge/pcTasks.hpp ! src/share/vm/gc_implementation/parallelScavenge/psPromotionManager.cpp ! src/share/vm/gc_implementation/parallelScavenge/psScavenge.inline.hpp ! src/share/vm/gc_implementation/parallelScavenge/psTasks.cpp ! src/share/vm/gc_implementation/shared/allocationStats.hpp ! src/share/vm/gc_implementation/shared/concurrentGCThread.cpp ! src/share/vm/gc_implementation/shared/gcUtil.cpp ! src/share/vm/gc_implementation/shared/markSweep.cpp ! src/share/vm/gc_interface/collectedHeap.cpp ! src/share/vm/interpreter/bytecodeInterpreter.hpp ! src/share/vm/interpreter/bytecodeInterpreter.inline.hpp ! src/share/vm/interpreter/cppInterpreter.hpp ! src/share/vm/interpreter/cppInterpreterGenerator.hpp ! src/share/vm/interpreter/interpreter.hpp ! src/share/vm/interpreter/interpreterGenerator.hpp ! src/share/vm/interpreter/linkResolver.hpp ! src/share/vm/interpreter/templateInterpreter.hpp ! src/share/vm/interpreter/templateInterpreterGenerator.hpp ! src/share/vm/interpreter/templateTable.hpp ! src/share/vm/memory/barrierSet.cpp ! src/share/vm/memory/classify.cpp ! src/share/vm/memory/compactingPermGenGen.cpp ! src/share/vm/memory/genCollectedHeap.cpp ! src/share/vm/memory/genMarkSweep.cpp ! src/share/vm/memory/heap.cpp ! src/share/vm/memory/heapInspection.cpp ! src/share/vm/memory/iterator.hpp ! src/share/vm/memory/restore.cpp ! src/share/vm/memory/serialize.cpp ! src/share/vm/memory/sharedHeap.cpp ! src/share/vm/memory/universe.hpp ! src/share/vm/oops/arrayKlass.cpp ! src/share/vm/oops/arrayOop.cpp ! src/share/vm/oops/cpCacheKlass.hpp ! src/share/vm/oops/generateOopMap.hpp ! src/share/vm/oops/klass.cpp ! src/share/vm/oops/markOop.hpp ! src/share/vm/oops/symbol.cpp ! src/share/vm/oops/symbol.hpp ! src/share/vm/oops/typeArrayOop.hpp ! src/share/vm/opto/buildOopMap.cpp ! src/share/vm/opto/c2_globals.hpp ! src/share/vm/opto/c2compiler.cpp ! src/share/vm/opto/chaitin.cpp ! src/share/vm/opto/gcm.cpp ! src/share/vm/opto/graphKit.cpp ! src/share/vm/opto/graphKit.hpp ! src/share/vm/opto/idealKit.cpp ! src/share/vm/opto/idealKit.hpp ! src/share/vm/opto/lcm.cpp ! src/share/vm/opto/locknode.hpp ! src/share/vm/opto/loopTransform.cpp ! src/share/vm/opto/loopUnswitch.cpp ! src/share/vm/opto/loopopts.cpp ! src/share/vm/opto/matcher.cpp ! src/share/vm/opto/matcher.hpp ! src/share/vm/opto/memnode.hpp ! src/share/vm/opto/node.cpp ! src/share/vm/opto/output.cpp ! src/share/vm/opto/output.hpp ! src/share/vm/opto/parse1.cpp ! src/share/vm/opto/parse2.cpp ! src/share/vm/opto/regmask.cpp ! src/share/vm/opto/regmask.hpp ! src/share/vm/opto/runtime.cpp ! src/share/vm/opto/stringopts.cpp ! src/share/vm/opto/type.hpp ! src/share/vm/precompiled.hpp ! src/share/vm/prims/forte.cpp ! src/share/vm/prims/jni_md.h ! src/share/vm/prims/jvm_misc.hpp ! src/share/vm/prims/jvmtiClassFileReconstituter.cpp ! src/share/vm/prims/jvmtiClassFileReconstituter.hpp ! src/share/vm/prims/jvmtiEventController.cpp ! src/share/vm/prims/jvmtiRedefineClasses.hpp ! src/share/vm/prims/jvmtiTagMap.hpp ! src/share/vm/runtime/deoptimization.hpp ! src/share/vm/runtime/dtraceJSDT.hpp ! src/share/vm/runtime/fieldDescriptor.hpp ! src/share/vm/runtime/fieldType.cpp ! src/share/vm/runtime/fieldType.hpp ! src/share/vm/runtime/fprofiler.cpp ! src/share/vm/runtime/fprofiler.hpp ! src/share/vm/runtime/frame.hpp ! src/share/vm/runtime/frame.inline.hpp ! src/share/vm/runtime/handles.hpp ! src/share/vm/runtime/icache.hpp ! src/share/vm/runtime/interfaceSupport.cpp ! src/share/vm/runtime/interfaceSupport.hpp ! src/share/vm/runtime/javaCalls.cpp ! src/share/vm/runtime/javaCalls.hpp ! src/share/vm/runtime/javaFrameAnchor.hpp ! src/share/vm/runtime/jniHandles.cpp ! src/share/vm/runtime/objectMonitor.cpp ! src/share/vm/runtime/reflection.hpp ! src/share/vm/runtime/reflectionUtils.hpp ! src/share/vm/runtime/registerMap.hpp ! src/share/vm/runtime/rframe.cpp ! src/share/vm/runtime/safepoint.cpp ! src/share/vm/runtime/signature.cpp ! src/share/vm/runtime/signature.hpp ! src/share/vm/runtime/stackValueCollection.cpp ! src/share/vm/runtime/statSampler.cpp ! src/share/vm/runtime/stubCodeGenerator.cpp ! src/share/vm/runtime/sweeper.cpp ! src/share/vm/runtime/synchronizer.cpp ! src/share/vm/runtime/threadLocalStorage.hpp ! src/share/vm/runtime/vframe.cpp ! src/share/vm/runtime/vmStructs.hpp ! src/share/vm/runtime/vm_operations.cpp ! src/share/vm/runtime/vm_operations.hpp ! src/share/vm/runtime/vm_version.cpp ! src/share/vm/runtime/vm_version.hpp ! src/share/vm/services/attachListener.cpp ! src/share/vm/services/attachListener.hpp ! src/share/vm/services/classLoadingService.cpp ! src/share/vm/services/management.hpp ! src/share/vm/services/memoryManager.cpp ! src/share/vm/services/memoryPool.cpp ! src/share/vm/services/memoryService.cpp ! src/share/vm/shark/sharkNativeWrapper.cpp ! src/share/vm/utilities/copy.hpp ! src/share/vm/utilities/debug.cpp ! src/share/vm/utilities/debug.hpp ! src/share/vm/utilities/elfSymbolTable.cpp ! src/share/vm/utilities/exceptions.cpp ! src/share/vm/utilities/exceptions.hpp ! src/share/vm/utilities/globalDefinitions_gcc.hpp ! src/share/vm/utilities/globalDefinitions_sparcWorks.hpp ! src/share/vm/utilities/globalDefinitions_visCPP.hpp ! src/share/vm/utilities/hashtable.cpp ! src/share/vm/utilities/hashtable.hpp ! src/share/vm/utilities/hashtable.inline.hpp ! src/share/vm/utilities/ostream.hpp ! src/share/vm/utilities/taskqueue.hpp ! src/share/vm/utilities/utf8.cpp ! src/share/vm/utilities/utf8.hpp ! src/share/vm/utilities/xmlstream.cpp ! src/share/vm/utilities/xmlstream.hpp Changeset: 4f978fb6c81a Author: jmasa Date: 2011-04-06 16:02 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/4f978fb6c81a Merge ! src/share/vm/oops/constantPoolKlass.cpp ! src/share/vm/runtime/globals.hpp Changeset: 92add02409c9 Author: jmasa Date: 2011-04-08 14:19 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/92add02409c9 Merge ! src/cpu/sparc/vm/cppInterpreter_sparc.cpp ! src/cpu/sparc/vm/interpreter_sparc.cpp ! src/cpu/sparc/vm/templateTable_sparc.cpp ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp ! src/cpu/x86/vm/templateInterpreter_x86_32.cpp ! src/cpu/x86/vm/templateInterpreter_x86_64.cpp ! src/cpu/x86/vm/templateTable_x86_32.cpp ! src/cpu/x86/vm/templateTable_x86_64.cpp ! src/share/vm/c1/c1_GraphBuilder.cpp ! src/share/vm/gc_implementation/g1/g1SATBCardTableModRefBS.hpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/graphKit.cpp ! src/share/vm/opto/graphKit.hpp ! src/share/vm/opto/lcm.cpp ! src/share/vm/opto/library_call.cpp ! src/share/vm/opto/memnode.cpp ! src/share/vm/opto/output.cpp ! src/share/vm/prims/unsafe.cpp Changeset: f177ddd59c60 Author: jmasa Date: 2011-04-08 14:53 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/f177ddd59c60 Merge ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/globals.hpp From tony.printezis at oracle.com Tue Apr 12 12:36:46 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 12 Apr 2011 15:36:46 -0400 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) Message-ID: <4DA4A9CE.5070107@oracle.com> Hi, Could I get a couple of people to look at this? (I'd like to push this this week if possible) http://cr.openjdk.java.net/~tonyp/7034139/webrev.0/ The actual fix is reasonably small (leave / join the SuspendibleThreadSet only if we are in concurrent mode). Most of the changes are new infrastructure to cause a fixed number of overflows during marking (in non-product builds of course) to stress the overflow code. This was the only way I could reliably reproduce the failure. This did uncover a couple of extra issues which I also fixed: - If we overflow during remark we should not actually deal with it during remark but we should abort the remark pause and restart a concurrent mark phase. For some reason we were not doing that. I fixed that (for this I had to ensure that the overflow flag is not cleared when we exit the do_marking_step() method). - Because we were clearing the overflow, it was also possible that the workers would deadlock (for that to happen a worker had to finish handling one overflow and immediately raise another one, so it was highly unlikely to occur in prcatice; good to find it and eliminate it though). I've already tested it, I'll run more tests overnight. Tony From yumin.qi at oracle.com Tue Apr 12 11:41:43 2011 From: yumin.qi at oracle.com (yumin.qi at oracle.com) Date: Tue, 12 Apr 2011 11:41:43 -0700 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications Message-ID: <4DA49CE7.7020104@oracle.com> http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ Summary: This is a RFE request for having a GC log rotation to prevent Java application from over flooding disk with GC output running for long time. In the implementation, supply three JVM options 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, default to 1, maximum set to 1024. 3) -XX:GCLogFileSize= can be configured by user how big the file size should be. Default to 10M. Minimum set to 512K if given from option is less than 512K. If MaxGCLogFileNumbers=1, rotating output in same file, i.e write from beginning of the file when reach cap of the file; with MaxGCLogFileNumbers > 1 rotating files sequentially after reach cap in file, file.1, file.2, ..., file. then back to file, file.1, ... Check if rotation needed at safepoint ending. Tested with multiple GC choices. Thanks Yumin From suenaga.yasumasa at oss.ntt.co.jp Tue Apr 12 17:55:48 2011 From: suenaga.yasumasa at oss.ntt.co.jp (Yasumasa Suenaga) Date: Wed, 13 Apr 2011 09:55:48 +0900 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA49CE7.7020104@oracle.com> References: <4DA49CE7.7020104@oracle.com> Message-ID: <4DA4F494.1000409@oss.ntt.co.jp> Hi Yumin, I would like to rotate GC log triggered by the outside. I think that it can be implemented using AttachLister (AttachOperationFunctionInfo []) and invoker tool. Could you examine it? Best regards, Yasumasa (2011/04/13 3:41), yumin.qi at oracle.com wrote: > http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ > > Summary: > > This is a RFE request for having a GC log rotation to prevent Java application from over flooding disk with GC output running for long time. > In the implementation, supply three JVM options > 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file > 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, default to 1, maximum set to 1024. > 3) -XX:GCLogFileSize= can be configured by user how big the file size should be. Default to 10M. Minimum set to 512K if given from option is less than 512K. > > If MaxGCLogFileNumbers=1, rotating output in same file, i.e write from beginning of the file when reach cap of the file; with MaxGCLogFileNumbers > 1 rotating files sequentially after reach cap in file, file.1, file.2, ..., file. then back to file, file.1, ... > Check if rotation needed at safepoint ending. > > Tested with multiple GC choices. > > Thanks > Yumin > > > From jesper.wilhelmsson at oracle.com Wed Apr 13 06:13:59 2011 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Wed, 13 Apr 2011 15:13:59 +0200 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA49CE7.7020104@oracle.com> References: <4DA49CE7.7020104@oracle.com> Message-ID: <4DA5A197.2070504@oracle.com> Yumin, I can't see from the flag name of MaxGCLogFileNumbers what it is supposed to be used for. A better name could perhaps be: MaxGCLogNumberOfFiles. In globals.hpp I find the descriptions in the strings a bit weird. Maybe these strings are used in some non-standard way, but from a pure C point of view the strings will be concatenated into: "Prevent large gclog file for long app running must use with -Xloggc:file" "Maximum number of gclog file roration Default rotate in 1 file" "Default log size 10 Megabytes only used when UseGCLogFileRotation set" I would rather see something like: "Prevent large gclog file for long running app. Requires -Xloggc:file" "Maximum number of gclog files in rotation. Default: 1 file" "Maximum gclog file size. Default: 10MB. Only used with UseGCLogFileRotation" In arguments.cpp there are also a few typos: "-XX:+UseGCLogRotaion must be with -Xloggc:filename in front\n" could be: "You must specify -Xloggc:filename before -XX:+UseGCLogRotaion" "Invalid MaxGCLogFileNumbers(should be > 0): %s\n" add a space: "Invalid MaxGCLogFileNumbers (should be > 0): %s\n" In ostream.hpp there is a comment on line 200: // current logfile rotation number, from 1 to MaxGCLogFileNumbers-1 Is this correct? According to the implementation in ostream.cpp it should probably be from 1 to MaxGCLogFileNumbers since you check for strictly greater than in line 403: if (_cur_file_num > MaxGCLogFileNumbers) _cur_file_num = 1; I would prefer if you changed it to be from 0 to MaxGCLogFileNumbers-1 though since it is more of a standard to start at 0, and it would make the code a bit easier to understand. You would also get one less operation in creating the filename :-) Also, Oracle secure coding guidelines disallows the use of sprintf. Use snprintf instead. (ostream.cpp lines 365, 406 and 408) Cheers, /Jesper On 04/12/2011 08:41 PM, yumin.qi at oracle.com wrote: > http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ > > > Summary: > > This is a RFE request for having a GC log rotation to prevent Java application > from over flooding disk with GC output running for long time. > In the implementation, supply three JVM options > 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file > 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, default to 1, > maximum set to 1024. > 3) -XX:GCLogFileSize= can be configured by user how big the file size should > be. Default to 10M. Minimum set to 512K if given from option is less than 512K. > > If MaxGCLogFileNumbers=1, rotating output in same file, i.e write from > beginning of the file when reach cap of the file; with MaxGCLogFileNumbers > 1 > rotating files sequentially after reach cap in file, file.1, file.2, ..., > file. then back to file, file.1, ... > Check if rotation needed at safepoint ending. > > Tested with multiple GC choices. > > Thanks > Yumin > > > From jon.masamitsu at oracle.com Wed Apr 13 08:47:17 2011 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Wed, 13 Apr 2011 08:47:17 -0700 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA4F494.1000409@oss.ntt.co.jp> References: <4DA49CE7.7020104@oracle.com> <4DA4F494.1000409@oss.ntt.co.jp> Message-ID: <4DA5C585.7000304@oracle.com> Yasumasa , Sounds like this is a request for an enhancement of the feature that Yumin has implemented. Or rather maybe just a way of turning it on? Or is it something more basic to the feature? Jon On 4/12/2011 5:55 PM, Yasumasa Suenaga wrote: > Hi Yumin, > > I would like to rotate GC log triggered by the outside. > I think that it can be implemented using AttachLister > (AttachOperationFunctionInfo []) > and invoker tool. > > Could you examine it? > > > Best regards, > > Yasumasa > > (2011/04/13 3:41), yumin.qi at oracle.com wrote: >> http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ >> >> >> Summary: >> >> This is a RFE request for having a GC log rotation to prevent Java >> application from over flooding disk with GC output running for long >> time. >> In the implementation, supply three JVM options >> 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file >> 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, >> default to 1, maximum set to 1024. >> 3) -XX:GCLogFileSize= can be configured by user how big the file size >> should be. Default to 10M. Minimum set to 512K if given from option >> is less than 512K. >> >> If MaxGCLogFileNumbers=1, rotating output in same file, i.e write >> from beginning of the file when reach cap of the file; with >> MaxGCLogFileNumbers > 1 rotating files sequentially after reach cap >> in file, file.1, file.2, ..., file. then back >> to file, file.1, ... >> Check if rotation needed at safepoint ending. >> >> Tested with multiple GC choices. >> >> Thanks >> Yumin >> >> >> From yumin.qi at oracle.com Wed Apr 13 09:54:58 2011 From: yumin.qi at oracle.com (yumin.qi at oracle.com) Date: Wed, 13 Apr 2011 09:54:58 -0700 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA5C585.7000304@oracle.com> References: <4DA49CE7.7020104@oracle.com> <4DA4F494.1000409@oss.ntt.co.jp> <4DA5C585.7000304@oracle.com> Message-ID: <4DA5D562.2080903@oracle.com> Current implementation will not work with outside tool, ostream_init called once in JVM starting and flags will not be changed rest of running supposedly. Now need to think of what if flags changed by outside tool. Need reconsideration of changing flags from outside tool. Thanks Yumin On 4/13/2011 8:47 AM, Jon Masamitsu wrote: > Yasumasa , > > Sounds like this is a request for an enhancement of the > feature that Yumin has implemented. Or rather maybe > just a way of turning it on? Or is it something more basic > to the feature? > > Jon > > On 4/12/2011 5:55 PM, Yasumasa Suenaga wrote: >> Hi Yumin, >> >> I would like to rotate GC log triggered by the outside. >> I think that it can be implemented using AttachLister >> (AttachOperationFunctionInfo []) >> and invoker tool. >> >> Could you examine it? >> >> >> Best regards, >> >> Yasumasa >> >> (2011/04/13 3:41), yumin.qi at oracle.com wrote: >>> http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ >>> >>> >>> Summary: >>> >>> This is a RFE request for having a GC log rotation to prevent Java >>> application from over flooding disk with GC output running for long >>> time. >>> In the implementation, supply three JVM options >>> 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file >>> 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, >>> default to 1, maximum set to 1024. >>> 3) -XX:GCLogFileSize= can be configured by user how big the file >>> size should be. Default to 10M. Minimum set to 512K if given from >>> option is less than 512K. >>> >>> If MaxGCLogFileNumbers=1, rotating output in same file, i.e write >>> from beginning of the file when reach cap of the file; with >>> MaxGCLogFileNumbers > 1 rotating files sequentially after reach cap >>> in file, file.1, file.2, ..., file. then back >>> to file, file.1, ... >>> Check if rotation needed at safepoint ending. >>> >>> Tested with multiple GC choices. >>> >>> Thanks >>> Yumin >>> >>> >>> From john.cuthbertson at oracle.com Wed Apr 13 10:08:48 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 13 Apr 2011 10:08:48 -0700 Subject: RFR(XS): 7035117: G1: nsk/stress/jni/jnistress002 fails with an assertion failure caused by changes for 7009266 In-Reply-To: <4D9F4E23.6060301@oracle.com> References: <4D9F4E23.6060301@oracle.com> Message-ID: <4DA5D8A0.8060703@oracle.com> Hi Everyone, I have a new webrev for this CR here: http://cr.openjdk.java.net/~johnc/7035117/webrev.1/ The changes include Tom's suggestion to use ConX in C2 code and the fix for the same test in tiered compilation/C1. Testing: the failing test case, nsk tests, jprt. Thanks, JohnC On 04/08/11 11:04, John Cuthbertson wrote: > Hi Everyone, > > Can I have a couple of volunteers to look over the fix for this CR? > The webrev can be found at: > http://cr.openjdk.java.net/~johnc/7035177/webrev.0/. > > The problem is that the node representing the offset (in an > Unsafe.getObject compilation) could be typed as a long and generating > the compare of offset against java_lang_ref_Reference::referent_offset > (typed as an int) caused an assertion failure about the mis-matching > types. > > The fix is to generate a suitably typed constant based upon the type > of "offset". > > Tested using the failing test case from the nightly tests. > > Thanks, > > JohnC > From john.cuthbertson at oracle.com Wed Apr 13 10:16:32 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 13 Apr 2011 10:16:32 -0700 Subject: RFR (XS): 7036021: G1: build failure on win64 and linux with hs21 in jdk6 build environment Message-ID: <4DA5DA70.9060500@oracle.com> Hi Everyone, Can I have a couple of volunteers to review these changes? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7036021/webrev.0/. Issue: Some of the the changes for 7026932 and 7009266 do cause compilation errors when built with the jdk6 build tools. The first is missing parentheses around an expression and the second is passing an uncasted negative one to a unsigned which has been substituted with max_juint. Thanks, JohnC From y.s.ramakrishna at oracle.com Wed Apr 13 10:24:19 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Wed, 13 Apr 2011 10:24:19 -0700 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA5D562.2080903@oracle.com> References: <4DA49CE7.7020104@oracle.com> <4DA4F494.1000409@oss.ntt.co.jp> <4DA5C585.7000304@oracle.com> <4DA5D562.2080903@oracle.com> Message-ID: <4DA5DC43.3060201@oracle.com> A possibility to pursue: You could make the appropriate flags manageable, and re-sample/re-snapshot into private variable (and when appropriate re-initialize the streams) at each safepoint where you now just check for log rotation. -- ramki On 04/13/11 09:54, yumin.qi at oracle.com wrote: > Current implementation will not work with outside tool, ostream_init > called once in JVM starting and flags will not be changed rest of > running supposedly. Now need to think of what if flags changed by > outside tool. Need reconsideration of changing flags from outside tool. > > Thanks > Yumin > > On 4/13/2011 8:47 AM, Jon Masamitsu wrote: >> Yasumasa , >> >> Sounds like this is a request for an enhancement of the >> feature that Yumin has implemented. Or rather maybe >> just a way of turning it on? Or is it something more basic >> to the feature? >> >> Jon >> >> On 4/12/2011 5:55 PM, Yasumasa Suenaga wrote: >>> Hi Yumin, >>> >>> I would like to rotate GC log triggered by the outside. >>> I think that it can be implemented using AttachLister >>> (AttachOperationFunctionInfo []) >>> and invoker tool. >>> >>> Could you examine it? >>> >>> >>> Best regards, >>> >>> Yasumasa >>> >>> (2011/04/13 3:41), yumin.qi at oracle.com wrote: >>>> http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ >>>> >>>> >>>> Summary: >>>> >>>> This is a RFE request for having a GC log rotation to prevent Java >>>> application from over flooding disk with GC output running for long >>>> time. >>>> In the implementation, supply three JVM options >>>> 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file >>>> 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, >>>> default to 1, maximum set to 1024. >>>> 3) -XX:GCLogFileSize= can be configured by user how big the file >>>> size should be. Default to 10M. Minimum set to 512K if given from >>>> option is less than 512K. >>>> >>>> If MaxGCLogFileNumbers=1, rotating output in same file, i.e write >>>> from beginning of the file when reach cap of the file; with >>>> MaxGCLogFileNumbers > 1 rotating files sequentially after reach cap >>>> in file, file.1, file.2, ..., file. then back >>>> to file, file.1, ... >>>> Check if rotation needed at safepoint ending. >>>> >>>> Tested with multiple GC choices. >>>> >>>> Thanks >>>> Yumin >>>> >>>> >>>> From y.s.ramakrishna at oracle.com Wed Apr 13 10:30:36 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Wed, 13 Apr 2011 10:30:36 -0700 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA5DC43.3060201@oracle.com> References: <4DA49CE7.7020104@oracle.com> <4DA4F494.1000409@oss.ntt.co.jp> <4DA5C585.7000304@oracle.com> <4DA5D562.2080903@oracle.com> <4DA5DC43.3060201@oracle.com> Message-ID: <4DA5DDBC.40002@oracle.com> But I agree with Jon that that would be a further new RFE, rather than doing all of that at once. Also, since the flags seem to be designed as a supported public interface, a CCC (or appropriate) review request should be filed, so that any changes stemming from that review are addressed before this is pushed... -- ramki On 04/13/11 10:24, Y. S. Ramakrishna wrote: > A possibility to pursue: > > You could make the appropriate flags manageable, and re-sample/re-snapshot > into private variable (and when appropriate re-initialize the streams) > at each safepoint where you now just check for log rotation. > > -- ramki > > On 04/13/11 09:54, yumin.qi at oracle.com wrote: >> Current implementation will not work with outside tool, ostream_init >> called once in JVM starting and flags will not be changed rest of >> running supposedly. Now need to think of what if flags changed by >> outside tool. Need reconsideration of changing flags from outside tool. >> >> Thanks >> Yumin >> >> On 4/13/2011 8:47 AM, Jon Masamitsu wrote: >>> Yasumasa , >>> >>> Sounds like this is a request for an enhancement of the >>> feature that Yumin has implemented. Or rather maybe >>> just a way of turning it on? Or is it something more basic >>> to the feature? >>> >>> Jon >>> >>> On 4/12/2011 5:55 PM, Yasumasa Suenaga wrote: >>>> Hi Yumin, >>>> >>>> I would like to rotate GC log triggered by the outside. >>>> I think that it can be implemented using AttachLister >>>> (AttachOperationFunctionInfo []) >>>> and invoker tool. >>>> >>>> Could you examine it? >>>> >>>> >>>> Best regards, >>>> >>>> Yasumasa >>>> >>>> (2011/04/13 3:41), yumin.qi at oracle.com wrote: >>>>> http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ >>>>> >>>>> >>>>> Summary: >>>>> >>>>> This is a RFE request for having a GC log rotation to prevent Java >>>>> application from over flooding disk with GC output running for long >>>>> time. >>>>> In the implementation, supply three JVM options >>>>> 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file >>>>> 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, >>>>> default to 1, maximum set to 1024. >>>>> 3) -XX:GCLogFileSize= can be configured by user how big the file >>>>> size should be. Default to 10M. Minimum set to 512K if given from >>>>> option is less than 512K. >>>>> >>>>> If MaxGCLogFileNumbers=1, rotating output in same file, i.e write >>>>> from beginning of the file when reach cap of the file; with >>>>> MaxGCLogFileNumbers > 1 rotating files sequentially after reach cap >>>>> in file, file.1, file.2, ..., file. then >>>>> back to file, file.1, ... >>>>> Check if rotation needed at safepoint ending. >>>>> >>>>> Tested with multiple GC choices. >>>>> >>>>> Thanks >>>>> Yumin >>>>> >>>>> >>>>> > From yumin.qi at oracle.com Wed Apr 13 10:54:21 2011 From: yumin.qi at oracle.com (yumin.qi at oracle.com) Date: Wed, 13 Apr 2011 10:54:21 -0700 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA5DDBC.40002@oracle.com> References: <4DA49CE7.7020104@oracle.com> <4DA4F494.1000409@oss.ntt.co.jp> <4DA5C585.7000304@oracle.com> <4DA5D562.2080903@oracle.com> <4DA5DC43.3060201@oracle.com> <4DA5DDBC.40002@oracle.com> Message-ID: <4DA5E34D.1030008@oracle.com> Thanks for pointing this out, only manageable flags can be changed from outside. If change UseGCLogFileRotation to manageable, means it can be changed during java app running. So it can be switched on and off if -Xloggc: gives a file name. As you pointed out, if this flag is manageable, it should go file a CCC request since it can be changed by outside tools. I agree with you that no need to do so now. Yasumasa's suggestion is add one more cmd in AttachOperationFunctionInfo [] as "gclogrotate", if UseGCLogFileRotation is on, this cmd will call gclog_or_tty->rotate_log(), else print out err msg. I am not sure if I understand the suggestion well. If this way, it has no harm to the existing implementation since the log is already in rotation. It also has no effect to the logging mechanism. What I thought is that what if outsider changes UseGCLogFileRotation when java is running, now it is not a problem --- not an external flag. I will keep the current implementation. Thanks Yumin On 4/13/2011 10:30 AM, Y. S. Ramakrishna wrote: > But I agree with Jon that that would be a further new RFE, > rather than doing all of that at once. > > Also, since the flags seem to be designed as a supported public > interface, > a CCC (or appropriate) review request should be filed, so that > any changes stemming from that review are addressed before > this is pushed... > > -- ramki > > On 04/13/11 10:24, Y. S. Ramakrishna wrote: >> A possibility to pursue: >> >> You could make the appropriate flags manageable, and >> re-sample/re-snapshot >> into private variable (and when appropriate re-initialize the streams) >> at each safepoint where you now just check for log rotation. >> >> -- ramki >> >> On 04/13/11 09:54, yumin.qi at oracle.com wrote: >>> Current implementation will not work with outside tool, >>> ostream_init called once in JVM starting and flags will not be >>> changed rest of running supposedly. Now need to think of what if >>> flags changed by outside tool. Need reconsideration of changing >>> flags from outside tool. >>> >>> Thanks >>> Yumin >>> >>> On 4/13/2011 8:47 AM, Jon Masamitsu wrote: >>>> Yasumasa , >>>> >>>> Sounds like this is a request for an enhancement of the >>>> feature that Yumin has implemented. Or rather maybe >>>> just a way of turning it on? Or is it something more basic >>>> to the feature? >>>> >>>> Jon >>>> >>>> On 4/12/2011 5:55 PM, Yasumasa Suenaga wrote: >>>>> Hi Yumin, >>>>> >>>>> I would like to rotate GC log triggered by the outside. >>>>> I think that it can be implemented using AttachLister >>>>> (AttachOperationFunctionInfo []) >>>>> and invoker tool. >>>>> >>>>> Could you examine it? >>>>> >>>>> >>>>> Best regards, >>>>> >>>>> Yasumasa >>>>> >>>>> (2011/04/13 3:41), yumin.qi at oracle.com wrote: >>>>>> http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ >>>>>> >>>>>> >>>>>> Summary: >>>>>> >>>>>> This is a RFE request for having a GC log rotation to prevent >>>>>> Java application from over flooding disk with GC output running >>>>>> for long time. >>>>>> In the implementation, supply three JVM options >>>>>> 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file >>>>>> 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, >>>>>> default to 1, maximum set to 1024. >>>>>> 3) -XX:GCLogFileSize= can be configured by user how big the file >>>>>> size should be. Default to 10M. Minimum set to 512K if given from >>>>>> option is less than 512K. >>>>>> >>>>>> If MaxGCLogFileNumbers=1, rotating output in same file, i.e write >>>>>> from beginning of the file when reach cap of the file; with >>>>>> MaxGCLogFileNumbers > 1 rotating files sequentially after reach >>>>>> cap in file, file.1, file.2, ..., file. >>>>>> then back to file, file.1, ... >>>>>> Check if rotation needed at safepoint ending. >>>>>> >>>>>> Tested with multiple GC choices. >>>>>> >>>>>> Thanks >>>>>> Yumin >>>>>> >>>>>> >>>>>> >> From vladimir.kozlov at oracle.com Wed Apr 13 11:58:20 2011 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 13 Apr 2011 11:58:20 -0700 Subject: RFR (XS): 7036021: G1: build failure on win64 and linux with hs21 in jdk6 build environment In-Reply-To: <4DA5DA70.9060500@oracle.com> References: <4DA5DA70.9060500@oracle.com> Message-ID: <4DA5F24C.1080701@oracle.com> Looks good. Why G1 code does not use juint type instead of unsigned int? Vladimir John Cuthbertson wrote: > Hi Everyone, > > Can I have a couple of volunteers to review these changes? The webrev > can be found at: http://cr.openjdk.java.net/~johnc/7036021/webrev.0/. > > Issue: Some of the the changes for 7026932 and 7009266 do cause > compilation errors when built with the jdk6 build tools. The first is > missing parentheses around an expression and the second is passing an > uncasted negative one to a unsigned which has been substituted with > max_juint. > > Thanks, > > JohnC From y.s.ramakrishna at oracle.com Wed Apr 13 12:28:31 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Wed, 13 Apr 2011 12:28:31 -0700 Subject: Java heap space, GC, and Promotion Failed In-Reply-To: References: Message-ID: <4DA5F95F.1010101@oracle.com> Hi Rafael -- Looks like you need more heap: size your -Xmx bigger to accommodate all of the objects that your Eclipse project creates. Here's the state of the old gen in the penultimate display:- >> [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95 >> sys=0.02, real=9.00 secs] >> (concurrent mode failure): 2008891K->2014802K(2015232K), 24.2053380 >> secs] 2038395K->2014802K(2044736K), [CMS Perm : 50779K->50779K(86244K)] >> icms_dc=100 , 24.2054320 secs] [Times: user=24.16 sys=0.03, real=24.20 >> secs] >> [GC [1 CMS-initial-mark: 2014802K(2015232K)] 2015335K(2044736K), >> 0.0023250 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] The last line shows that the old gen has:- 2015232 - 2014802 = 430 KB of free space. Perhaps you were trying to allocate an object bigger than that. I'd suggest running with a larger heap (possibly using a 64-bit JVM if you need more Java heap). However, the end of your message does not show the heap to be too full. Perhaps Eclipse catches the OOM, and drops all of the objects before it exits, so you see the heap as not full in the final display:- (Eclipse experts on the list might want to weigh in.) >> This is the end of the output: >> >> Heap >> par new generation total 29504K, used 23591K [0x2e8b0000, 0x308b0000, >> 0x308b0000) >> eden space 26240K, 77% used [0x2e8b0000, 0x2fc89f20, 0x30250000) >> from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000) >> to space 3264K, 0% used [0x30250000, 0x30250000, 0x30580000) >> concurrent mark-sweep generation total 2015232K, used 61766K >> [0x308b0000, 0xab8b0000, 0xab8b0000) >> concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000, >> 0xb0e75000, 0xb38b0000) Asides:- Never, never use values for MaxTenuringThreshold exceeding 15, unless you are sure you want that kind of behaviour. I'd suggest just leave that option out unless you know how to tune for it (there's lots of experience on this alias with tuning that though, should you need to tune that for performance in the future). More asides (specific to CMS):- Depending on what your platform is, if it has anything more than 2 cores, i'd advise dropping the -XX:+CMSIncrementalMode option. (You'd then want to drop other options starting with "CMSIncremental". CMS does not unload classes by default. With Eclipse etc. you would want to unload classes concurrently so as not to get OOM's: use -XX:+CMSClassUnloadingEnabled (and if on older JVM's -XX:+CMSPermGenSweepingEnabled). Bottom line: looks like you need more Java heap. -- ramki On 04/13/11 10:25, Rafael Angarita wrote: > Hello everybody, > > I'm building a code generation application as an Eclipse and one of my > test projects contains around 15000 source files. My application started > having memory problems, so after doing some optimizations especific to > the framework I'm using to develope my DSL, I started learning about GC, > but I think I'm still lost. > > I have tried with different JVM options for the GC with no success. > Currently, I'm trying: > > -Xms2000m -Xmx2000m -verbosegc -XX:+PrintGCDetails > -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC > -XX:+CMSIncrementalMode > -XX:+CMSIncrementalPacing -XX:CMSInitiatingOccupancyFraction=5 > -XX:MaxTenuringThreshold=300 -XX:+UseCMSInitiatingOccupancyOnly > -XX:CMSIncrementalDutyCycleMin=1 > > but this is just one of the several things I have tried. > > At first everything seems to go fine, but after awhile I get "promotion > failed" and everything gets really slow, and finally the application > crash with java.lang.OutOfMemoryError: Java heap space. > > > CMS-concurrent-abortable-preclean: 0.070/0.587 secs] [Times: user=0.66 > sys=0.02, real=0.58 secs] > [GC[YG occupancy: 28574 K (29504 K)][Rescan (parallel) , 0.0198420 > secs][weak refs processing, 0.0015200 secs] [1 CMS-remark: > 1961760K(2015232K)] 1990335K(2044736K), 0.0215890 secs] [Times: > user=0.03 sys=0.00, real=0.03 secs] > [GC [ParNew: 29096K->2087K(29504K), 0.0270900 secs] > 1990735K->1965523K(2044736K) icms_dc=100 , 0.0271650 secs] [Times: > user=0.05 sys=0.00, real=0.03 secs] > [GC [ParNew: 28327K->3264K(29504K), 0.0430410 secs] > 1990448K->1969326K(2044736K) icms_dc=100 , 0.0431180 secs] [Times: > user=0.07 sys=0.01, real=0.04 secs] > [GC [ParNew: 29504K->3264K(29504K), 0.0658260 secs] > 1995091K->1975795K(2044736K) icms_dc=100 , 0.0659090 secs] [Times: > user=0.11 sys=0.00, real=0.07 secs] > [GC [ParNew: 29504K->3264K(29504K), 0.0630250 secs] > 2001944K->1982760K(2044736K) icms_dc=100 , 0.0631060 secs] [Times: > user=0.11 sys=0.00, real=0.06 secs] > [GC [ParNew: 29504K->3263K(29504K), 0.0435130 secs] > 2008711K->1985752K(2044736K) icms_dc=100 , 0.0436310 secs] [Times: > user=0.07 sys=0.00, real=0.04 secs] > [CMS-concurrent-sweep: 1.813/2.058 secs] [Times: user=3.76 sys=0.02, > real=2.05 secs] > [CMS-concurrent-reset: 0.035/0.035 secs] [Times: user=0.06 sys=0.00, > real=0.04 secs] > [GC [ParNew (promotion failed): 29503K->29504K(29504K), 0.5729750 > secs][CMS[Unloading class sun.reflect.GeneratedConstructorAccessor6] > [Unloading class sun.reflect.GeneratedConstructorAccessor26] > [Unloading class sun.reflect.GeneratedMethodAccessor9] > [Unloading class sun.reflect.GeneratedConstructorAccessor17] > [Unloading class sun.reflect.GeneratedConstructorAccessor20] > [Unloading class sun.reflect.GeneratedMethodAccessor4] > [Unloading class sun.reflect.GeneratedMethodAccessor8] > [Unloading class sun.reflect.GeneratedConstructorAccessor25] > [Unloading class sun.reflect.GeneratedMethodAccessor18] > [Unloading class sun.reflect.GeneratedMethodAccessor17] > [Unloading class sun.reflect.GeneratedConstructorAccessor27] > [Unloading class sun.reflect.GeneratedConstructorAccessor19] > [Unloading class sun.reflect.GeneratedConstructorAccessor12] > [Unloading class sun.reflect.GeneratedMethodAccessor2] > [Unloading class sun.reflect.GeneratedConstructorAccessor14] > [Unloading class sun.reflect.GeneratedConstructorAccessor28] > [Unloading class sun.reflect.GeneratedConstructorAccessor5] > [Unloading class sun.reflect.GeneratedMethodAccessor16] > [Unloading class sun.reflect.GeneratedMethodAccessor19] > [Unloading class sun.reflect.GeneratedConstructorAccessor9] > [Unloading class sun.reflect.GeneratedConstructorAccessor11] > [Unloading class sun.reflect.GeneratedConstructorAccessor8] > [Unloading class sun.reflect.GeneratedConstructorAccessor29] > [Unloading class sun.reflect.GeneratedMethodAccessor3] > [Unloading class sun.reflect.GeneratedConstructorAccessor24] > [Unloading class sun.reflect.GeneratedConstructorAccessor18] > [Unloading class sun.reflect.GeneratedMethodAccessor15] > [Unloading class sun.reflect.GeneratedConstructorAccessor10] > [Unloading class sun.reflect.GeneratedConstructorAccessor16] > [Unloading class sun.reflect.GeneratedConstructorAccessor15] > > [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95 > sys=0.02, real=9.00 secs] > (concurrent mode failure): 2008891K->2014802K(2015232K), 24.2053380 > secs] 2038395K->2014802K(2044736K), [CMS Perm : 50779K->50779K(86244K)] > icms_dc=100 , 24.2054320 secs] [Times: user=24.16 sys=0.03, real=24.20 > secs] > [GC [1 CMS-initial-mark: 2014802K(2015232K)] 2015335K(2044736K), > 0.0023250 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > > This is the end of the output: > > Heap > par new generation total 29504K, used 23591K [0x2e8b0000, 0x308b0000, > 0x308b0000) > eden space 26240K, 77% used [0x2e8b0000, 0x2fc89f20, 0x30250000) > from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000) > to space 3264K, 0% used [0x30250000, 0x30250000, 0x30580000) > concurrent mark-sweep generation total 2015232K, used 61766K > [0x308b0000, 0xab8b0000, 0xab8b0000) > concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000, > 0xb0e75000, 0xb38b0000) > > > I would appreciate if anybody can give me an advise about this. > > Thank you very much for your help. > > > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From igor.veresov at oracle.com Wed Apr 13 14:28:33 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Wed, 13 Apr 2011 14:28:33 -0700 Subject: RFR(XS): 7035117: G1: nsk/stress/jni/jnistress002 fails with an assertion failure caused by changes for 7009266 In-Reply-To: <4DA5D8A0.8060703@oracle.com> References: <4D9F4E23.6060301@oracle.com> <4DA5D8A0.8060703@oracle.com> Message-ID: <4DA61581.5080700@oracle.com> Looks good. igor On 4/13/11 10:08 AM, John Cuthbertson wrote: > Hi Everyone, > > I have a new webrev for this CR here: > http://cr.openjdk.java.net/~johnc/7035117/webrev.1/ > > The changes include Tom's suggestion to use ConX in C2 code and the fix > for the same test in tiered compilation/C1. > > Testing: the failing test case, nsk tests, jprt. > > Thanks, > > JohnC > > On 04/08/11 11:04, John Cuthbertson wrote: >> Hi Everyone, >> >> Can I have a couple of volunteers to look over the fix for this CR? >> The webrev can be found at: >> http://cr.openjdk.java.net/~johnc/7035177/webrev.0/. >> >> The problem is that the node representing the offset (in an >> Unsafe.getObject compilation) could be typed as a long and generating >> the compare of offset against java_lang_ref_Reference::referent_offset >> (typed as an int) caused an assertion failure about the mis-matching >> types. >> >> The fix is to generate a suitably typed constant based upon the type >> of "offset". >> >> Tested using the failing test case from the nightly tests. >> >> Thanks, >> >> JohnC >> > From y.s.ramakrishna at oracle.com Wed Apr 13 17:42:56 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Wed, 13 Apr 2011 17:42:56 -0700 Subject: Request for review (XS): 7036482: clear argument is redundant and unused in cardtable methods Message-ID: <4DA64310.9000308@oracle.com> 7036482: clear argument is redundant and unused in cardtable methods http://cr.openjdk.java.net/~ysr/7036482/webrev.00/ As in synopsis; further motivational detail may be found in the CR. Some further cleanups/deletions will follow in subsequent CR's to be filed. thanks for your reviews. -- ramki From suenaga.yasumasa at oss.ntt.co.jp Wed Apr 13 18:12:33 2011 From: suenaga.yasumasa at oss.ntt.co.jp (Yasumasa Suenaga) Date: Thu, 14 Apr 2011 10:12:33 +0900 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA5E34D.1030008@oracle.com> References: <4DA49CE7.7020104@oracle.com> <4DA4F494.1000409@oss.ntt.co.jp> <4DA5C585.7000304@oracle.com> <4DA5D562.2080903@oracle.com> <4DA5DC43.3060201@oracle.com> <4DA5DDBC.40002@oracle.com> <4DA5E34D.1030008@oracle.com> Message-ID: <4DA64A01.40505@oss.ntt.co.jp> Hi, In the case of Fedora14, syslog and any other log files are rotated by "logrotate" tool. logrotate is invoked by "cron.daily" . I and co-worker want to match GC log rotation to the timing of other logs. In the case of Linux and UNIX, it can be achieved by using signal (e.g. SIGHUP). However, if we use signal, its implementation is platform-dependent. So I suggested using AttachListener. If we make GC log rotation invoker like jinfo, jmap, etc... , I'm sure that it's platform-independent. I understood that my suggestion should be a new RFE. After that GC log rotation patch merged HotSpot repository, I want to make the patch to achieve this RFE. Thanks, Yasumasa (2011/04/14 2:54), yumin.qi at oracle.com wrote: > Thanks for pointing this out, only manageable flags can be changed from outside. > If change UseGCLogFileRotation to manageable, means it can be changed during java app running. So it can be switched on and off if -Xloggc: gives a file name. As you pointed out, if this flag is manageable, it should go file a CCC request since it can be changed by outside tools. I agree with you that no need to do so now. > > Yasumasa's suggestion is add one more cmd in AttachOperationFunctionInfo [] as "gclogrotate", if UseGCLogFileRotation is on, this cmd will call gclog_or_tty->rotate_log(), else print out err msg. > > I am not sure if I understand the suggestion well. If this way, it has no harm to the existing implementation since the log is already in rotation. It also has no effect to the logging mechanism. > > What I thought is that what if outsider changes UseGCLogFileRotation when java is running, now it is not a problem --- not an external flag. > > I will keep the current implementation. > > Thanks > Yumin > > > On 4/13/2011 10:30 AM, Y. S. Ramakrishna wrote: >> But I agree with Jon that that would be a further new RFE, >> rather than doing all of that at once. >> >> Also, since the flags seem to be designed as a supported public interface, >> a CCC (or appropriate) review request should be filed, so that >> any changes stemming from that review are addressed before >> this is pushed... >> >> -- ramki >> >> On 04/13/11 10:24, Y. S. Ramakrishna wrote: >>> A possibility to pursue: >>> >>> You could make the appropriate flags manageable, and re-sample/re-snapshot >>> into private variable (and when appropriate re-initialize the streams) >>> at each safepoint where you now just check for log rotation. >>> >>> -- ramki >>> >>> On 04/13/11 09:54, yumin.qi at oracle.com wrote: >>>> Current implementation will not work with outside tool, ostream_init called once in JVM starting and flags will not be changed rest of running supposedly. Now need to think of what if flags changed by outside tool. Need reconsideration of changing flags from outside tool. >>>> >>>> Thanks >>>> Yumin >>>> >>>> On 4/13/2011 8:47 AM, Jon Masamitsu wrote: >>>>> Yasumasa , >>>>> >>>>> Sounds like this is a request for an enhancement of the >>>>> feature that Yumin has implemented. Or rather maybe >>>>> just a way of turning it on? Or is it something more basic >>>>> to the feature? >>>>> >>>>> Jon >>>>> >>>>> On 4/12/2011 5:55 PM, Yasumasa Suenaga wrote: >>>>>> Hi Yumin, >>>>>> >>>>>> I would like to rotate GC log triggered by the outside. >>>>>> I think that it can be implemented using AttachLister (AttachOperationFunctionInfo []) >>>>>> and invoker tool. >>>>>> >>>>>> Could you examine it? >>>>>> >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> (2011/04/13 3:41), yumin.qi at oracle.com wrote: >>>>>>> http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ >>>>>>> >>>>>>> Summary: >>>>>>> >>>>>>> This is a RFE request for having a GC log rotation to prevent Java application from over flooding disk with GC output running for long time. >>>>>>> In the implementation, supply three JVM options >>>>>>> 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file >>>>>>> 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, default to 1, maximum set to 1024. >>>>>>> 3) -XX:GCLogFileSize= can be configured by user how big the file size should be. Default to 10M. Minimum set to 512K if given from option is less than 512K. >>>>>>> >>>>>>> If MaxGCLogFileNumbers=1, rotating output in same file, i.e write from beginning of the file when reach cap of the file; with MaxGCLogFileNumbers > 1 rotating files sequentially after reach cap in file, file.1, file.2, ..., file. then back to file, file.1, ... >>>>>>> Check if rotation needed at safepoint ending. >>>>>>> >>>>>>> Tested with multiple GC choices. >>>>>>> >>>>>>> Thanks >>>>>>> Yumin >>>>>>> >>>>>>> >>>>>>> >>> -- ?????????? ?????? OSS ??? ?????????Web???? ???????????????? TEL: 03-5860-5105 (?? 5069) E-mail: suenaga.yasumasa at oss.ntt.co.jp From Dmitry.Samersoff at oracle.com Wed Apr 13 13:52:32 2011 From: Dmitry.Samersoff at oracle.com (Dmitry Samersoff) Date: Thu, 14 Apr 2011 00:52:32 +0400 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA49CE7.7020104@oracle.com> References: <4DA49CE7.7020104@oracle.com> Message-ID: <4DA60D10.7000902@oracle.com> Yumin, Thank you for taking care of the logging staff. Personally, I would like to see syslog logging inside JVM. It allows cu to use dozen of already existing and already set up log rotation, log archiving etc tools and also redirect logging to other machine when necessary. -Dmitry On 2011-04-12 22:41, yumin.qi at oracle.com wrote: > http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ > > > Summary: > > This is a RFE request for having a GC log rotation to prevent Java > application from over flooding disk with GC output running for long time. > In the implementation, supply three JVM options > 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file > 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, default > to 1, maximum set to 1024. > 3) -XX:GCLogFileSize= can be configured by user how big the file size > should be. Default to 10M. Minimum set to 512K if given from option is > less than 512K. > > If MaxGCLogFileNumbers=1, rotating output in same file, i.e write from > beginning of the file when reach cap of the file; with > MaxGCLogFileNumbers > 1 rotating files sequentially after reach cap in > file, file.1, file.2, ..., file. then back to > file, file.1, ... > Check if rotation needed at safepoint ending. > > Tested with multiple GC choices. > > Thanks > Yumin > > > -- Dmitry Samersoff Java Hotspot development team, SPB04 * There will come soft rains ... From bengt.rutisson at oracle.com Thu Apr 14 00:00:50 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 14 Apr 2011 09:00:50 +0200 Subject: RFR (XS): 7036021: G1: build failure on win64 and linux with hs21 in jdk6 build environment In-Reply-To: <4DA5DA70.9060500@oracle.com> References: <4DA5DA70.9060500@oracle.com> Message-ID: <25E0A0FE-2EE5-49BE-AE81-30CF4FEE0AC6@oracle.com> Looks good. Bengt 13 apr 2011 kl. 19:16 skrev John Cuthbertson : > Hi Everyone, > > Can I have a couple of volunteers to review these changes? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7036021/webrev.0/. > > Issue: Some of the the changes for 7026932 and 7009266 do cause compilation errors when built with the jdk6 build tools. The first is missing parentheses around an expression and the second is passing an uncasted negative one to a unsigned which has been substituted with max_juint. > > Thanks, > > JohnC From bengt.rutisson at oracle.com Thu Apr 14 00:42:51 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 14 Apr 2011 09:42:51 +0200 Subject: Request for review (XS): 7036482: clear argument is redundant and unused in cardtable methods In-Reply-To: <4DA64310.9000308@oracle.com> References: <4DA64310.9000308@oracle.com> Message-ID: <4DA6A57B.7090100@oracle.com> Ramki, Looks good. One question: In allocationStats.hpp your only change is a style issue that is not really related to removing the clear argument. Is it intentional to include this in the change? (I agree that it looks better the way you do it.) Also, copyright year... Bengt On 2011-04-14 02:42, Y. S. Ramakrishna wrote: > > 7036482: clear argument is redundant and unused in cardtable methods > > http://cr.openjdk.java.net/~ysr/7036482/webrev.00/ > > As in synopsis; further motivational detail may be found > in the CR. Some further cleanups/deletions will follow in > subsequent CR's to be filed. > > thanks for your reviews. > -- ramki From John.Coomes at oracle.com Thu Apr 14 01:10:42 2011 From: John.Coomes at oracle.com (John Coomes) Date: Thu, 14 Apr 2011 01:10:42 -0700 Subject: Request for review (XS): 7036482: clear argument is redundant and unused in cardtable methods In-Reply-To: <4DA64310.9000308@oracle.com> References: <4DA64310.9000308@oracle.com> Message-ID: <19878.44034.439619.56575@oracle.com> Y. S. Ramakrishna (y.s.ramakrishna at oracle.com) wrote: > > 7036482: clear argument is redundant and unused in cardtable methods > > http://cr.openjdk.java.net/~ysr/7036482/webrev.00/ The change in allocationStats.hpp is unrelated, is it intended for this fix? Aside from that, looks good to me. > As in synopsis; further motivational detail may be found > in the CR. Some further cleanups/deletions will follow in > subsequent CR's to be filed. mod_oop_in_space_iterate(), the only method that calls non_clean_card_iterate() with clear != false, is unused. Is deleting it one of the further cleanups? Removing it makes it easier to see that 'clear' is not needed, so you could include it with this change. -John From y.s.ramakrishna at oracle.com Thu Apr 14 01:22:19 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Thu, 14 Apr 2011 01:22:19 -0700 Subject: Request for review (XS): 7036482: clear argument is redundant and unused in cardtable methods In-Reply-To: <4DA6A57B.7090100@oracle.com> References: <4DA64310.9000308@oracle.com> <4DA6A57B.7090100@oracle.com> Message-ID: <4DA6AEBB.7000206@oracle.com> Thanks for the review Bengt. On 4/14/2011 12:42 AM, Bengt Rutisson wrote: > > Ramki, > > Looks good. > > One question: > > In allocationStats.hpp your only change is a style issue that is not really related to removing the > clear argument. Is it intentional to include this in the change? (I agree that it looks better the > way you do it.) I could either sneak it in here without further ado, or drop it. I'll let you and John vote :-) > > > Also, copyright year... I'll fix that; thanks! -- ramki > > Bengt > > > On 2011-04-14 02:42, Y. S. Ramakrishna wrote: >> >> 7036482: clear argument is redundant and unused in cardtable methods >> >> http://cr.openjdk.java.net/~ysr/7036482/webrev.00/ >> >> As in synopsis; further motivational detail may be found >> in the CR. Some further cleanups/deletions will follow in >> subsequent CR's to be filed. >> >> thanks for your reviews. >> -- ramki > From y.s.ramakrishna at oracle.com Thu Apr 14 01:39:48 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Thu, 14 Apr 2011 01:39:48 -0700 Subject: Request for review (XS): 7036482: clear argument is redundant and unused in cardtable methods In-Reply-To: <19878.44034.439619.56575@oracle.com> References: <4DA64310.9000308@oracle.com> <19878.44034.439619.56575@oracle.com> Message-ID: <4DA6B2D4.6060209@oracle.com> Thanks for the review John! On 4/14/2011 1:10 AM, John Coomes wrote: > Y. S. Ramakrishna (y.s.ramakrishna at oracle.com) wrote: >> >> 7036482: clear argument is redundant and unused in cardtable methods >> >> http://cr.openjdk.java.net/~ysr/7036482/webrev.00/ > > The change in allocationStats.hpp is unrelated, is it intended for > this fix? Aside from that, looks good to me. I'll let you and Bengt vote on it. Left to myself I'd sneak it in, although I agree that it's unrelated to the synopsis here. (May be i could file a separate bug and include this change also, or I could just mention it in the summary, as an aside.) Let me know which you prefer, or if you'd prefer i just dropped that change. > >> As in synopsis; further motivational detail may be found >> in the CR. Some further cleanups/deletions will follow in >> subsequent CR's to be filed. > > mod_oop_in_space_iterate(), the only method that calls > non_clean_card_iterate() with clear != false, is unused. Is deleting > it one of the further cleanups? Removing it makes it easier to see > that 'clear' is not needed, so you could include it with this change. I initially considered deleting mod_oop_in_space_iterate(), but then thought it was useful as an existing interface, rather along the lines of mod_card_iterate(). I did remove the clear arg to it, so that a caller would have to use a "clearing wrapper" around the closure if they wanted the cards cleared. (This would hold for other interfaces which were modified in this manner too, such as mod_card_iterate()). I have a few more cleanups coming, so I will see if the future cleanups tell me that mod_oop_in_space_iterate() should go away. I am not quite sure at the moment if it should, although given that it has been around for so long and is unused should mean that it's useless: may be that should be reason enough to delete it. OK, I'll delete it too :-> Thanks for your review! -- ramki > > -John From bengt.rutisson at oracle.com Thu Apr 14 02:02:49 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 14 Apr 2011 11:02:49 +0200 Subject: Request for review (XS): 7036482: clear argument is redundant and unused in cardtable methods In-Reply-To: <4DA6B2D4.6060209@oracle.com> References: <4DA64310.9000308@oracle.com> <19878.44034.439619.56575@oracle.com> <4DA6B2D4.6060209@oracle.com> Message-ID: <4DA6B839.3000001@oracle.com> [snip] > I'll let you and Bengt vote on it. Left to myself I'd sneak it > in, although I agree that it's unrelated to the synopsis here. > (May be i could file a separate bug and include this change also, > or I could just mention it in the summary, as an aside.) > Let me know which you prefer, or if you'd prefer i just dropped > that change. Not a big deal for me. I was mostly just wondering if this was intentional. Mentioning it in the summary is fine as far as I am concerned. Bengt > > >> >>> As in synopsis; further motivational detail may be found >>> in the CR. Some further cleanups/deletions will follow in >>> subsequent CR's to be filed. >> >> mod_oop_in_space_iterate(), the only method that calls >> non_clean_card_iterate() with clear != false, is unused. Is deleting >> it one of the further cleanups? Removing it makes it easier to see >> that 'clear' is not needed, so you could include it with this change. > > I initially considered deleting mod_oop_in_space_iterate(), but then > thought it was useful as an existing interface, rather along the lines of > mod_card_iterate(). I did remove the clear arg to > it, so that a caller would have to use a "clearing wrapper" around the > closure if they wanted the cards cleared. (This would hold for > other interfaces which were modified in this manner too, such > as mod_card_iterate()). > > I have a few more cleanups coming, so I will see if the future cleanups > tell me that mod_oop_in_space_iterate() should go away. I am not > quite sure at the moment if it should, although given that it > has been around for so long and is unused should mean that it's > useless: may be that should be reason enough to delete it. > OK, I'll delete it too :-> > > Thanks for your review! > -- ramki > > >> >> -John > From rednaxelafx at gmail.com Thu Apr 14 02:14:28 2011 From: rednaxelafx at gmail.com (Krystal Mok) Date: Thu, 14 Apr 2011 17:14:28 +0800 Subject: What are the conflicting points between Class-Data Sharing and ParNew/CMS/ParallelScavenge/G1? Message-ID: Hi all, I've had this doubt for quite a long time but was too shy to ask on this list. Anyway, here I go: According to the logic in hotspot/src/share/vm/runtime/arguments.cpp, CDS doesn't work with ParNew/CMS/PS/G1. That means only serial GCs works with CDS. I'd like to know, what are the conflicting points between CDS and these parallel/concurrent GCs? Are there any plans to resolve the conflicts, or are there any suggestions on how the conflicts would be resolved? Sincerely, Kris Mok -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110414/f86fd9e0/attachment.html From christian.thalinger at oracle.com Thu Apr 14 02:12:37 2011 From: christian.thalinger at oracle.com (Christian Thalinger) Date: Thu, 14 Apr 2011 11:12:37 +0200 Subject: RFR(XS): 7035117: G1: nsk/stress/jni/jnistress002 fails with an assertion failure caused by changes for 7009266 In-Reply-To: <4DA5D8A0.8060703@oracle.com> References: <4D9F4E23.6060301@oracle.com> <4DA5D8A0.8060703@oracle.com> Message-ID: <7771C78D-C2DF-41FF-9C51-155EC3AA16F1@oracle.com> On Apr 13, 2011, at 7:08 PM, John Cuthbertson wrote: > Hi Everyone, > > I have a new webrev for this CR here: http://cr.openjdk.java.net/~johnc/7035117/webrev.1/ > > The changes include Tom's suggestion to use ConX in C2 code and the fix for the same test in tiered compilation/C1. - Register thread_reg = thread()->as_register(); + Register thread_reg = NOT_LP64(thread()->as_register()) LP64_ONLY(thread()->as_register_lo()); I'm not an expert here but could you use as_pointer_register here instead? -- Christian > > Testing: the failing test case, nsk tests, jprt. > > Thanks, > > JohnC > > On 04/08/11 11:04, John Cuthbertson wrote: >> Hi Everyone, >> >> Can I have a couple of volunteers to look over the fix for this CR? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7035177/webrev.0/. >> >> The problem is that the node representing the offset (in an Unsafe.getObject compilation) could be typed as a long and generating the compare of offset against java_lang_ref_Reference::referent_offset (typed as an int) caused an assertion failure about the mis-matching types. >> >> The fix is to generate a suitably typed constant based upon the type of "offset". >> >> Tested using the failing test case from the nightly tests. >> >> Thanks, >> >> JohnC From john.cuthbertson at oracle.com Thu Apr 14 02:25:26 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Thu, 14 Apr 2011 09:25:26 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7035117: G1: nsk/stress/jni/jnistress002 fails with assertion failure Message-ID: <20110414092532.B9EEF47A8D@hg.openjdk.java.net> Changeset: 59766fd005ff Author: johnc Date: 2011-04-13 17:56 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/59766fd005ff 7035117: G1: nsk/stress/jni/jnistress002 fails with assertion failure Summary: Allow long type for offset in G1 code in compiler implementations of Unsafe.getObject Reviewed-by: never, iveresov ! src/cpu/sparc/vm/c1_CodeStubs_sparc.cpp ! src/cpu/x86/vm/c1_CodeStubs_x86.cpp ! src/share/vm/c1/c1_LIRGenerator.cpp ! src/share/vm/opto/library_call.cpp From y.s.ramakrishna at oracle.com Thu Apr 14 02:29:34 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Thu, 14 Apr 2011 02:29:34 -0700 Subject: Request for review (XS): 7036482: clear argument is redundant and unused in cardtable methods In-Reply-To: <4DA64310.9000308@oracle.com> References: <4DA64310.9000308@oracle.com> Message-ID: <4DA6BE7E.5080200@oracle.com> Modified based on Bengt's and John's reviews; new webrev here: http://cr.openjdk.java.net/~ysr/7036482/webrev.01 -- ramki On 4/13/2011 5:42 PM, Y. S. Ramakrishna wrote: > > 7036482: clear argument is redundant and unused in cardtable methods > > http://cr.openjdk.java.net/~ysr/7036482/webrev.00/ > > As in synopsis; further motivational detail may be found > in the CR. Some further cleanups/deletions will follow in > subsequent CR's to be filed. > > thanks for your reviews. > -- ramki From bengt.rutisson at oracle.com Thu Apr 14 04:15:27 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 14 Apr 2011 13:15:27 +0200 Subject: Request for review (XS): 7036482: clear argument is redundant and unused in cardtable methods In-Reply-To: <4DA6BE7E.5080200@oracle.com> References: <4DA64310.9000308@oracle.com> <4DA6BE7E.5080200@oracle.com> Message-ID: <4DA6D74F.3030906@oracle.com> Looks good to me. Bengt On 2011-04-14 11:29, Y. Srinivas Ramakrishna wrote: > Modified based on Bengt's and John's reviews; new webrev here: > > http://cr.openjdk.java.net/~ysr/7036482/webrev.01 > > -- ramki > > On 4/13/2011 5:42 PM, Y. S. Ramakrishna wrote: >> >> 7036482: clear argument is redundant and unused in cardtable methods >> >> http://cr.openjdk.java.net/~ysr/7036482/webrev.00/ >> >> As in synopsis; further motivational detail may be found >> in the CR. Some further cleanups/deletions will follow in >> subsequent CR's to be filed. >> >> thanks for your reviews. >> -- ramki > From y.s.ramakrishna at oracle.com Thu Apr 14 09:03:09 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Thu, 14 Apr 2011 09:03:09 -0700 Subject: Java heap space, GC, and Promotion Failed In-Reply-To: References: <4DA5F95F.1010101@oracle.com> Message-ID: <4DA71ABD.1050804@oracle.com> May be double the heap using -d64 -Xms5g -Xmx5g (assuming your machine has enough RAM so you are not swapping), and see what happens. If that size of heap usage seems excessive, try and use a heap profiling tool to see why your application is holding on to so much. all the best. -- ramki On 04/14/11 07:17, Rafael Angarita wrote: > Thank you very much! > > I took your advise about the JVM GC parameters and removed some of them. > > I used -Xmx2500m. My application gets further with the proccesing it > needs to do, but the whole computer gets really slow and my application > crash anyway. > > I'm trying to get the developers of the framework I'm using for my DSL. > > If any of you guys have more ideas, I'm here to listen and learn. > > Thank you very much. > > On 13 April 2011 14:58, Y. S. Ramakrishna > wrote: > > Hi Rafael -- > > Looks like you need more heap: size your -Xmx bigger to > accommodate all of the objects that your Eclipse project creates. > Here's the state of the old gen in the penultimate display:- > > > [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: > user=10.95 sys=0.02, real=9.00 secs] (concurrent mode > failure): 2008891K->2014802K(2015232K), 24.2053380 secs] > 2038395K->2014802K(2044736K), [CMS Perm : > 50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] > [Times: user=24.16 sys=0.03, real=24.20 secs] [GC [1 > CMS-initial-mark: 2014802K(2015232K)] 2015335K(2044736K), > 0.0023250 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] > > > The last line shows that the old gen has:- > 2015232 - 2014802 = 430 KB > of free space. Perhaps you were trying to allocate an object bigger > than that. > I'd suggest running with a larger heap (possibly using a 64-bit JVM if > you need more Java heap). > > However, the end of your message does not show the heap to be too full. > Perhaps Eclipse catches the OOM, and drops all of the objects before > it exits, > so you see the heap as not full in the final display:- > (Eclipse experts on the list might want to weigh in.) > > > This is the end of the output: > > Heap > par new generation total 29504K, used 23591K [0x2e8b0000, > 0x308b0000, 0x308b0000) > eden space 26240K, 77% used [0x2e8b0000, 0x2fc89f20, > 0x30250000) > from space 3264K, 100% used [0x30580000, 0x308b0000, > 0x308b0000) > to space 3264K, 0% used [0x30250000, 0x30250000, > 0x30580000) > concurrent mark-sweep generation total 2015232K, used > 61766K [0x308b0000, 0xab8b0000, 0xab8b0000) > concurrent-mark-sweep perm gen total 87828K, used 52696K > [0xab8b0000, 0xb0e75000, 0xb38b0000) > > > > Asides:- > Never, never use values for MaxTenuringThreshold exceeding 15, unless > you are sure you want that kind of behaviour. I'd suggest > just leave that option out unless you know how to tune for it (there's > lots of experience on this alias with tuning that though, should > you need to tune that for performance in the future). > > More asides (specific to CMS):- > Depending on what your platform is, if it has anything more than 2 > cores, > i'd advise dropping the -XX:+CMSIncrementalMode option. (You'd then > want to drop other options starting with "CMSIncremental". > CMS does not unload classes by default. With Eclipse etc. you would > want to unload classes concurrently so as not to get OOM's: > use -XX:+CMSClassUnloadingEnabled (and if on older JVM's > -XX:+CMSPermGenSweepingEnabled). > > Bottom line: looks like you need more Java heap. > -- ramki > > > On 04/13/11 10:25, Rafael Angarita wrote: > > Hello everybody, > > I'm building a code generation application as an Eclipse and one > of my test projects contains around 15000 source files. My > application started having memory problems, so after doing some > optimizations especific to the framework I'm using to develope > my DSL, I started learning about GC, but I think I'm still lost. > > I have tried with different JVM options for the GC with no > success. Currently, I'm trying: > > -Xms2000m -Xmx2000m -verbosegc -XX:+PrintGCDetails > -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC > -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing > -XX:CMSInitiatingOccupancyFraction=5 > -XX:MaxTenuringThreshold=300 -XX:+UseCMSInitiatingOccupancyOnly > -XX:CMSIncrementalDutyCycleMin=1 > > but this is just one of the several things I have tried. > > At first everything seems to go fine, but after awhile I get > "promotion failed" and everything gets really slow, and finally > the application crash with java.lang.OutOfMemoryError: Java heap > space. > > > CMS-concurrent-abortable-preclean: 0.070/0.587 secs] [Times: > user=0.66 sys=0.02, real=0.58 secs] [GC[YG occupancy: 28574 K > (29504 K)][Rescan (parallel) , 0.0198420 secs][weak refs > processing, 0.0015200 secs] [1 CMS-remark: 1961760K(2015232K)] > 1990335K(2044736K), 0.0215890 secs] [Times: user=0.03 sys=0.00, > real=0.03 secs] [GC [ParNew: 29096K->2087K(29504K), 0.0270900 > secs] 1990735K->1965523K(2044736K) icms_dc=100 , 0.0271650 secs] > [Times: user=0.05 sys=0.00, real=0.03 secs] [GC [ParNew: > 28327K->3264K(29504K), 0.0430410 secs] > 1990448K->1969326K(2044736K) icms_dc=100 , 0.0431180 secs] > [Times: user=0.07 sys=0.01, real=0.04 secs] [GC [ParNew: > 29504K->3264K(29504K), 0.0658260 secs] > 1995091K->1975795K(2044736K) icms_dc=100 , 0.0659090 secs] > [Times: user=0.11 sys=0.00, real=0.07 secs] [GC [ParNew: > 29504K->3264K(29504K), 0.0630250 secs] > 2001944K->1982760K(2044736K) icms_dc=100 , 0.0631060 secs] > [Times: user=0.11 sys=0.00, real=0.06 secs] [GC [ParNew: > 29504K->3263K(29504K), 0.0435130 secs] > 2008711K->1985752K(2044736K) icms_dc=100 , 0.0436310 secs] > [Times: user=0.07 sys=0.00, real=0.04 secs] > [CMS-concurrent-sweep: 1.813/2.058 secs] [Times: user=3.76 > sys=0.02, real=2.05 secs] [CMS-concurrent-reset: 0.035/0.035 > secs] [Times: user=0.06 sys=0.00, real=0.04 secs] [GC [ParNew > (promotion failed): 29503K->29504K(29504K), 0.5729750 > secs][CMS[Unloading class sun.reflect.GeneratedConstructorAccessor6] > [Unloading class sun.reflect.GeneratedConstructorAccessor26] > [Unloading class sun.reflect.GeneratedMethodAccessor9] > [Unloading class sun.reflect.GeneratedConstructorAccessor17] > [Unloading class sun.reflect.GeneratedConstructorAccessor20] > [Unloading class sun.reflect.GeneratedMethodAccessor4] > [Unloading class sun.reflect.GeneratedMethodAccessor8] > [Unloading class sun.reflect.GeneratedConstructorAccessor25] > [Unloading class sun.reflect.GeneratedMethodAccessor18] > [Unloading class sun.reflect.GeneratedMethodAccessor17] > [Unloading class sun.reflect.GeneratedConstructorAccessor27] > [Unloading class sun.reflect.GeneratedConstructorAccessor19] > [Unloading class sun.reflect.GeneratedConstructorAccessor12] > [Unloading class sun.reflect.GeneratedMethodAccessor2] > [Unloading class sun.reflect.GeneratedConstructorAccessor14] > [Unloading class sun.reflect.GeneratedConstructorAccessor28] > [Unloading class sun.reflect.GeneratedConstructorAccessor5] > [Unloading class sun.reflect.GeneratedMethodAccessor16] > [Unloading class sun.reflect.GeneratedMethodAccessor19] > [Unloading class sun.reflect.GeneratedConstructorAccessor9] > [Unloading class sun.reflect.GeneratedConstructorAccessor11] > [Unloading class sun.reflect.GeneratedConstructorAccessor8] > [Unloading class sun.reflect.GeneratedConstructorAccessor29] > [Unloading class sun.reflect.GeneratedMethodAccessor3] > [Unloading class sun.reflect.GeneratedConstructorAccessor24] > [Unloading class sun.reflect.GeneratedConstructorAccessor18] > [Unloading class sun.reflect.GeneratedMethodAccessor15] > [Unloading class sun.reflect.GeneratedConstructorAccessor10] > [Unloading class sun.reflect.GeneratedConstructorAccessor16] > [Unloading class sun.reflect.GeneratedConstructorAccessor15] > > [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: > user=10.95 sys=0.02, real=9.00 secs] (concurrent mode failure): > 2008891K->2014802K(2015232K), 24.2053380 secs] > 2038395K->2014802K(2044736K), [CMS Perm : > 50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] [Times: > user=24.16 sys=0.03, real=24.20 secs] [GC [1 CMS-initial-mark: > 2014802K(2015232K)] 2015335K(2044736K), 0.0023250 secs] [Times: > user=0.00 sys=0.00, real=0.00 secs] > This is the end of the output: > > Heap > par new generation total 29504K, used 23591K [0x2e8b0000, > 0x308b0000, 0x308b0000) > eden space 26240K, 77% used [0x2e8b0000, 0x2fc89f20, 0x30250000) > from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000) > to space 3264K, 0% used [0x30250000, 0x30250000, 0x30580000) > concurrent mark-sweep generation total 2015232K, used 61766K > [0x308b0000, 0xab8b0000, 0xab8b0000) > concurrent-mark-sweep perm gen total 87828K, used 52696K > [0xab8b0000, 0xb0e75000, 0xb38b0000) > > > I would appreciate if anybody can give me an advise about this. > > Thank you very much for your help. > > > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From yumin.qi at oracle.com Thu Apr 14 10:06:04 2011 From: yumin.qi at oracle.com (yumin.qi at oracle.com) Date: Thu, 14 Apr 2011 10:06:04 -0700 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA65910.40507@oracle.com> References: <4DA49CE7.7020104@oracle.com> <4DA65910.40507@oracle.com> Message-ID: <4DA7297C.6070408@oracle.com> Poonam, thanks. On 4/13/2011 7:16 PM, Poonam Bajaj wrote: > Hello Yumin, > > Comments inline: > > globals.hpp: > > 2337 product(uintx, MaxGCLogFileNumbers, 1, \ > 2338 "Maximum number of gclog file roration " \ > 2339 "Default rotate in 1 file") \ > 2340 \ > 2341 product(uintx, GCLogFileSize, 10*M, \ > 2342 "Default log size 10 Megabytes " \ > 2343 "only used when UseGCLogFileRotation set") \ > 2344 > > Here I agree with Jesper that the option names and comments need to be changed for clear understanding. Some suggestons > oMaxGCLogFileNumbers can be named asMaxGCLogFiles orMaxGCLogNumberOfFiles. > o line 2338 can be worded as"Maximum number of rotating gc log files. Deafult is 1 " > o line 2343 should be"only used when UseGCLogFileRotation is set") > changed as pointed by Jesper. > > arguments.cpp: > + if (_gc_log_filename == NULL) { > + jio_fprintf(defaultStream::error_stream(), > + "-XX:+UseGCLogRotaion must be with -Xloggc:filename in front\n"); > > should be:"-XX:+UseGCLogFileRotationmust be used with -Xloggc:\n" > > Good catch. > osstream.cpp: > > 835 if (UseGCLogFileRotation) { > 836 gclogStream* rot_tty = new(ResourceObj::C_HEAP) > 837 gclogStream(); > 838 if (rot_tty->is_open()) { > 839 // now we update the time stamp of the GC log to be synced up with tty. > 840 rot_tty->time_stamp().update_to(tty->time_stamp().ticks()); > 841 } > 842 gclog_or_tty = rot_tty; > 843 } > 844 else { > 845 fileStream* gc_tty = new(ResourceObj::C_HEAP) > 846 fileStream(Arguments::gc_log_filename()); > 847 if (gc_tty->is_open()) { > 848 // now we update the time stamp of the GC log to be synced up with tty. > 849 gc_tty->time_stamp().update_to(tty->time_stamp().ticks()); > > > 850 } > 851 gclog_or_tty = gc_tty; > 852 } > 853 } > > > o here, we are refering to log file so the variable names should be > rot_gclog and gclog respectively. > o why is the line at 851 out of 'if' block ? gclog_or_tty should be > set to gclog file only if it is open and the same holds for the > rotating gc log file. > I think the code here is if file already open, put a time stamp or just assign gclog_or_tty a choice. I may change the code using same fileStream --- doing modification now. Since the flag will be an external flag, so it will respond to the change request from outside tool. > > Thanks, > Poonam > > > > On 4/13/2011 12:11 AM, yumin.qi at oracle.com wrote: >> http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ >> >> >> Summary: >> >> This is a RFE request for having a GC log rotation to prevent Java >> application from over flooding disk with GC output running for long >> time. >> In the implementation, supply three JVM options >> 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file >> 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, >> default to 1, maximum set to 1024. >> 3) -XX:GCLogFileSize= can be configured by user how big the file >> size should be. Default to 10M. Minimum set to 512K if given from >> option is less than 512K. >> >> If MaxGCLogFileNumbers=1, rotating output in same file, i.e write >> from beginning of the file when reach cap of the file; with >> MaxGCLogFileNumbers > 1 rotating files sequentially after reach cap >> in file, file.1, file.2, ..., file. then back >> to file, file.1, ... >> Check if rotation needed at safepoint ending. >> >> Tested with multiple GC choices. >> >> Thanks >> Yumin >> >> >> > > -- > Best regards, Poonam > > Sun, an Oracle company > Sun, an Oracle Company > Poonam Bajaj | Staff Engineer > Phone: +66937451 | Mobile: +9844511366 > JVM Sustaining Engineering > | Bangalore > Green Oracle Oracle is committed to > developing practices and products that help protect the environment > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110414/26162477/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 2088 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110414/26162477/attachment-0002.gif -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 356 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110414/26162477/attachment-0003.gif From jon.masamitsu at oracle.com Thu Apr 14 15:10:39 2011 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Thu, 14 Apr 2011 15:10:39 -0700 Subject: request for review - 6946385 Message-ID: <4DA770DF.6090309@oracle.com> 6946385: G1: jstat does not support G1 GC http://cr.openjdk.java.net/~jmasa/6946385/webrev.00/ Thanks. From y.s.ramakrishna at oracle.com Thu Apr 14 16:25:48 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Thu, 14 Apr 2011 16:25:48 -0700 Subject: request for review - 6946385 In-Reply-To: <4DA770DF.6090309@oracle.com> References: <4DA770DF.6090309@oracle.com> Message-ID: <4DA7827C.4090000@oracle.com> Still looks good! -- ramki On 04/14/11 15:10, Jon Masamitsu wrote: > 6946385: G1: jstat does not support G1 GC > > http://cr.openjdk.java.net/~jmasa/6946385/webrev.00/ > > Thanks. From john.cuthbertson at oracle.com Thu Apr 14 17:48:35 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Fri, 15 Apr 2011 00:48:35 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 34 new changesets Message-ID: <20110415004935.C000047AD4@hg.openjdk.java.net> Changeset: 9e6733fb56f8 Author: schien Date: 2011-04-07 15:20 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/9e6733fb56f8 Added tag jdk7-b137 for changeset 2dbcb4a4d8da ! .hgtags Changeset: 987d9d10a30a Author: trims Date: 2011-04-08 15:56 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/987d9d10a30a Added tag hs21-b07 for changeset 2dbcb4a4d8da ! .hgtags Changeset: 24fbb4b7c2d3 Author: trims Date: 2011-04-08 16:18 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/24fbb4b7c2d3 Merge Changeset: 0930dc920c18 Author: trims Date: 2011-04-08 16:18 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/0930dc920c18 7035259: Bump the HS21 build number to 08 Summary: Update the HS21 build number to 08 Reviewed-by: jcoomes ! make/hotspot_version Changeset: c2323e2ea62b Author: never Date: 2011-03-31 21:05 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/c2323e2ea62b 6385687: UseFastEmptyMethods/UseFastAccessorMethods considered harmful Reviewed-by: kvn, jrose, phh ! src/share/vm/prims/jvmtiManageCapabilities.cpp ! src/share/vm/runtime/globals.hpp Changeset: f8b038506985 Author: never Date: 2011-04-01 21:45 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/f8b038506985 6909440: C2 fails with assertion (_always_cold->is_cold(),"must always be cold") Reviewed-by: kvn ! src/share/vm/opto/callGenerator.cpp ! src/share/vm/opto/callGenerator.hpp Changeset: 07acc51c1d2a Author: kvn Date: 2011-04-02 09:49 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/07acc51c1d2a 7032314: Allow to generate CallLeafNoFPNode in IdealKit Summary: Added CallLeafNoFPNode generation to IdealKit. Added i_o synchronization. Reviewed-by: never ! src/share/vm/opto/graphKit.cpp ! src/share/vm/opto/graphKit.hpp ! src/share/vm/opto/idealKit.cpp ! src/share/vm/opto/idealKit.hpp ! src/share/vm/opto/library_call.cpp Changeset: 08eb13460b3a Author: kvn Date: 2011-04-02 10:54 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/08eb13460b3a 7004535: Clone loop predicate during loop unswitch Summary: Clone loop predicate for clonned loops Reviewed-by: never ! src/share/vm/opto/cfgnode.cpp ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/compile.hpp ! src/share/vm/opto/ifnode.cpp + src/share/vm/opto/loopPredicate.cpp ! src/share/vm/opto/loopTransform.cpp ! src/share/vm/opto/loopUnswitch.cpp ! src/share/vm/opto/loopnode.cpp ! src/share/vm/opto/loopnode.hpp ! src/share/vm/opto/loopopts.cpp ! src/share/vm/opto/phaseX.hpp ! src/share/vm/opto/split_if.cpp ! src/share/vm/opto/superword.cpp ! src/share/vm/opto/vectornode.hpp Changeset: 13bc79b5c9c8 Author: roland Date: 2011-04-03 12:00 +0200 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/13bc79b5c9c8 7033154: Improve C1 arraycopy performance Summary: better static analysis. Take advantage of array copy stubs. Reviewed-by: never ! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp ! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp ! src/share/vm/c1/c1_GraphBuilder.cpp ! src/share/vm/c1/c1_Instruction.cpp ! src/share/vm/c1/c1_Instruction.hpp ! src/share/vm/c1/c1_LIR.hpp ! src/share/vm/c1/c1_LIRGenerator.cpp ! src/share/vm/c1/c1_Optimizer.cpp ! src/share/vm/c1/c1_Runtime1.cpp ! src/share/vm/c1/c1_Runtime1.hpp ! src/share/vm/opto/library_call.cpp ! src/share/vm/runtime/stubRoutines.cpp ! src/share/vm/runtime/stubRoutines.hpp Changeset: e863062e521d Author: twisti Date: 2011-04-04 03:02 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/e863062e521d 7032458: Zero and Shark fixes Reviewed-by: twisti Contributed-by: Gary Benson ! src/cpu/zero/vm/globals_zero.hpp ! src/cpu/zero/vm/relocInfo_zero.cpp ! src/cpu/zero/vm/sharedRuntime_zero.cpp ! src/share/vm/ci/ciTypeFlow.hpp ! src/share/vm/compiler/compileBroker.cpp ! src/share/vm/interpreter/bytecodeInterpreter.cpp ! src/share/vm/shark/sharkCompiler.cpp ! src/share/vm/shark/sharkCompiler.hpp ! src/share/vm/utilities/globalDefinitions.hpp ! src/share/vm/utilities/globalDefinitions_gcc.hpp Changeset: 8b2317d732ec Author: never Date: 2011-04-04 12:57 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/8b2317d732ec 7026957: assert(type2aelembytes(store->as_Mem()->memory_type(), true) == 1 << shift->in(2)->get_int()) failed Reviewed-by: kvn, jrose ! src/share/vm/opto/loopTransform.cpp Changeset: bb22629531fa Author: iveresov Date: 2011-04-04 16:00 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/bb22629531fa 7033732: C1: When calling c2 arraycopy stubs offsets and length must have clear upper 32bits Summary: With 7033154 we started calling c2 arraycopy stubs from c1. On sparcv9 we must clear the upper 32bits for offset (src_pos, dst_pos) and length parameters when calling them. Reviewed-by: never, kvn ! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp Changeset: a54519951ff6 Author: iveresov Date: 2011-04-04 18:48 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/a54519951ff6 Merge Changeset: 87ce328c6a21 Author: never Date: 2011-04-04 19:03 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/87ce328c6a21 6528013: C1 CTW failure with -XX:+VerifyOops assert(allocates2(pc),"") Reviewed-by: kvn, iveresov ! src/share/vm/c1/c1_LIRAssembler.cpp Changeset: fb37e3eabfd0 Author: never Date: 2011-04-04 22:17 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/fb37e3eabfd0 Merge Changeset: d7a3fed1c1c9 Author: kvn Date: 2011-04-04 19:02 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/d7a3fed1c1c9 7004547: regular loop unroll should not unroll more than max unrolling Summary: Take into account that after unroll conjoined heads and tails will fold. Reviewed-by: never ! src/share/vm/opto/loopTransform.cpp Changeset: 03f2be00fa21 Author: kvn Date: 2011-04-05 00:27 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/03f2be00fa21 Merge Changeset: 479b4b4b6950 Author: never Date: 2011-04-05 00:31 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/479b4b4b6950 6777083: assert(target != __null,"must not be null") Reviewed-by: iveresov, kvn ! src/cpu/x86/vm/assembler_x86.hpp ! src/share/vm/code/relocInfo.cpp ! src/share/vm/code/relocInfo.hpp Changeset: 8e77e1f26188 Author: never Date: 2011-04-05 02:31 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/8e77e1f26188 Merge Changeset: 527977d4f740 Author: never Date: 2011-04-05 19:16 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/527977d4f740 7033779: CodeCache::largest_free_block may need to hold the CodeCache lock Reviewed-by: kvn ! src/share/vm/code/codeCache.cpp ! src/share/vm/code/codeCache.hpp Changeset: 98c560260039 Author: never Date: 2011-04-06 16:02 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/98c560260039 7034513: enable fast accessors and empty methods for ZERO and -Xint Reviewed-by: kvn, iveresov ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/globals.hpp Changeset: 55973726c600 Author: kvn Date: 2011-04-06 17:32 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/55973726c600 6992789: assert(phi->_idx >= nodes_size()) failed: only new Phi per instance memory slice Summary: Swap checks: check for regular memory slice first and keep input phi. Reviewed-by: never ! src/share/vm/opto/escape.cpp Changeset: ed69575596ac Author: jrose Date: 2011-04-07 17:02 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/ed69575596ac 6981791: remove experimental code for JSR 292 Reviewed-by: twisti ! agent/src/share/classes/sun/jvm/hotspot/oops/ConstantPool.java ! agent/src/share/classes/sun/jvm/hotspot/runtime/ClassConstants.java ! agent/src/share/classes/sun/jvm/hotspot/tools/jcore/ClassWriter.java ! agent/src/share/classes/sun/jvm/hotspot/ui/classbrowser/HTMLGenerator.java ! agent/src/share/classes/sun/jvm/hotspot/utilities/ConstantTag.java ! src/share/vm/classfile/classFileParser.cpp ! src/share/vm/classfile/javaClasses.cpp ! src/share/vm/classfile/systemDictionary.cpp ! src/share/vm/classfile/systemDictionary.hpp ! src/share/vm/classfile/verifier.cpp ! src/share/vm/classfile/vmSymbols.hpp ! src/share/vm/interpreter/bytecodeTracer.cpp ! src/share/vm/interpreter/interpreterRuntime.cpp ! src/share/vm/interpreter/linkResolver.cpp ! src/share/vm/interpreter/rewriter.cpp ! src/share/vm/oops/constantPoolKlass.cpp ! src/share/vm/oops/constantPoolOop.cpp ! src/share/vm/oops/constantPoolOop.hpp ! src/share/vm/oops/cpCacheOop.cpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/oops/instanceKlassKlass.cpp ! src/share/vm/oops/methodOop.cpp ! src/share/vm/prims/jvm.h ! src/share/vm/prims/methodHandleWalk.cpp ! src/share/vm/prims/methodHandles.cpp ! src/share/vm/prims/nativeLookup.cpp ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/globals.cpp ! src/share/vm/runtime/globals.hpp ! src/share/vm/utilities/constantTag.cpp ! src/share/vm/utilities/constantTag.hpp Changeset: 758ba0bf7bcc Author: jrose Date: 2011-04-07 17:12 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/758ba0bf7bcc 7012087: JSR 292 Misleading exception message for a non-bound MH for a virtual method Summary: Improve error message formatting to give more information to user. Also, catch a corner case related to 6930553 and 6844449. Reviewed-by: kvn ! src/share/vm/classfile/classFileParser.cpp ! src/share/vm/runtime/sharedRuntime.cpp Changeset: 4124a5a27707 Author: jrose Date: 2011-04-07 17:12 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/4124a5a27707 7009600: JSR 292 Server compiler crashes in Compile::find_intrinsic(ciMethod*, bool) Summary: catch errors during the compile-time processing of method handles; back out cleanly Reviewed-by: twisti ! src/share/vm/ci/ciMethodHandle.cpp ! src/share/vm/opto/doCall.cpp Changeset: 3f49d30f8184 Author: never Date: 2011-04-07 21:32 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/3f49d30f8184 7034957: acquiring lock CodeCache_lock/1 out of order with lock tty_lock/0 -- possible deadlock Reviewed-by: iveresov ! src/share/vm/code/codeCache.cpp Changeset: d86923d96dca Author: iveresov Date: 2011-04-08 17:03 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/d86923d96dca 7034967: C1: assert(false) failed: error (assembler_sparc.cpp:2043) Summary: Fix -XX:+VerifyOops Reviewed-by: kvn, never ! src/cpu/sparc/vm/c1_MacroAssembler_sparc.cpp ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp ! src/cpu/x86/vm/c1_CodeStubs_x86.cpp ! src/share/vm/c1/c1_LIRGenerator.cpp Changeset: 3af54845df98 Author: kvn Date: 2011-04-08 14:56 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/3af54845df98 7004555: Add new policy for one iteration loops Summary: Add new policy for one iteration loops (mostly formal pre- loops). Reviewed-by: never ! src/share/vm/opto/loopTransform.cpp ! src/share/vm/opto/loopnode.cpp ! src/share/vm/opto/loopnode.hpp Changeset: 46d145ee8e68 Author: kvn Date: 2011-04-08 20:52 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/46d145ee8e68 Merge Changeset: 3fa3c7e4d4f3 Author: never Date: 2011-04-08 23:00 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/3fa3c7e4d4f3 7035161: assert(!o->is_null_object()) failed: null object not yet handled here. Reviewed-by: kvn ! src/share/vm/ci/ciInstance.cpp Changeset: 6c97c830fb6f Author: jrose Date: 2011-04-09 21:16 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/6c97c830fb6f Merge ! agent/src/share/classes/sun/jvm/hotspot/oops/ConstantPool.java ! agent/src/share/classes/sun/jvm/hotspot/tools/jcore/ClassWriter.java ! agent/src/share/classes/sun/jvm/hotspot/ui/classbrowser/HTMLGenerator.java ! src/cpu/sparc/vm/c1_MacroAssembler_sparc.cpp ! src/share/vm/c1/c1_LIRAssembler.cpp ! src/share/vm/code/codeCache.hpp ! src/share/vm/code/relocInfo.cpp ! src/share/vm/code/relocInfo.hpp ! src/share/vm/oops/constantPoolKlass.cpp ! src/share/vm/opto/graphKit.cpp ! src/share/vm/opto/graphKit.hpp ! src/share/vm/opto/idealKit.cpp ! src/share/vm/opto/idealKit.hpp ! src/share/vm/opto/loopTransform.cpp ! src/share/vm/opto/loopUnswitch.cpp ! src/share/vm/opto/loopopts.cpp ! src/share/vm/runtime/globals.hpp ! src/share/vm/utilities/globalDefinitions_gcc.hpp Changeset: 5d046bf49ce7 Author: johnc Date: 2011-04-14 13:45 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/5d046bf49ce7 Merge ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp ! src/cpu/x86/vm/c1_CodeStubs_x86.cpp ! src/share/vm/c1/c1_GraphBuilder.cpp ! src/share/vm/c1/c1_LIRGenerator.cpp ! src/share/vm/classfile/vmSymbols.hpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/graphKit.cpp ! src/share/vm/opto/graphKit.hpp ! src/share/vm/opto/library_call.cpp ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/globals.hpp Changeset: c69b1043dfb1 Author: ysr Date: 2011-04-14 12:10 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/c69b1043dfb1 7036482: clear argument is redundant and unused in cardtable methods Summary: Removed the unused clear argument to various cardtbale methods and unused mod_oop_in_space_iterate method. Unrelated to synopsis, added a pair of clarifying parens in AllocationStats constructor. Reviewed-by: brutisso, jcoomes ! src/share/vm/gc_implementation/parNew/parCardTableModRefBS.cpp ! src/share/vm/gc_implementation/shared/allocationStats.hpp ! src/share/vm/memory/cardTableModRefBS.cpp ! src/share/vm/memory/cardTableModRefBS.hpp ! src/share/vm/memory/cardTableRS.cpp ! src/share/vm/memory/modRefBarrierSet.hpp Changeset: 4080db1b5d0a Author: johnc Date: 2011-04-14 13:49 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/4080db1b5d0a Merge From john.cuthbertson at oracle.com Fri Apr 15 10:53:33 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 15 Apr 2011 10:53:33 -0700 Subject: RFR(XXS): 7036706 G1: Use LIR_OprDesc::as_pointer_register in code changes for 7035117 Message-ID: <4DA8861D.1010806@oracle.com> Hi Everyone, Can I have a couple of volunteers to review this change? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7036706/webrev.0/ The change incorporates Christian's suggestion to use LIR_OprDesc::as_pointer_register() to define thread_reg in the C1 code stubs for Unsafe.getObject. Testing: nsk stress tests (including jni stress tests) with TieredCompilation on both sparc64 and x64. Thanks JohnC From igor.veresov at oracle.com Fri Apr 15 11:24:13 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 15 Apr 2011 11:24:13 -0700 Subject: RFR(XXS): 7036706 G1: Use LIR_OprDesc::as_pointer_register in code changes for 7035117 In-Reply-To: <4DA8861D.1010806@oracle.com> References: <4DA8861D.1010806@oracle.com> Message-ID: <4DA88D4D.2000908@oracle.com> Looks good. Sorry for not spotting it the first time. igor On 4/15/11 10:53 AM, John Cuthbertson wrote: > Hi Everyone, > > Can I have a couple of volunteers to review this change? The webrev can > be found at: http://cr.openjdk.java.net/~johnc/7036706/webrev.0/ > > The change incorporates Christian's suggestion to use > LIR_OprDesc::as_pointer_register() to define thread_reg in the C1 code > stubs for Unsafe.getObject. > > Testing: nsk stress tests (including jni stress tests) with > TieredCompilation on both sparc64 and x64. > > Thanks > > JohnC From john.cuthbertson at oracle.com Fri Apr 15 23:47:04 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Sat, 16 Apr 2011 06:47:04 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7036021: G1: build failure on win64 and linux with hs21 in jdk6 build environment Message-ID: <20110416064711.1EBDD47B29@hg.openjdk.java.net> Changeset: edd9b016deb6 Author: johnc Date: 2011-04-15 10:10 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/edd9b016deb6 7036021: G1: build failure on win64 and linux with hs21 in jdk6 build environment Summary: Missing parentheses around a casted expression and some missing casts were causing build failures with the jdk6 build tools. Reviewed-by: kvn, brutisso ! src/share/vm/gc_implementation/g1/concurrentG1Refine.hpp ! src/share/vm/opto/library_call.cpp From John.Coomes at oracle.com Sat Apr 16 13:38:45 2011 From: John.Coomes at oracle.com (John Coomes) Date: Sat, 16 Apr 2011 13:38:45 -0700 Subject: review request (S): 7037250 cscope.make is silently broken Message-ID: <19881.65109.537647.669897@oracle.com> I'd appreciate reviews the fix for 7037250: cscope.make database generation is silently broken I recently resumed using cscope and found cscope.make was rather broken. Details are in the webrev: http://cr.openjdk.java.net/~jcoomes/7037250-cscope/ Thanks for any comments. -John From jon.masamitsu at oracle.com Sun Apr 17 16:32:50 2011 From: jon.masamitsu at oracle.com (jon.masamitsu at oracle.com) Date: Sun, 17 Apr 2011 23:32:50 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 6946385: G1: jstat does not support G1 GC Message-ID: <20110417233252.E926247B9F@hg.openjdk.java.net> Changeset: 1d0b856224f8 Author: jmasa Date: 2011-04-17 01:24 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/1d0b856224f8 6946385: G1: jstat does not support G1 GC Summary: Added counters for jstat Reviewed-by: tonyp, jwilhelm, stefank, ysr, johnc From y.s.ramakrishna at oracle.com Sun Apr 17 23:49:21 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Sun, 17 Apr 2011 23:49:21 -0700 Subject: question on finalize method In-Reply-To: References: Message-ID: <4DABDEF1.3050007@oracle.com> Yes; indeed, the spec is deliberately loose because it is difficult in practice to implement any hard promptness guarantees in general. -- ramki On 4/17/2011 8:55 PM, Ted Yu wrote: > Is this statement true for Java 1.6 and beyond ( > http://forums.whirlpool.net.au/archive/754353) ? > In fact, it is perfectly permissible for a Java VM to *never* call it. > > Thanks > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From bluedavy at gmail.com Mon Apr 18 06:48:07 2011 From: bluedavy at gmail.com (BlueDavy Lin) Date: Mon, 18 Apr 2011 21:48:07 +0800 Subject: Crash log when do GC... Message-ID: hi! Rencently our two app often crash when do gc,the crash log attached,can someone give me some advice? thks. ps: I tried to set -XX:-UseCompressedOops,but still crash,and log is the same. -- ============================= |? ?? BlueDavy? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | |? ?? http://www.bluedavy.com? ?? ? ? ? ? ? ?| ============================= -------------- next part -------------- A non-text attachment was scrubbed... Name: hs_err_pid6208.log Type: application/octet-stream Size: 119663 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110418/4096f72f/attachment-0001.obj From bluedavy at gmail.com Mon Apr 18 06:50:36 2011 From: bluedavy at gmail.com (BlueDavy Lin) Date: Mon, 18 Apr 2011 21:50:36 +0800 Subject: Crash log when do GC... In-Reply-To: References: Message-ID: The code where crash from core dump: #6 #7 0x00002b9da8b282c3 in ParScanClosure::do_oop_work () 2011/4/18 BlueDavy Lin : > hi! > > ? ? ?Rencently our two app often crash when do gc,the crash log > attached,can someone give me some advice? thks. > > ? ? ?ps: I tried to set -XX:-UseCompressedOops,but still crash,and > log is the same. > > -- > ============================= > |? ?? BlueDavy? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | > |? ?? http://www.bluedavy.com? ?? ? ? ? ? ? ?| > ============================= > -- ============================= |? ?? BlueDavy? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | |? ?? http://www.bluedavy.com? ?? ? ? ? ? ? ?| ============================= From y.s.ramakrishna at oracle.com Mon Apr 18 07:58:42 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Mon, 18 Apr 2011 07:58:42 -0700 Subject: Crash log when do GC... In-Reply-To: References: Message-ID: <4DAC51A2.2060506@oracle.com> Hi, i have heard a couple of other reports of this sort recently. But i don't think we have found or fixed any issue recently that might address this. You might want to try a more recent JVM/JDK to confirm if the crash still occurs (which i think it probably will, going by other such reports). Do you have a test case? If so, please file a bug through support or send us your test case off-line. You can also enable heap verification at some considerable GC performance cost and see if that gets us closer to the root cause. (From looking at the stack retrace it appears as though GC finds a bad reference from an object array while copying live objects from the young generation during a scavenge.) -- ramki On 4/18/2011 6:48 AM, BlueDavy Lin wrote: > hi! > > Rencently our two app often crash when do gc,the crash log > attached,can someone give me some advice? thks. > > ps: I tried to set -XX:-UseCompressedOops,but still crash,and > log is the same. > From y.s.ramakrishna at oracle.com Mon Apr 18 11:34:02 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Mon, 18 Apr 2011 11:34:02 -0700 Subject: Request for review (S): 7037276: Unnecessary double traversal of dirty card windows Message-ID: <4DAC841A.4030806@oracle.com> I'd like a couple of code reviews for: 7037276: Unnecessary double traversal of dirty card windows http://cr.openjdk.java.net/~ysr/7037276/webrev.00/ Card scanning with ParNew and DefNew scavenges was collecting dirty card windows in one method and then doing a retraversal of that window in another method to clear the cards and iterate over the refs on those cards. This double traversal was unnecessary, since the collection of contiguous ranges, their clearing and the iteration over the covered refs could all be done in a single sweep over the cards covering the region of interest. I also specialized some of the arguments to these calls, made some internal methods private and renamed a few others for greater clarity. This is the second of a series of about 4 CRs aimed at eventually fixing 6883834 (the last in the series). Stay tuned for some more incremental rearrangements of related code in another one or two subsequent CRs. A quick performance measurement didn't show any appreciable change in scavenge times as a result of this change, but more careful performance measurements are in progress. Tested with jprt, the test case for 6883834 and refworkload server. Perf measurements with refworkload server in progress. thanks for your reviews. -- ramki From john.cuthbertson at oracle.com Mon Apr 18 19:17:03 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Tue, 19 Apr 2011 02:17:03 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7036706: G1: Use LIR_OprDesc::as_pointer_register in code changes for 7035117 Message-ID: <20110419021709.532B247BEB@hg.openjdk.java.net> Changeset: 527b586edf24 Author: johnc Date: 2011-04-18 16:27 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/527b586edf24 7036706: G1: Use LIR_OprDesc::as_pointer_register in code changes for 7035117 Summary: Use LIR_OprDesc::as_pointer_register() instead as_register/as_register_lo combination in the code changes for 7035117. Reviewed-by: iveresov ! src/cpu/sparc/vm/c1_CodeStubs_sparc.cpp ! src/cpu/x86/vm/c1_CodeStubs_x86.cpp From y.s.ramakrishna at oracle.com Tue Apr 19 19:28:40 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Tue, 19 Apr 2011 19:28:40 -0700 Subject: Request for review (S): 7037276: Unnecessary double traversal of dirty card windows In-Reply-To: <4DAC841A.4030806@oracle.com> References: <4DAC841A.4030806@oracle.com> Message-ID: <4DAE44D8.9030102@oracle.com> I discovered a small flaw in this which I am going to fix. If you have not started your review yet, please hold off until I publish a second version of the changeset that fixes the flaw I found. Sorry for not having found this earlier and for the delay. -- ramki On 4/18/2011 11:34 AM, Y. Srinivas Ramakrishna wrote: > > I'd like a couple of code reviews for: > > 7037276: Unnecessary double traversal of dirty card windows > http://cr.openjdk.java.net/~ysr/7037276/webrev.00/ > > Card scanning with ParNew and DefNew scavenges was collecting > dirty card windows in one method and then doing a retraversal > of that window in another method to clear the cards and iterate > over the refs on those cards. This double traversal was unnecessary, > since the collection of contiguous ranges, their clearing and the > iteration over the covered refs could all be done in a single sweep > over the cards covering the region of interest. > > I also specialized some of the arguments to these calls, > made some internal methods private and renamed a few others > for greater clarity. > > This is the second of a series of about 4 CRs aimed at eventually > fixing 6883834 (the last in the series). Stay tuned for some > more incremental rearrangements of related code in another > one or two subsequent CRs. > > A quick performance measurement didn't show any appreciable > change in scavenge times as a result of this change, but more > careful performance measurements are in progress. > > Tested with jprt, the test case for 6883834 and refworkload server. > Perf measurements with refworkload server in progress. > > thanks for your reviews. > -- ramki From tony.printezis at oracle.com Tue Apr 19 19:43:37 2011 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Wed, 20 Apr 2011 02:43:37 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7011855: G1: non-product flag to artificially grow the heap Message-ID: <20110420024343.A2EA647C46@hg.openjdk.java.net> Changeset: 49a67202bc67 Author: tonyp Date: 2011-04-19 15:46 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/49a67202bc67 7011855: G1: non-product flag to artificially grow the heap Summary: It introduces non-product cmd line parameter G1DummyRegionsPerGC which indicates how many "dummy" regions to allocate at the end of each GC. This allows the G1 heap to grow artificially and makes concurrent marking cycles more frequent irrespective of what the application that is running is doing. The dummy regions will be found totally empty during cleanup so this parameter can also be used to stress the concurrent cleanup operation. Reviewed-by: brutisso, johnc ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp ! src/share/vm/gc_implementation/g1/g1_globals.hpp From y.s.ramakrishna at oracle.com Tue Apr 19 20:13:24 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Tue, 19 Apr 2011 20:13:24 -0700 Subject: Request for review (S): 7037276: Unnecessary double traversal of dirty card windows In-Reply-To: <4DAE44D8.9030102@oracle.com> References: <4DAC841A.4030806@oracle.com> <4DAE44D8.9030102@oracle.com> Message-ID: <4DAE4F54.4030306@oracle.com> It's me again; looks like i got a couple of my repos confused, and raised a false alarm. Please go ahead, continuing to use the previous version of the webrev for your review. Sorry for my confusion. -- ramki On 4/19/2011 7:28 PM, Y. Srinivas Ramakrishna wrote: > I discovered a small flaw in this which I am going to fix. > If you have not started your review yet, please hold off > until I publish a second version of the changeset that > fixes the flaw I found. > > Sorry for not having found this earlier and for the delay. > -- ramki > > On 4/18/2011 11:34 AM, Y. Srinivas Ramakrishna wrote: >> >> I'd like a couple of code reviews for: >> >> 7037276: Unnecessary double traversal of dirty card windows >> http://cr.openjdk.java.net/~ysr/7037276/webrev.00/ >> >> Card scanning with ParNew and DefNew scavenges was collecting >> dirty card windows in one method and then doing a retraversal >> of that window in another method to clear the cards and iterate >> over the refs on those cards. This double traversal was unnecessary, >> since the collection of contiguous ranges, their clearing and the >> iteration over the covered refs could all be done in a single sweep >> over the cards covering the region of interest. >> >> I also specialized some of the arguments to these calls, >> made some internal methods private and renamed a few others >> for greater clarity. >> >> This is the second of a series of about 4 CRs aimed at eventually >> fixing 6883834 (the last in the series). Stay tuned for some >> more incremental rearrangements of related code in another >> one or two subsequent CRs. >> >> A quick performance measurement didn't show any appreciable >> change in scavenge times as a result of this change, but more >> careful performance measurements are in progress. >> >> Tested with jprt, the test case for 6883834 and refworkload server. >> Perf measurements with refworkload server in progress. >> >> thanks for your reviews. >> -- ramki > From tony.printezis at oracle.com Wed Apr 20 12:56:57 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Wed, 20 Apr 2011 15:56:57 -0400 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DA4A9CE.5070107@oracle.com> References: <4DA4A9CE.5070107@oracle.com> Message-ID: <4DAF3A89.7010706@oracle.com> Hi all, I'd still like a couple of code reviews for this. Here's the latest version (I only rephrased a couple of comments, so if you're looking at the earlier version already you can ignore this one): http://cr.openjdk.java.net/~tonyp/7034139/webrev.1/ Tony Tony Printezis wrote: > Hi, > > Could I get a couple of people to look at this? (I'd like to push this > this week if possible) > > http://cr.openjdk.java.net/~tonyp/7034139/webrev.0/ > > The actual fix is reasonably small (leave / join the > SuspendibleThreadSet only if we are in concurrent mode). Most of the > changes are new infrastructure to cause a fixed number of overflows > during marking (in non-product builds of course) to stress the > overflow code. This was the only way I could reliably reproduce the > failure. This did uncover a couple of extra issues which I also fixed: > > - If we overflow during remark we should not actually deal with it > during remark but we should abort the remark pause and restart a > concurrent mark phase. For some reason we were not doing that. I fixed > that (for this I had to ensure that the overflow flag is not cleared > when we exit the do_marking_step() method). > - Because we were clearing the overflow, it was also possible that the > workers would deadlock (for that to happen a worker had to finish > handling one overflow and immediately raise another one, so it was > highly unlikely to occur in prcatice; good to find it and eliminate it > though). > > I've already tested it, I'll run more tests overnight. > > Tony From john.cuthbertson at oracle.com Wed Apr 20 14:04:25 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 20 Apr 2011 14:04:25 -0700 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DAF3A89.7010706@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> Message-ID: <4DAF4A59.5060702@oracle.com> Hi Tony, I should get to this later this afternoon. I want to kick off testing of the fix for 7037756 first. JohnC On 04/20/11 12:56, Tony Printezis wrote: > Hi all, > > I'd still like a couple of code reviews for this. Here's the latest > version (I only rephrased a couple of comments, so if you're looking > at the earlier version already you can ignore this one): > > http://cr.openjdk.java.net/~tonyp/7034139/webrev.1/ > > Tony > > Tony Printezis wrote: >> Hi, >> >> Could I get a couple of people to look at this? (I'd like to push >> this this week if possible) >> >> http://cr.openjdk.java.net/~tonyp/7034139/webrev.0/ >> >> The actual fix is reasonably small (leave / join the >> SuspendibleThreadSet only if we are in concurrent mode). Most of the >> changes are new infrastructure to cause a fixed number of overflows >> during marking (in non-product builds of course) to stress the >> overflow code. This was the only way I could reliably reproduce the >> failure. This did uncover a couple of extra issues which I also fixed: >> >> - If we overflow during remark we should not actually deal with it >> during remark but we should abort the remark pause and restart a >> concurrent mark phase. For some reason we were not doing that. I >> fixed that (for this I had to ensure that the overflow flag is not >> cleared when we exit the do_marking_step() method). >> - Because we were clearing the overflow, it was also possible that >> the workers would deadlock (for that to happen a worker had to finish >> handling one overflow and immediately raise another one, so it was >> highly unlikely to occur in prcatice; good to find it and eliminate >> it though). >> >> I've already tested it, I'll run more tests overnight. >> >> Tony From igor.veresov at oracle.com Wed Apr 20 21:04:06 2011 From: igor.veresov at oracle.com (igor.veresov at oracle.com) Date: Thu, 21 Apr 2011 04:04:06 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7034464: Support transparent large pages on Linux Message-ID: <20110421040408.88DE147CE3@hg.openjdk.java.net> Changeset: 139667d9836a Author: iveresov Date: 2011-04-20 17:12 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/139667d9836a 7034464: Support transparent large pages on Linux Summary: Support transparent huge pages on Linux available since 2.6.38 Reviewed-by: iveresov, ysr Contributed-by: aph at redhat.com ! src/os/linux/vm/globals_linux.hpp ! src/os/linux/vm/os_linux.cpp ! src/os/linux/vm/os_linux.hpp From y.s.ramakrishna at oracle.com Thu Apr 21 03:25:09 2011 From: y.s.ramakrishna at oracle.com (y.s.ramakrishna at oracle.com) Date: Thu, 21 Apr 2011 10:25:09 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 2 new changesets Message-ID: <20110421102516.A3D5147D18@hg.openjdk.java.net> Changeset: c48ad6ab8bdf Author: ysr Date: 2011-04-20 19:19 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/c48ad6ab8bdf 7037276: Unnecessary double traversal of dirty card windows Summary: Short-circuited an unnecessary double traversal of dirty card windows when iterating younger refs. Also renamed some cardtable methods for more clarity. Reviewed-by: jmasa, stefank, poonam ! src/share/vm/gc_implementation/parNew/parCardTableModRefBS.cpp ! src/share/vm/memory/cardTableModRefBS.cpp ! src/share/vm/memory/cardTableModRefBS.hpp ! src/share/vm/memory/cardTableRS.cpp ! src/share/vm/memory/cardTableRS.hpp Changeset: c0dcda80820f Author: ysr Date: 2011-04-21 01:16 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/c0dcda80820f Merge From igor.veresov at oracle.com Thu Apr 21 14:22:38 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 21 Apr 2011 14:22:38 -0700 Subject: review(S): 7037939: NUMA: Disable adaptive resizing if SHM large pages are used Message-ID: <4DB0A01E.8040003@oracle.com> The fix has two parts: 1. On Solaris, when ISM shared memory is used it is always allocated round-robin across the nodes, so the NUMA allocator cannot work. The fix is to disable UseNUMA if ISM method is selected. I let UseNUMA win however if UseLargePages and UseSHM are not explicitly specified. 2. On Linux, it's impossible to use adaptive resizing with UseNUMA if SHM shared memory is used. That is because we cannot uncommit a page in such a mmaping. The solution is to disable adaptive resizing if the userr really wants it, that is when UseNUMA and (UseLargePages or UseSHM) are specified on the command line. Like on Solaris, I let UseNUMA win if it's explicitly specified and UseLargePages and UseSHM are not. Webrev: http://cr.openjdk.java.net/~iveresov/7037939/webrev.00/ Thanks, igor From jon.masamitsu at oracle.com Thu Apr 21 18:53:10 2011 From: jon.masamitsu at oracle.com (jon.masamitsu at oracle.com) Date: Fri, 22 Apr 2011 01:53:10 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 6946417: G1: Java VisualVM does not support G1 properly. Message-ID: <20110422015315.78DEA47DD0@hg.openjdk.java.net> Changeset: b52782ae3880 Author: jmasa Date: 2011-04-21 10:23 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/b52782ae3880 6946417: G1: Java VisualVM does not support G1 properly. Summary: Added counters for jstat Reviewed-by: tonyp, jwilhelm, stefank, ysr, johnc ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp + src/share/vm/gc_implementation/g1/g1MonitoringSupport.cpp + src/share/vm/gc_implementation/g1/g1MonitoringSupport.hpp ! src/share/vm/gc_implementation/shared/generationCounters.cpp ! src/share/vm/gc_implementation/shared/generationCounters.hpp + src/share/vm/gc_implementation/shared/hSpaceCounters.cpp + src/share/vm/gc_implementation/shared/hSpaceCounters.hpp ! src/share/vm/services/g1MemoryPool.cpp ! src/share/vm/services/g1MemoryPool.hpp From jon.masamitsu at oracle.com Fri Apr 22 14:48:38 2011 From: jon.masamitsu at oracle.com (jon.masamitsu at oracle.com) Date: Fri, 22 Apr 2011 21:48:38 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 27 new changesets Message-ID: <20110422214928.B6F4447E86@hg.openjdk.java.net> Changeset: 677234770800 Author: dsamersoff Date: 2011-03-30 19:38 +0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/677234770800 7017193: Small memory leak in get_stack_bounds os::create_stack_guard_pages Summary: getline() returns -1 but still allocate memory for str Reviewed-by: dcubed, coleenp ! src/os/linux/vm/os_linux.cpp ! src/share/vm/runtime/os.cpp ! src/share/vm/runtime/os.hpp Changeset: b025bffd6c2c Author: dholmes Date: 2011-03-31 06:54 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/b025bffd6c2c 7032775: Include Shark code in the build again Reviewed-by: ohair Contributed-by: gbenson at redhat.com, ahughes at redhat.com ! make/linux/makefiles/vm.make Changeset: 37be97a58393 Author: andrew Date: 2011-04-01 15:15 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/37be97a58393 7010849: 5/5 Extraneous javac source/target options when building sa-jdi Summary: Make code changes necessary to get rid of the '-source 1.4 -target 1.4' options. Reviewed-by: dholmes, dcubed ! agent/src/share/classes/sun/jvm/hotspot/HelloWorld.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/ByteValueImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/CharValueImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/ConnectorImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/DoubleValueImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/FieldImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/FloatValueImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/IntegerValueImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/LocalVariableImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/LocationImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/LongValueImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/MethodImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/ReferenceTypeImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/ShortValueImpl.java ! agent/src/share/classes/sun/jvm/hotspot/jdi/VirtualMachineImpl.java ! make/linux/makefiles/sa.make ! make/solaris/makefiles/sa.make ! make/windows/makefiles/sa.make Changeset: 7144a1d6e0a9 Author: kamg Date: 2011-03-31 08:08 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/7144a1d6e0a9 7030388: JCK test failed to reject invalid class check01304m10n. Summary: Restrict fix for 7020118 to only when checking exception handlers Reviewed-by: dcubed, dholmes ! src/share/vm/classfile/stackMapFrame.cpp ! src/share/vm/classfile/stackMapFrame.hpp ! src/share/vm/classfile/stackMapTable.cpp Changeset: 11427f216063 Author: dholmes Date: 2011-04-04 18:15 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/11427f216063 7009276: Add -XX:+IgnoreUnrecognizedVMOptions to several tests Reviewed-by: kvn ! test/compiler/6795161/Test.java Changeset: 1dac0f3af89f Author: ohair Date: 2011-04-07 20:26 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/1dac0f3af89f 7019210: Fix misc references to /bugreport websites Reviewed-by: skannan ! src/share/vm/runtime/arguments.cpp Changeset: c49c3947b98a Author: brutisso Date: 2011-04-11 11:12 +0200 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/c49c3947b98a 7034625: Product builds in Visual Studio projects should produce full symbol information Summary: Add the /debug flag to the linker command in Visual Studio Reviewed-by: mgronlun, poonam, hosterda ! src/share/tools/ProjectCreator/WinGammaPlatformVC10.java Changeset: 6a615eae2f34 Author: dholmes Date: 2011-04-12 02:53 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/6a615eae2f34 7034585: Adjust fillInStackTrace filtering to assist 6998871 Summary: Allow for one or more fillInStackTrace frames to be skipped Reviewed-by: mchung, kvn ! src/share/vm/classfile/javaClasses.cpp ! src/share/vm/classfile/vmSymbols.hpp Changeset: 3449f5e02cc4 Author: coleenp Date: 2011-04-12 14:18 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/3449f5e02cc4 Merge ! make/linux/makefiles/vm.make ! src/share/vm/classfile/javaClasses.cpp ! src/share/vm/classfile/stackMapFrame.cpp ! src/share/vm/classfile/stackMapFrame.hpp ! src/share/vm/classfile/stackMapTable.cpp ! src/share/vm/classfile/vmSymbols.hpp ! src/share/vm/runtime/arguments.cpp Changeset: 328926869b15 Author: jrose Date: 2011-04-09 22:55 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/328926869b15 6987991: JSR 292 phpreboot test/testtracefun2.phpr segfaults Summary: Make MH verification tests more correct, robust, and informative. Fix lingering symbol refcount problems. Reviewed-by: twisti ! src/share/vm/oops/methodOop.cpp ! src/share/vm/prims/methodHandleWalk.hpp ! src/share/vm/prims/methodHandles.cpp Changeset: 15c9a0e16269 Author: kvn Date: 2011-04-11 15:30 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/15c9a0e16269 7035713: 3DNow Prefetch Instruction Support Summary: The upcoming processors from AMD are the first that support 3dnow prefetch without supporting the 3dnow instruction set. Reviewed-by: kvn Contributed-by: tom.deneau at amd.com ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp ! src/cpu/x86/vm/vm_version_x86.cpp ! src/cpu/x86/vm/vm_version_x86.hpp ! src/cpu/x86/vm/x86_32.ad Changeset: 4b95bbb36464 Author: twisti Date: 2011-04-12 02:40 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/4b95bbb36464 7035870: JSR 292: Zero support Summary: This adds support for JSR 292 to Zero. Reviewed-by: twisti Contributed-by: Gary Benson ! src/cpu/zero/vm/bytecodeInterpreter_zero.hpp ! src/cpu/zero/vm/cppInterpreter_zero.cpp ! src/cpu/zero/vm/cppInterpreter_zero.hpp ! src/cpu/zero/vm/interpreter_zero.cpp ! src/cpu/zero/vm/methodHandles_zero.cpp ! src/share/vm/interpreter/bytecodeInterpreter.cpp ! src/share/vm/interpreter/bytecodeInterpreter.hpp Changeset: 3a808be061ff Author: iveresov Date: 2011-04-13 14:33 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/3a808be061ff 6988308: assert((cnt > 0.0f) && (prob > 0.0f)) failed: Bad frequency assignment in if Summary: Make sure cnt doesn't become negative and integer overflow doesn't happen. Reviewed-by: kvn, twisti ! src/share/vm/opto/parse2.cpp Changeset: dbccacb79c63 Author: iveresov Date: 2011-04-14 00:02 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/dbccacb79c63 7036236: VM crashes assert((!inside_attrs()) || is_error_reported()) failed ... Summary: Eliminate the race condition. Reviewed-by: kvn ! src/share/vm/code/codeCache.cpp ! src/share/vm/compiler/compileBroker.cpp ! src/share/vm/runtime/sweeper.cpp Changeset: 1fcd6e9c3965 Author: twisti Date: 2011-04-14 01:53 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/1fcd6e9c3965 7036220: Shark fails to find LLVM 2.9 System headers during build Reviewed-by: gbenson, twisti Contributed-by: Xerxes Ranby ! src/share/vm/shark/llvmHeaders.hpp Changeset: e9b9554f7fc3 Author: twisti Date: 2011-04-14 06:46 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/e9b9554f7fc3 Merge Changeset: 97e8046e2562 Author: jrose Date: 2011-04-15 08:29 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/97e8046e2562 Merge Changeset: 5504afd15955 Author: zgu Date: 2011-04-14 11:50 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/5504afd15955 7033100: CreateMinidumpOnCrash does not work for failed asserts Summary: Passing NULL as MINIDUMP_EXCEPTION_INFORMATION when calling MiniDumpWriteDump when crash is due to assertion instead of real exception to avoid creating zero-length mini dump file. Reviewed-by: acorn, dcubed, poonam, coleenp ! src/os/windows/vm/os_windows.cpp Changeset: 6c9cec219ce4 Author: vladidan Date: 2011-04-11 23:02 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/6c9cec219ce4 7005865: Crash when running with PrintIRWithLIR Summary: the failure is caused by uninitialized bci number Reviewed-by: iveresov ! src/share/vm/c1/c1_Instruction.cpp Changeset: c737922fd8bb Author: vladidan Date: 2011-04-12 10:32 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/c737922fd8bb Merge Changeset: 208b6c560ff4 Author: vladidan Date: 2011-04-14 11:02 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/208b6c560ff4 Merge ! src/share/vm/c1/c1_Instruction.cpp Changeset: a534c140904e Author: vladidan Date: 2011-04-14 23:06 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/a534c140904e Merge Changeset: 8ce625481709 Author: coleenp Date: 2011-04-15 09:36 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/8ce625481709 7032407: Crash in LinkResolver::runtime_resolve_virtual_method() Summary: Make CDS reorder vtables so that dump time vtables match run time order, so when redefine classes reinitializes them, they aren't in the wrong order. Reviewed-by: dcubed, acorn ! src/share/vm/classfile/classFileParser.cpp ! src/share/vm/classfile/systemDictionary.cpp ! src/share/vm/memory/dump.cpp ! src/share/vm/oops/instanceKlassKlass.cpp ! src/share/vm/oops/klass.cpp ! src/share/vm/oops/klassVtable.cpp ! src/share/vm/oops/klassVtable.hpp Changeset: fcc932c8238c Author: thurka Date: 2011-04-16 11:59 +0200 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/fcc932c8238c 7007254: NullPointerException occurs with jvisualvm placed under a dir. including Japanese chars Summary: use java_lang_String::create_from_platform_dependent_str() instead of java_lang_String::create_from_str() in JvmtiEnv::AddToSystemClassLoaderSearch() Reviewed-by: dcubed ! src/share/vm/prims/jvmtiEnv.cpp Changeset: df8a1555b1ea Author: coleenp Date: 2011-04-19 20:40 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/df8a1555b1ea Merge Changeset: 732454aaf5cb Author: jmasa Date: 2011-04-20 20:32 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/732454aaf5cb Merge ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/zero/vm/cppInterpreter_zero.cpp ! src/share/vm/classfile/vmSymbols.hpp ! src/share/vm/runtime/arguments.cpp Changeset: 7f3faf7159fd Author: jmasa Date: 2011-04-22 09:26 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/7f3faf7159fd Merge ! src/os/linux/vm/os_linux.cpp From tony.printezis at oracle.com Fri Apr 22 15:58:38 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 22 Apr 2011 18:58:38 -0400 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DAF3A89.7010706@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> Message-ID: <4DB2081E.5020701@oracle.com> Thanks to John Cuthbertson for looking at this. I took his advice and I'm going to disable the forced overflow by default (by setting the default parameter to 0), but leave the code in as it's helpful. Latest version here: http://cr.openjdk.java.net/~tonyp/7034139/webrev.2/ Tony Tony Printezis wrote: > Hi all, > > I'd still like a couple of code reviews for this. Here's the latest > version (I only rephrased a couple of comments, so if you're looking > at the earlier version already you can ignore this one): > > http://cr.openjdk.java.net/~tonyp/7034139/webrev.1/ > > Tony > > Tony Printezis wrote: >> Hi, >> >> Could I get a couple of people to look at this? (I'd like to push >> this this week if possible) >> >> http://cr.openjdk.java.net/~tonyp/7034139/webrev.0/ >> >> The actual fix is reasonably small (leave / join the >> SuspendibleThreadSet only if we are in concurrent mode). Most of the >> changes are new infrastructure to cause a fixed number of overflows >> during marking (in non-product builds of course) to stress the >> overflow code. This was the only way I could reliably reproduce the >> failure. This did uncover a couple of extra issues which I also fixed: >> >> - If we overflow during remark we should not actually deal with it >> during remark but we should abort the remark pause and restart a >> concurrent mark phase. For some reason we were not doing that. I >> fixed that (for this I had to ensure that the overflow flag is not >> cleared when we exit the do_marking_step() method). >> - Because we were clearing the overflow, it was also possible that >> the workers would deadlock (for that to happen a worker had to finish >> handling one overflow and immediately raise another one, so it was >> highly unlikely to occur in prcatice; good to find it and eliminate >> it though). >> >> I've already tested it, I'll run more tests overnight. >> >> Tony > From john.cuthbertson at oracle.com Fri Apr 22 17:28:01 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 22 Apr 2011 17:28:01 -0700 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 Message-ID: <4DB21D11.5090308@oracle.com> Hi Everyone, Can I have a couple of volunteers to look over these changes? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7037756/webrev.1 The issue here was very similar to the issue that caused 6789220 - the difference here was that the reference handler was blocked while waiting for the MethodCompileQueue_lock rather than waiting on a blocking compilation. To summarize: Thread 6 (reference handler thread), while owning the pending list lock, requested a compilation and was blocked waiting on the MethodCompileQueue_lock. Thread 11 (compiler thread 1), while owning the Compile_lock, attempted to allocate a Class mirror which triggered GC. In the GC it was blocked attempting to lock the pending list lock. Thread 12 (compiler thread 2) was registering a compiled method and, while owning the MethodCompileQueue_lock, was blocked waiting on the Compile_lock. The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return with enqueueing the compile task. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. Testing: the failing test case has been running successfully on the VMSQE machine for 2 days (normally I see the deadlock after 20 minutes or so); the nsk tests; and a jprt job is the queue. Thanks, JohnC From john.cuthbertson at oracle.com Fri Apr 22 17:33:26 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 22 Apr 2011 17:33:26 -0700 Subject: RFR(M): 7004681: G1: Extend marking verification to marking phase of Full GCs Message-ID: <4DB21E56.6070204@oracle.com> Hi Everyone, A new webrev for this CR can be found at: http://cr.openjdk.java.net/~johnc/MarkSweep-VerifyMark/webrev.3. I'd like to get at least another person look over these changes (Tony has already looked at an earlier version). The latest webrev includes skipping the region set verification if the verification was called from a full GC (in G1 the region sets are torn down at the start of the full GC and so the verification will give a false failure). Testing: GC test suite with +VerifyDuringGC with and without G1. Thanks, JohnC From john.cuthbertson at oracle.com Fri Apr 22 17:38:23 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 22 Apr 2011 17:38:23 -0700 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 In-Reply-To: <4DB21D11.5090308@oracle.com> References: <4DB21D11.5090308@oracle.com> Message-ID: <4DB21F7F.9050308@oracle.com> Hi EVeryone. Typo.... On 04/22/11 17:28, John Cuthbertson wrote: > Hi Everyone, > > Can I have a couple of volunteers to look over these changes? The > webrev can be found at: > http://cr.openjdk.java.net/~johnc/7037756/webrev.1 > > The issue here was very similar to the issue that caused 6789220 - the > difference here was that the reference handler was blocked while > waiting for the MethodCompileQueue_lock rather than waiting on a > blocking compilation. To summarize: > > Thread 6 (reference handler thread), while owning the pending list > lock, requested a compilation and was blocked waiting on the > MethodCompileQueue_lock. > > Thread 11 (compiler thread 1), while owning the Compile_lock, > attempted to allocate a Class mirror which triggered GC. In the GC it > was blocked attempting to lock the pending list lock. > > Thread 12 (compiler thread 2) was registering a compiled method and, > while owning the MethodCompileQueue_lock, was blocked waiting on the > Compile_lock. > > The solution is to make the reference handler thread not block while > holding the pending list lock. If the requesting thread is the > reference handler thread, then an attempt is made to lock the > MethodCompileQueue_lock in CompileBroker::compile_method_base and, if > that is unsuccessful, we just return with enqueueing the compile task. > Otherwise a regular blocking lock attempt is made. I also tweaked the > fix made by Bengt for 6789220 to make all compilation requests by the > reference handler thread non-blocking. The above paragraph should read; The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return _without_ enqueueing the compilation request. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. > > Testing: the failing test case has been running successfully on the > VMSQE machine for 2 days (normally I see the deadlock after 20 minutes > or so); the nsk tests; and a jprt job is the queue. > > Thanks, > > JohnC > From tom.rodriguez at oracle.com Fri Apr 22 18:48:55 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Fri, 22 Apr 2011 18:48:55 -0700 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 In-Reply-To: <4DB21F7F.9050308@oracle.com> References: <4DB21D11.5090308@oracle.com> <4DB21F7F.9050308@oracle.com> Message-ID: <28A87274-4AAA-4F1E-9711-D722A103312E@oracle.com> Instead of enshrining the reference handler thread itself, could you make it work by checking whether the requesting thread owns the reference handler lock instead? That seems more robust and targeted. Something like: if (instanceRefKlass:owns_pending_list_lock(JavaThread::current()) { return false; } replacing the fix in in CompileBroker::is_compile_blocking seems like it should work. tom On Apr 22, 2011, at 5:38 PM, John Cuthbertson wrote: > Hi EVeryone. > > Typo.... > > On 04/22/11 17:28, John Cuthbertson wrote: >> Hi Everyone, >> >> Can I have a couple of volunteers to look over these changes? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7037756/webrev.1 >> >> The issue here was very similar to the issue that caused 6789220 - the difference here was that the reference handler was blocked while waiting for the MethodCompileQueue_lock rather than waiting on a blocking compilation. To summarize: >> >> Thread 6 (reference handler thread), while owning the pending list lock, requested a compilation and was blocked waiting on the MethodCompileQueue_lock. >> >> Thread 11 (compiler thread 1), while owning the Compile_lock, attempted to allocate a Class mirror which triggered GC. In the GC it was blocked attempting to lock the pending list lock. >> >> Thread 12 (compiler thread 2) was registering a compiled method and, while owning the MethodCompileQueue_lock, was blocked waiting on the Compile_lock. >> >> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return with enqueueing the compile task. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. > The above paragraph should read; > > The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return _without_ enqueueing the compilation request. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >> >> Testing: the failing test case has been running successfully on the VMSQE machine for 2 days (normally I see the deadlock after 20 minutes or so); the nsk tests; and a jprt job is the queue. >> >> Thanks, >> >> JohnC >> > From john.coomes at oracle.com Sat Apr 23 08:09:29 2011 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Sat, 23 Apr 2011 15:09:29 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7037250: cscope.make database generation is silently broken Message-ID: <20110423150933.20BBB47EFB@hg.openjdk.java.net> Changeset: d6cdc6c77582 Author: jcoomes Date: 2011-04-23 04:20 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/d6cdc6c77582 7037250: cscope.make database generation is silently broken Reviewed-by: stefank + make/cscope.make ! make/linux/Makefile - make/linux/makefiles/cscope.make ! make/solaris/Makefile - make/solaris/makefiles/cscope.make From y.s.ramakrishna at oracle.com Sat Apr 23 13:49:36 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Sat, 23 Apr 2011 13:49:36 -0700 Subject: Request for review (S): 7039089 G1: changeset for 7037276 broke heap verification, and related cleanups Message-ID: <4DB33B60.7080007@oracle.com> http://cr.openjdk.java.net/~ysr/7039089/webrev.00/ The problem was that G1 was calling process_strong_roots() with collecting_perm_gen option set to false, so the closure would be used to scan the younger refs in the perm gen. However, this was not preceded in this case by a save_marks() call, so that if save_marks() had been done earlier and the perm gen had been resized subsequently, we could end up trying to scan non-existent cards in the card table. Changed options to process_strong_root() so we pass "collecting_perm_gen" == true, and appropriate class scanning options. This also allowed us to get rid of the subsequent invalidation of the perm gen cards which would otherwise have been cleared by the younger refs iteration code. Add an assertion in non_clean_cards_iterate_possibly_parallel() to catch such an issue and provide a more informative message. I also noticed that some of the obsoleted code related to the scanning of oops in the symbol table had not been removed when symbols emigrated out of the Java heap. I removed that obsolete code and references to it in the documentation/comments. Allowed VerifyBeforeExit heap verification to be more verbose either of PrintGCDetails is enabled or if Verbose is enabled. Testing: the failing test (LoadUnloadGC2), specjvm with heap verification enabled, refworkload with heap verification enabled and (ongoing) JPRT with and without heap verification enabled. Thanks for your reviews. -- ramki From fancyerii at gmail.com Sun Apr 24 20:26:41 2011 From: fancyerii at gmail.com (Li Li) Date: Mon, 25 Apr 2011 11:26:41 +0800 Subject: is there any resource about gc details of hotspot? Message-ID: hi all, I'd like to learn the detail of each garbage collector such as Serial GC, Parallel GC, G1 GC. the basic idea of these algorithm(I don't want to read the codes of open jdk now because it's hard to understand). such as how they do marking and sweeping, why some of them need stopping the world while others can run concurrently with java application. http://www.oracle.com/technetwork/java/javase/tech/index-jsp-140228.html is the official document. but I need something more detailed. thank you. From zhouyx at linux.vnet.ibm.com Sun Apr 24 22:43:14 2011 From: zhouyx at linux.vnet.ibm.com (Sean Chou) Date: Mon, 25 Apr 2011 13:43:14 +0800 Subject: is there any resource about gc details of hotspot? In-Reply-To: References: Message-ID: Hi, I think what you want is an overview of GC, wikipedia may help you better than technical documents. And IBM developerworks has some articles for introduction too. http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29 http://www.ibm.com/developerworks/java/library/j-jtp10283/ 2011/4/25 Li Li > hi all, > I'd like to learn the detail of each garbage collector such as > Serial GC, Parallel GC, G1 GC. the basic idea of these algorithm(I > don't want to read the codes of open jdk now because it's hard to > understand). such as how they do marking and sweeping, why some of > them need stopping the world while others can run concurrently with > java application. > http://www.oracle.com/technetwork/java/javase/tech/index-jsp-140228.html > is the official document. but I need something more detailed. thank > you. > -- Best Regards, Sean Chou -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110425/edc02f24/attachment.html From y.s.ramakrishna at oracle.com Mon Apr 25 01:05:31 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Mon, 25 Apr 2011 01:05:31 -0700 Subject: is there any resource about gc details of hotspot? In-Reply-To: References: Message-ID: <4DB52B4B.4010503@oracle.com> Try this one:- http://openjdk.java.net/groups/hotspot/docs/StorageManagement.html including the references at the bottom of that page. -- ramki On 4/24/2011 8:26 PM, Li Li wrote: > hi all, > I'd like to learn the detail of each garbage collector such as > Serial GC, Parallel GC, G1 GC. the basic idea of these algorithm(I > don't want to read the codes of open jdk now because it's hard to > understand). such as how they do marking and sweeping, why some of > them need stopping the world while others can run concurrently with > java application. > http://www.oracle.com/technetwork/java/javase/tech/index-jsp-140228.html > is the official document. but I need something more detailed. thank > you. From igor.veresov at oracle.com Mon Apr 25 11:38:48 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 25 Apr 2011 11:38:48 -0700 Subject: Request for review (S): 7039089 G1: changeset for 7037276 broke heap verification, and related cleanups In-Reply-To: <4DB33B60.7080007@oracle.com> References: <4DB33B60.7080007@oracle.com> Message-ID: <4DB5BFB8.5040406@oracle.com> Looks good. igor On 4/23/11 1:49 PM, Y. Srinivas Ramakrishna wrote: > > http://cr.openjdk.java.net/~ysr/7039089/webrev.00/ > > The problem was that G1 was calling process_strong_roots() > with collecting_perm_gen option set to false, so the > closure would be used to scan the younger refs in > the perm gen. However, this was not preceded in this case > by a save_marks() call, so that if save_marks() had been > done earlier and the perm gen had been resized subsequently, > we could end up trying to scan non-existent cards in the > card table. > > Changed options to process_strong_root() so we > pass "collecting_perm_gen" == true, and > appropriate class scanning options. > This also allowed us to get rid of the subsequent > invalidation of the perm gen cards which would otherwise > have been cleared by the younger refs iteration code. > > Add an assertion in non_clean_cards_iterate_possibly_parallel() > to catch such an issue and provide a more informative > message. > > I also noticed that some of the obsoleted code related to > the scanning of oops in the symbol table had not been > removed when symbols emigrated out of the Java heap. I removed > that obsolete code and references to it in the documentation/comments. > > Allowed VerifyBeforeExit heap verification to be more verbose > either of PrintGCDetails is enabled or if Verbose is enabled. > > Testing: the failing test (LoadUnloadGC2), specjvm with > heap verification enabled, refworkload with heap verification > enabled and (ongoing) JPRT with and without heap verification > enabled. > > Thanks for your reviews. > -- ramki From fancyerii at gmail.com Sun Apr 24 20:21:21 2011 From: fancyerii at gmail.com (Li Li) Date: Mon, 25 Apr 2011 11:21:21 +0800 Subject: is there any resource about gc details of hotspot? Message-ID: hi all, I'd like to learn the detail of each garbage collector such as Serial GC, Parallel GC, G1 GC. the basic idea of these algorithm(I don't want to read the codes of open jdk now because it's hard to understand). such as how they do marking and sweeping, why some of them need stopping the world while others can run concurrently with java application. http://www.oracle.com/technetwork/java/javase/tech/index-jsp-140228.html is the official document. but I need something more detailed. thank you. _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From igor.veresov at oracle.com Mon Apr 25 12:11:34 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Mon, 25 Apr 2011 12:11:34 -0700 Subject: Fwd: Re: Request for review (S): 7039089 G1: changeset for 7037276 broke heap verification, and related cleanups Message-ID: <4DB5C766.9050307@oracle.com> [resending] -------- Original Message -------- Subject: Re: Request for review (S): 7039089 G1: changeset for 7037276 broke heap verification, and related cleanups Date: Mon, 25 Apr 2011 11:38:48 -0700 From: Igor Veresov To: y.s.ramakrishna at oracle.com CC: hotspot-gc-dev Looks good. igor On 4/23/11 1:49 PM, Y. Srinivas Ramakrishna wrote: > > http://cr.openjdk.java.net/~ysr/7039089/webrev.00/ > > The problem was that G1 was calling process_strong_roots() > with collecting_perm_gen option set to false, so the > closure would be used to scan the younger refs in > the perm gen. However, this was not preceded in this case > by a save_marks() call, so that if save_marks() had been > done earlier and the perm gen had been resized subsequently, > we could end up trying to scan non-existent cards in the > card table. > > Changed options to process_strong_root() so we > pass "collecting_perm_gen" == true, and > appropriate class scanning options. > This also allowed us to get rid of the subsequent > invalidation of the perm gen cards which would otherwise > have been cleared by the younger refs iteration code. > > Add an assertion in non_clean_cards_iterate_possibly_parallel() > to catch such an issue and provide a more informative > message. > > I also noticed that some of the obsoleted code related to > the scanning of oops in the symbol table had not been > removed when symbols emigrated out of the Java heap. I removed > that obsolete code and references to it in the documentation/comments. > > Allowed VerifyBeforeExit heap verification to be more verbose > either of PrintGCDetails is enabled or if Verbose is enabled. > > Testing: the failing test (LoadUnloadGC2), specjvm with > heap verification enabled, refworkload with heap verification > enabled and (ongoing) JPRT with and without heap verification > enabled. > > Thanks for your reviews. > -- ramki From y.s.ramakrishna at oracle.com Mon Apr 25 13:11:25 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Mon, 25 Apr 2011 13:11:25 -0700 Subject: Request for review (S): 7039089 G1: changeset for 7037276 broke heap verification, and related cleanups In-Reply-To: <4DB33B60.7080007@oracle.com> References: <4DB33B60.7080007@oracle.com> Message-ID: <4DB5D56D.7060408@oracle.com> Thanks Igor, Jon and Tony for your reviews & suggestions! Based upon these review/suggestions i have uploaded a new webrev into http://cr.openjdk.java.net/~ysr/7039089/webrev.01/ and plan to push these hopefully in time for the testing tonight, JPRT gremlins permitting. -- ramki On 4/23/2011 1:49 PM, Y. Srinivas Ramakrishna wrote: > > http://cr.openjdk.java.net/~ysr/7039089/webrev.00/ > > The problem was that G1 was calling process_strong_roots() > with collecting_perm_gen option set to false, so the > closure would be used to scan the younger refs in > the perm gen. However, this was not preceded in this case > by a save_marks() call, so that if save_marks() had been > done earlier and the perm gen had been resized subsequently, > we could end up trying to scan non-existent cards in the > card table. > > Changed options to process_strong_root() so we > pass "collecting_perm_gen" == true, and > appropriate class scanning options. > This also allowed us to get rid of the subsequent > invalidation of the perm gen cards which would otherwise > have been cleared by the younger refs iteration code. > > Add an assertion in non_clean_cards_iterate_possibly_parallel() > to catch such an issue and provide a more informative > message. > > I also noticed that some of the obsoleted code related to > the scanning of oops in the symbol table had not been > removed when symbols emigrated out of the Java heap. I removed > that obsolete code and references to it in the documentation/comments. > > Allowed VerifyBeforeExit heap verification to be more verbose > either of PrintGCDetails is enabled or if Verbose is enabled. > > Testing: the failing test (LoadUnloadGC2), specjvm with > heap verification enabled, refworkload with heap verification > enabled and (ongoing) JPRT with and without heap verification > enabled. > > Thanks for your reviews. > -- ramki From y.s.ramakrishna at oracle.com Mon Apr 25 15:47:47 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Mon, 25 Apr 2011 15:47:47 -0700 Subject: review(S): 7037939: NUMA: Disable adaptive resizing if SHM large pages are used In-Reply-To: <4DB0A01E.8040003@oracle.com> References: <4DB0A01E.8040003@oracle.com> Message-ID: <4DB5FA13.9090006@oracle.com> Looks good to me. -- ramki On 4/21/2011 2:22 PM, Igor Veresov wrote: > The fix has two parts: > 1. On Solaris, when ISM shared memory is used it is always allocated round-robin across the nodes, > so the NUMA allocator cannot work. The fix is to disable UseNUMA if ISM method is selected. I let > UseNUMA win however if UseLargePages and UseSHM are not explicitly specified. > > 2. On Linux, it's impossible to use adaptive resizing with UseNUMA if SHM shared memory is used. > That is because we cannot uncommit a page in such a mmaping. The solution is to disable adaptive > resizing if the userr really wants it, that is when UseNUMA and (UseLargePages or UseSHM) are > specified on the command line. Like on Solaris, I let UseNUMA win if it's explicitly specified and > UseLargePages and UseSHM are not. > > > Webrev: http://cr.openjdk.java.net/~iveresov/7037939/webrev.00/ > > > Thanks, > igor From john.cuthbertson at oracle.com Mon Apr 25 16:37:15 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Mon, 25 Apr 2011 16:37:15 -0700 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 In-Reply-To: <28A87274-4AAA-4F1E-9711-D722A103312E@oracle.com> References: <4DB21D11.5090308@oracle.com> <4DB21F7F.9050308@oracle.com> <28A87274-4AAA-4F1E-9711-D722A103312E@oracle.com> Message-ID: <4DB605AB.7080201@oracle.com> Hi Everyone, A new webrev that is essentially Tom's suggestion can be found at: http://cr.openjdk.java.net/~johnc/7037756/webrev.2/. I also reverted Bengt's fix for 6789220 as, with Tom's suggested fix, a thread that owns the pending list will no longer be blocked in CompileBroker::compile_method_base. Testing: Ran over the weekend with the test case for 7037756; the test case for 6789220 (which fails 50% of the time with Bengt's fix removed); nsk tests; jprt. Thanks, JohnC On 04/22/11 18:48, Tom Rodriguez wrote: > Instead of enshrining the reference handler thread itself, could you make it work by checking whether the requesting thread owns the reference handler lock instead? That seems more robust and targeted. Something like: > > if (instanceRefKlass:owns_pending_list_lock(JavaThread::current()) { > return false; > } > > replacing the fix in in CompileBroker::is_compile_blocking seems like it should work. > > tom > > On Apr 22, 2011, at 5:38 PM, John Cuthbertson wrote: > > >> Hi EVeryone. >> >> Typo.... >> >> On 04/22/11 17:28, John Cuthbertson wrote: >> >>> Hi Everyone, >>> >>> Can I have a couple of volunteers to look over these changes? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7037756/webrev.1 >>> >>> The issue here was very similar to the issue that caused 6789220 - the difference here was that the reference handler was blocked while waiting for the MethodCompileQueue_lock rather than waiting on a blocking compilation. To summarize: >>> >>> Thread 6 (reference handler thread), while owning the pending list lock, requested a compilation and was blocked waiting on the MethodCompileQueue_lock. >>> >>> Thread 11 (compiler thread 1), while owning the Compile_lock, attempted to allocate a Class mirror which triggered GC. In the GC it was blocked attempting to lock the pending list lock. >>> >>> Thread 12 (compiler thread 2) was registering a compiled method and, while owning the MethodCompileQueue_lock, was blocked waiting on the Compile_lock. >>> >>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return with enqueueing the compile task. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>> >> The above paragraph should read; >> >> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return _without_ enqueueing the compilation request. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >> >>> Testing: the failing test case has been running successfully on the VMSQE machine for 2 days (normally I see the deadlock after 20 minutes or so); the nsk tests; and a jprt job is the queue. >>> >>> Thanks, >>> >>> JohnC >>> >>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110425/1cd74897/attachment.html From tom.rodriguez at oracle.com Mon Apr 25 17:12:45 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Mon, 25 Apr 2011 17:12:45 -0700 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 In-Reply-To: <4DB605AB.7080201@oracle.com> References: <4DB21D11.5090308@oracle.com> <4DB21F7F.9050308@oracle.com> <28A87274-4AAA-4F1E-9711-D722A103312E@oracle.com> <4DB605AB.7080201@oracle.com> Message-ID: <41B43E13-9FE0-41CF-BA2B-865AEE65A1A1@oracle.com> Seems ok. We'll never compile the main loop of the reference handler thread new but that doesn't seem like it should matter much. tom On Apr 25, 2011, at 4:37 PM, John Cuthbertson wrote: > Hi Everyone, > > A new webrev that is essentially Tom's suggestion can be found at: http://cr.openjdk.java.net/~johnc/7037756/webrev.2/. I also reverted Bengt's fix for 6789220 as, with Tom's suggested fix, a thread that owns the pending list will no longer be blocked in CompileBroker::compile_method_base. > > Testing: Ran over the weekend with the test case for 7037756; the test case for 6789220 (which fails 50% of the time with Bengt's fix removed); nsk tests; jprt. > > Thanks, > > JohnC > > > On 04/22/11 18:48, Tom Rodriguez wrote: >> Instead of enshrining the reference handler thread itself, could you make it work by checking whether the requesting thread owns the reference handler lock instead? That seems more robust and targeted. Something like: >> >> if (instanceRefKlass:owns_pending_list_lock(JavaThread::current()) { >> return false; >> } >> >> replacing the fix in in CompileBroker::is_compile_blocking seems like it should work. >> >> tom >> >> On Apr 22, 2011, at 5:38 PM, John Cuthbertson wrote: >> >> >> >>> Hi EVeryone. >>> >>> Typo.... >>> >>> On 04/22/11 17:28, John Cuthbertson wrote: >>> >>> >>>> Hi Everyone, >>>> >>>> Can I have a couple of volunteers to look over these changes? The webrev can be found at: >>>> http://cr.openjdk.java.net/~johnc/7037756/webrev.1 >>>> >>>> >>>> The issue here was very similar to the issue that caused 6789220 - the difference here was that the reference handler was blocked while waiting for the MethodCompileQueue_lock rather than waiting on a blocking compilation. To summarize: >>>> >>>> Thread 6 (reference handler thread), while owning the pending list lock, requested a compilation and was blocked waiting on the MethodCompileQueue_lock. >>>> >>>> Thread 11 (compiler thread 1), while owning the Compile_lock, attempted to allocate a Class mirror which triggered GC. In the GC it was blocked attempting to lock the pending list lock. >>>> >>>> Thread 12 (compiler thread 2) was registering a compiled method and, while owning the MethodCompileQueue_lock, was blocked waiting on the Compile_lock. >>>> >>>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return with enqueueing the compile task. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>>> >>>> >>> The above paragraph should read; >>> >>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return _without_ enqueueing the compilation request. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>> >>> >>>> Testing: the failing test case has been running successfully on the VMSQE machine for 2 days (normally I see the deadlock after 20 minutes or so); the nsk tests; and a jprt job is the queue. >>>> >>>> Thanks, >>>> >>>> JohnC >>>> >>>> >>>> >> >> >> > From john.cuthbertson at oracle.com Mon Apr 25 17:56:05 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Mon, 25 Apr 2011 17:56:05 -0700 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 In-Reply-To: <41B43E13-9FE0-41CF-BA2B-865AEE65A1A1@oracle.com> References: <4DB21D11.5090308@oracle.com> <4DB21F7F.9050308@oracle.com> <28A87274-4AAA-4F1E-9711-D722A103312E@oracle.com> <4DB605AB.7080201@oracle.com> <41B43E13-9FE0-41CF-BA2B-865AEE65A1A1@oracle.com> Message-ID: <4DB61825.4040101@oracle.com> Hi Tom, Thanks for the review and suggestion. Earlier I instrumented the compile tasks to record and display the requesting thread. In the test case (a jck test suite) I only saw 4 or 5 compilation requests coming from the reference handler thread. If it becomes performance critical then I guess a variant can be resurrected - though I prefer the simpler code. Thanks again. JohnC On 04/25/11 17:12, Tom Rodriguez wrote: > Seems ok. We'll never compile the main loop of the reference handler thread new but that doesn't seem like it should matter much. > > tom > > On Apr 25, 2011, at 4:37 PM, John Cuthbertson wrote: > > >> Hi Everyone, >> >> A new webrev that is essentially Tom's suggestion can be found at: http://cr.openjdk.java.net/~johnc/7037756/webrev.2/. I also reverted Bengt's fix for 6789220 as, with Tom's suggested fix, a thread that owns the pending list will no longer be blocked in CompileBroker::compile_method_base. >> >> Testing: Ran over the weekend with the test case for 7037756; the test case for 6789220 (which fails 50% of the time with Bengt's fix removed); nsk tests; jprt. >> >> Thanks, >> >> JohnC >> >> >> On 04/22/11 18:48, Tom Rodriguez wrote: >> >>> Instead of enshrining the reference handler thread itself, could you make it work by checking whether the requesting thread owns the reference handler lock instead? That seems more robust and targeted. Something like: >>> >>> if (instanceRefKlass:owns_pending_list_lock(JavaThread::current()) { >>> return false; >>> } >>> >>> replacing the fix in in CompileBroker::is_compile_blocking seems like it should work. >>> >>> tom >>> >>> On Apr 22, 2011, at 5:38 PM, John Cuthbertson wrote: >>> >>> >>> >>> >>>> Hi EVeryone. >>>> >>>> Typo.... >>>> >>>> On 04/22/11 17:28, John Cuthbertson wrote: >>>> >>>> >>>> >>>>> Hi Everyone, >>>>> >>>>> Can I have a couple of volunteers to look over these changes? The webrev can be found at: >>>>> http://cr.openjdk.java.net/~johnc/7037756/webrev.1 >>>>> >>>>> >>>>> The issue here was very similar to the issue that caused 6789220 - the difference here was that the reference handler was blocked while waiting for the MethodCompileQueue_lock rather than waiting on a blocking compilation. To summarize: >>>>> >>>>> Thread 6 (reference handler thread), while owning the pending list lock, requested a compilation and was blocked waiting on the MethodCompileQueue_lock. >>>>> >>>>> Thread 11 (compiler thread 1), while owning the Compile_lock, attempted to allocate a Class mirror which triggered GC. In the GC it was blocked attempting to lock the pending list lock. >>>>> >>>>> Thread 12 (compiler thread 2) was registering a compiled method and, while owning the MethodCompileQueue_lock, was blocked waiting on the Compile_lock. >>>>> >>>>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return with enqueueing the compile task. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>>>> >>>>> >>>>> >>>> The above paragraph should read; >>>> >>>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return _without_ enqueueing the compilation request. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>>> >>>> >>>> >>>>> Testing: the failing test case has been running successfully on the VMSQE machine for 2 days (normally I see the deadlock after 20 minutes or so); the nsk tests; and a jprt job is the queue. >>>>> >>>>> Thanks, >>>>> >>>>> JohnC >>>>> >>>>> >>>>> >>>>> >>> >>> >>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110425/303f24c9/attachment-0001.html From y.s.ramakrishna at oracle.com Mon Apr 25 19:20:47 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Mon, 25 Apr 2011 19:20:47 -0700 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 In-Reply-To: <41B43E13-9FE0-41CF-BA2B-865AEE65A1A1@oracle.com> References: <4DB21D11.5090308@oracle.com> <4DB21F7F.9050308@oracle.com> <28A87274-4AAA-4F1E-9711-D722A103312E@oracle.com> <4DB605AB.7080201@oracle.com> <41B43E13-9FE0-41CF-BA2B-865AEE65A1A1@oracle.com> Message-ID: <4DB62BFF.1030503@oracle.com> On 4/25/2011 5:12 PM, Tom Rodriguez wrote: > Seems ok. We'll never compile the main loop of the reference handler thread new but that doesn't seem like it should matter much. May be test with a benchmark that stresses Reference object handling. May be the bottleneck will still not be the reference handler thread running interpreted code, but worth checking (may be later). Can the compiler thread be prevented from trying to allocate in the heap while holding the Compile_lock? (Would it work for it to drop and reacquire the lock around such allocations, or is that not feasible? Just doing some loud thinking.) (Or could one safely pre-compile the reference handler's code before start-up so it doesn't always run interpreted?) -- ramki > > tom > > On Apr 25, 2011, at 4:37 PM, John Cuthbertson wrote: > >> Hi Everyone, >> >> A new webrev that is essentially Tom's suggestion can be found at: http://cr.openjdk.java.net/~johnc/7037756/webrev.2/. I also reverted Bengt's fix for 6789220 as, with Tom's suggested fix, a thread that owns the pending list will no longer be blocked in CompileBroker::compile_method_base. >> >> Testing: Ran over the weekend with the test case for 7037756; the test case for 6789220 (which fails 50% of the time with Bengt's fix removed); nsk tests; jprt. >> >> Thanks, >> >> JohnC >> >> >> On 04/22/11 18:48, Tom Rodriguez wrote: >>> Instead of enshrining the reference handler thread itself, could you make it work by checking whether the requesting thread owns the reference handler lock instead? That seems more robust and targeted. Something like: >>> >>> if (instanceRefKlass:owns_pending_list_lock(JavaThread::current()) { >>> return false; >>> } >>> >>> replacing the fix in in CompileBroker::is_compile_blocking seems like it should work. >>> >>> tom >>> >>> On Apr 22, 2011, at 5:38 PM, John Cuthbertson wrote: >>> >>> >>> >>>> Hi EVeryone. >>>> >>>> Typo.... >>>> >>>> On 04/22/11 17:28, John Cuthbertson wrote: >>>> >>>> >>>>> Hi Everyone, >>>>> >>>>> Can I have a couple of volunteers to look over these changes? The webrev can be found at: >>>>> http://cr.openjdk.java.net/~johnc/7037756/webrev.1 >>>>> >>>>> >>>>> The issue here was very similar to the issue that caused 6789220 - the difference here was that the reference handler was blocked while waiting for the MethodCompileQueue_lock rather than waiting on a blocking compilation. To summarize: >>>>> >>>>> Thread 6 (reference handler thread), while owning the pending list lock, requested a compilation and was blocked waiting on the MethodCompileQueue_lock. >>>>> >>>>> Thread 11 (compiler thread 1), while owning the Compile_lock, attempted to allocate a Class mirror which triggered GC. In the GC it was blocked attempting to lock the pending list lock. >>>>> >>>>> Thread 12 (compiler thread 2) was registering a compiled method and, while owning the MethodCompileQueue_lock, was blocked waiting on the Compile_lock. >>>>> >>>>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return with enqueueing the compile task. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>>>> >>>>> >>>> The above paragraph should read; >>>> >>>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return _without_ enqueueing the compilation request. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>>> >>>> >>>>> Testing: the failing test case has been running successfully on the VMSQE machine for 2 days (normally I see the deadlock after 20 minutes or so); the nsk tests; and a jprt job is the queue. >>>>> >>>>> Thanks, >>>>> >>>>> JohnC >>>>> >>>>> >>>>> >>> >>> >>> >> > From tom.rodriguez at oracle.com Mon Apr 25 19:58:20 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Mon, 25 Apr 2011 19:58:20 -0700 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 In-Reply-To: <4DB62BFF.1030503@oracle.com> References: <4DB21D11.5090308@oracle.com> <4DB21F7F.9050308@oracle.com> <28A87274-4AAA-4F1E-9711-D722A103312E@oracle.com> <4DB605AB.7080201@oracle.com> <41B43E13-9FE0-41CF-BA2B-865AEE65A1A1@oracle.com> <4DB62BFF.1030503@oracle.com> Message-ID: <401F8218-C523-4063-9F74-85F1F4AB536A@oracle.com> On Apr 25, 2011, at 7:20 PM, Y. Srinivas Ramakrishna wrote: > On 4/25/2011 5:12 PM, Tom Rodriguez wrote: >> Seems ok. We'll never compile the main loop of the reference handler thread new but that doesn't seem like it should matter much. > > May be test with a benchmark that stresses Reference object handling. > May be the bottleneck will still not be the reference handler thread > running interpreted code, but worth checking (may be later). > > Can the compiler thread be prevented from trying to allocate > in the heap while holding the Compile_lock? (Would it work > for it to drop and reacquire the lock around such allocations, > or is that not feasible? Just doing some loud thinking.) Well the compiler itself isn't acquiring the Compile_lock. It appears to be this code in objArrayKlassKlass. KlassHandle ek; { MutexUnlocker mu(MultiArray_lock); MutexUnlocker mc(Compile_lock); // for vtables klassOop sk = element_super->array_klass(CHECK_0); super_klass = KlassHandle(THREAD, sk); for( int i = element_supers->length()-1; i >= 0; i-- ) { KlassHandle elem_super (THREAD, element_supers->obj_at(i)); elem_super->array_klass(CHECK_0); } // Now retry from the beginning klassOop klass_oop = element_klass->array_klass(n, CHECK_0); // Create a handle because the enclosing brace, when locking // can cause a gc. Better to have this function return a Handle. ek = KlassHandle(THREAD, klass_oop); } // re-lock return ek(); I really don't understand the usage of Compile_lock here or the comment "for vtables". All the array klasses do something similar and I don't get it. The main purpose of Compile_lock is control updates to the system dictionary but I don't see why the array klasses have to do anything special here. > > (Or could one safely pre-compile the reference handler's code > before start-up so it doesn't always run interpreted?) We could have a small side queue that we could enqueue these on and when a later safe request comes in we can request a non blocking compile of them. It's kind of gross but it would work I think. It could even just be a queue of length one instead of a full data structure. tom > > -- ramki > >> >> tom >> >> On Apr 25, 2011, at 4:37 PM, John Cuthbertson wrote: >> >>> Hi Everyone, >>> >>> A new webrev that is essentially Tom's suggestion can be found at: http://cr.openjdk.java.net/~johnc/7037756/webrev.2/. I also reverted Bengt's fix for 6789220 as, with Tom's suggested fix, a thread that owns the pending list will no longer be blocked in CompileBroker::compile_method_base. >>> >>> Testing: Ran over the weekend with the test case for 7037756; the test case for 6789220 (which fails 50% of the time with Bengt's fix removed); nsk tests; jprt. >>> >>> Thanks, >>> >>> JohnC >>> >>> >>> On 04/22/11 18:48, Tom Rodriguez wrote: >>>> Instead of enshrining the reference handler thread itself, could you make it work by checking whether the requesting thread owns the reference handler lock instead? That seems more robust and targeted. Something like: >>>> >>>> if (instanceRefKlass:owns_pending_list_lock(JavaThread::current()) { >>>> return false; >>>> } >>>> >>>> replacing the fix in in CompileBroker::is_compile_blocking seems like it should work. >>>> >>>> tom >>>> >>>> On Apr 22, 2011, at 5:38 PM, John Cuthbertson wrote: >>>> >>>> >>>> >>>>> Hi EVeryone. >>>>> >>>>> Typo.... >>>>> >>>>> On 04/22/11 17:28, John Cuthbertson wrote: >>>>> >>>>> >>>>>> Hi Everyone, >>>>>> >>>>>> Can I have a couple of volunteers to look over these changes? The webrev can be found at: >>>>>> http://cr.openjdk.java.net/~johnc/7037756/webrev.1 >>>>>> >>>>>> >>>>>> The issue here was very similar to the issue that caused 6789220 - the difference here was that the reference handler was blocked while waiting for the MethodCompileQueue_lock rather than waiting on a blocking compilation. To summarize: >>>>>> >>>>>> Thread 6 (reference handler thread), while owning the pending list lock, requested a compilation and was blocked waiting on the MethodCompileQueue_lock. >>>>>> >>>>>> Thread 11 (compiler thread 1), while owning the Compile_lock, attempted to allocate a Class mirror which triggered GC. In the GC it was blocked attempting to lock the pending list lock. >>>>>> >>>>>> Thread 12 (compiler thread 2) was registering a compiled method and, while owning the MethodCompileQueue_lock, was blocked waiting on the Compile_lock. >>>>>> >>>>>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return with enqueueing the compile task. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>>>>> >>>>>> >>>>> The above paragraph should read; >>>>> >>>>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return _without_ enqueueing the compilation request. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>>>> >>>>> >>>>>> Testing: the failing test case has been running successfully on the VMSQE machine for 2 days (normally I see the deadlock after 20 minutes or so); the nsk tests; and a jprt job is the queue. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> JohnC >>>>>> >>>>>> >>>>>> >>>> >>>> >>>> >>> >> > From bengt.rutisson at oracle.com Mon Apr 25 23:19:32 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 26 Apr 2011 08:19:32 +0200 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 In-Reply-To: <4DB605AB.7080201@oracle.com> References: <4DB21D11.5090308@oracle.com> <4DB21F7F.9050308@oracle.com> <28A87274-4AAA-4F1E-9711-D722A103312E@oracle.com> <4DB605AB.7080201@oracle.com> Message-ID: <4DB663F4.5030606@oracle.com> Hi John, I like this much better than my filtering approach. I think it looks good. Just one nit: I think it might be worth adding to the comment for your code that the deadlock that might occur is with the GC. That might make it easier for someone reading the code to figure out why this is dangerous. So, maybe instead of: // If the requesting thread is holding the pending list lock // then we just return. We can't risk blocking while holding // the pending list lock or a deadlock may occur. Something like: // If the requesting thread is holding the pending list lock // then we just return. We can't risk blocking while holding // the pending list lock or a deadlock with the GC (that needs // to take the pending list lock) may occur. Bengt On 2011-04-26 01:37, John Cuthbertson wrote: > Hi Everyone, > > A new webrev that is essentially Tom's suggestion can be found at: > http://cr.openjdk.java.net/~johnc/7037756/webrev.2/. I also reverted > Bengt's fix for 6789220 as, with Tom's suggested fix, a thread that > owns the pending list will no longer be blocked in > CompileBroker::compile_method_base. > > Testing: Ran over the weekend with the test case for 7037756; the test > case for 6789220 (which fails 50% of the time with Bengt's fix > removed); nsk tests; jprt. > > Thanks, > > JohnC > > > On 04/22/11 18:48, Tom Rodriguez wrote: >> Instead of enshrining the reference handler thread itself, could you make it work by checking whether the requesting thread owns the reference handler lock instead? That seems more robust and targeted. Something like: >> >> if (instanceRefKlass:owns_pending_list_lock(JavaThread::current()) { >> return false; >> } >> >> replacing the fix in in CompileBroker::is_compile_blocking seems like it should work. >> >> tom >> >> On Apr 22, 2011, at 5:38 PM, John Cuthbertson wrote: >> >> >>> Hi EVeryone. >>> >>> Typo.... >>> >>> On 04/22/11 17:28, John Cuthbertson wrote: >>> >>>> Hi Everyone, >>>> >>>> Can I have a couple of volunteers to look over these changes? The webrev can be found at:http://cr.openjdk.java.net/~johnc/7037756/webrev.1 >>>> >>>> The issue here was very similar to the issue that caused 6789220 - the difference here was that the reference handler was blocked while waiting for the MethodCompileQueue_lock rather than waiting on a blocking compilation. To summarize: >>>> >>>> Thread 6 (reference handler thread), while owning the pending list lock, requested a compilation and was blocked waiting on the MethodCompileQueue_lock. >>>> >>>> Thread 11 (compiler thread 1), while owning the Compile_lock, attempted to allocate a Class mirror which triggered GC. In the GC it was blocked attempting to lock the pending list lock. >>>> >>>> Thread 12 (compiler thread 2) was registering a compiled method and, while owning the MethodCompileQueue_lock, was blocked waiting on the Compile_lock. >>>> >>>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return with enqueueing the compile task. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>>> >>> The above paragraph should read; >>> >>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return _without_ enqueueing the compilation request. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking. >>> >>>> Testing: the failing test case has been running successfully on the VMSQE machine for 2 days (normally I see the deadlock after 20 minutes or so); the nsk tests; and a jprt job is the queue. >>>> >>>> Thanks, >>>> >>>> JohnC >>>> >>>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110426/43de9364/attachment.html From do.chuan at gmail.com Mon Apr 25 18:52:19 2011 From: do.chuan at gmail.com (dochuan) Date: Tue, 26 Apr 2011 09:52:19 +0800 Subject: is there any resource about gc details of hotspot? In-Reply-To: References: Message-ID: <4DB62553.704@gmail.com> book: Garbage Collection: algorithms for automatic dynamic memory management and http://www.hpl.hp.com/personal/Hans_Boehm/ On 11-4-25 ??11:21, Li Li wrote: > hi all, > I'd like to learn the detail of each garbage collector such as > Serial GC, Parallel GC, G1 GC. the basic idea of these algorithm(I > don't want to read the codes of open jdk now because it's hard to > understand). such as how they do marking and sweeping, why some of > them need stopping the world while others can run concurrently with > java application. > http://www.oracle.com/technetwork/java/javase/tech/index-jsp-140228.html > is the official document. but I need something more detailed. thank > you. > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tony.printezis at oracle.com Tue Apr 26 10:02:49 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 26 Apr 2011 13:02:49 -0400 Subject: CRR: 7039627: G1: avoid BOT updates for survivor allocations and dirty survivor regions incrementally (L) Message-ID: <4DB6FAB9.9000705@oracle.com> Hi, This is the long-awaited :-) GC alloc region refactoring for G1 that I've been working on for a while now (in the background). A lot of that allocation code during GC is very similar to the code that manages the mutator allocation regions so we might as well share it. We recently introduced the G1AllocRegion abstraction for mutator alloc regions. Now we're going to re-use it for GC alloc regions too (and remove a lot of replicated code in the process). The webrev is here: http://cr.openjdk.java.net/~tonyp/7039627/webrev.0/ (don't let the number of lines changed intimidate you; over 60% of them correspond to code that was removed) Quick summary of the improvements: - Removed most of the code that manages the GC alloc regions and replaced it with subclasses of G1AllocRegion (one for survivor regions, the other for old regions). We now keep the two GC alloc regions separate (before they could point to the same physical region) as we have to handle them differently (do/don't do BOT updates, retire them differently, etc.) and we don't want to have to add checks everywhere. - No BOT updates for survivor regions (the same way we do not need them for mutator allocation regions). - The cards of survivor regions are now dirtied incrementally (the same way it's done for mutator allocation regions). - We do not link the GC alloc regions into a list any more in order to do any post-GC cleanup on them at the end of the GC. Instead, any cleanup that needs to be done it's done as each region is retired. So we save the extra post-GC step. - Apart from not linking the GC alloc regions, I also removed the "is_gc_alloc" flag as we do not need to check it any more (and this saves us having to reset the flag at the end of the GC, which helped in eliminating the post-GC cleanup step). - The new code also fixes a subtle bug. In the old code, when a GC thread allocated a new region is allowed other threads to allocate out of it before attempting its allocation (the allocation that essentially caused the new region to be allocated). But, that allocation is not guaranteed to succeeded (given that other threads might have meanwhile filled up the region) and this was not handled correctly in the code. This would cause an unnecessary evacuation failure. The new code fixes this bug as this case is handled correctly in the G1AllocRegion class (the thread that allocates the region will first satisfy its own allocation request before allowing anybody else to allocate out of the new region). I'd like a couple of reviews please. :-) Tony From tony.printezis at oracle.com Tue Apr 26 10:08:45 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 26 Apr 2011 13:08:45 -0400 Subject: CRR: 7039627: G1: avoid BOT updates for survivor allocations and dirty survivor regions incrementally (L) In-Reply-To: <4DB6FAB9.9000705@oracle.com> References: <4DB6FAB9.9000705@oracle.com> Message-ID: <4DB6FC1D.5010500@oracle.com> PS Sorry, I should have said: I did some before/after performance measurements. Here are the findings: - intel : didn't see much change, even though collect/analyze did show that the number of cache misses / branch mispredictions during allocation went down dramatically in the new version. - sparc (Niagara and non-Niagara) : I saw some modest GC time improvements with the new workspace (a few %). Note that the allocation code that does BOT updates takes a lock. So, by decreasing the number of BOT updates we do will decrease a potential scalability bottleneck. This might be why I saw benefit on sparc : the sparc boxes had more HW parallelism than the intel box I ran on. Tony Tony Printezis wrote: > Hi, > > This is the long-awaited :-) GC alloc region refactoring for G1 that > I've been working on for a while now (in the background). > > A lot of that allocation code during GC is very similar to the code > that manages the mutator allocation regions so we might as well share > it. We recently introduced the G1AllocRegion abstraction for mutator > alloc regions. Now we're going to re-use it for GC alloc regions too > (and remove a lot of replicated code in the process). > > The webrev is here: > > http://cr.openjdk.java.net/~tonyp/7039627/webrev.0/ > > (don't let the number of lines changed intimidate you; over 60% of > them correspond to code that was removed) > > Quick summary of the improvements: > > - Removed most of the code that manages the GC alloc regions and > replaced it with subclasses of G1AllocRegion (one for survivor > regions, the other for old regions). We now keep the two GC alloc > regions separate (before they could point to the same physical region) > as we have to handle them differently (do/don't do BOT updates, retire > them differently, etc.) and we don't want to have to add checks > everywhere. > - No BOT updates for survivor regions (the same way we do not need > them for mutator allocation regions). > - The cards of survivor regions are now dirtied incrementally (the > same way it's done for mutator allocation regions). > - We do not link the GC alloc regions into a list any more in order to > do any post-GC cleanup on them at the end of the GC. Instead, any > cleanup that needs to be done it's done as each region is retired. So > we save the extra post-GC step. > - Apart from not linking the GC alloc regions, I also removed the > "is_gc_alloc" flag as we do not need to check it any more (and this > saves us having to reset the flag at the end of the GC, which helped > in eliminating the post-GC cleanup step). > - The new code also fixes a subtle bug. In the old code, when a GC > thread allocated a new region is allowed other threads to allocate out > of it before attempting its allocation (the allocation that > essentially caused the new region to be allocated). But, that > allocation is not guaranteed to succeeded (given that other threads > might have meanwhile filled up the region) and this was not handled > correctly in the code. This would cause an unnecessary evacuation > failure. The new code fixes this bug as this case is handled correctly > in the G1AllocRegion class (the thread that allocates the region will > first satisfy its own allocation request before allowing anybody else > to allocate out of the new region). > > I'd like a couple of reviews please. :-) > > Tony > > From shane.cox at gmail.com Tue Apr 26 10:36:37 2011 From: shane.cox at gmail.com (Shane Cox) Date: Tue, 26 Apr 2011 13:36:37 -0400 Subject: Periodic long minor GC pauses Message-ID: Periodically, our Java app on Linux experiences a long Minor GC pause that cannot be accounted for by the GC time in the log file. Instead, the pause is captured as "real" (wall clock) time and is observable in our application logs. An example is below. The GC completed in 56ms, but the application was paused for 2.45 seconds. 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: [ParNew: 943439K->104832K(943744K), 0.0481790 secs] 4909998K->4086751K(25060992K), 0.0485110 secs] [Times: user=0.34 sys=0.03, real=0.04 secs] 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: [ParNew: 942852K->104832K(943744K), 0.0738000 secs] 4924772K->4150899K(25060992K), 0.0740980 secs] [Times: user=0.45 sys=0.12, real=0.07 secs] 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: user=0.31 sys=0.09, *real=2.45 secs]* 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: [ParNew: 918208K->81040K(943744K), 0.0396620 secs] 5026432K->4189265K(25060992K), 0.0400030 secs] [Times: user=0.32 sys=0.00, real=0.04 secs] 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: [ParNew: 919952K->104832K(943744K), 0.0845070 secs] 5028177K->4268050K(25060992K), 0.0848300 secs] [Times: user=0.52 sys=0.11, real=0.09 secs] Initially I suspected swapping, but according to the free command, 0 bytes of swap are in use. >free -m total used free shared buffers cached Mem: 32168 28118 4050 0 824 12652 -/+ buffers/cache: 14641 17527 Swap: 8191 0 8191 Next, I read about a problem relating to mprotect() on Linux that can be worked around with -XX:+UseMember. I tried that, but I still see the same unexplainable pauses. Any suggestions/ideas? We've upgraded to the latest JDK, but no luck. Thanks, Shane java version "1.6.0_25" Java(TM) SE Runtime Environment (build 1.6.0_25-b06) Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode) Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 x86_64 x86_64 GNU/Linux -verbose:gc -Xms24g -Xmx24g -Xmn1g -Xss256k -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings -XX:+UseMembar -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110426/f9ba1dd4/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Tue Apr 26 10:45:55 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 26 Apr 2011 10:45:55 -0700 Subject: Periodic long minor GC pauses In-Reply-To: References: Message-ID: <4DB704D3.20600@oracle.com> The pause is definitely in the beginning, before GC collection code itself runs; witness the timestamps:- 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: user=0.31 sys=0.09, real=2.45 secs] The first timestamp is 2120.686 and the next one is 2123.075, so we have about 2.389 s between those two. If you add to that the GC time of 0.056 s, you get 2.445 which is close enough to the 2.45 s reported. So we need to figure out what happens in the JVM between those two time-stamps and we can at least bound the culprit. -- ramki On 04/26/11 10:36, Shane Cox wrote: > Periodically, our Java app on Linux experiences a long Minor GC pause > that cannot be accounted for by the GC time in the log file. Instead, > the pause is captured as "real" (wall clock) time and is observable in > our application logs. An example is below. The GC completed in 56ms, > but the application was paused for 2.45 seconds. > > 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: [ParNew: > 943439K->104832K(943744K), 0.0481790 secs] > 4909998K->4086751K(25060992K), 0.0485110 secs] [Times: user=0.34 > sys=0.03, real=0.04 secs] > 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: [ParNew: > 942852K->104832K(943744K), 0.0738000 secs] > 4924772K->4150899K(25060992K), 0.0740980 secs] [Times: user=0.45 > sys=0.12, real=0.07 secs] > 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: > 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), > 0.0563970 secs] [Times: user=0.31 sys=0.09, *real=2.45 secs]* > 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: [ParNew: > 918208K->81040K(943744K), 0.0396620 secs] 5026432K->4189265K(25060992K), > 0.0400030 secs] [Times: user=0.32 sys=0.00, real=0.04 secs] > 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: [ParNew: > 919952K->104832K(943744K), 0.0845070 secs] > 5028177K->4268050K(25060992K), 0.0848300 secs] [Times: user=0.52 > sys=0.11, real=0.09 secs] > > > Initially I suspected swapping, but according to the free command, 0 > bytes of swap are in use. > >free -m > total used free shared buffers cached > Mem: 32168 28118 4050 0 824 12652 > -/+ buffers/cache: 14641 17527 > Swap: 8191 0 8191 > > > Next, I read about a problem relating to mprotect() on Linux that can be > worked around with -XX:+UseMember. I tried that, but I still see the > same unexplainable pauses. > > > Any suggestions/ideas? We've upgraded to the latest JDK, but no luck. > > Thanks, > Shane > > > java version "1.6.0_25" > Java(TM) SE Runtime Environment (build 1.6.0_25-b06) > Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode) > > > Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 x86_64 > x86_64 GNU/Linux > > > -verbose:gc -Xms24g -Xmx24g -Xmn1g -Xss256k -XX:PermSize=256m > -XX:MaxPermSize=256m -XX:+PrintTenuringDistribution > -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled > -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSClassUnloadingEnabled > -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC > -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings -XX:+UseMembar > > > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From john.cuthbertson at oracle.com Tue Apr 26 10:55:11 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 26 Apr 2011 10:55:11 -0700 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 In-Reply-To: <401F8218-C523-4063-9F74-85F1F4AB536A@oracle.com> References: <4DB21D11.5090308@oracle.com> <4DB21F7F.9050308@oracle.com> <28A87274-4AAA-4F1E-9711-D722A103312E@oracle.com> <4DB605AB.7080201@oracle.com> <41B43E13-9FE0-41CF-BA2B-865AEE65A1A1@oracle.com> <4DB62BFF.1030503@oracle.com> <401F8218-C523-4063-9F74-85F1F4AB536A@oracle.com> Message-ID: <4DB706FF.9050701@oracle.com> Hi Tom, Rmki, I think that the main loop might still get compiled (some of the time) even with this fix. Here's the code of the run method: public void run() { for (;;) { Reference r; synchronized (lock) { if (pending != null) { r = pending; Reference rn = r.next; pending = (rn == r) ? null : rn; r.next = r; } else { try { lock.wait(); } catch (InterruptedException x) { } continue; } } // Fast path for cleaners if (r instanceof Cleaner) { ((Cleaner)r).clean(); continue; } ReferenceQueue q = r.queue; if (q != ReferenceQueue.NULL) q.enqueue(r); } } With Xcomp, the reference handler thread is not holding the lock when run() is called and so the compilation request would be successful. Normally though the compilation of run() would be an OSR compilation - it looks like there are 3 backward branches and only one of which is inside the the locked region. If we have a lot of references to enqueue then I would expect that the backward branch after the enqueue() call would trigger an OSR compile (I don't think the branch inside the locked region would be executed any more frequently). The actual method I saw being compiled while holding the pending list lock was: java/lang/ref/Reference.access$202:(Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference; Other methods invoked from the locked region are: java/lang/ref/Reference.access$200:()Ljava/lang/ref/Reference; java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock; java/lang/Object.wait:()V Of these java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock; is also invoked outside of the locked region (it's actually at bci:0 - which is the target bci of the backward branches), and java/lang/Object.wait:()V is a native method. On 04/25/11 19:58, Tom Rodriguez wrote: > > We could have a small side queue that we could enqueue these on and when a later safe request comes in we can request a non blocking compile of them. It's kind of gross but it would work I think. It could even just be a queue of length one instead of a full data structure. > > tom > > I think having a special queue for these is a lot uglier than adapting the original fix (i.e. the one with the try_lock) to use ownership of the pending list lock rather than the thread id. JohnC -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110426/54b165a3/attachment-0001.html From y.s.ramakrishna at oracle.com Tue Apr 26 11:17:46 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 26 Apr 2011 11:17:46 -0700 Subject: Periodic long minor GC pauses In-Reply-To: <4DB704D3.20600@oracle.com> References: <4DB704D3.20600@oracle.com> Message-ID: <4DB70C4A.9090203@oracle.com> I had a quick look and all i could find was the GC prologue code (although i didn't look all that carefully). Bascially, GC is invoked, it prints this timestamp, does a bit of global book-keeping and some initialization, and then goes over each generation in the heap and says "i am going to do a collection, do whatever you need to do before i do the collection", and the generations each do a bit of book-keeping and any relevant initialization. The only thing i can see in the gc prologues other than a bit of lightweight book-keeping is some reporting code that could potentially be heavyweight. But you do not have any of those enabled in your option set, so there should not be anything obviously heavyweight going on. I'd suggest filing a bug under the category of jvm/hotspot/garbage_collector so someone in support can work with you to get this diagnosed... Three questions when you file the bug: (1) have you seen this start happening recently? (version?) (2) can you check if the longer pauses are "random" or do they always happen "during" CMS concurrent cycles or always outside of such cycles? (3) test set-up. -- ramki On 04/26/11 10:45, Y. S. Ramakrishna wrote: > The pause is definitely in the beginning, before GC collection code > itself runs; witness the timestamps:- > > 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: user=0.31 sys=0.09, real=2.45 secs] > > The first timestamp is 2120.686 and the next one is 2123.075, so we have > about 2.389 s between those two. If you add to that the GC time of 0.056 s, > you get 2.445 which is close enough to the 2.45 s reported. > > So we need to figure out what happens in the JVM between those two > time-stamps and we can at least bound the culprit. > > -- ramki > > On 04/26/11 10:36, Shane Cox wrote: >> Periodically, our Java app on Linux experiences a long Minor GC pause >> that cannot be accounted for by the GC time in the log file. Instead, >> the pause is captured as "real" (wall clock) time and is observable in >> our application logs. An example is below. The GC completed in 56ms, >> but the application was paused for 2.45 seconds. >> >> 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: [ParNew: >> 943439K->104832K(943744K), 0.0481790 secs] >> 4909998K->4086751K(25060992K), 0.0485110 secs] [Times: user=0.34 >> sys=0.03, real=0.04 secs] >> 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: [ParNew: >> 942852K->104832K(943744K), 0.0738000 secs] >> 4924772K->4150899K(25060992K), 0.0740980 secs] [Times: user=0.45 >> sys=0.12, real=0.07 secs] >> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: >> 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), >> 0.0563970 secs] [Times: user=0.31 sys=0.09, *real=2.45 secs]* >> 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: [ParNew: >> 918208K->81040K(943744K), 0.0396620 secs] 5026432K->4189265K(25060992K), >> 0.0400030 secs] [Times: user=0.32 sys=0.00, real=0.04 secs] >> 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: [ParNew: >> 919952K->104832K(943744K), 0.0845070 secs] >> 5028177K->4268050K(25060992K), 0.0848300 secs] [Times: user=0.52 >> sys=0.11, real=0.09 secs] >> >> >> Initially I suspected swapping, but according to the free command, 0 >> bytes of swap are in use. >> >free -m >> total used free shared buffers cached >> Mem: 32168 28118 4050 0 824 12652 >> -/+ buffers/cache: 14641 17527 >> Swap: 8191 0 8191 >> >> >> Next, I read about a problem relating to mprotect() on Linux that can be >> worked around with -XX:+UseMember. I tried that, but I still see the >> same unexplainable pauses. >> >> >> Any suggestions/ideas? We've upgraded to the latest JDK, but no luck. >> >> Thanks, >> Shane >> >> >> java version "1.6.0_25" >> Java(TM) SE Runtime Environment (build 1.6.0_25-b06) >> Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode) >> >> >> Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 x86_64 >> x86_64 GNU/Linux >> >> >> -verbose:gc -Xms24g -Xmx24g -Xmn1g -Xss256k -XX:PermSize=256m >> -XX:MaxPermSize=256m -XX:+PrintTenuringDistribution >> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled >> -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSClassUnloadingEnabled >> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC >> -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings -XX:+UseMembar >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From igor.veresov at oracle.com Tue Apr 26 11:25:34 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Tue, 26 Apr 2011 11:25:34 -0700 Subject: review(S): 7037939: NUMA: Disable adaptive resizing if SHM large pages are used In-Reply-To: <4DB5FA13.9090006@oracle.com> References: <4DB0A01E.8040003@oracle.com> <4DB5FA13.9090006@oracle.com> Message-ID: <4DB70E1E.3000500@oracle.com> Thanks, Ramki! igor On 4/25/11 3:47 PM, Y. Srinivas Ramakrishna wrote: > Looks good to me. > > -- ramki > > On 4/21/2011 2:22 PM, Igor Veresov wrote: >> The fix has two parts: >> 1. On Solaris, when ISM shared memory is used it is always allocated >> round-robin across the nodes, >> so the NUMA allocator cannot work. The fix is to disable UseNUMA if >> ISM method is selected. I let >> UseNUMA win however if UseLargePages and UseSHM are not explicitly >> specified. >> >> 2. On Linux, it's impossible to use adaptive resizing with UseNUMA if >> SHM shared memory is used. >> That is because we cannot uncommit a page in such a mmaping. The >> solution is to disable adaptive >> resizing if the userr really wants it, that is when UseNUMA and >> (UseLargePages or UseSHM) are >> specified on the command line. Like on Solaris, I let UseNUMA win if >> it's explicitly specified and >> UseLargePages and UseSHM are not. >> >> >> Webrev: http://cr.openjdk.java.net/~iveresov/7037939/webrev.00/ >> >> >> Thanks, >> igor > From shane.cox at gmail.com Tue Apr 26 11:29:42 2011 From: shane.cox at gmail.com (Shane Cox) Date: Tue, 26 Apr 2011 14:29:42 -0400 Subject: Periodic long minor GC pauses In-Reply-To: <4DB70C4A.9090203@oracle.com> References: <4DB704D3.20600@oracle.com> <4DB70C4A.9090203@oracle.com> Message-ID: Below is an example from a Remark. Of the total 1.3 seconds of elapsed time, 1.2 seconds is found between the first two timestamps. However, I'm not savvy enough to know whether this is the same problem or simply the result of a long scavenge that occurs as part of the Remark. Is there any way to tell? 2011-04-25T14:38:40.215-0400: 9466.139: [GC[YG occupancy: 712500 K (943744 K)]9467.353: [Rescan (parallel) , 0.0106370 secs]9467.374: [weak refs processing, 0.0159250 secs]9467.390: [class unloading, 0.0180420 secs]9467.408: [scrub symbol & string tables, 0.0458500 secs] [1 CMS-remark: 12520949K(24117248K)] 13233450K(25060992K), 0.1052950 secs] [Times: user=0.13 sys=0.01, real=1.32 secs] On Tue, Apr 26, 2011 at 2:17 PM, Y. S. Ramakrishna < y.s.ramakrishna at oracle.com> wrote: > I had a quick look and all i could find was the GC prologue > code (although i didn't look all that carefully). > Bascially, GC is invoked, it prints this timestamp, > does a bit of global book-keeping and some initialization, > and then goes over each generation in the heap and > says "i am going to do a collection, do whatever you need > to do before i do the collection", and the generations each do a bit of > book-keeping and any relevant initialization. > > The only thing i can see in the gc prologues other than a bit > of lightweight book-keeping is some reporting code that could > potentially be heavyweight. But you do not have any of those > enabled in your option set, so there should not be anything > obviously heavyweight going on. > > I'd suggest filing a bug under the category of > jvm/hotspot/garbage_collector > so someone in support can work with you to get this diagnosed... > > Three questions when you file the bug: > (1) have you seen this start happening recently? (version?) > (2) can you check if the longer pauses are "random" or do > they always happen "during" CMS concurrent cycles or > always outside of such cycles? > (3) test set-up. > > -- ramki > > > On 04/26/11 10:45, Y. S. Ramakrishna wrote: > >> The pause is definitely in the beginning, before GC collection code >> itself runs; witness the timestamps:- >> >> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: >> 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), >> 0.0563970 secs] [Times: user=0.31 sys=0.09, real=2.45 secs] >> >> The first timestamp is 2120.686 and the next one is 2123.075, so we have >> about 2.389 s between those two. If you add to that the GC time of 0.056 >> s, >> you get 2.445 which is close enough to the 2.45 s reported. >> >> So we need to figure out what happens in the JVM between those two >> time-stamps and we can at least bound the culprit. >> >> -- ramki >> >> On 04/26/11 10:36, Shane Cox wrote: >> >>> Periodically, our Java app on Linux experiences a long Minor GC pause >>> that cannot be accounted for by the GC time in the log file. Instead, the >>> pause is captured as "real" (wall clock) time and is observable in our >>> application logs. An example is below. The GC completed in 56ms, but the >>> application was paused for 2.45 seconds. >>> >>> 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: [ParNew: >>> 943439K->104832K(943744K), 0.0481790 secs] 4909998K->4086751K(25060992K), >>> 0.0485110 secs] [Times: user=0.34 sys=0.03, real=0.04 secs] >>> 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: [ParNew: >>> 942852K->104832K(943744K), 0.0738000 secs] 4924772K->4150899K(25060992K), >>> 0.0740980 secs] [Times: user=0.45 sys=0.12, real=0.07 secs] >>> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: >>> 943744K->79296K(943744K), 0.0559560 secs] 4989811K->4187520K(25060992K), >>> 0.0563970 secs] [Times: user=0.31 sys=0.09, *real=2.45 secs]* >>> 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: [ParNew: >>> 918208K->81040K(943744K), 0.0396620 secs] 5026432K->4189265K(25060992K), >>> 0.0400030 secs] [Times: user=0.32 sys=0.00, real=0.04 secs] >>> 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: [ParNew: >>> 919952K->104832K(943744K), 0.0845070 secs] 5028177K->4268050K(25060992K), >>> 0.0848300 secs] [Times: user=0.52 sys=0.11, real=0.09 secs] >>> >>> >>> Initially I suspected swapping, but according to the free command, 0 >>> bytes of swap are in use. >>> >free -m >>> total used free shared buffers cached >>> Mem: 32168 28118 4050 0 824 12652 >>> -/+ buffers/cache: 14641 17527 >>> Swap: 8191 0 8191 >>> >>> >>> Next, I read about a problem relating to mprotect() on Linux that can be >>> worked around with -XX:+UseMember. I tried that, but I still see the same >>> unexplainable pauses. >>> >>> >>> Any suggestions/ideas? We've upgraded to the latest JDK, but no luck. >>> >>> Thanks, >>> Shane >>> >>> >>> java version "1.6.0_25" >>> Java(TM) SE Runtime Environment (build 1.6.0_25-b06) >>> Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode) >>> >>> >>> Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 x86_64 >>> x86_64 GNU/Linux >>> >>> >>> -verbose:gc -Xms24g -Xmx24g -Xmn1g -Xss256k -XX:PermSize=256m >>> -XX:MaxPermSize=256m -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC >>> -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=70 >>> -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps >>> -XX:+PrintHeapAtGC -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings >>> -XX:+UseMembar >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110426/4acfabde/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tom.rodriguez at oracle.com Tue Apr 26 11:31:12 2011 From: tom.rodriguez at oracle.com (Tom Rodriguez) Date: Tue, 26 Apr 2011 11:31:12 -0700 Subject: RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220 In-Reply-To: <4DB706FF.9050701@oracle.com> References: <4DB21D11.5090308@oracle.com> <4DB21F7F.9050308@oracle.com> <28A87274-4AAA-4F1E-9711-D722A103312E@oracle.com> <4DB605AB.7080201@oracle.com> <41B43E13-9FE0-41CF-BA2B-865AEE65A1A1@oracle.com> <4DB62BFF.1030503@oracle.com> <401F8218-C523-4063-9F74-85F1F4AB536A@oracle.com> <4DB706FF.9050701@oracle.com> Message-ID: <9EF575D0-A5C9-407A-8896-2625A763B379@oracle.com> I'm not against a hybrid of your original fix and the current one. Use the owns_pending_list_lock check instead of the is_reference_handler_check and keep the rest. The try/lock logic itself it fairly ugly though. Maybe it's just the name LockerMutexLocker that seems wrong. Maybe LockedMutexUnlocker or a variant constructor for MutexLocker like this: MutexLocker(Monitor * mutex, bool already_locked) { assert(mutex->rank() != Mutex::special, "Special ranked mutex should only use MutexLockerEx"); _mutex = mutex; if (already_locked) { assert(mutex->owned_by_self(), "must already be locked"); } else { _mutex->lock(); } } Then the try_lock piece just sets a flag that we pass into that and otherwise the locking proceeds as normal. tom On Apr 26, 2011, at 10:55 AM, John Cuthbertson wrote: > Hi Tom, Rmki, > > I think that the main loop might still get compiled (some of the time) even with this fix. Here's the code of the run method: > > public void run() { > for (;;) { > > Reference r; > synchronized (lock) { > if (pending != null) { > r = pending; > Reference rn = r.next; > pending = (rn == r) ? null : rn; > r.next = r; > } else { > try { > lock.wait(); > } catch (InterruptedException x) { } > continue; > } > } > > // Fast path for cleaners > if (r instanceof Cleaner) { > ((Cleaner)r).clean(); > continue; > } > > ReferenceQueue q = r.queue; > if (q != ReferenceQueue.NULL) q.enqueue(r); > } > } > > With Xcomp, the reference handler thread is not holding the lock when run() is called and so the compilation request would be successful. Normally though the compilation of run() would be an OSR compilation - it looks like there are 3 backward branches and only one of which is inside the the locked region. If we have a lot of references to enqueue then I would expect that the backward branch after the enqueue() call would trigger an OSR compile (I don't think the branch inside the locked region would be executed any more frequently). > > The actual method I saw being compiled while holding the pending list lock was: java/lang/ref/Reference.access$202:(Ljava/lang/ref/Reference;)Ljava/lang/ref/Reference; > > Other methods invoked from the locked region are: > > java/lang/ref/Reference.access$200:()Ljava/lang/ref/Reference; > java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock; > java/lang/Object.wait:()V > > Of these java/lang/ref/Reference.access$100:()Ljava/lang/ref/Reference$Lock; is also invoked outside of the locked region (it's actually at bci:0 - which is the target bci of the backward branches), and java/lang/Object.wait:()V is a native method. > > > On 04/25/11 19:58, Tom Rodriguez wrote: >> >> We could have a small side queue that we could enqueue these on and when a later safe request comes in we can request a non blocking compile of them. It's kind of gross but it would work I think. It could even just be a queue of length one instead of a full data structure. >> >> tom >> >> >> > I think having a special queue for these is a lot uglier than adapting the original fix (i.e. the one with the try_lock) to use ownership of the pending list lock rather than the thread id. > > JohnC > From john.cuthbertson at oracle.com Tue Apr 26 12:01:30 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 26 Apr 2011 12:01:30 -0700 Subject: RFR(M): 7004681: G1: Extend marking verification to marking phase of Full GCs In-Reply-To: <4DB21E56.6070204@oracle.com> References: <4DB21E56.6070204@oracle.com> Message-ID: <4DB7168A.3040909@oracle.com> Hi All, A new webrev is here: http://cr.openjdk.java.net/~johnc/MarkSweep-VerifyMark/webrev.4/ The changes made since the last webrev include a suggestion from Tony to fold the check from G1CollectedHeap::checkConcurrentMark into the VerifyObjsInRegionClosure and remove the checkConcurrentMark routine and associated closure. Thanks, JohnC On 04/22/11 17:33, John Cuthbertson wrote: > Hi Everyone, > > A new webrev for this CR can be found at: > http://cr.openjdk.java.net/~johnc/MarkSweep-VerifyMark/webrev.3. > > I'd like to get at least another person look over these changes (Tony > has already looked at an earlier version). The latest webrev includes > skipping the region set verification if the verification was called > from a full GC (in G1 the region sets are torn down at the start of > the full GC and so the verification will give a false failure). > > Testing: GC test suite with +VerifyDuringGC with and without G1. > > Thanks, > > JohnC From y.s.ramakrishna at oracle.com Tue Apr 26 12:40:29 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Tue, 26 Apr 2011 12:40:29 -0700 Subject: Periodic long minor GC pauses In-Reply-To: References: <4DB704D3.20600@oracle.com> <4DB70C4A.9090203@oracle.com> Message-ID: <4DB71FAD.3050905@oracle.com> Well-spotted; it's a version of the same problem as near as i can tell. Please make sure to include a sizable GC log with your bug report (starting from VM start-up, so we can see if there is any clue in when the problem first starts during the life of the VM). thanks. -- ramki On 04/26/11 11:29, Shane Cox wrote: > Below is an example from a Remark. Of the total 1.3 seconds of elapsed > time, 1.2 seconds is found between the first two timestamps. However, > I'm not savvy enough to know whether this is the same problem or simply > the result of a long scavenge that occurs as part of the Remark. Is > there any way to tell? > > 2011-04-25T14:38:40.215-0400: 9466.139: [GC[YG occupancy: 712500 K > (943744 K)]9467.353: [Rescan (parallel) , 0.0106370 secs]9467.374: [weak > refs processing, 0.0159250 secs]9467.390: [class unloading, 0.0180420 > secs]9467.408: [scrub symbol & string tables, 0.0458500 secs] [1 > CMS-remark: 12520949K(24117248K)] 13233450K(25060992K), 0.1052950 secs] > [Times: user=0.13 sys=0.01, real=1.32 secs] > > > On Tue, Apr 26, 2011 at 2:17 PM, Y. S. Ramakrishna > > wrote: > > I had a quick look and all i could find was the GC prologue > code (although i didn't look all that carefully). > Bascially, GC is invoked, it prints this timestamp, > does a bit of global book-keeping and some initialization, > and then goes over each generation in the heap and > says "i am going to do a collection, do whatever you need > to do before i do the collection", and the generations each do a bit of > book-keeping and any relevant initialization. > > The only thing i can see in the gc prologues other than a bit > of lightweight book-keeping is some reporting code that could > potentially be heavyweight. But you do not have any of those > enabled in your option set, so there should not be anything > obviously heavyweight going on. > > I'd suggest filing a bug under the category of > jvm/hotspot/garbage_collector > so someone in support can work with you to get this diagnosed... > > Three questions when you file the bug: > (1) have you seen this start happening recently? (version?) > (2) can you check if the longer pauses are "random" or do > they always happen "during" CMS concurrent cycles or > always outside of such cycles? > (3) test set-up. > > -- ramki > > > On 04/26/11 10:45, Y. S. Ramakrishna wrote: > > The pause is definitely in the beginning, before GC collection code > itself runs; witness the timestamps:- > > 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: > 943744K->79296K(943744K), 0.0559560 secs] > 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: user=0.31 > sys=0.09, real=2.45 secs] > > The first timestamp is 2120.686 and the next one is 2123.075, so > we have > about 2.389 s between those two. If you add to that the GC time > of 0.056 s, > you get 2.445 which is close enough to the 2.45 s reported. > > So we need to figure out what happens in the JVM between those two > time-stamps and we can at least bound the culprit. > > -- ramki > > On 04/26/11 10:36, Shane Cox wrote: > > Periodically, our Java app on Linux experiences a long Minor > GC pause that cannot be accounted for by the GC time in the > log file. Instead, the pause is captured as "real" (wall > clock) time and is observable in our application logs. An > example is below. The GC completed in 56ms, but the > application was paused for 2.45 seconds. > > 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: > [ParNew: 943439K->104832K(943744K), 0.0481790 secs] > 4909998K->4086751K(25060992K), 0.0485110 secs] [Times: > user=0.34 sys=0.03, real=0.04 secs] > 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: > [ParNew: 942852K->104832K(943744K), 0.0738000 secs] > 4924772K->4150899K(25060992K), 0.0740980 secs] [Times: > user=0.45 sys=0.12, real=0.07 secs] > 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: > [ParNew: 943744K->79296K(943744K), 0.0559560 secs] > 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: > user=0.31 sys=0.09, *real=2.45 secs]* > 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: > [ParNew: 918208K->81040K(943744K), 0.0396620 secs] > 5026432K->4189265K(25060992K), 0.0400030 secs] [Times: > user=0.32 sys=0.00, real=0.04 secs] > 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: > [ParNew: 919952K->104832K(943744K), 0.0845070 secs] > 5028177K->4268050K(25060992K), 0.0848300 secs] [Times: > user=0.52 sys=0.11, real=0.09 secs] > > > Initially I suspected swapping, but according to the free > command, 0 bytes of swap are in use. > >free -m > total used free shared > buffers cached > Mem: 32168 28118 4050 0 > 824 12652 > -/+ buffers/cache: 14641 17527 > Swap: 8191 0 8191 > > > Next, I read about a problem relating to mprotect() on Linux > that can be worked around with -XX:+UseMember. I tried > that, but I still see the same unexplainable pauses. > > > Any suggestions/ideas? We've upgraded to the latest JDK, > but no luck. > > Thanks, > Shane > > > java version "1.6.0_25" > Java(TM) SE Runtime Environment (build 1.6.0_25-b06) > Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode) > > > Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 > x86_64 x86_64 x86_64 GNU/Linux > > > -verbose:gc -Xms24g -Xmx24g -Xmn1g -Xss256k > -XX:PermSize=256m -XX:MaxPermSize=256m > -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC > -XX:+CMSParallelRemarkEnabled > -XX:CMSInitiatingOccupancyFraction=70 > -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails > -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC > -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings > -XX:+UseMembar > > > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From yumin.qi at oracle.com Tue Apr 26 14:45:56 2011 From: yumin.qi at oracle.com (yumin.qi at oracle.com) Date: Tue, 26 Apr 2011 14:45:56 -0700 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DA49CE7.7020104@oracle.com> References: <4DA49CE7.7020104@oracle.com> Message-ID: <4DB73D14.8000808@oracle.com> Based on first round review comments, new webrev: http://cr.openjdk.java.net/~minqi/6941923/webrev.01 One more flag added, GCLogFile which can be used as -Xloggc:: -XX:GCLogFile= Deleted class gclogStream, modify existing class fileStream instead to have the functions of the former. Thanks Yumin On 4/12/2011 11:41 AM, yumin.qi at oracle.com wrote: > http://cr.openjdk.java.net/~minqi/6941923/webrev.00/ > > > Summary: > > This is a RFE request for having a GC log rotation to prevent Java > application from over flooding disk with GC output running for long time. > In the implementation, supply three JVM options > 1) -XX:+UseGCLogFileRotation must be used with -Xloggc:file > 2) -XX:MaxGCLogFileNumbers= set limit of rotation file numbers, > default to 1, maximum set to 1024. > 3) -XX:GCLogFileSize= can be configured by user how big the file size > should be. Default to 10M. Minimum set to 512K if given from option is > less than 512K. > > If MaxGCLogFileNumbers=1, rotating output in same file, i.e write from > beginning of the file when reach cap of the file; with > MaxGCLogFileNumbers > 1 rotating files sequentially after reach cap in > file, file.1, file.2, ..., file. then back to > file, file.1, ... > Check if rotation needed at safepoint ending. > > Tested with multiple GC choices. > > Thanks > Yumin > > > From igor.veresov at oracle.com Tue Apr 26 20:40:59 2011 From: igor.veresov at oracle.com (igor.veresov at oracle.com) Date: Wed, 27 Apr 2011 03:40:59 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7037939: NUMA: Disable adaptive resizing if SHM large pages are used Message-ID: <20110427034108.5AD2147022@hg.openjdk.java.net> Changeset: c303b3532d4a Author: iveresov Date: 2011-04-26 11:46 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/c303b3532d4a 7037939: NUMA: Disable adaptive resizing if SHM large pages are used Summary: Make the NUMA allocator behave properly with SHM and ISM large pages. Reviewed-by: ysr ! src/os/linux/vm/os_linux.cpp ! src/os/solaris/vm/os_solaris.cpp From bhorowit at gmail.com Tue Apr 26 21:09:53 2011 From: bhorowit at gmail.com (Ben Horowitz) Date: Wed, 27 Apr 2011 04:09:53 -0000 Subject: Java library + JVMTI agent for getting GC pause times? Message-ID: Hi all, Does anyone know of a Java library/JVMTI agent pair that allows a Java application access to GC pause times soon after GC pauses occur? It'd be useful to incorporate this information into the logs written by my Java application for easier correlation with other application behavior. Of course, we get this information from GC log files, but correlating multiple log files is time consuming. I'm thinking of writing such a library/agent pair myself, but don't want to duplicate work that's already been done. Thanks, Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110427/5be15eb5/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From bharathwork at yahoo.com Fri Apr 22 16:00:24 2011 From: bharathwork at yahoo.com (Bharath Mundlapudi) Date: Fri, 22 Apr 2011 16:00:24 -0700 (PDT) Subject: Is CMS cycle can collect finalize objects In-Reply-To: <4DA87354.7050206@oracle.com> References: <522727.98997.qm@web110708.mail.gq1.yahoo.com> <4DA87354.7050206@oracle.com> Message-ID: <774851.44950.qm@web110705.mail.gq1.yahoo.com> Hi Ramki, Thanks for the detailed explanation. I was trying to run some tests for your questions. Here are the answers to some of your questions. >>What are the symptoms? java.net.SocksSocketImpl objects are not getting cleaned up after a CMS cycle. I see the direct correlation to java.lang.ref.Finalizer objects. Overtime, this fills up the old generation and CMS going in loop occupying complete one core. But when we trigger Full GC, these objects are garbage collected. You mentioned that CMS cycle does cleanup these objects provided we enable class unloading. Are you suggesting -XX:+ClassUnloading or -XX:+CMSClassUnloadingEnabled? I have tried with later and didn't succeed.? Our pern gen is relatively constant, by enabling this, are we introducing performance overhead? We have room for CPU cycles and perm gen is relatively small, so this may be fine. Just that we want to see these objects should GC'ed in CMS cycle. Do you have any suggestion w.r.t. to which flags should i be using to trigger this? >> What does jmap -finalizerinfo on your process show? >> What does -XX:+PrintClassHistogram show as accumulating in the heap? >> (Are they one specific type of Finalizer objects or all varieties?) Jmap -histo shows the above class is keep accumulating. Infact, finalizerinfo doesn't show any objects on this process. >>Did the problem start in 6u21? Or are those the only versions >>you tested and found that there was an issue? We have seen this problem in 6u21. We were on 6u12 earlier and didn't run into this problem. But can't say this is a build particular, since lots of things have changed. Thanks in anticipation, -Bharath ________________________________ From: Y. Srinivas Ramakrishna To: Bharath Mundlapudi Cc: hotspot-gc-use at openjdk.java.net Sent: Friday, April 15, 2011 9:33 AM Subject: Re: Is CMS cycle can collect finalize objects Hi Bharath -- On 4/15/2011 7:12 AM, Bharath Mundlapudi wrote: > We have tuned our server to not run into Full GC with CMS collector. One thing, we noted recently was - java.lang.ref.Finalizer objects getting incremented with load. Due to this, CMS cycle threshold was reached and CMS went into loop and run continuously. > > To verify if CMS is cleaning up these Finalizer objects, I have tested on another setup. I have noticed that Finalizer objects are not getting cleaned but when i force full gc, these objects are getting garbage collected. > > > I have the following questions: > 1. Is there a way (JVM cmd option) to tell CMS to cleanup Finalizer objects when CMS runs rather than via Full GC? CMS does process finalizable objects without the need for a full STW gc. Once an object with a finalizer is determined by the CMS collector to be unreachable, it will be placed on the finalizable queue, whence the finalizer thread will pull those objects and finalize them. At the next CMS cycle the space used by those objects will become available for new allocation. The only catch is that the CMS collector will only detect such objects in the old generation (and if you have enabled class unloading, in the perm generation). (That is not to say that those in the younger gen will not be finalized; they will be if both the FinalRef and the referent object are in the young gen at the time that the referent became unreachable. If they happen to be split between the two generations (which is unlikely to happen in practice but is not impossible), then we'll need to wait until the object that is in the younger generation migrates to the older generation and then they will be discovered at the next CMS cycle. (And then there will need to be another CMS cycle to actually reclaim the space used by them following finalization.) In your case, what is the symptom? Is the finalizer thread's "to be finalized" queue of objects growing, or are you saying that the CMS collector does not detect and enqueue unreachable objects into the finalizer's queue? What does jmap -finalizerinfo on your process show? What does -XX:+PrintClassHistogram show as accumulating in the heap? (Are they one specific type of Finalizer objects or all varieties?) > 2. I see that, we can there is System.runFinalization() method to notify GC to cleanup the finalizer queue. Is this better approach for server-side applications? runFinalization() will only cause the objects in the finalizer queue to get finalized in a new thread. If objects are not already on the queue nothing will happen. > 3. Is there any JMX API to invoke finalization from an external process? Don't know. > > > Versions verified: > > We are using JDK 1.6.0 update 21/23 on Redhat 5.4. Did the problem start in 6u21? Or are those the only versions you tested and found that there was an issue? Do you have a test case that demonstrates the issue you encounter? If so, could you send it in and open a suitable bug report? -- ramki > > > Thanks in anticipation, > Bharath > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From chkwok at digibites.nl Thu Apr 14 08:47:20 2011 From: chkwok at digibites.nl (Chi Ho Kwok) Date: Thu, 14 Apr 2011 17:47:20 +0200 Subject: Java heap space, GC, and Promotion Failed In-Reply-To: References: <4DA5F95F.1010101@oracle.com> Message-ID: Well, just use a profiler to see why the app is using so much memory. It's either leaking or it just requires that much RAM to process everything. Don't allocate more memory than you have. I doubt this problem has anything to do with GC at all, so remove all the flags except for -Xmx. On Thu, Apr 14, 2011 at 4:17 PM, Rafael Angarita wrote: > Thank you very much! > > I took your advise about the JVM GC parameters and removed some of them. > > I used -Xmx2500m. My application gets further with the proccesing it needs > to do, but the whole computer gets really slow and my application crash > anyway. > > I'm trying to get the developers of the framework I'm using for my DSL. > > If any of you guys have more ideas, I'm here to listen and learn. > > Thank you very much. > > On 13 April 2011 14:58, Y. S. Ramakrishna wrote: > >> Hi Rafael -- >> >> Looks like you need more heap: size your -Xmx bigger to >> accommodate all of the objects that your Eclipse project creates. >> Here's the state of the old gen in the penultimate display:- >> >> >> [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95 >>>> sys=0.02, real=9.00 secs] (concurrent mode failure): >>>> 2008891K->2014802K(2015232K), 24.2053380 secs] 2038395K->2014802K(2044736K), >>>> [CMS Perm : 50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] [Times: >>>> user=24.16 sys=0.03, real=24.20 secs] [GC [1 CMS-initial-mark: >>>> 2014802K(2015232K)] 2015335K(2044736K), 0.0023250 secs] [Times: user=0.00 >>>> sys=0.00, real=0.00 secs] >>>> >>> >> The last line shows that the old gen has:- >> 2015232 - 2014802 = 430 KB >> of free space. Perhaps you were trying to allocate an object bigger than >> that. >> I'd suggest running with a larger heap (possibly using a 64-bit JVM if >> you need more Java heap). >> >> However, the end of your message does not show the heap to be too full. >> Perhaps Eclipse catches the OOM, and drops all of the objects before it >> exits, >> so you see the heap as not full in the final display:- >> (Eclipse experts on the list might want to weigh in.) >> >> >> This is the end of the output: >>>> >>>> Heap >>>> par new generation total 29504K, used 23591K [0x2e8b0000, 0x308b0000, >>>> 0x308b0000) >>>> eden space 26240K, 77% used [0x2e8b0000, 0x2fc89f20, 0x30250000) >>>> from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000) >>>> to space 3264K, 0% used [0x30250000, 0x30250000, 0x30580000) >>>> concurrent mark-sweep generation total 2015232K, used 61766K >>>> [0x308b0000, 0xab8b0000, 0xab8b0000) >>>> concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000, >>>> 0xb0e75000, 0xb38b0000) >>>> >>> >> >> Asides:- >> Never, never use values for MaxTenuringThreshold exceeding 15, unless >> you are sure you want that kind of behaviour. I'd suggest >> just leave that option out unless you know how to tune for it (there's >> lots of experience on this alias with tuning that though, should >> you need to tune that for performance in the future). >> >> More asides (specific to CMS):- >> Depending on what your platform is, if it has anything more than 2 cores, >> i'd advise dropping the -XX:+CMSIncrementalMode option. (You'd then >> want to drop other options starting with "CMSIncremental". >> CMS does not unload classes by default. With Eclipse etc. you would >> want to unload classes concurrently so as not to get OOM's: >> use -XX:+CMSClassUnloadingEnabled (and if on older JVM's >> -XX:+CMSPermGenSweepingEnabled). >> >> Bottom line: looks like you need more Java heap. >> -- ramki >> >> >> On 04/13/11 10:25, Rafael Angarita wrote: >> >>> Hello everybody, >>> >>> I'm building a code generation application as an Eclipse and one of my >>> test projects contains around 15000 source files. My application started >>> having memory problems, so after doing some optimizations especific to the >>> framework I'm using to develope my DSL, I started learning about GC, but I >>> think I'm still lost. >>> >>> I have tried with different JVM options for the GC with no success. >>> Currently, I'm trying: >>> >>> -Xms2000m -Xmx2000m -verbosegc -XX:+PrintGCDetails >>> -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC >>> -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing >>> -XX:CMSInitiatingOccupancyFraction=5 -XX:MaxTenuringThreshold=300 >>> -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSIncrementalDutyCycleMin=1 >>> >>> but this is just one of the several things I have tried. >>> >>> At first everything seems to go fine, but after awhile I get "promotion >>> failed" and everything gets really slow, and finally the application crash >>> with java.lang.OutOfMemoryError: Java heap space. >>> >>> >>> CMS-concurrent-abortable-preclean: 0.070/0.587 secs] [Times: user=0.66 >>> sys=0.02, real=0.58 secs] [GC[YG occupancy: 28574 K (29504 K)][Rescan >>> (parallel) , 0.0198420 secs][weak refs processing, 0.0015200 secs] [1 >>> CMS-remark: 1961760K(2015232K)] 1990335K(2044736K), 0.0215890 secs] [Times: >>> user=0.03 sys=0.00, real=0.03 secs] [GC [ParNew: 29096K->2087K(29504K), >>> 0.0270900 secs] 1990735K->1965523K(2044736K) icms_dc=100 , 0.0271650 secs] >>> [Times: user=0.05 sys=0.00, real=0.03 secs] [GC [ParNew: >>> 28327K->3264K(29504K), 0.0430410 secs] 1990448K->1969326K(2044736K) >>> icms_dc=100 , 0.0431180 secs] [Times: user=0.07 sys=0.01, real=0.04 secs] >>> [GC [ParNew: 29504K->3264K(29504K), 0.0658260 secs] >>> 1995091K->1975795K(2044736K) icms_dc=100 , 0.0659090 secs] [Times: user=0.11 >>> sys=0.00, real=0.07 secs] [GC [ParNew: 29504K->3264K(29504K), 0.0630250 >>> secs] 2001944K->1982760K(2044736K) icms_dc=100 , 0.0631060 secs] [Times: >>> user=0.11 sys=0.00, real=0.06 secs] [GC [ParNew: 29504K->3263K(29504K), >>> 0.0435130 secs] 2008711K->1985752K(2044736K) icms_dc=100 , 0.0436310 secs] >>> [Times: user=0.07 sys=0.00, real=0.04 secs] [CMS-concurrent-sweep: >>> 1.813/2.058 secs] [Times: user=3.76 sys=0.02, real=2.05 secs] >>> [CMS-concurrent-reset: 0.035/0.035 secs] [Times: user=0.06 sys=0.00, >>> real=0.04 secs] [GC [ParNew (promotion failed): 29503K->29504K(29504K), >>> 0.5729750 secs][CMS[Unloading class >>> sun.reflect.GeneratedConstructorAccessor6] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor26] >>> [Unloading class sun.reflect.GeneratedMethodAccessor9] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor17] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor20] >>> [Unloading class sun.reflect.GeneratedMethodAccessor4] >>> [Unloading class sun.reflect.GeneratedMethodAccessor8] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor25] >>> [Unloading class sun.reflect.GeneratedMethodAccessor18] >>> [Unloading class sun.reflect.GeneratedMethodAccessor17] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor27] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor19] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor12] >>> [Unloading class sun.reflect.GeneratedMethodAccessor2] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor14] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor28] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor5] >>> [Unloading class sun.reflect.GeneratedMethodAccessor16] >>> [Unloading class sun.reflect.GeneratedMethodAccessor19] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor9] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor11] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor8] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor29] >>> [Unloading class sun.reflect.GeneratedMethodAccessor3] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor24] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor18] >>> [Unloading class sun.reflect.GeneratedMethodAccessor15] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor10] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor16] >>> [Unloading class sun.reflect.GeneratedConstructorAccessor15] >>> >>> [Full GC [CMS[CMS-concurrent-mark: 8.811/9.001 secs] [Times: user=10.95 >>> sys=0.02, real=9.00 secs] (concurrent mode failure): >>> 2008891K->2014802K(2015232K), 24.2053380 secs] 2038395K->2014802K(2044736K), >>> [CMS Perm : 50779K->50779K(86244K)] icms_dc=100 , 24.2054320 secs] [Times: >>> user=24.16 sys=0.03, real=24.20 secs] [GC [1 CMS-initial-mark: >>> 2014802K(2015232K)] 2015335K(2044736K), 0.0023250 secs] [Times: user=0.00 >>> sys=0.00, real=0.00 secs] >>> This is the end of the output: >>> >>> Heap >>> par new generation total 29504K, used 23591K [0x2e8b0000, 0x308b0000, >>> 0x308b0000) >>> eden space 26240K, 77% used [0x2e8b0000, 0x2fc89f20, 0x30250000) >>> from space 3264K, 100% used [0x30580000, 0x308b0000, 0x308b0000) >>> to space 3264K, 0% used [0x30250000, 0x30250000, 0x30580000) >>> concurrent mark-sweep generation total 2015232K, used 61766K >>> [0x308b0000, 0xab8b0000, 0xab8b0000) >>> concurrent-mark-sweep perm gen total 87828K, used 52696K [0xab8b0000, >>> 0xb0e75000, 0xb38b0000) >>> >>> >>> I would appreciate if anybody can give me an advise about this. >>> >>> Thank you very much for your help. >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >> > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110414/f2e20b49/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Fri Apr 15 09:33:24 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Fri, 15 Apr 2011 09:33:24 -0700 Subject: Is CMS cycle can collect finalize objects In-Reply-To: <522727.98997.qm@web110708.mail.gq1.yahoo.com> References: <522727.98997.qm@web110708.mail.gq1.yahoo.com> Message-ID: <4DA87354.7050206@oracle.com> Hi Bharath -- On 4/15/2011 7:12 AM, Bharath Mundlapudi wrote: > We have tuned our server to not run into Full GC with CMS collector. One thing, we noted recently was - java.lang.ref.Finalizer objects getting incremented with load. Due to this, CMS cycle threshold was reached and CMS went into loop and run continuously. > > To verify if CMS is cleaning up these Finalizer objects, I have tested on another setup. I have noticed that Finalizer objects are not getting cleaned but when i force full gc, these objects are getting garbage collected. > > > I have the following questions: > 1. Is there a way (JVM cmd option) to tell CMS to cleanup Finalizer objects when CMS runs rather than via Full GC? CMS does process finalizable objects without the need for a full STW gc. Once an object with a finalizer is determined by the CMS collector to be unreachable, it will be placed on the finalizable queue, whence the finalizer thread will pull those objects and finalize them. At the next CMS cycle the space used by those objects will become available for new allocation. The only catch is that the CMS collector will only detect such objects in the old generation (and if you have enabled class unloading, in the perm generation). (That is not to say that those in the younger gen will not be finalized; they will be if both the FinalRef and the referent object are in the young gen at the time that the referent became unreachable. If they happen to be split between the two generations (which is unlikely to happen in practice but is not impossible), then we'll need to wait until the object that is in the younger generation migrates to the older generation and then they will be discovered at the next CMS cycle. (And then there will need to be another CMS cycle to actually reclaim the space used by them following finalization.) In your case, what is the symptom? Is the finalizer thread's "to be finalized" queue of objects growing, or are you saying that the CMS collector does not detect and enqueue unreachable objects into the finalizer's queue? What does jmap -finalizerinfo on your process show? What does -XX:+PrintClassHistogram show as accumulating in the heap? (Are they one specific type of Finalizer objects or all varieties?) > 2. I see that, we can there is System.runFinalization() method to notify GC to cleanup the finalizer queue. Is this better approach for server-side applications? runFinalization() will only cause the objects in the finalizer queue to get finalized in a new thread. If objects are not already on the queue nothing will happen. > 3. Is there any JMX API to invoke finalization from an external process? Don't know. > > > Versions verified: > > We are using JDK 1.6.0 update 21/23 on Redhat 5.4. Did the problem start in 6u21? Or are those the only versions you tested and found that there was an issue? Do you have a test case that demonstrates the issue you encounter? If so, could you send it in and open a suitable bug report? -- ramki > > > Thanks in anticipation, > Bharath > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Mon Apr 25 09:48:08 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Mon, 25 Apr 2011 09:48:08 -0700 Subject: Is CMS cycle can collect finalize objects Message-ID: <4DB5A5C8.7070307@oracle.com> Forgot to cc the alias; response attached. -------------- next part -------------- An embedded message was scrubbed... From: "Y. Srinivas Ramakrishna" Subject: Re: Is CMS cycle can collect finalize objects Date: Mon, 25 Apr 2011 09:37:30 -0700 Size: 4853 Url: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110425/9d895eee/attachment.eml -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From jon.masamitsu at oracle.com Tue Apr 26 21:34:09 2011 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Tue, 26 Apr 2011 21:34:09 -0700 Subject: Periodic long minor GC pauses In-Reply-To: <4DB71FAD.3050905@oracle.com> References: <4DB704D3.20600@oracle.com> <4DB70C4A.9090203@oracle.com> <4DB71FAD.3050905@oracle.com> Message-ID: <4DB79CC1.8040707@oracle.com> Shane, Have you tried running with -XX:+AlwaysPreTouch ? We've occasionally seen intermittent long pauses as the heap grows into newly committed pages. This flag causes pages to be touched as they are committed. I don't know how this fits into Ramki's observation but it might be worth a shot. Jon On 4/26/2011 12:40 PM, Y. S. Ramakrishna wrote: > Well-spotted; it's a version of the same problem as near as > i can tell. Please make sure to include a sizable GC log with > your bug report (starting from VM start-up, so we can see if > there is any clue in when the problem first starts during > the life of the VM). > > thanks. > -- ramki > > On 04/26/11 11:29, Shane Cox wrote: >> Below is an example from a Remark. Of the total 1.3 seconds of elapsed >> time, 1.2 seconds is found between the first two timestamps. However, >> I'm not savvy enough to know whether this is the same problem or simply >> the result of a long scavenge that occurs as part of the Remark. Is >> there any way to tell? >> >> 2011-04-25T14:38:40.215-0400: 9466.139: [GC[YG occupancy: 712500 K >> (943744 K)]9467.353: [Rescan (parallel) , 0.0106370 secs]9467.374: [weak >> refs processing, 0.0159250 secs]9467.390: [class unloading, 0.0180420 >> secs]9467.408: [scrub symbol& string tables, 0.0458500 secs] [1 >> CMS-remark: 12520949K(24117248K)] 13233450K(25060992K), 0.1052950 secs] >> [Times: user=0.13 sys=0.01, real=1.32 secs] >> >> >> On Tue, Apr 26, 2011 at 2:17 PM, Y. S. Ramakrishna >> > wrote: >> >> I had a quick look and all i could find was the GC prologue >> code (although i didn't look all that carefully). >> Bascially, GC is invoked, it prints this timestamp, >> does a bit of global book-keeping and some initialization, >> and then goes over each generation in the heap and >> says "i am going to do a collection, do whatever you need >> to do before i do the collection", and the generations each do a bit of >> book-keeping and any relevant initialization. >> >> The only thing i can see in the gc prologues other than a bit >> of lightweight book-keeping is some reporting code that could >> potentially be heavyweight. But you do not have any of those >> enabled in your option set, so there should not be anything >> obviously heavyweight going on. >> >> I'd suggest filing a bug under the category of >> jvm/hotspot/garbage_collector >> so someone in support can work with you to get this diagnosed... >> >> Three questions when you file the bug: >> (1) have you seen this start happening recently? (version?) >> (2) can you check if the longer pauses are "random" or do >> they always happen "during" CMS concurrent cycles or >> always outside of such cycles? >> (3) test set-up. >> >> -- ramki >> >> >> On 04/26/11 10:45, Y. S. Ramakrishna wrote: >> >> The pause is definitely in the beginning, before GC collection code >> itself runs; witness the timestamps:- >> >> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: [ParNew: >> 943744K->79296K(943744K), 0.0559560 secs] >> 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: user=0.31 >> sys=0.09, real=2.45 secs] >> >> The first timestamp is 2120.686 and the next one is 2123.075, so >> we have >> about 2.389 s between those two. If you add to that the GC time >> of 0.056 s, >> you get 2.445 which is close enough to the 2.45 s reported. >> >> So we need to figure out what happens in the JVM between those two >> time-stamps and we can at least bound the culprit. >> >> -- ramki >> >> On 04/26/11 10:36, Shane Cox wrote: >> >> Periodically, our Java app on Linux experiences a long Minor >> GC pause that cannot be accounted for by the GC time in the >> log file. Instead, the pause is captured as "real" (wall >> clock) time and is observable in our application logs. An >> example is below. The GC completed in 56ms, but the >> application was paused for 2.45 seconds. >> >> 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157: >> [ParNew: 943439K->104832K(943744K), 0.0481790 secs] >> 4909998K->4086751K(25060992K), 0.0485110 secs] [Times: >> user=0.34 sys=0.03, real=0.04 secs] >> 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317: >> [ParNew: 942852K->104832K(943744K), 0.0738000 secs] >> 4924772K->4150899K(25060992K), 0.0740980 secs] [Times: >> user=0.45 sys=0.12, real=0.07 secs] >> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075: >> [ParNew: 943744K->79296K(943744K), 0.0559560 secs] >> 4989811K->4187520K(25060992K), 0.0563970 secs] [Times: >> user=0.31 sys=0.09, *real=2.45 secs]* >> 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928: >> [ParNew: 918208K->81040K(943744K), 0.0396620 secs] >> 5026432K->4189265K(25060992K), 0.0400030 secs] [Times: >> user=0.32 sys=0.00, real=0.04 secs] >> 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445: >> [ParNew: 919952K->104832K(943744K), 0.0845070 secs] >> 5028177K->4268050K(25060992K), 0.0848300 secs] [Times: >> user=0.52 sys=0.11, real=0.09 secs] >> >> >> Initially I suspected swapping, but according to the free >> command, 0 bytes of swap are in use. >> >free -m >> total used free shared >> buffers cached >> Mem: 32168 28118 4050 0 >> 824 12652 >> -/+ buffers/cache: 14641 17527 >> Swap: 8191 0 8191 >> >> >> Next, I read about a problem relating to mprotect() on Linux >> that can be worked around with -XX:+UseMember. I tried >> that, but I still see the same unexplainable pauses. >> >> >> Any suggestions/ideas? We've upgraded to the latest JDK, >> but no luck. >> >> Thanks, >> Shane >> >> >> java version "1.6.0_25" >> Java(TM) SE Runtime Environment (build 1.6.0_25-b06) >> Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11, mixed mode) >> >> >> Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 >> x86_64 x86_64 x86_64 GNU/Linux >> >> >> -verbose:gc -Xms24g -Xmx24g -Xmn1g -Xss256k >> -XX:PermSize=256m -XX:MaxPermSize=256m >> -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC >> -XX:+CMSParallelRemarkEnabled >> -XX:CMSInitiatingOccupancyFraction=70 >> -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails >> -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC >> -XX:+HeapDumpOnOutOfMemoryError -XX:+UseCompressedStrings >> -XX:+UseMembar >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From y.s.ramakrishna at oracle.com Wed Apr 27 02:47:05 2011 From: y.s.ramakrishna at oracle.com (y.s.ramakrishna at oracle.com) Date: Wed, 27 Apr 2011 09:47:05 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7039089: G1: changeset for 7037276 broke heap verification, and related cleanups Message-ID: <20110427094712.02B994703B@hg.openjdk.java.net> Changeset: 1f4413413144 Author: ysr Date: 2011-04-26 21:17 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/1f4413413144 7039089: G1: changeset for 7037276 broke heap verification, and related cleanups Summary: In G1 heap verification, we no longer scan perm to G1-collected heap refs as part of process_strong_roots() but rather in a separate explicit oop iteration over the perm gen. This preserves the original perm card-marks. Added a new assertion in younger_refs_iterate() to catch a simple subcase where the user may have forgotten a prior save_marks() call, as happened in the case of G1's attempt to iterate perm to G1 refs when verifying the heap before exit. The assert was deliberately weakened for ParNew+CMS and will be fixed for that combination in a future CR. Also made some (non-G1) cleanups related to code and comments obsoleted by the migration of Symbols to the native heap. Reviewed-by: iveresov, jmasa, tonyp ! src/share/vm/gc_implementation/concurrentMarkSweep/compactibleFreeListSpace.cpp ! src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/memory/cardTableRS.cpp ! src/share/vm/memory/genCollectedHeap.hpp ! src/share/vm/memory/sharedHeap.cpp ! src/share/vm/memory/sharedHeap.hpp ! src/share/vm/runtime/vmThread.cpp From paul.hohensee at oracle.com Wed Apr 27 05:20:59 2011 From: paul.hohensee at oracle.com (Paul Hohensee) Date: Wed, 27 Apr 2011 08:20:59 -0400 Subject: What are the conflicting points between Class-Data Sharing and ParNew/CMS/ParallelScavenge/G1? In-Reply-To: References: Message-ID: <4DB80A2B.60207@oracle.com> CDS stores a non-relocatable permgen image, so in order to use it, the jvm must be able to load it at the same address it occupied when it was generated. If I remember correctly, serial and CMS/Parnew/G1 put the permgen above the old/young gens in the address space, while the parallel collectors put it below the old/young gens. There can be issues with trying to map the permgen in the latter case, esp. on windows where it's already a pain to find contiguous virtual address space for the heap (dlls are loaded all over the place). I believe CDS more-or-less works with CMS/Parnew/G1, but no one's had time to get all the bugs out. We'd welcome help. :) Paul On 4/14/11 5:14 AM, Krystal Mok wrote: > Hi all, > > I've had this doubt for quite a long time but was too shy to ask on > this list. Anyway, here I go: > > According to the logic in hotspot/src/share/vm/runtime/arguments.cpp, > CDS doesn't work with ParNew/CMS/PS/G1. That means only serial GCs > works with CDS. > I'd like to know, what are the conflicting points between CDS and > these parallel/concurrent GCs? Are there any plans to resolve the > conflicts, or are there any suggestions on how the conflicts would be > resolved? > > Sincerely, > Kris Mok -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110427/5509dfc1/attachment-0001.html From john.cuthbertson at oracle.com Wed Apr 27 18:10:17 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Thu, 28 Apr 2011 01:10:17 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7037756: Deadlock in compiler thread similiar to 6789220 Message-ID: <20110428011024.32A344706A@hg.openjdk.java.net> Changeset: 86ebb26bcdeb Author: johnc Date: 2011-04-27 14:40 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/86ebb26bcdeb 7037756: Deadlock in compiler thread similiar to 6789220 Summary: Avoid blocking in CompileBroker::compile_method_base() if the current thread holds the pending list lock. Reviewed-by: never, brutisso, ysr ! src/share/vm/compiler/compileBroker.cpp From bengt.rutisson at oracle.com Thu Apr 28 00:35:42 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 28 Apr 2011 09:35:42 +0200 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DB2081E.5020701@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> <4DB2081E.5020701@oracle.com> Message-ID: <4DB918CE.2080105@oracle.com> Tony, I think this fix looks good. I like that you have a way of testing this, but if it is possible I would like to see if it can be made more obvious in the code what parts are "real" code and what parts are there just for testing. For example, I think I would prefer that concurrentMark.cpp had calls like: DEBUG_ONLY(force_overflow()->update()); rather than having this in concurrentMark.hpp: void update() PRODUCT_RETURN; That way someone browsing the code in concurrentMark.cpp can quickly see the difference between test code and product code. I realize that this might be a matter of taste, so I'm ok with the way it is now as well. Bengt On 2011-04-23 00:58, Tony Printezis wrote: > Thanks to John Cuthbertson for looking at this. I took his advice and > I'm going to disable the forced overflow by default (by setting the > default parameter to 0), but leave the code in as it's helpful. Latest > version here: > > http://cr.openjdk.java.net/~tonyp/7034139/webrev.2/ > > Tony > > Tony Printezis wrote: >> Hi all, >> >> I'd still like a couple of code reviews for this. Here's the latest >> version (I only rephrased a couple of comments, so if you're looking >> at the earlier version already you can ignore this one): >> >> http://cr.openjdk.java.net/~tonyp/7034139/webrev.1/ >> >> Tony >> >> Tony Printezis wrote: >>> Hi, >>> >>> Could I get a couple of people to look at this? (I'd like to push >>> this this week if possible) >>> >>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.0/ >>> >>> The actual fix is reasonably small (leave / join the >>> SuspendibleThreadSet only if we are in concurrent mode). Most of the >>> changes are new infrastructure to cause a fixed number of overflows >>> during marking (in non-product builds of course) to stress the >>> overflow code. This was the only way I could reliably reproduce the >>> failure. This did uncover a couple of extra issues which I also fixed: >>> >>> - If we overflow during remark we should not actually deal with it >>> during remark but we should abort the remark pause and restart a >>> concurrent mark phase. For some reason we were not doing that. I >>> fixed that (for this I had to ensure that the overflow flag is not >>> cleared when we exit the do_marking_step() method). >>> - Because we were clearing the overflow, it was also possible that >>> the workers would deadlock (for that to happen a worker had to >>> finish handling one overflow and immediately raise another one, so >>> it was highly unlikely to occur in prcatice; good to find it and >>> eliminate it though). >>> >>> I've already tested it, I'll run more tests overnight. >>> >>> Tony >> From jesper.wilhelmsson at oracle.com Thu Apr 28 02:17:56 2011 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Thu, 28 Apr 2011 11:17:56 +0200 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DB918CE.2080105@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> <4DB2081E.5020701@oracle.com> <4DB918CE.2080105@oracle.com> Message-ID: <4DB930C4.30800@oracle.com> On 2011-04-28 09:35, Bengt Rutisson wrote: > > Tony, > > I think this fix looks good. > > I like that you have a way of testing this, but if it is possible I would like > to see if it can be made more obvious in the code what parts are "real" code > and what parts are there just for testing. > > For example, I think I would prefer that concurrentMark.cpp had calls like: > > DEBUG_ONLY(force_overflow()->update()); > > rather than having this in concurrentMark.hpp: > > void update() PRODUCT_RETURN; > > That way someone browsing the code in concurrentMark.cpp can quickly see the > difference between test code and product code. I realize that this might be a > matter of taste, so I'm ok with the way it is now as well. I agree with Bengt that it would be preferred to make it visible at the call site that this call is for debugging only. I don't think this is just a matter of taste, it is also a matter of clarity and consistency. /Jesper > > Bengt > > On 2011-04-23 00:58, Tony Printezis wrote: >> Thanks to John Cuthbertson for looking at this. I took his advice and I'm >> going to disable the forced overflow by default (by setting the default >> parameter to 0), but leave the code in as it's helpful. Latest version here: >> >> http://cr.openjdk.java.net/~tonyp/7034139/webrev.2/ >> >> Tony >> >> Tony Printezis wrote: >>> Hi all, >>> >>> I'd still like a couple of code reviews for this. Here's the latest version >>> (I only rephrased a couple of comments, so if you're looking at the earlier >>> version already you can ignore this one): >>> >>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.1/ >>> >>> Tony >>> >>> Tony Printezis wrote: >>>> Hi, >>>> >>>> Could I get a couple of people to look at this? (I'd like to push this this >>>> week if possible) >>>> >>>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.0/ >>>> >>>> The actual fix is reasonably small (leave / join the SuspendibleThreadSet >>>> only if we are in concurrent mode). Most of the changes are new >>>> infrastructure to cause a fixed number of overflows during marking (in >>>> non-product builds of course) to stress the overflow code. This was the >>>> only way I could reliably reproduce the failure. This did uncover a couple >>>> of extra issues which I also fixed: >>>> >>>> - If we overflow during remark we should not actually deal with it during >>>> remark but we should abort the remark pause and restart a concurrent mark >>>> phase. For some reason we were not doing that. I fixed that (for this I had >>>> to ensure that the overflow flag is not cleared when we exit the >>>> do_marking_step() method). >>>> - Because we were clearing the overflow, it was also possible that the >>>> workers would deadlock (for that to happen a worker had to finish handling >>>> one overflow and immediately raise another one, so it was highly unlikely >>>> to occur in prcatice; good to find it and eliminate it though). >>>> >>>> I've already tested it, I'll run more tests overnight. >>>> >>>> Tony >>> > From jon.masamitsu at oracle.com Thu Apr 28 06:25:53 2011 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Thu, 28 Apr 2011 06:25:53 -0700 Subject: 1.7 G1GC significantly slower than 1.6 Mark and Sweep? In-Reply-To: <4DB6DDB5.4040804@xs4all.nl> References: <4DB6DDB5.4040804@xs4all.nl> Message-ID: <4DB96AE1.2020202@oracle.com> John, You're telling G1 (UseG1GC) to limit pauses to 2ms. (-XX:MaxGCPauseMillis=2) but seemed to have tuned CMS (UseConcMarkSweepGC) toward a 20ms goal. G1 is trying to do very short collections and needs to do many of them to keep up with the allocation rate. Did you mean you are setting MaxGCPauseMillis to 20? Jon On 4/26/2011 7:59 AM, John Hendrikx wrote: > Hi list, > > I've been testing Java 1.6 performance vs Java 1.7 performance with a > timing critical application -- it's essential that garbage collection > pauses are very short. What I've found is that Java 1.6 seems to > perform significantly better than 1.7 (b137) in this respect, although > with certain settings 1.6 will also fail catastrophically. I've used > the following options: > > For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC > For 1.7.0b137: -Xms256M -Xmx256M -XX:+UseG1GC -XX:MaxGCPauseMillis=2 > > The amount of garbage created is roughly 150 MB/sec. The application > demands a response time of about 20 ms and uses half a dozen threads > which deal with buffering and decoding of information. > > With the above settings, the 1.6 VM will meet this goal over a 2 minute > period>99% of the time (with an average CPU consumption of 65% per CPU > core for two cores) -- from verbosegc I gather that the pause times are > around 0.01-0.02 seconds: > > [GC 187752K->187559K(258880K), 0.0148198 secs] > [GC 192156K(258880K), 0.0008281 secs] > [GC 144561K->144372K(258880K), 0.0153497 secs] > [GC 148965K(258880K), 0.0008028 secs] > [GC 166187K->165969K(258880K), 0.0146546 secs] > [GC 187935K->187754K(258880K), 0.0150638 secs] > [GC 192344K(258880K), 0.0008422 secs] > > Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a bit. > It can also introduce OutOfMemory conditions and other catastrophic > failures (one time the GC took 10 seconds after the application had only > been running 20 seconds). How stable 1.6 will perform with the initial > settings remains to be seen; the results with more RAM worry me somewhat. > > The 1.7 VM however performs significantly worse. Here is some of its > output (over roughtly a one second period): > > [GC concurrent-mark-end, 0.0197681 sec] > [GC remark, 0.0030323 secs] > [GC concurrent-count-start] > [GC concurrent-count-end, 0.0060561] > [GC cleanup 177M->103M(256M), 0.0005319 secs] > [GC concurrent-cleanup-start] > [GC concurrent-cleanup-end, 0.0000676] > [GC pause (partial) 136M->136M(256M), 0.0046206 secs] > [GC pause (partial) 139M->139M(256M), 0.0039039 secs] > [GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs] > [GC concurrent-mark-start] > [GC concurrent-mark-end, 0.0152915 sec] > [GC remark, 0.0033085 secs] > [GC concurrent-count-start] > [GC concurrent-count-end, 0.0085232] > [GC cleanup 163M->129M(256M), 0.0004847 secs] > [GC concurrent-cleanup-start] > [GC concurrent-cleanup-end, 0.0000363] > > From the above output one would not expect the performance to be worse, > however, the application fails to meet its goals 10-20% of the time. > The amount of garbage created is the same. CPU time however is hovering > around 90-95%, which is likely the cause of the poor performance. The > GC seems to take a significantly larger amount of time to do its work > causing these stalls in my test application. > > I've experimented with memory sizes and max pause times with the 1.7 VM, > and although it seemed to be doing better with more RAM, it never comes > even close to the performance observed with the 1.6 VM. > > I'm not sure if there are other useful options I can try to see if I can > tune the 1.7 VM performance a bit better. I can provide more > information, although not any (useful) source code at this time due to > external dependencies (JNA/JNI) of this application. > > I'm wondering if I'm missing something as it seems strange to me that > 1.7 is actually underperforming for me when in general most seem to > agree that the G1GC is a huge improvement. > > --John > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From john.cuthbertson at oracle.com Thu Apr 28 10:59:59 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Thu, 28 Apr 2011 10:59:59 -0700 Subject: RFR(M): 7004681: G1: Extend marking verification to marking phase of Full GCs In-Reply-To: <4DB7168A.3040909@oracle.com> References: <4DB21E56.6070204@oracle.com> <4DB7168A.3040909@oracle.com> Message-ID: <4DB9AB1F.9020806@oracle.com> Hi All, Another new webrev for this CR can be found at: http://cr.openjdk.java.net/~johnc/MarkSweep-VerifyMark/webrev.5/ The changes since the last one are just a reconciliation, in g1CollectedHeap.cpp, between the previous changes and Ramki's changes for 7039089. Testing: I ran the GC test suite overnight with VerifyDuringGC and a low initial occupancy threadhold (5%) Thanks, JohnC On 04/26/11 12:01, John Cuthbertson wrote: > Hi All, > > A new webrev is here: > http://cr.openjdk.java.net/~johnc/MarkSweep-VerifyMark/webrev.4/ > > The changes made since the last webrev include a suggestion from Tony > to fold the check from G1CollectedHeap::checkConcurrentMark into the > VerifyObjsInRegionClosure and remove the checkConcurrentMark routine > and associated closure. > > Thanks, > > JohnC > > On 04/22/11 17:33, John Cuthbertson wrote: >> Hi Everyone, >> >> A new webrev for this CR can be found at: >> http://cr.openjdk.java.net/~johnc/MarkSweep-VerifyMark/webrev.3. >> >> I'd like to get at least another person look over these changes (Tony >> has already looked at an earlier version). The latest webrev includes >> skipping the region set verification if the verification was called >> from a full GC (in G1 the region sets are torn down at the start of >> the full GC and so the verification will give a false failure). >> >> Testing: GC test suite with +VerifyDuringGC with and without G1. >> >> Thanks, >> >> JohnC > From john.cuthbertson at oracle.com Thu Apr 28 12:27:57 2011 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Thu, 28 Apr 2011 12:27:57 -0700 Subject: RFR(XXXS): 7040410: -Xloggc: incorrectly enables TraceClassUnloading causing tracing on tty Message-ID: <4DB9BFBD.3090902@oracle.com> Hi Everyone, Very simple and small change: http://cr.openjdk.java.net/~johnc/7040410/webrev.1/ Verified with running crypto benchmarks with -Xloggc. Thanks, JohnC From y.s.ramakrishna at oracle.com Thu Apr 28 12:41:25 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Thu, 28 Apr 2011 12:41:25 -0700 Subject: RFR(XXXS): 7040410: -Xloggc: incorrectly enables TraceClassUnloading causing tracing on tty In-Reply-To: <4DB9BFBD.3090902@oracle.com> References: <4DB9BFBD.3090902@oracle.com> Message-ID: <4DB9C2E5.3090201@oracle.com> Looks good! On 04/28/11 12:27, John Cuthbertson wrote: > Hi Everyone, > > Very simple and small change: > http://cr.openjdk.java.net/~johnc/7040410/webrev.1/ > > Verified with running crypto benchmarks with -Xloggc. > > Thanks, > > JohnC From tony.printezis at oracle.com Thu Apr 28 12:57:11 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Thu, 28 Apr 2011 15:57:11 -0400 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DB918CE.2080105@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> <4DB2081E.5020701@oracle.com> <4DB918CE.2080105@oracle.com> Message-ID: <4DB9C697.6020408@oracle.com> Hi Bengt, Again, thanks for the code review! See inline. Bengt Rutisson wrote: > > Tony, > > I think this fix looks good. > > I like that you have a way of testing this, but if it is possible I > would like to see if it can be made more obvious in the code what > parts are "real" code and what parts are there just for testing. Basically, the fix is just the conditional calling of stsLeave() and stsJoin(). The rest is instrumentation to artificially cause N overflows per concurrent cycle and remark. We could try to write a test to cause that condition but, IMHO, it'd be quite hard (in particular, it'd be hard to cause the overflow during remark; which is why we've only seen this failure only once). So, with the instrumentation enabled, the overflow condition will be caused every time there's a marking cycle. > For example, I think I would prefer that concurrentMark.cpp had calls > like: > > DEBUG_ONLY(force_overflow()->update()); > > rather than having this in concurrentMark.hpp: > > void update() PRODUCT_RETURN; > > That way someone browsing the code in concurrentMark.cpp can quickly > see the difference between test code and product code. I realize that > this might be a matter of taste, so I'm ok with the way it is now as > well. I do see the point you and Jesper (he made in his subsequent e-mail) and, of course :-), we're generally not consistent in taking one approach over the other in HotSpot. Having said this, this is the sort of situation for which the PRODUCT_RETURN macro was introduced in the first place. I personally like using PRODUCT_RETURN since it cuts down the clutter. Compare: void foo() PRODUCT_RETURN; #ifndef PRODUCT void foo() { ... } #endif foo(); with: NOT_PRODUCT(void foo();) #ifndef PRODUCT void foo() { ... } #endif NOT_PRODUCT(foo();) But I'd be OK with the latter for the reasons you and Jesper brought up. Anyone else in group have a strong opinion one way or another? Tony Tony > Bengt > > On 2011-04-23 00:58, Tony Printezis wrote: >> Thanks to John Cuthbertson for looking at this. I took his advice and >> I'm going to disable the forced overflow by default (by setting the >> default parameter to 0), but leave the code in as it's helpful. >> Latest version here: >> >> http://cr.openjdk.java.net/~tonyp/7034139/webrev.2/ >> >> Tony >> >> Tony Printezis wrote: >>> Hi all, >>> >>> I'd still like a couple of code reviews for this. Here's the latest >>> version (I only rephrased a couple of comments, so if you're looking >>> at the earlier version already you can ignore this one): >>> >>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.1/ >>> >>> Tony >>> >>> Tony Printezis wrote: >>>> Hi, >>>> >>>> Could I get a couple of people to look at this? (I'd like to push >>>> this this week if possible) >>>> >>>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.0/ >>>> >>>> The actual fix is reasonably small (leave / join the >>>> SuspendibleThreadSet only if we are in concurrent mode). Most of >>>> the changes are new infrastructure to cause a fixed number of >>>> overflows during marking (in non-product builds of course) to >>>> stress the overflow code. This was the only way I could reliably >>>> reproduce the failure. This did uncover a couple of extra issues >>>> which I also fixed: >>>> >>>> - If we overflow during remark we should not actually deal with it >>>> during remark but we should abort the remark pause and restart a >>>> concurrent mark phase. For some reason we were not doing that. I >>>> fixed that (for this I had to ensure that the overflow flag is not >>>> cleared when we exit the do_marking_step() method). >>>> - Because we were clearing the overflow, it was also possible that >>>> the workers would deadlock (for that to happen a worker had to >>>> finish handling one overflow and immediately raise another one, so >>>> it was highly unlikely to occur in prcatice; good to find it and >>>> eliminate it though). >>>> >>>> I've already tested it, I'll run more tests overnight. >>>> >>>> Tony >>> > From yumin.qi at oracle.com Thu Apr 28 14:18:51 2011 From: yumin.qi at oracle.com (yumin.qi at oracle.com) Date: Thu, 28 Apr 2011 14:18:51 -0700 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications Message-ID: <4DB9D9BB.6010905@oracle.com> Hi, Need your review on the second time changes: http://cr.openjdk.java.net/~minqi/6941923/webrev.01 Any comments on the revised version? thanks in advance. Yumin -------------- next part -------------- An embedded message was scrubbed... From: yumin.qi at oracle.com Subject: Re: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications Date: Tue, 26 Apr 2011 14:45:56 -0700 Size: 4935 Url: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110428/daa6e014/attachment.nws From igor.veresov at oracle.com Thu Apr 28 15:06:57 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 28 Apr 2011 15:06:57 -0700 Subject: review(XS) 7040485: Use transparent huge page on linux by default Message-ID: <4DB9E501.4050003@oracle.com> We should enable the use of transparent huge pages on Linux by default. The solution is to set the UseLargePages flag to true by default, but try only using UseHugeTLBFS by default if UseLargePages is not specified on the command line. We would try both UseSHM and UseHugeTLBFS methods if UseLargePages is explicitly set. This way we maintain compatibility and do not start eating into shared memory by default. Webrev: http://cr.openjdk.java.net/~iveresov/7040485/webrev.00/ Thanks, igor From y.s.ramakrishna at oracle.com Thu Apr 28 15:28:14 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Thu, 28 Apr 2011 15:28:14 -0700 Subject: review(XS) 7040485: Use transparent huge page on linux by default In-Reply-To: <4DB9E501.4050003@oracle.com> References: <4DB9E501.4050003@oracle.com> Message-ID: <4DB9E9FE.6060608@oracle.com> Looks good to me. I guess I missed this nuance in my previour review. Thanks also for fixing the return value. I had one comment about large_page_init(): it returns a boolean value and sets the value of UseLargePages (on Solaris and Linux). The caller then uses the return value to set the value of UseLargePages again. Seems a bit unpleasant. [Windows doesn't set the value in large_page_init(); the caller uses the return value to set UseLargePages.] What if the return value was just discarded and the value of the flag always set in the method itself prior to return? (You would then change the method in the windows implementation, and change the callers in the solaris and linux versions.] But of course all of that predates your current changes, and is not directly related to the synopsis, so would just be a small related clean-up you might want to roll into your changset. Your choice. -- ramki On 04/28/11 15:06, Igor Veresov wrote: > We should enable the use of transparent huge pages on Linux by default. > The solution is to set the UseLargePages flag to true by default, but > try only using UseHugeTLBFS by default if UseLargePages is not specified > on the command line. We would try both UseSHM and UseHugeTLBFS methods > if UseLargePages is explicitly set. > This way we maintain compatibility and do not start eating into shared > memory by default. > > Webrev: http://cr.openjdk.java.net/~iveresov/7040485/webrev.00/ > > > Thanks, > igor From y.s.ramakrishna at oracle.com Thu Apr 28 15:58:26 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Thu, 28 Apr 2011 15:58:26 -0700 Subject: RFR(M): 7004681: G1: Extend marking verification to marking phase of Full GCs In-Reply-To: <4DB9AB1F.9020806@oracle.com> References: <4DB21E56.6070204@oracle.com> <4DB7168A.3040909@oracle.com> <4DB9AB1F.9020806@oracle.com> Message-ID: <4DB9F112.9060007@oracle.com> Looks good, except for one comment for your consideration/thought. It appeared to me that since the verification of mark sweep is now called after the stringtable, symbol table and system dictionary, code cache and other weak roots structures have been visited and cleaned up, the process_strong_roots code might in fact want to verify those structures with the rootsCL as well, via enabling the relevant bits in the scanning option: i.e. instead of your current:- 2835 // We apply the relevant closures to all the oops in the 2836 // system dictionary, the string table and the code cache. 2837 int so = SharedHeap::SO_AllClasses | SharedHeap::SO_Strings | SharedHeap::SO_CodeCache; 2838 2839 if (vo == VerifyOption_G1UseMarkWord) { 2840 // We need to match the parameters used in G1MarkSweep::mark_sweep_phase1 2841 // to match the set of marked roots. Otherwise we could see spurious 2842 // verification failures where a (non-marked or external) root referencing 2843 // a dead oop. For example we could have an entry in the dictionary 2844 // pointing to an unmarked class, or we could have an unmarked klassOop 2845 // in Perm referencing its class loader. This is not a marking failure 2846 // as the dead objects will be cleaned up during the current full GC. 2847 so = SharedHeap::SO_SystemClasses; 2848 } simply keep: const int so = SharedHeap::SO_AllClasses | SharedHeap::SO_Strings | SharedHeap::SO_CodeCache; I read yr comment as saying these weak root structures may still be left pointing to dead objects, but it appears from the mark sweep phase 1 code that at the end of phase 1 we have in fact cleaned up all these weak roots left pointing to dead objects by now. Anyway, worth checking this suggestion, just in case it allows a slightly tighter verification than what you might otherwise get. Otherwise looks good. -- ramki On 04/28/11 10:59, John Cuthbertson wrote: > Hi All, > > Another new webrev for this CR can be found at: > http://cr.openjdk.java.net/~johnc/MarkSweep-VerifyMark/webrev.5/ > > The changes since the last one are just a reconciliation, in > g1CollectedHeap.cpp, between the previous changes and Ramki's changes > for 7039089. > > Testing: I ran the GC test suite overnight with VerifyDuringGC and a low > initial occupancy threadhold (5%) > > Thanks, > > JohnC > > On 04/26/11 12:01, John Cuthbertson wrote: >> Hi All, >> >> A new webrev is here: >> http://cr.openjdk.java.net/~johnc/MarkSweep-VerifyMark/webrev.4/ >> >> The changes made since the last webrev include a suggestion from Tony >> to fold the check from G1CollectedHeap::checkConcurrentMark into the >> VerifyObjsInRegionClosure and remove the checkConcurrentMark routine >> and associated closure. >> >> Thanks, >> >> JohnC >> >> On 04/22/11 17:33, John Cuthbertson wrote: >>> Hi Everyone, >>> >>> A new webrev for this CR can be found at: >>> http://cr.openjdk.java.net/~johnc/MarkSweep-VerifyMark/webrev.3. >>> >>> I'd like to get at least another person look over these changes (Tony >>> has already looked at an earlier version). The latest webrev includes >>> skipping the region set verification if the verification was called >>> from a full GC (in G1 the region sets are torn down at the start of >>> the full GC and so the verification will give a false failure). >>> >>> Testing: GC test suite with +VerifyDuringGC with and without G1. >>> >>> Thanks, >>> >>> JohnC >> > From y.s.ramakrishna at oracle.com Thu Apr 28 16:08:59 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Thu, 28 Apr 2011 16:08:59 -0700 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DB9C697.6020408@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> <4DB2081E.5020701@oracle.com> <4DB918CE.2080105@oracle.com> <4DB9C697.6020408@oracle.com> Message-ID: <4DB9F38B.1080805@oracle.com> I'm OK with either. The general question (not the specific one in regards to this specific changeset) would benefit from input from the bigger hotspot-dev alias (for reasons of global consistency, as was pointed out). I agree that having a clear discriminator at the call-site definitely aids clarity. The clutter from the macro perhaps offsets that a bit -- but the clutter itself is the "signal" here, even if that may sound a bit oxymoronic, so may be there's an aesthetic balancing act here to do, i suspect. So global style guidelines will need to dictate behaviour, because as everyone noted consistency (or in general, a monotonic move towards consistency) would be good to keep the system from going into an oscillatory state. Historically, one has of course favoured the PRODUCT_RETURN* form for historical reasons, but it's a good idea to revisit historical norms every once in a while -- as long as such revision is consistent and global, so that the system as a whole advances towards the desired goal. I'll now get off the soap-box :-) -- ramki On 04/28/11 12:57, Tony Printezis wrote: ... > I do see the point you and Jesper (he made in his subsequent e-mail) > and, of course :-), we're generally not consistent in taking one > approach over the other in HotSpot. Having said this, this is the sort > of situation for which the PRODUCT_RETURN macro was introduced in the > first place. I personally like using PRODUCT_RETURN since it cuts down > the clutter. Compare: > > void foo() PRODUCT_RETURN; > > #ifndef PRODUCT > void foo() { > ... > } > #endif > > foo(); > > with: > > NOT_PRODUCT(void foo();) > > #ifndef PRODUCT > void foo() { > ... > } > #endif > > NOT_PRODUCT(foo();) > > > But I'd be OK with the latter for the reasons you and Jesper brought up. > Anyone else in group have a strong opinion one way or another? > > Tony From igor.veresov at oracle.com Thu Apr 28 17:58:44 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 28 Apr 2011 17:58:44 -0700 Subject: review(XS) 7040485: Use transparent huge page on linux by default In-Reply-To: <4DB9E9FE.6060608@oracle.com> References: <4DB9E501.4050003@oracle.com> <4DB9E9FE.6060608@oracle.com> Message-ID: <4DBA0D44.7000702@oracle.com> Ramki, Sure! Webrev: http://cr.openjdk.java.net/~iveresov/7040485/webrev.01/ igor On 4/28/11 3:28 PM, Y. S. Ramakrishna wrote: > Looks good to me. I guess I missed this nuance in my > previour review. Thanks also for fixing the return value. > > I had one comment about large_page_init(): it returns a > boolean value and sets the value of UseLargePages (on > Solaris and Linux). The caller then uses the return > value to set the value of UseLargePages again. Seems > a bit unpleasant. [Windows doesn't set the value > in large_page_init(); the caller uses the return > value to set UseLargePages.] > > What if the return value was just discarded and the > value of the flag always set in the method itself > prior to return? (You would then change the method > in the windows implementation, and change the callers > in the solaris and linux versions.] > > But of course all of that predates your current changes, > and is not directly related to the synopsis, so would > just be a small related clean-up you might want to roll into > your changset. Your choice. > > -- ramki > > On 04/28/11 15:06, Igor Veresov wrote: >> We should enable the use of transparent huge pages on Linux by >> default. The solution is to set the UseLargePages flag to true by >> default, but try only using UseHugeTLBFS by default if UseLargePages >> is not specified on the command line. We would try both UseSHM and >> UseHugeTLBFS methods if UseLargePages is explicitly set. >> This way we maintain compatibility and do not start eating into shared >> memory by default. >> >> Webrev: http://cr.openjdk.java.net/~iveresov/7040485/webrev.00/ >> >> >> Thanks, >> igor From y.s.ramakrishna at oracle.com Thu Apr 28 18:45:40 2011 From: y.s.ramakrishna at oracle.com (Y. S. Ramakrishna) Date: Thu, 28 Apr 2011 18:45:40 -0700 Subject: review(XS) 7040485: Use transparent huge page on linux by default In-Reply-To: <4DBA0D44.7000702@oracle.com> References: <4DB9E501.4050003@oracle.com> <4DB9E9FE.6060608@oracle.com> <4DBA0D44.7000702@oracle.com> Message-ID: <4DBA1844.50005@oracle.com> Looks good to me! On 04/28/11 17:58, Igor Veresov wrote: > Ramki, > > Sure! > > Webrev: http://cr.openjdk.java.net/~iveresov/7040485/webrev.01/ > > igor > > On 4/28/11 3:28 PM, Y. S. Ramakrishna wrote: >> Looks good to me. I guess I missed this nuance in my >> previour review. Thanks also for fixing the return value. >> >> I had one comment about large_page_init(): it returns a >> boolean value and sets the value of UseLargePages (on >> Solaris and Linux). The caller then uses the return >> value to set the value of UseLargePages again. Seems >> a bit unpleasant. [Windows doesn't set the value >> in large_page_init(); the caller uses the return >> value to set UseLargePages.] >> >> What if the return value was just discarded and the >> value of the flag always set in the method itself >> prior to return? (You would then change the method >> in the windows implementation, and change the callers >> in the solaris and linux versions.] >> >> But of course all of that predates your current changes, >> and is not directly related to the synopsis, so would >> just be a small related clean-up you might want to roll into >> your changset. Your choice. >> >> -- ramki >> >> On 04/28/11 15:06, Igor Veresov wrote: >>> We should enable the use of transparent huge pages on Linux by >>> default. The solution is to set the UseLargePages flag to true by >>> default, but try only using UseHugeTLBFS by default if UseLargePages >>> is not specified on the command line. We would try both UseSHM and >>> UseHugeTLBFS methods if UseLargePages is explicitly set. >>> This way we maintain compatibility and do not start eating into shared >>> memory by default. >>> >>> Webrev: http://cr.openjdk.java.net/~iveresov/7040485/webrev.00/ >>> >>> >>> Thanks, >>> igor > From john.cuthbertson at oracle.com Thu Apr 28 19:44:13 2011 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Fri, 29 Apr 2011 02:44:13 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7040410: -Xloggc: incorrectly enables TraceClassUnloading causing tracing on tty Message-ID: <20110429024419.C48F8470CD@hg.openjdk.java.net> Changeset: da0fffdcc453 Author: johnc Date: 2011-04-28 15:29 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/da0fffdcc453 7040410: -Xloggc: incorrectly enables TraceClassUnloading causing tracing on tty Summary: Don't enable TraceClassUnloading whne -Xloggc is specified. Reviewed-by: tonyp, ysr ! src/share/vm/runtime/arguments.cpp From bengt.rutisson at oracle.com Thu Apr 28 22:45:48 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 29 Apr 2011 07:45:48 +0200 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DB9C697.6020408@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> <4DB2081E.5020701@oracle.com> <4DB918CE.2080105@oracle.com> <4DB9C697.6020408@oracle.com> Message-ID: <4DBA508C.6080803@oracle.com> Tony, >> I like that you have a way of testing this, but if it is possible I >> would like to see if it can be made more obvious in the code what >> parts are "real" code and what parts are there just for testing. > > Basically, the fix is just the conditional calling of stsLeave() and > stsJoin(). The rest is instrumentation to artificially cause N > overflows per concurrent cycle and remark. We could try to write a > test to cause that condition but, IMHO, it'd be quite hard (in > particular, it'd be hard to cause the overflow during remark; which is > why we've only seen this failure only once). So, with the > instrumentation enabled, the overflow condition will be caused every > time there's a marking cycle. Sorry for being unclear. I agree with you here. I think it would be difficult to write a test case that will trigger the overflows in the desired way. This is not what I meant when I talked about "real code" and "testing code". What I meat was what you are pointing out further down. Whether there is an acceptable way to signal already at the call site that a call has no effect in product builds. I'll comment on that in a response to Ramki's email shortly. Bengt > >> For example, I think I would prefer that concurrentMark.cpp had calls >> like: >> >> DEBUG_ONLY(force_overflow()->update()); >> >> rather than having this in concurrentMark.hpp: >> >> void update() PRODUCT_RETURN; >> >> That way someone browsing the code in concurrentMark.cpp can quickly >> see the difference between test code and product code. I realize that >> this might be a matter of taste, so I'm ok with the way it is now as >> well. > > I do see the point you and Jesper (he made in his subsequent e-mail) > and, of course :-), we're generally not consistent in taking one > approach over the other in HotSpot. Having said this, this is the sort > of situation for which the PRODUCT_RETURN macro was introduced in the > first place. I personally like using PRODUCT_RETURN since it cuts down > the clutter. Compare: > > void foo() PRODUCT_RETURN; > > #ifndef PRODUCT > void foo() { > ... > } > #endif > > foo(); > > with: > > NOT_PRODUCT(void foo();) > > #ifndef PRODUCT > void foo() { > ... > } > #endif > > NOT_PRODUCT(foo();) > > > But I'd be OK with the latter for the reasons you and Jesper brought > up. Anyone else in group have a strong opinion one way or another? > > Tony > Tony > > >> Bengt >> >> On 2011-04-23 00:58, Tony Printezis wrote: >>> Thanks to John Cuthbertson for looking at this. I took his advice >>> and I'm going to disable the forced overflow by default (by setting >>> the default parameter to 0), but leave the code in as it's helpful. >>> Latest version here: >>> >>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.2/ >>> >>> Tony >>> >>> Tony Printezis wrote: >>>> Hi all, >>>> >>>> I'd still like a couple of code reviews for this. Here's the latest >>>> version (I only rephrased a couple of comments, so if you're >>>> looking at the earlier version already you can ignore this one): >>>> >>>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.1/ >>>> >>>> Tony >>>> >>>> Tony Printezis wrote: >>>>> Hi, >>>>> >>>>> Could I get a couple of people to look at this? (I'd like to push >>>>> this this week if possible) >>>>> >>>>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.0/ >>>>> >>>>> The actual fix is reasonably small (leave / join the >>>>> SuspendibleThreadSet only if we are in concurrent mode). Most of >>>>> the changes are new infrastructure to cause a fixed number of >>>>> overflows during marking (in non-product builds of course) to >>>>> stress the overflow code. This was the only way I could reliably >>>>> reproduce the failure. This did uncover a couple of extra issues >>>>> which I also fixed: >>>>> >>>>> - If we overflow during remark we should not actually deal with it >>>>> during remark but we should abort the remark pause and restart a >>>>> concurrent mark phase. For some reason we were not doing that. I >>>>> fixed that (for this I had to ensure that the overflow flag is not >>>>> cleared when we exit the do_marking_step() method). >>>>> - Because we were clearing the overflow, it was also possible that >>>>> the workers would deadlock (for that to happen a worker had to >>>>> finish handling one overflow and immediately raise another one, so >>>>> it was highly unlikely to occur in prcatice; good to find it and >>>>> eliminate it though). >>>>> >>>>> I've already tested it, I'll run more tests overnight. >>>>> >>>>> Tony >>>> >> From bengt.rutisson at oracle.com Thu Apr 28 23:05:36 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 29 Apr 2011 08:05:36 +0200 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DB9F38B.1080805@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> <4DB2081E.5020701@oracle.com> <4DB918CE.2080105@oracle.com> <4DB9C697.6020408@oracle.com> <4DB9F38B.1080805@oracle.com> Message-ID: <4DBA5530.5040302@oracle.com> Ramki, I totally agree with you. Most important is to have some kind of general guideline so that we can start working towards a common way of implementing non-product code. I see your (and Tony's) point about PRODUCT_RETURN having been the way to do this. However, I don't fully appreciate the "clutter argument" against using a macro at the call site. To me the line "force_overflow()->update()" below is clutter if I am debugging this code from a product build since it does not do anything. But from just going through this code it is not clear that it is clutter. I actually have to go and understand what force_overflow() does. So, changing the line to "NOT_PRODUCT(force_overflow()->update())" is actually, IMHO, adding information - not adding clutter. The clutter is already there. It just doesn't look like clutter. (Sorry for getting all philosophical here :-) ) if (task_num == 0) { clear_marking_state(concurrent() /* clear_overflow */); force_overflow()->update(); if (PrintGC) { gclog_or_tty->date_stamp(PrintGCDateStamps); gclog_or_tty->stamp(PrintGCTimeStamps); gclog_or_tty->print_cr("[GC concurrent-mark-reset-for-overflow]"); } } Anyway, Tony, since we don't have any common rule for this, or maybe the rule is even to use PRODUCT_RETURN, I don't think you have to re-write your code for this change. I am more interested in the general direction that we will go in. Bengt On 2011-04-29 01:08, Y. S. Ramakrishna wrote: > I'm OK with either. The general question > (not the specific one in regards to this > specific changeset) would benefit from > input from the bigger hotspot-dev alias > (for reasons of global consistency, as was > pointed out). > > I agree that having a clear discriminator at the > call-site definitely aids clarity. The clutter from > the macro perhaps offsets that a bit -- but the clutter > itself is the "signal" here, even if that may sound a > bit oxymoronic, so may be there's an aesthetic balancing > act here to do, i suspect. So global style guidelines > will need to dictate behaviour, because as everyone noted > consistency (or in general, a monotonic move towards > consistency) would be good to keep the system from > going into an oscillatory state. > > Historically, one has of course favoured the PRODUCT_RETURN* > form for historical reasons, but it's a good > idea to revisit historical norms every once in a > while -- as long as such revision is consistent and > global, so that the system as a whole advances towards the > desired goal. > > I'll now get off the soap-box :-) > -- ramki > > On 04/28/11 12:57, Tony Printezis wrote: > ... >> I do see the point you and Jesper (he made in his subsequent e-mail) >> and, of course :-), we're generally not consistent in taking one >> approach over the other in HotSpot. Having said this, this is the >> sort of situation for which the PRODUCT_RETURN macro was introduced >> in the first place. I personally like using PRODUCT_RETURN since it >> cuts down the clutter. Compare: >> >> void foo() PRODUCT_RETURN; >> >> #ifndef PRODUCT >> void foo() { >> ... >> } >> #endif >> >> foo(); >> >> with: >> >> NOT_PRODUCT(void foo();) >> >> #ifndef PRODUCT >> void foo() { >> ... >> } >> #endif >> >> NOT_PRODUCT(foo();) >> >> >> But I'd be OK with the latter for the reasons you and Jesper brought >> up. Anyone else in group have a strong opinion one way or another? >> >> Tony From y.s.ramakrishna at oracle.com Thu Apr 28 23:16:28 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Thu, 28 Apr 2011 23:16:28 -0700 Subject: 1.7 G1GC significantly slower than 1.6 Mark and Sweep? In-Reply-To: <4DBA5640.9080203@xs4all.nl> References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com> <4DBA5640.9080203@xs4all.nl> Message-ID: <4DBA57BC.40604@oracle.com> John -- How about posting performance/times of each of the 4 combinations from the following cartesian product:- {JDK7, JDK6} X {CMS, G1} Perhaps you are conflating JDK changes with GC changes, because of changing both axes/dimensions at the same time? -- ramki On 4/28/2011 11:10 PM, John Hendrikx wrote: > I tried many -XX:MaxGCPauseMillis settings, including not setting it at > all, 20, 10, 5, 2. The results were similar each time -- it didn't > really have much of an effect. In retrospect you might say that the > total CPU use is what is causing the problems, not necessarily the > length of the pauses -- whether this extra CPU use is caused by the > collector or because of some other change in Java 7 I donot know; the > program is the same. Is there perhaps another collector that I could > try to see if this lowers CPU use? Or settings (even non-GC related) > that could lower CPU use? > > Java 6's CMS I didn't need to tune. After determining that the length > of GC pauses was causing problems in the application, I tried turning > CMS on and it resolved the problems. > > What I observe is that even though with Java 7 the pauses seem (are?) > very short, the CPU use is a lot higher (from 65% under Java 6 to 95% > with 7). This could be related to other causes (perhaps threading > overhead, debug code in Java 7, etc) but I doubt it is in any specific > Java code that I wrote as most of the heavy lifting is happening in > native methods. It could for example be that several ByteBuffers being > used are being copied under Java 7 while under 6 direct access was possible. > > John. > > Jon Masamitsu wrote: >> John, >> >> You're telling G1 (UseG1GC) to limit pauses to 2ms. >> (-XX:MaxGCPauseMillis=2) but seemed to have tuned >> CMS (UseConcMarkSweepGC) toward a 20ms goal. >> G1 is trying to do very short collections and needs to do many >> of them to keep up with the allocation rate. Did you >> mean you are setting MaxGCPauseMillis to 20? >> >> Jon >> >> On 4/26/2011 7:59 AM, John Hendrikx wrote: >> >>> Hi list, >>> >>> I've been testing Java 1.6 performance vs Java 1.7 performance with a >>> timing critical application -- it's essential that garbage collection >>> pauses are very short. What I've found is that Java 1.6 seems to >>> perform significantly better than 1.7 (b137) in this respect, although >>> with certain settings 1.6 will also fail catastrophically. I've used >>> the following options: >>> >>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC >>> For 1.7.0b137: -Xms256M -Xmx256M -XX:+UseG1GC -XX:MaxGCPauseMillis=2 >>> >>> The amount of garbage created is roughly 150 MB/sec. The application >>> demands a response time of about 20 ms and uses half a dozen threads >>> which deal with buffering and decoding of information. >>> >>> With the above settings, the 1.6 VM will meet this goal over a 2 minute >>> period>99% of the time (with an average CPU consumption of 65% per CPU >>> core for two cores) -- from verbosegc I gather that the pause times are >>> around 0.01-0.02 seconds: >>> >>> [GC 187752K->187559K(258880K), 0.0148198 secs] >>> [GC 192156K(258880K), 0.0008281 secs] >>> [GC 144561K->144372K(258880K), 0.0153497 secs] >>> [GC 148965K(258880K), 0.0008028 secs] >>> [GC 166187K->165969K(258880K), 0.0146546 secs] >>> [GC 187935K->187754K(258880K), 0.0150638 secs] >>> [GC 192344K(258880K), 0.0008422 secs] >>> >>> Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a bit. >>> It can also introduce OutOfMemory conditions and other catastrophic >>> failures (one time the GC took 10 seconds after the application had only >>> been running 20 seconds). How stable 1.6 will perform with the initial >>> settings remains to be seen; the results with more RAM worry me somewhat. >>> >>> The 1.7 VM however performs significantly worse. Here is some of its >>> output (over roughtly a one second period): >>> >>> [GC concurrent-mark-end, 0.0197681 sec] >>> [GC remark, 0.0030323 secs] >>> [GC concurrent-count-start] >>> [GC concurrent-count-end, 0.0060561] >>> [GC cleanup 177M->103M(256M), 0.0005319 secs] >>> [GC concurrent-cleanup-start] >>> [GC concurrent-cleanup-end, 0.0000676] >>> [GC pause (partial) 136M->136M(256M), 0.0046206 secs] >>> [GC pause (partial) 139M->139M(256M), 0.0039039 secs] >>> [GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs] >>> [GC concurrent-mark-start] >>> [GC concurrent-mark-end, 0.0152915 sec] >>> [GC remark, 0.0033085 secs] >>> [GC concurrent-count-start] >>> [GC concurrent-count-end, 0.0085232] >>> [GC cleanup 163M->129M(256M), 0.0004847 secs] >>> [GC concurrent-cleanup-start] >>> [GC concurrent-cleanup-end, 0.0000363] >>> >>> From the above output one would not expect the performance to be worse, >>> however, the application fails to meet its goals 10-20% of the time. >>> The amount of garbage created is the same. CPU time however is hovering >>> around 90-95%, which is likely the cause of the poor performance. The >>> GC seems to take a significantly larger amount of time to do its work >>> causing these stalls in my test application. >>> >>> I've experimented with memory sizes and max pause times with the 1.7 VM, >>> and although it seemed to be doing better with more RAM, it never comes >>> even close to the performance observed with the 1.6 VM. >>> >>> I'm not sure if there are other useful options I can try to see if I can >>> tune the 1.7 VM performance a bit better. I can provide more >>> information, although not any (useful) source code at this time due to >>> external dependencies (JNA/JNI) of this application. >>> >>> I'm wondering if I'm missing something as it seems strange to me that >>> 1.7 is actually underperforming for me when in general most seem to >>> agree that the G1GC is a huge improvement. >>> >>> --John >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From bengt.rutisson at oracle.com Fri Apr 29 04:27:06 2011 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 29 Apr 2011 13:27:06 +0200 Subject: RFR(XXXS): 7040410: -Xloggc: incorrectly enables TraceClassUnloading causing tracing on tty In-Reply-To: <4DB9C2E5.3090201@oracle.com> References: <4DB9BFBD.3090902@oracle.com> <4DB9C2E5.3090201@oracle.com> Message-ID: <4DBAA08A.9090801@oracle.com> Yes, looks good. Bengt On 2011-04-28 21:41, Y. S. Ramakrishna wrote: > Looks good! > > On 04/28/11 12:27, John Cuthbertson wrote: >> Hi Everyone, >> >> Very simple and small change: >> http://cr.openjdk.java.net/~johnc/7040410/webrev.1/ >> >> Verified with running crypto benchmarks with -Xloggc. >> >> Thanks, >> >> JohnC From jesper.wilhelmsson at oracle.com Fri Apr 29 05:25:02 2011 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Fri, 29 Apr 2011 14:25:02 +0200 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DB9D9BB.6010905@oracle.com> References: <4DB9D9BB.6010905@oracle.com> Message-ID: <4DBAAE1E.20904@oracle.com> Yumin, In ostream.hpp lines 199 - 215 you have added a block of code that is commented out. Personally I don't think we should have code that is commented out in there unless there is a good documentation reason for it. I don't see such a reason here. Looks good otherwise. /Jesper On 04/28/2011 11:18 PM, yumin.qi at oracle.com wrote: > Hi, > > Need your review on the second time changes: > > http://cr.openjdk.java.net/~minqi/6941923/webrev.01 > > Any comments on the revised version? thanks in advance. > > Yumin > From tony.printezis at oracle.com Fri Apr 29 07:38:47 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 29 Apr 2011 10:38:47 -0400 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DBA508C.6080803@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> <4DB2081E.5020701@oracle.com> <4DB918CE.2080105@oracle.com> <4DB9C697.6020408@oracle.com> <4DBA508C.6080803@oracle.com> Message-ID: <4DBACD77.8000006@oracle.com> Bengt, I clearly misread your e-mail. Apologies for the confusion! Tony On 4/29/2011 1:45 AM, Bengt Rutisson wrote: > > Tony, > >>> I like that you have a way of testing this, but if it is possible I >>> would like to see if it can be made more obvious in the code what >>> parts are "real" code and what parts are there just for testing. >> >> Basically, the fix is just the conditional calling of stsLeave() and >> stsJoin(). The rest is instrumentation to artificially cause N >> overflows per concurrent cycle and remark. We could try to write a >> test to cause that condition but, IMHO, it'd be quite hard (in >> particular, it'd be hard to cause the overflow during remark; which >> is why we've only seen this failure only once). So, with the >> instrumentation enabled, the overflow condition will be caused every >> time there's a marking cycle. > > Sorry for being unclear. I agree with you here. I think it would be > difficult to write a test case that will trigger the overflows in the > desired way. This is not what I meant when I talked about "real code" > and "testing code". What I meat was what you are pointing out further > down. Whether there is an acceptable way to signal already at the call > site that a call has no effect in product builds. > > I'll comment on that in a response to Ramki's email shortly. > > Bengt > > >> >>> For example, I think I would prefer that concurrentMark.cpp had >>> calls like: >>> >>> DEBUG_ONLY(force_overflow()->update()); >>> >>> rather than having this in concurrentMark.hpp: >>> >>> void update() PRODUCT_RETURN; >>> >>> That way someone browsing the code in concurrentMark.cpp can quickly >>> see the difference between test code and product code. I realize >>> that this might be a matter of taste, so I'm ok with the way it is >>> now as well. >> >> I do see the point you and Jesper (he made in his subsequent e-mail) >> and, of course :-), we're generally not consistent in taking one >> approach over the other in HotSpot. Having said this, this is the >> sort of situation for which the PRODUCT_RETURN macro was introduced >> in the first place. I personally like using PRODUCT_RETURN since it >> cuts down the clutter. Compare: >> >> void foo() PRODUCT_RETURN; >> >> #ifndef PRODUCT >> void foo() { >> ... >> } >> #endif >> >> foo(); >> >> with: >> >> NOT_PRODUCT(void foo();) >> >> #ifndef PRODUCT >> void foo() { >> ... >> } >> #endif >> >> NOT_PRODUCT(foo();) >> >> >> But I'd be OK with the latter for the reasons you and Jesper brought >> up. Anyone else in group have a strong opinion one way or another? >> >> Tony >> Tony >> >> >>> Bengt >>> >>> On 2011-04-23 00:58, Tony Printezis wrote: >>>> Thanks to John Cuthbertson for looking at this. I took his advice >>>> and I'm going to disable the forced overflow by default (by setting >>>> the default parameter to 0), but leave the code in as it's helpful. >>>> Latest version here: >>>> >>>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.2/ >>>> >>>> Tony >>>> >>>> Tony Printezis wrote: >>>>> Hi all, >>>>> >>>>> I'd still like a couple of code reviews for this. Here's the >>>>> latest version (I only rephrased a couple of comments, so if >>>>> you're looking at the earlier version already you can ignore this >>>>> one): >>>>> >>>>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.1/ >>>>> >>>>> Tony >>>>> >>>>> Tony Printezis wrote: >>>>>> Hi, >>>>>> >>>>>> Could I get a couple of people to look at this? (I'd like to push >>>>>> this this week if possible) >>>>>> >>>>>> http://cr.openjdk.java.net/~tonyp/7034139/webrev.0/ >>>>>> >>>>>> The actual fix is reasonably small (leave / join the >>>>>> SuspendibleThreadSet only if we are in concurrent mode). Most of >>>>>> the changes are new infrastructure to cause a fixed number of >>>>>> overflows during marking (in non-product builds of course) to >>>>>> stress the overflow code. This was the only way I could reliably >>>>>> reproduce the failure. This did uncover a couple of extra issues >>>>>> which I also fixed: >>>>>> >>>>>> - If we overflow during remark we should not actually deal with >>>>>> it during remark but we should abort the remark pause and restart >>>>>> a concurrent mark phase. For some reason we were not doing that. >>>>>> I fixed that (for this I had to ensure that the overflow flag is >>>>>> not cleared when we exit the do_marking_step() method). >>>>>> - Because we were clearing the overflow, it was also possible >>>>>> that the workers would deadlock (for that to happen a worker had >>>>>> to finish handling one overflow and immediately raise another >>>>>> one, so it was highly unlikely to occur in prcatice; good to find >>>>>> it and eliminate it though). >>>>>> >>>>>> I've already tested it, I'll run more tests overnight. >>>>>> >>>>>> Tony >>>>> >>> > From tony.printezis at oracle.com Fri Apr 29 07:48:23 2011 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 29 Apr 2011 10:48:23 -0400 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DBA5530.5040302@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> <4DB2081E.5020701@oracle.com> <4DB918CE.2080105@oracle.com> <4DB9C697.6020408@oracle.com> <4DB9F38B.1080805@oracle.com> <4DBA5530.5040302@oracle.com> Message-ID: <4DBACFB7.6060500@oracle.com> Bengt and Ramki, Thanks for your thoughts, which I agree with. I'll go ahead and push this with the PRODUCT_RETURN version given that it is consistent with what we've been doing so far, as both Ramki and Bengt suggested. And, yes, it'd be nice to revisit this in the future. FWIW, I'd like to see the whole of assert / product / not product / debug only and friends macros revisited given that there's a lot of redundancy between them and we use them inconsistently throughout the codebase. Tony On 4/29/2011 2:05 AM, Bengt Rutisson wrote: > > Ramki, > > I totally agree with you. Most important is to have some kind of > general guideline so that we can start working towards a common way of > implementing non-product code. > > I see your (and Tony's) point about PRODUCT_RETURN having been the way > to do this. However, I don't fully appreciate the "clutter argument" > against using a macro at the call site. > > To me the line "force_overflow()->update()" below is clutter if I am > debugging this code from a product build since it does not do > anything. But from just going through this code it is not clear that > it is clutter. I actually have to go and understand what > force_overflow() does. So, changing the line to > "NOT_PRODUCT(force_overflow()->update())" is actually, IMHO, adding > information - not adding clutter. The clutter is already there. It > just doesn't look like clutter. (Sorry for getting all philosophical > here :-) ) > > if (task_num == 0) { > clear_marking_state(concurrent() /* clear_overflow */); > force_overflow()->update(); > > if (PrintGC) { > gclog_or_tty->date_stamp(PrintGCDateStamps); > gclog_or_tty->stamp(PrintGCTimeStamps); > gclog_or_tty->print_cr("[GC concurrent-mark-reset-for-overflow]"); > } > } > > Anyway, Tony, since we don't have any common rule for this, or maybe > the rule is even to use PRODUCT_RETURN, I don't think you have to > re-write your code for this change. I am more interested in the > general direction that we will go in. > > Bengt > > > On 2011-04-29 01:08, Y. S. Ramakrishna wrote: >> I'm OK with either. The general question >> (not the specific one in regards to this >> specific changeset) would benefit from >> input from the bigger hotspot-dev alias >> (for reasons of global consistency, as was >> pointed out). >> >> I agree that having a clear discriminator at the >> call-site definitely aids clarity. The clutter from >> the macro perhaps offsets that a bit -- but the clutter >> itself is the "signal" here, even if that may sound a >> bit oxymoronic, so may be there's an aesthetic balancing >> act here to do, i suspect. So global style guidelines >> will need to dictate behaviour, because as everyone noted >> consistency (or in general, a monotonic move towards >> consistency) would be good to keep the system from >> going into an oscillatory state. >> >> Historically, one has of course favoured the PRODUCT_RETURN* >> form for historical reasons, but it's a good >> idea to revisit historical norms every once in a >> while -- as long as such revision is consistent and >> global, so that the system as a whole advances towards the >> desired goal. >> >> I'll now get off the soap-box :-) >> -- ramki >> >> On 04/28/11 12:57, Tony Printezis wrote: >> ... >>> I do see the point you and Jesper (he made in his subsequent e-mail) >>> and, of course :-), we're generally not consistent in taking one >>> approach over the other in HotSpot. Having said this, this is the >>> sort of situation for which the PRODUCT_RETURN macro was introduced >>> in the first place. I personally like using PRODUCT_RETURN since it >>> cuts down the clutter. Compare: >>> >>> void foo() PRODUCT_RETURN; >>> >>> #ifndef PRODUCT >>> void foo() { >>> ... >>> } >>> #endif >>> >>> foo(); >>> >>> with: >>> >>> NOT_PRODUCT(void foo();) >>> >>> #ifndef PRODUCT >>> void foo() { >>> ... >>> } >>> #endif >>> >>> NOT_PRODUCT(foo();) >>> >>> >>> But I'd be OK with the latter for the reasons you and Jesper brought >>> up. Anyone else in group have a strong opinion one way or another? >>> >>> Tony > From yumin.qi at oracle.com Fri Apr 29 10:15:09 2011 From: yumin.qi at oracle.com (yumin.qi at oracle.com) Date: Fri, 29 Apr 2011 10:15:09 -0700 Subject: Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications In-Reply-To: <4DBAAE1E.20904@oracle.com> References: <4DB9D9BB.6010905@oracle.com> <4DBAAE1E.20904@oracle.com> Message-ID: <4DBAF21D.8010502@oracle.com> Jesper, Thanks. Deleted the comments part, this is the new version: http://cr.openjdk.java.net/~minqi/6941923/webrev.02 Thanks Yumin On 4/29/2011 5:25 AM, Jesper Wilhelmsson wrote: > Yumin, > > In ostream.hpp lines 199 - 215 you have added a block of code that is > commented out. Personally I don't think we should have code that is > commented out in there unless there is a good documentation reason for > it. I don't see such a reason here. > > Looks good otherwise. > /Jesper > > > > On 04/28/2011 11:18 PM, yumin.qi at oracle.com wrote: >> Hi, >> >> Need your review on the second time changes: >> >> http://cr.openjdk.java.net/~minqi/6941923/webrev.01 >> >> Any comments on the revised version? thanks in advance. >> >> Yumin >> From simone.bordet at gmail.com Fri Apr 29 10:59:12 2011 From: simone.bordet at gmail.com (Simone Bordet) Date: Fri, 29 Apr 2011 19:59:12 +0200 Subject: G1 feedback: frantic GC cycles Message-ID: Hi, I am running with JDK 7: java version "1.7.0-ea" Java(TM) SE Runtime Environment (build 1.7.0-ea-b139) Java HotSpot(TM) 64-Bit Server VM (build 21.0-b09, mixed mode) The application is IntelliJ IDEA, and I am using these command lines options: -Xms1024m -Xmx1024m -Xmn512m -XX:MaxPermSize=256m -ea -verbose:gc -XX:+PrintGCDetails -XX:+DisableExplicitGC -XX:+PrintGCDateStamps -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -XX:+TieredCompilation -XX:+PrintCommandLineFlags -XX:+AggressiveOpts The application works well, I use it heavily for coding, and G1 works well: small pauses (which is what I like when I have keyboard-based mind-less automatisms) and stable. However, from time to time, say 2-3 times a day, I experience a few seconds of frantic GC cycles from G1, for which I have attached a log. The log starts at 18:44:10 (previous logging was fine and I hope not relevant), and the frantic cycles start at 19:14:08 (around line 4741 of the log) and lasts roughly 20 seconds. During these 20 seconds, the application was unresponsive (that is what allowed me to detect these frantic GC cycles) so I stopped working on it to look at the GC logs and waited until the GC came back to normal activity (i.e. the frantic cycles stopped). After those 20 seconds, G1 seems to be able to go back to normal activity and the application works again well. During those 20 seconds, there has been roughly one collection every 100 ms or so. It is entirely possible that the application is leaking (I'm just a user, not a developer of the application). I was wondering if someone saw this before ? >From the logs and jconsole, it seems that the old generation is quite full (jconsole reported 540 M occupied) , but I still have ~512 M available of young generation available (and jconsole confirms that a young GC brings the new generation to 6 M or less). Other than using G1 for the IDE, I often consult on GC tuning, so any additional information or improvement to G1 is good news for me, and understanding this behavior will help on suggesting G1 to customers or tuning G1 better. I am pretty sure I saw the same anomaly (frantic cycles) with G1 on JDK 6 as well (1.6.0_24), so it seems more a G1 behavior than JDK's. I can fairly easily reproduce the problem on a daily basis, so any suggestion to track down this issue is welcome (like additional command line switches, or confirmation that it happens with JDK 6 as well, etc.) Thanks ! Simon -- http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless.?? Victoria Livschitz -------------- next part -------------- A non-text attachment was scrubbed... Name: g1_spin.log.gz Type: application/x-gzip Size: 52987 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20110429/d2e8fb4c/attachment-0001.bin From simone.bordet at gmail.com Fri Apr 29 11:16:47 2011 From: simone.bordet at gmail.com (Simone Bordet) Date: Fri, 29 Apr 2011 20:16:47 +0200 Subject: G1 feedback: frantic GC cycles In-Reply-To: References: Message-ID: Forgot to say that this is on: Linux pitt 2.6.38-8-generic #42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux Ubuntu 11.04 and it's on a real computer (i.e. it's not a virtualized host). Simon -- http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless.?? Victoria Livschitz From John.Coomes at oracle.com Fri Apr 29 11:30:36 2011 From: John.Coomes at oracle.com (John Coomes) Date: Fri, 29 Apr 2011 11:30:36 -0700 Subject: CRR: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this (S) In-Reply-To: <4DB9C697.6020408@oracle.com> References: <4DA4A9CE.5070107@oracle.com> <4DAF3A89.7010706@oracle.com> <4DB2081E.5020701@oracle.com> <4DB918CE.2080105@oracle.com> <4DB9C697.6020408@oracle.com> Message-ID: <19899.972.548741.863279@oracle.com> Tony Printezis (tony.printezis at oracle.com) wrote: > ... > > For example, I think I would prefer that concurrentMark.cpp had calls > > like: > > > > DEBUG_ONLY(force_overflow()->update()); > > > > rather than having this in concurrentMark.hpp: > > > > void update() PRODUCT_RETURN; > > > > That way someone browsing the code in concurrentMark.cpp can quickly > > see the difference between test code and product code. I realize that > > this might be a matter of taste, so I'm ok with the way it is now as > > well. > > I do see the point you and Jesper (he made in his subsequent e-mail) > and, of course :-), we're generally not consistent in taking one > approach over the other in HotSpot. Having said this, this is the sort > of situation for which the PRODUCT_RETURN macro was introduced in the > first place. I personally like using PRODUCT_RETURN since it cuts down > the clutter. Compare: > > void foo() PRODUCT_RETURN; > > #ifndef PRODUCT > void foo() { > ... > } > #endif > > foo(); > > with: > > NOT_PRODUCT(void foo();) > > #ifndef PRODUCT > void foo() { > ... > } > #endif > > NOT_PRODUCT(foo();) > > > But I'd be OK with the latter for the reasons you and Jesper brought up. > Anyone else in group have a strong opinion one way or another? Not a strong opinion, but I favor the latter a bit. It also depends on the code in question. If the use point is something like verify_card_table() then it's pretty obvious that it's non-product code. But if the use point is more generic like 'update()' or 'invalidate()' then you have to go look. -John From igor.veresov at oracle.com Fri Apr 29 12:23:37 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 29 Apr 2011 12:23:37 -0700 Subject: review(XS) 7040485: Use transparent huge page on linux by default In-Reply-To: <4DBA1844.50005@oracle.com> References: <4DB9E501.4050003@oracle.com> <4DB9E9FE.6060608@oracle.com> <4DBA0D44.7000702@oracle.com> <4DBA1844.50005@oracle.com> Message-ID: <4DBB1039.4010503@oracle.com> Thanks, Ramki! On 4/28/11 6:45 PM, Y. S. Ramakrishna wrote: > Looks good to me! > > On 04/28/11 17:58, Igor Veresov wrote: >> Ramki, >> >> Sure! >> >> Webrev: http://cr.openjdk.java.net/~iveresov/7040485/webrev.01/ >> >> igor >> >> On 4/28/11 3:28 PM, Y. S. Ramakrishna wrote: >>> Looks good to me. I guess I missed this nuance in my >>> previour review. Thanks also for fixing the return value. >>> >>> I had one comment about large_page_init(): it returns a >>> boolean value and sets the value of UseLargePages (on >>> Solaris and Linux). The caller then uses the return >>> value to set the value of UseLargePages again. Seems >>> a bit unpleasant. [Windows doesn't set the value >>> in large_page_init(); the caller uses the return >>> value to set UseLargePages.] >>> >>> What if the return value was just discarded and the >>> value of the flag always set in the method itself >>> prior to return? (You would then change the method >>> in the windows implementation, and change the callers >>> in the solaris and linux versions.] >>> >>> But of course all of that predates your current changes, >>> and is not directly related to the synopsis, so would >>> just be a small related clean-up you might want to roll into >>> your changset. Your choice. >>> >>> -- ramki >>> >>> On 04/28/11 15:06, Igor Veresov wrote: >>>> We should enable the use of transparent huge pages on Linux by >>>> default. The solution is to set the UseLargePages flag to true by >>>> default, but try only using UseHugeTLBFS by default if UseLargePages >>>> is not specified on the command line. We would try both UseSHM and >>>> UseHugeTLBFS methods if UseLargePages is explicitly set. >>>> This way we maintain compatibility and do not start eating into shared >>>> memory by default. >>>> >>>> Webrev: http://cr.openjdk.java.net/~iveresov/7040485/webrev.00/ >>>> >>>> >>>> Thanks, >>>> igor >> From tony.printezis at oracle.com Fri Apr 29 12:32:08 2011 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Fri, 29 Apr 2011 19:32:08 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this. Message-ID: <20110429193209.D49D84710B@hg.openjdk.java.net> Changeset: cd8e33b2a8ad Author: tonyp Date: 2011-04-29 12:40 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/cd8e33b2a8ad 7034139: G1: assert(Thread::current()->is_ConcurrentGC_thread()) failed: only a conc GC thread can call this. Summary: We were calling STS join and leave during a STW pause and we are not suppoesed to. I now only call those during concurrent phase. I also added stress code in the non-product builds to force an overflows (the condition that ws uncovering the bug) to make sure it does not happen again. Reviewed-by: johnc, brutisso ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/concurrentMark.hpp ! src/share/vm/gc_implementation/g1/g1_globals.hpp From y.s.ramakrishna at oracle.com Fri Apr 29 12:57:46 2011 From: y.s.ramakrishna at oracle.com (Y. Srinivas Ramakrishna) Date: Fri, 29 Apr 2011 12:57:46 -0700 Subject: CMS option for Java7 ignored (was Re: 1.7 G1GC significantly slower than 1.6 Mark and Sweep?) In-Reply-To: <4DBA675F.70801@xs4all.nl> References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com> <4DBA5640.9080203@xs4all.nl> <4DBA57BC.40604@oracle.com> <4DBA675F.70801@xs4all.nl> Message-ID: <4DBB183A.6090608@oracle.com> John, You brought up several related but somewhat orthogonal issues, so best to deal with each in a separate sub-thread of the main thread. On 4/29/2011 12:23 AM, John Hendrikx wrote: > .... I don't know how to activate CMS for Java 7, it ignores the option that I'd use > for Java 6. ... > Java6: -ea -Xms256M -Xmx256M -XX:UseConcMarkSweepGC -verbose:gc ... >>>>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC Are you saying that if you do: -Xms256M -Xmx256M -XX:UseConcMarkSweepGC -verbose:gc you do not get CMS? If not, what does the GC log say? Can you provide the following details: % java -version and also % jinfo as well as: % jnifo -flag UseConcMarkSweepGC where is yr JVM process. The main issue will be dealt with in the original thread. Sorry for the digression. -- ramki _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From jon.masamitsu at oracle.com Fri Apr 29 13:21:55 2011 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Fri, 29 Apr 2011 13:21:55 -0700 Subject: 1.7 G1GC significantly slower than 1.6 Mark and Sweep? In-Reply-To: <4DBA5640.9080203@xs4all.nl> References: <4DB6DDB5.4040804@xs4all.nl> <4DB96AE1.2020202@oracle.com> <4DBA5640.9080203@xs4all.nl> Message-ID: <4DBB1DE3.9070304@oracle.com> John, If you do additional runs to look at this issue, please add -XX:+PrintGCTimesStamps. Helps with plotting vs. time. Thanks. Jon On 4/28/2011 11:10 PM, John Hendrikx wrote: > I tried many -XX:MaxGCPauseMillis settings, including not setting it > at all, 20, 10, 5, 2. The results were similar each time -- it didn't > really have much of an effect. In retrospect you might say that the > total CPU use is what is causing the problems, not necessarily the > length of the pauses -- whether this extra CPU use is caused by the > collector or because of some other change in Java 7 I donot know; the > program is the same. Is there perhaps another collector that I could > try to see if this lowers CPU use? Or settings (even non-GC related) > that could lower CPU use? > > Java 6's CMS I didn't need to tune. After determining that the length > of GC pauses was causing problems in the application, I tried turning > CMS on and it resolved the problems. > > What I observe is that even though with Java 7 the pauses seem (are?) > very short, the CPU use is a lot higher (from 65% under Java 6 to 95% > with 7). This could be related to other causes (perhaps threading > overhead, debug code in Java 7, etc) but I doubt it is in any specific > Java code that I wrote as most of the heavy lifting is happening in > native methods. It could for example be that several ByteBuffers > being used are being copied under Java 7 while under 6 direct access > was possible. > > John. > > Jon Masamitsu wrote: >> John, >> >> You're telling G1 (UseG1GC) to limit pauses to 2ms. >> (-XX:MaxGCPauseMillis=2) but seemed to have tuned >> CMS (UseConcMarkSweepGC) toward a 20ms goal. >> G1 is trying to do very short collections and needs to do many >> of them to keep up with the allocation rate. Did you >> mean you are setting MaxGCPauseMillis to 20? >> >> Jon >> >> On 4/26/2011 7:59 AM, John Hendrikx wrote: >>> Hi list, >>> >>> I've been testing Java 1.6 performance vs Java 1.7 performance with a >>> timing critical application -- it's essential that garbage collection >>> pauses are very short. What I've found is that Java 1.6 seems to >>> perform significantly better than 1.7 (b137) in this respect, although >>> with certain settings 1.6 will also fail catastrophically. I've used >>> the following options: >>> >>> For 1.6.0_22: -Xms256M -Xmx256M -XX:+UseConcMarkSweepGC >>> For 1.7.0b137: -Xms256M -Xmx256M -XX:+UseG1GC -XX:MaxGCPauseMillis=2 >>> >>> The amount of garbage created is roughly 150 MB/sec. The application >>> demands a response time of about 20 ms and uses half a dozen threads >>> which deal with buffering and decoding of information. >>> >>> With the above settings, the 1.6 VM will meet this goal over a 2 minute >>> period>99% of the time (with an average CPU consumption of 65% per CPU >>> core for two cores) -- from verbosegc I gather that the pause times are >>> around 0.01-0.02 seconds: >>> >>> [GC 187752K->187559K(258880K), 0.0148198 secs] >>> [GC 192156K(258880K), 0.0008281 secs] >>> [GC 144561K->144372K(258880K), 0.0153497 secs] >>> [GC 148965K(258880K), 0.0008028 secs] >>> [GC 166187K->165969K(258880K), 0.0146546 secs] >>> [GC 187935K->187754K(258880K), 0.0150638 secs] >>> [GC 192344K(258880K), 0.0008422 secs] >>> >>> Giving the 1.6 VM more RAM (-Xms1G -Xmx1G) increases these times a bit. >>> It can also introduce OutOfMemory conditions and other catastrophic >>> failures (one time the GC took 10 seconds after the application had >>> only >>> been running 20 seconds). How stable 1.6 will perform with the initial >>> settings remains to be seen; the results with more RAM worry me >>> somewhat. >>> >>> The 1.7 VM however performs significantly worse. Here is some of its >>> output (over roughtly a one second period): >>> >>> [GC concurrent-mark-end, 0.0197681 sec] >>> [GC remark, 0.0030323 secs] >>> [GC concurrent-count-start] >>> [GC concurrent-count-end, 0.0060561] >>> [GC cleanup 177M->103M(256M), 0.0005319 secs] >>> [GC concurrent-cleanup-start] >>> [GC concurrent-cleanup-end, 0.0000676] >>> [GC pause (partial) 136M->136M(256M), 0.0046206 secs] >>> [GC pause (partial) 139M->139M(256M), 0.0039039 secs] >>> [GC pause (partial) (initial-mark) 158M->157M(256M), 0.0039424 secs] >>> [GC concurrent-mark-start] >>> [GC concurrent-mark-end, 0.0152915 sec] >>> [GC remark, 0.0033085 secs] >>> [GC concurrent-count-start] >>> [GC concurrent-count-end, 0.0085232] >>> [GC cleanup 163M->129M(256M), 0.0004847 secs] >>> [GC concurrent-cleanup-start] >>> [GC concurrent-cleanup-end, 0.0000363] >>> >>> From the above output one would not expect the performance to be >>> worse, >>> however, the application fails to meet its goals 10-20% of the time. >>> The amount of garbage created is the same. CPU time however is >>> hovering >>> around 90-95%, which is likely the cause of the poor performance. The >>> GC seems to take a significantly larger amount of time to do its work >>> causing these stalls in my test application. >>> >>> I've experimented with memory sizes and max pause times with the 1.7 >>> VM, >>> and although it seemed to be doing better with more RAM, it never comes >>> even close to the performance observed with the 1.6 VM. >>> >>> I'm not sure if there are other useful options I can try to see if I >>> can >>> tune the 1.7 VM performance a bit better. I can provide more >>> information, although not any (useful) source code at this time due to >>> external dependencies (JNA/JNI) of this application. >>> >>> I'm wondering if I'm missing something as it seems strange to me that >>> 1.7 is actually underperforming for me when in general most seem to >>> agree that the G1GC is a huge improvement. >>> >>> --John >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tony.printezis at oracle.com Fri Apr 29 18:36:15 2011 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Sat, 30 Apr 2011 01:36:15 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 7035144: G1: nightly failure: Non-dirty cards in region that should be dirty (failures still exist...) Message-ID: <20110430013618.AAEF14711E@hg.openjdk.java.net> Changeset: 063382f9b575 Author: tonyp Date: 2011-04-29 14:59 -0400 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/063382f9b575 7035144: G1: nightly failure: Non-dirty cards in region that should be dirty (failures still exist...) Summary: We should only undirty cards after we decide that they are not on a young region, not before. The fix also includes improvements to the verify_dirty_region() method which print out which cards were not found dirty. Reviewed-by: johnc, brutisso ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp ! src/share/vm/gc_implementation/g1/g1RemSet.cpp ! src/share/vm/gc_implementation/g1/heapRegion.cpp ! src/share/vm/gc_implementation/g1/heapRegion.hpp ! src/share/vm/memory/cardTableModRefBS.cpp ! src/share/vm/memory/cardTableModRefBS.hpp ! src/share/vm/memory/modRefBarrierSet.hpp From igor.veresov at oracle.com Fri Apr 29 22:40:34 2011 From: igor.veresov at oracle.com (igor.veresov at oracle.com) Date: Sat, 30 Apr 2011 05:40:34 +0000 Subject: hg: jdk7/hotspot-gc/hotspot: 2 new changesets Message-ID: <20110430054038.540E84712F@hg.openjdk.java.net> Changeset: 188c9a5d6a6d Author: iveresov Date: 2011-04-29 12:39 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/188c9a5d6a6d 7040485: Use transparent huge page on linux by default Summary: Turn on UseLargePages by default but try only HugeTLBFS method if it is not explicitly specified on the command line. Reviewed-by: ysr ! src/os/linux/vm/globals_linux.hpp ! src/os/linux/vm/os_linux.cpp ! src/os/solaris/vm/os_solaris.cpp ! src/os/windows/vm/os_windows.cpp ! src/share/vm/runtime/os.hpp Changeset: 6dd3d74b2674 Author: iveresov Date: 2011-04-29 20:42 -0700 URL: http://hg.openjdk.java.net/jdk7/hotspot-gc/hotspot/rev/6dd3d74b2674 Merge From igor.veresov at oracle.com Sat Apr 30 00:04:57 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Sat, 30 Apr 2011 00:04:57 -0700 Subject: G1 feedback: frantic GC cycles In-Reply-To: References: Message-ID: <4DBBB499.1080607@oracle.com> This is pretty weird. Simone, could you run it with -XX:+PrintHeapAtGC, it would be nice to see how big the survivor spaces are. It seems like at the time a collection happens there is only one young region. igor On 4/29/11 10:59 AM, Simone Bordet wrote: > Hi, > > I am running with JDK 7: > > java version "1.7.0-ea" > Java(TM) SE Runtime Environment (build 1.7.0-ea-b139) > Java HotSpot(TM) 64-Bit Server VM (build 21.0-b09, mixed mode) > > The application is IntelliJ IDEA, and I am using these command lines options: > > -Xms1024m > -Xmx1024m > -Xmn512m > -XX:MaxPermSize=256m > -ea > -verbose:gc > -XX:+PrintGCDetails > -XX:+DisableExplicitGC > -XX:+PrintGCDateStamps > -XX:+UnlockExperimentalVMOptions > -XX:+UseG1GC > -XX:+TieredCompilation > -XX:+PrintCommandLineFlags > -XX:+AggressiveOpts > > The application works well, I use it heavily for coding, and G1 works > well: small pauses (which is what I like when I have keyboard-based > mind-less automatisms) and stable. > > However, from time to time, say 2-3 times a day, I experience a few > seconds of frantic GC cycles from G1, for which I have attached a log. > The log starts at 18:44:10 (previous logging was fine and I hope not > relevant), and the frantic cycles start at 19:14:08 (around line 4741 > of the log) and lasts roughly 20 seconds. > During these 20 seconds, the application was unresponsive (that is > what allowed me to detect these frantic GC cycles) so I stopped > working on it to look at the GC logs and waited until the GC came back > to normal activity (i.e. the frantic cycles stopped). > > After those 20 seconds, G1 seems to be able to go back to normal > activity and the application works again well. > During those 20 seconds, there has been roughly one collection every > 100 ms or so. > > It is entirely possible that the application is leaking (I'm just a > user, not a developer of the application). > I was wondering if someone saw this before ? > >> From the logs and jconsole, it seems that the old generation is quite > full (jconsole reported 540 M occupied) , but I still have ~512 M > available of young generation available (and jconsole confirms that a > young GC brings the new generation to 6 M or less). > > Other than using G1 for the IDE, I often consult on GC tuning, so any > additional information or improvement to G1 is good news for me, and > understanding this behavior will help on suggesting G1 to customers or > tuning G1 better. > > I am pretty sure I saw the same anomaly (frantic cycles) with G1 on > JDK 6 as well (1.6.0_24), so it seems more a G1 behavior than JDK's. > > I can fairly easily reproduce the problem on a daily basis, so any > suggestion to track down this issue is welcome (like additional > command line switches, or confirmation that it happens with JDK 6 as > well, etc.) > > Thanks ! > > Simon From sbordet at intalio.com Sat Apr 30 01:38:35 2011 From: sbordet at intalio.com (Simone Bordet) Date: Sat, 30 Apr 2011 10:38:35 +0200 Subject: G1 feedback: frantic GC cycles In-Reply-To: <4DBBB499.1080607@oracle.com> References: <4DBBB499.1080607@oracle.com> Message-ID: Hi, On Sat, Apr 30, 2011 at 09:04, Igor Veresov wrote: > This is pretty weird. Simone, could you run it with -XX:+PrintHeapAtGC, it > would be nice to see how big the survivor spaces are. Ok. > It seems like at the time a collection happens there is only one young region. Can I ask how do you know that ? From the logs ? Thanks ! Simon -- http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless.?? Victoria Livschitz From igor.veresov at oracle.com Sat Apr 30 02:44:51 2011 From: igor.veresov at oracle.com (Igor Veresov) Date: Sat, 30 Apr 2011 02:44:51 -0700 Subject: G1 feedback: frantic GC cycles In-Reply-To: References: <4DBBB499.1080607@oracle.com> Message-ID: <4DBBDA13.1060308@oracle.com> On 4/30/11 1:38 AM, Simone Bordet wrote: > Hi, > > On Sat, Apr 30, 2011 at 09:04, Igor Veresov wrote: >> This is pretty weird. Simone, could you run it with -XX:+PrintHeapAtGC, it >> would be nice to see how big the survivor spaces are. > > Ok. > >> It seems like at the time a collection happens there is only one young region. > > Can I ask how do you know that ? From the logs ? Actually I was mistaken it's not always like that, but there one instance. These are the sizes at the end of the four consequent collections: [ 518M->517M(1024M)] [ 518M->517M(1024M)] [ 518M->518M(1024M)] [ 519M->518M(1024M)] It does look like we've filled about 1M before the next collection happened, which is the default size of a region. PrintHeapAtGC should give more details on how many regions are there in each portion of the heap. igor