From john.cuthbertson at oracle.com Tue Jan 3 10:55:56 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 03 Jan 2012 10:55:56 -0800 Subject: RFR(S): 7121496: G1: do the per-region evacuation failure handling work in parallel In-Reply-To: <4EFB058A.4090304@oracle.com> References: <4EF4D6A8.2070301@oracle.com> <4EFB058A.4090304@oracle.com> Message-ID: <4F034F3C.9050804@oracle.com> Hi Tony, Thanks for looking over the code change. On 12/28/11 04:03, Tony Printezis wrote: > John, > > Thanks for doing this, it looks good. A few comments: > > g1CollectedHeap.cpp: > > 4102 assert(_g1h->g1_policy()->assertMarkedBytesDataOK(), "Should > be!"); > > Since this will now be executed concurrently with other workers doing > the self-forward removal it might pick up inconsistent information > (not 100% sure that this will be the case, but I'd like be careful!). > Why don't you stop calling it per region and only call it once at the > start of remove_self_forwarding_pointers() (it's already called at the > end). Sure. No problem. > 4192 // Now reset the claim values in the regions in the collection > set. > 4193 ResetClaimValuesClosure reset_cv_cl; > 4194 collection_set_iterate(&reset_cv_cl); > > This is fine but we already have: > > reset_heap_region_claim_values() > check_heap_region_claim_values() > > and > > check_cset_heap_region_claim_values() > > Can you maybe introduce reset_cset_heap_region_claim_values() for > consistency? Again no problem. > > BTW, I liked that you declared the update_rset_cl once per task. Thanks. It seemed the most natural way to do it. I'll also follow Igor's suggestion of moving these closures into a new .hpp file. Thanks, JohnC From john.coomes at oracle.com Tue Jan 3 13:02:38 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Tue, 03 Jan 2012 21:02:38 +0000 Subject: hg: hsx/hotspot-gc/langtools: 12 new changesets Message-ID: <20120103210307.8059047865@hg.openjdk.java.net> Changeset: 4822dfe0922b Author: ohair Date: 2011-12-12 08:15 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/4822dfe0922b 7119829: Adjust default jprt testing configuration Reviewed-by: alanb ! make/jprt.properties Changeset: 3809292620c9 Author: jjg Date: 2011-12-13 11:21 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/3809292620c9 7120736: refactor javac option handling Reviewed-by: mcimadamore ! src/share/classes/com/sun/tools/javac/api/JavacTool.java ! src/share/classes/com/sun/tools/javac/code/Source.java ! src/share/classes/com/sun/tools/javac/comp/Check.java ! src/share/classes/com/sun/tools/javac/comp/Enter.java ! src/share/classes/com/sun/tools/javac/comp/Lower.java ! src/share/classes/com/sun/tools/javac/file/Locations.java ! src/share/classes/com/sun/tools/javac/jvm/ClassReader.java ! src/share/classes/com/sun/tools/javac/jvm/ClassWriter.java ! src/share/classes/com/sun/tools/javac/jvm/Gen.java ! src/share/classes/com/sun/tools/javac/jvm/Target.java ! src/share/classes/com/sun/tools/javac/main/JavaCompiler.java ! src/share/classes/com/sun/tools/javac/main/Main.java ! src/share/classes/com/sun/tools/javac/nio/JavacPathFileManager.java ! src/share/classes/com/sun/tools/javac/processing/JavacProcessingEnvironment.java ! src/share/classes/com/sun/tools/javac/util/BaseFileManager.java ! src/share/classes/com/sun/tools/javac/util/Log.java ! src/share/classes/com/sun/tools/javac/util/Options.java ! test/tools/javac/diags/examples/UnsupportedEncoding.java Changeset: 4e4fed1d02f9 Author: jjg Date: 2011-12-13 14:33 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/4e4fed1d02f9 7121164: renamed files not committed Reviewed-by: ksrini - src/share/classes/com/sun/tools/javac/main/JavacOption.java + src/share/classes/com/sun/tools/javac/main/Option.java + src/share/classes/com/sun/tools/javac/main/OptionHelper.java - src/share/classes/com/sun/tools/javac/main/OptionName.java - src/share/classes/com/sun/tools/javac/main/RecognizedOptions.java Changeset: 4261dc8af622 Author: jjg Date: 2011-12-14 16:16 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/4261dc8af622 7111022: javac no long prints last round of processing 7121323: Sqe tests using -Xstdout option fail with an invalid flag error message Reviewed-by: darcy ! src/share/classes/com/sun/tools/javac/main/Option.java ! src/share/classes/com/sun/tools/javac/processing/JavacProcessingEnvironment.java ! src/share/classes/com/sun/tools/javac/util/Log.java ! test/tools/javac/4846262/Test.sh + test/tools/javac/processing/options/testPrintProcessorInfo/TestWithXstdout.java ! test/tools/javac/util/T6597678.java Changeset: 281eeedf9755 Author: jjg Date: 2011-12-14 17:52 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/281eeedf9755 7121681: compiler message file broken for javac -fullversion Reviewed-by: jjh ! src/share/classes/com/sun/tools/javac/main/Option.java Changeset: 42ffceeceeca Author: jjg Date: 2011-12-14 21:52 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/42ffceeceeca 7121682: remove obsolete import Reviewed-by: jjh ! test/tools/javac/api/T6838467.java Changeset: ab2a880cc23b Author: lana Date: 2011-12-15 19:53 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/ab2a880cc23b Merge Changeset: 6b773fdeb633 Author: jjg Date: 2011-12-16 13:49 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/6b773fdeb633 7121961: javadoc is missing a resource property Reviewed-by: bpatel ! src/share/classes/com/sun/tools/doclets/formats/html/resources/standard.properties Changeset: a7a2720c7897 Author: jjh Date: 2011-12-16 16:41 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/a7a2720c7897 7122342: testPrintProcessorInfo/TestWithXstdout.java failed for JDK8 nightly build at 12/16/2011 Summary: Do not pass empty args to javac Reviewed-by: jjg ! test/tools/javac/processing/options/testPrintProcessorInfo/TestWithXstdout.java Changeset: 1ae5988e201b Author: mcimadamore Date: 2011-12-19 12:07 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/1ae5988e201b 7120463: Fix method reference parser support in order to avoid ambiguities Summary: Add lookahead routine to disambiguate between method reference in method context and binary expression Reviewed-by: jjg, dlsmith ! src/share/classes/com/sun/tools/javac/parser/JavacParser.java ! test/tools/javac/lambda/MethodReferenceParserTest.java Changeset: 77b2c066084c Author: lana Date: 2011-12-23 16:39 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/77b2c066084c Merge - src/share/classes/com/sun/tools/javac/main/JavacOption.java - src/share/classes/com/sun/tools/javac/main/OptionName.java - src/share/classes/com/sun/tools/javac/main/RecognizedOptions.java Changeset: ffd294128a48 Author: katleman Date: 2011-12-29 15:14 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/ffd294128a48 Added tag jdk8-b19 for changeset 77b2c066084c ! .hgtags From stefan.karlsson at oracle.com Wed Jan 4 06:58:49 2012 From: stefan.karlsson at oracle.com (stefan.karlsson at oracle.com) Date: Wed, 04 Jan 2012 14:58:49 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7125503: Compiling collectedHeap.cpp fails with -Werror=int-to-pointer-cast with g++ 4.6.1 Message-ID: <20120104145853.B252747877@hg.openjdk.java.net> Changeset: b6a04c79ccbc Author: stefank Date: 2012-01-02 10:01 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/b6a04c79ccbc 7125503: Compiling collectedHeap.cpp fails with -Werror=int-to-pointer-cast with g++ 4.6.1 Summary: Used uintptr_t and void* for all the casts and checks in test_is_in. Reviewed-by: tonyp, jmasa ! src/share/vm/gc_interface/collectedHeap.cpp From jon.masamitsu at oracle.com Wed Jan 4 09:43:34 2012 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Wed, 04 Jan 2012 09:43:34 -0800 Subject: ParNew garbage collection In-Reply-To: <21ED8E3420CDB647B88C7F80A7D64DAC06F04BC659@exnjmb89.nam.nsroot.net> References: <21ED8E3420CDB647B88C7F80A7D64DAC06F04BC659@exnjmb89.nam.nsroot.net> Message-ID: <4F048FC6.30907@oracle.com> Try turning on TraceSafepointCleanupTime. I haven't used it myself. If that's not it, look in share/vm/runtime/globals.hpp for some other flag that traces safepoints. On 1/3/2012 1:36 PM, Darji, Kinnari wrote: > Hello GC team, > I have question regarding ParNew collection. As in logs below, the GC is taking only 0.04 sec but application was stopped for 1.71 sec. What could possibly cause this? Please advise. > > 2012-01-03T14:37:04.975-0500: 30982.368: [GC 30982.368: [ParNew > Desired survivor size 19628032 bytes, new threshold 4 (max 4) > - age 1: 4466024 bytes, 4466024 total > - age 2: 3568136 bytes, 8034160 total > - age 3: 3559808 bytes, 11593968 total > - age 4: 1737520 bytes, 13331488 total > : 330991K->18683K(345024K), 0.0357400 secs] 5205809K->4894299K(26176064K), 0.0366240 secs] [Times: user=0.47 sys=0.04, real=0.04 secs] > Total time for which application threads were stopped: 1.7197830 seconds > Application time: 8.4134780 seconds > > > > Thank you > Kinnari > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120104/1d676f2c/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From ysr1729 at gmail.com Wed Jan 4 09:53:52 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Wed, 4 Jan 2012 09:53:52 -0800 Subject: ParNew garbage collection In-Reply-To: <4F048FC6.30907@oracle.com> References: <21ED8E3420CDB647B88C7F80A7D64DAC06F04BC659@exnjmb89.nam.nsroot.net> <4F048FC6.30907@oracle.com> Message-ID: May be also +PrintSafepointStatistics (and related parms) to drill down a bit further, although TraceSafepointCleanup would probably provide all of the info on a per-safepoint basis. There was an old issue wrt monitor deflation that was foixed a few releases ago, so Kinnari should check the version of the JVM she's running on as well.... (There are now a couple of flags related to monitor list handling policies i believe but i have no experience with them and do not have the code in front of me -- make sure to cc the runtime list if that turns out to be the issue again and you are already on a very recent version of the JVM.) -- ramki On Wed, Jan 4, 2012 at 9:43 AM, Jon Masamitsu wrote: > ** > Try turning on TraceSafepointCleanupTime. I haven't used it myself. If > that's not it, look in share/vm/runtime/globals.hpp for some other flag > that traces safepoints. > > > On 1/3/2012 1:36 PM, Darji, Kinnari wrote: > > Hello GC team, > I have question regarding ParNew collection. As in logs below, the GC is taking only 0.04 sec but application was stopped for 1.71 sec. What could possibly cause this? Please advise. > > 2012-01-03T14:37:04.975-0500: 30982.368: [GC 30982.368: [ParNew > Desired survivor size 19628032 bytes, new threshold 4 (max 4) > - age 1: 4466024 bytes, 4466024 total > - age 2: 3568136 bytes, 8034160 total > - age 3: 3559808 bytes, 11593968 total > - age 4: 1737520 bytes, 13331488 total > : 330991K->18683K(345024K), 0.0357400 secs] 5205809K->4894299K(26176064K), 0.0366240 secs] [Times: user=0.47 sys=0.04, real=0.04 secs] > Total time for which application threads were stopped: 1.7197830 seconds > Application time: 8.4134780 seconds > > > > Thank you > Kinnari > > > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120104/c186dd04/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From jon.masamitsu at oracle.com Wed Jan 4 10:22:02 2012 From: jon.masamitsu at oracle.com (jon.masamitsu at oracle.com) Date: Wed, 04 Jan 2012 18:22:02 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 5 new changesets Message-ID: <20120104182215.44B4547879@hg.openjdk.java.net> Changeset: 4ceaf61479fc Author: dcubed Date: 2011-12-22 12:50 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4ceaf61479fc 7122253: Instrumentation.retransformClasses() leaks class bytes Summary: Change ClassFileParser::parseClassFile() to use the instanceKlass:_cached_class_file_bytes field to avoid leaking the cache. Reviewed-by: coleenp, acorn, poonam ! src/share/vm/classfile/classFileParser.cpp ! src/share/vm/prims/jvmtiEnv.cpp ! src/share/vm/prims/jvmtiExport.cpp ! src/share/vm/prims/jvmtiRedefineClasses.cpp Changeset: 4ec93d767458 Author: vladidan Date: 2011-12-26 20:36 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4ec93d767458 Merge Changeset: 3db6ea5ce021 Author: vladidan Date: 2011-12-29 20:09 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/3db6ea5ce021 Merge Changeset: 5ee33ff9b1c4 Author: jmasa Date: 2012-01-03 10:22 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5ee33ff9b1c4 Merge Changeset: 4753e3dda3c8 Author: jmasa Date: 2012-01-04 07:56 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4753e3dda3c8 Merge From john.cuthbertson at oracle.com Wed Jan 4 11:20:50 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 04 Jan 2012 11:20:50 -0800 Subject: CRR (S): 7121623: G1: always be able to reliably calculate the length of a forwarded chunked array In-Reply-To: <4EFAF192.7080605@oracle.com> References: <4EE9208B.5040203@oracle.com> <4EF25CC1.6030105@oracle.com> <4EF43DB8.1080104@oracle.com> <4EF9D448.80305@oracle.com> <4EFAF192.7080605@oracle.com> Message-ID: <4F04A692.8090103@oracle.com> Hi Tony, This looks good to me. JohnC On 12/28/2011 2:38 AM, Tony Printezis wrote: > Ramki, > > Quick follow-up on this. See below. > > On 12/27/2011 09:20 AM, Tony Printezis wrote: >> >>> It is probably true that the post-image's length is not used >>> during GC once it's been copied, but it'd be good to check (I'm >>> especially wary of CMS... but of course >>> this is limited to G1 -- does G1 ever need to scan or iterate over >>> regions that are subject to being copied >>> into during an incremental pause?) >> >> This is of course something I was also worried about. In G1 we should >> not be scanning to-space objects that are being copied during GC, not >> only because the length might be incorrect due to this change but >> also because there are no guarantees that the objects are well formed >> (another thread might be in the process of copying them). For all >> regions we copy objects into we call save_marks() so that we never go >> over saved_mark() during scanning. >> > > The above is correct. However your observation made me think of > something related: we do of course scan the to-image of an object > after we copy it to identify what it points to. When the object is > chunked we use oop_iterate_range() to scan each chunk. I checked the > definition of that method and it does not use the object's size / > length when doing the scanning, it relies only on the start / end > parameters passed to it. So, we're safe. :-) I updated the latest > webrev I posted: > > http://cr.openjdk.java.net/~tonyp/7121623/webrev.1/ > > to include the following comment: > 4674 // Process indexes [start,end). It will also process the header > 4675 // along with the first chunk (i.e., the chunk with start == 0). > 4676 // Note that at this point the length field of to_obj_array is not > 4677 // correct given that we are using it to keep track of the next > 4678 // start index. oop_iterate_range() (thankfully!) ignores the length > 4679 // field and only relies on the start / end parameters. It does > 4680 // however return the size of the object which will be incorrect. So > 4681 // we have to ignore it even if we wanted to use it. > 4682 to_obj_array->oop_iterate_range(&_scanner, start, end); > > Tony > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120104/64f7c193/attachment-0001.html From tony.printezis at oracle.com Wed Jan 4 11:42:48 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Wed, 04 Jan 2012 14:42:48 -0500 Subject: CRR (S): 7121623: G1: always be able to reliably calculate the length of a forwarded chunked array In-Reply-To: <4F04A692.8090103@oracle.com> References: <4EE9208B.5040203@oracle.com> <4EF25CC1.6030105@oracle.com> <4EF43DB8.1080104@oracle.com> <4EF9D448.80305@oracle.com> <4EFAF192.7080605@oracle.com> <4F04A692.8090103@oracle.com> Message-ID: <4F04ABB8.9070106@oracle.com> Thanks John. Any chance of getting one more review for this so that I can push it? The marking changes rely on it. Thanks, Tony On 1/4/2012 2:20 PM, John Cuthbertson wrote: > Hi Tony, > > This looks good to me. > > JohnC > > On 12/28/2011 2:38 AM, Tony Printezis wrote: >> Ramki, >> >> Quick follow-up on this. See below. >> >> On 12/27/2011 09:20 AM, Tony Printezis wrote: >>> >>>> It is probably true that the post-image's length is not used >>>> during GC once it's been copied, but it'd be good to check (I'm >>>> especially wary of CMS... but of course >>>> this is limited to G1 -- does G1 ever need to scan or iterate over >>>> regions that are subject to being copied >>>> into during an incremental pause?) >>> >>> This is of course something I was also worried about. In G1 we >>> should not be scanning to-space objects that are being copied during >>> GC, not only because the length might be incorrect due to this >>> change but also because there are no guarantees that the objects are >>> well formed (another thread might be in the process of copying >>> them). For all regions we copy objects into we call save_marks() so >>> that we never go over saved_mark() during scanning. >>> >> >> The above is correct. However your observation made me think of >> something related: we do of course scan the to-image of an object >> after we copy it to identify what it points to. When the object is >> chunked we use oop_iterate_range() to scan each chunk. I checked the >> definition of that method and it does not use the object's size / >> length when doing the scanning, it relies only on the start / end >> parameters passed to it. So, we're safe. :-) I updated the latest >> webrev I posted: >> >> http://cr.openjdk.java.net/~tonyp/7121623/webrev.1/ >> >> to include the following comment: >> 4674 // Process indexes [start,end). It will also process the header >> 4675 // along with the first chunk (i.e., the chunk with start == 0). >> 4676 // Note that at this point the length field of to_obj_array is not >> 4677 // correct given that we are using it to keep track of the next >> 4678 // start index. oop_iterate_range() (thankfully!) ignores the length >> 4679 // field and only relies on the start / end parameters. It does >> 4680 // however return the size of the object which will be incorrect. So >> 4681 // we have to ignore it even if we wanted to use it. >> 4682 to_obj_array->oop_iterate_range(&_scanner, start, end); >> >> Tony >> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120104/6dbeb400/attachment.html From tony.printezis at oracle.com Thu Jan 5 05:24:12 2012 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Thu, 05 Jan 2012 13:24:12 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7113006: G1: excessive ergo output when an evac failure happens Message-ID: <20120105132416.C82DB4789C@hg.openjdk.java.net> Changeset: bacb651cf5bf Author: tonyp Date: 2012-01-05 05:54 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/bacb651cf5bf 7113006: G1: excessive ergo output when an evac failure happens Summary: Introduce a flag that is set when a heap expansion attempt during a GC fails so that we do not consantly attempt to expand the heap when it's going to fail anyway. This not only prevents the excessive ergo output (which is generated when a region allocation fails) but also avoids excessive and ultimately unsuccessful expansion attempts. Reviewed-by: jmasa, johnc ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp From john.cuthbertson at oracle.com Thu Jan 5 11:55:54 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Thu, 05 Jan 2012 11:55:54 -0800 Subject: RFR(S): 7121496: G1: do the per-region evacuation failure handling work in parallel In-Reply-To: <4EF4D6A8.2070301@oracle.com> References: <4EF4D6A8.2070301@oracle.com> Message-ID: <4F06004A.5010701@oracle.com> Hi Everyone, I have a new webrev for this CR based upon feedback from Tony and Igor. The biggest difference is the moving of the closures and abstract gang task that removes the self-forwared pointers into their own header file. The new webrev can be found at: http://cr.openjdk.java.net/~johnc/7121496/webrev.1/ Thanks, JohnC On 12/23/2011 11:29 AM, John Cuthbertson wrote: > Hi Everyone, > > Can I have a couple of volunteers look of this set of changes? The > webrev can be found at: > http://cr.openjdk.java.net/~johnc/7121496/webrev.0/ > > Summary: > The work that gets done for each heap region in the collection set, in > the event of an evacuation failure, (e.g. removing self-forwarding > pointers, updating the BOT etc.) was serial. I parallelized it by > simply wrapping the work done for each region inside a HeapRegion > closure, whose doHeapRegion method claims a region and does the work > for that region. This HeapRegion closure is, in turn, wrapped in an > AbstractGangTask. > > Testing: GC test suite with both deferred and immediate RSet updates > (in some of the configurations - SPECjbb2000, SPECjbb2005, and > GCBasher can experience a number of evacuation failures); Kitchensink > with a forced evacuation failure mechanism. > > Thanks, > > JohnC > > From tony.printezis at oracle.com Thu Jan 5 12:17:04 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Thu, 05 Jan 2012 15:17:04 -0500 Subject: CRR (L / updated): 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces In-Reply-To: <4EFA08D8.8040009@oracle.com> References: <4EF25FB8.5050507@oracle.com> <4EFA08D8.8040009@oracle.com> Message-ID: <4F060540.3070005@oracle.com> Hi all, Updated webrev after making some changes based on comments from John (thanks John!): http://cr.openjdk.java.net/~tonyp/6888336/webrev.2/ I'd like to clarify something: this change relies on the array chunking changes (7121623) but the webrev does not include those changes (despite what the index page says). So, if you want to try this patch out you'll need to apply the array chunking changes first. Tony On 12/27/2011 01:05 PM, Tony Printezis wrote: > Hi all, > > Here's an updated webrev for this change that takes into account the > new approach of chunking object arrays (see previous e-mails on 7121623): > > http://cr.openjdk.java.net/~tonyp/6888336/webrev.1/ > > If anything else the new approach simplified the code a bit since now > we can always read an object's size from its from-image instead of > having to check one or the other depending on whether it's a chunked > array or not. I also moved the body of some methods from > heapRegion.hpp to the .inline.hpp and .cpp files (as they were getting > a bit large to keep in the .hpp file). > > Tony > > On 12/21/2011 05:37 PM, Tony Printezis wrote: >> Hi all, >> >> I'd like a couple of code reviews for the following non-trivial >> changes (large, not necessary in lines of code modified but more due >> to the fact that the evacuation pause / concurrent marking >> interaction is changed quite dramatically): >> >> http://cr.openjdk.java.net/~tonyp/6888336/webrev.0/ >> >> Here's some background, motivation, and a summary of the changes (I >> felt that it was important to write a longer then usual explanation): >> >> * Background / Motivation >> >> Each G1 heap region has a field top-at-mark-start (aka TAMS) which >> denotes where the top of the region was when marking started. An >> object is considered implicitly live if it's over TAMS (i.e., it was >> allocated since marking started) or explicitly live if it's below >> TAMS (i.e., it was allocated before marking started) and marked on >> the bitmap. (It follows that it's unnecessary to explicitly mark >> objects over TAMS.) >> >> In fact, we have two copies of the above marking information: "Next >> TAMS / Next Bitmap" and "Prev TAMS / Prev Bitmap". Prev is the copy >> that was obtained by the last marking cycle that was successfully >> completed (so, it is consistent: all live objects should appear as >> live in the prev marking information). Next is the copy that will be >> obtained / is currently being obtained and it's not consistent >> because it's not guaranteed to be complete. >> >> G1 uses SATB marking which has the advantage not to require objects >> allocated since the start of marking to be visited at all by the >> marking threads (they are implicitly live and they do not need to be >> scanned). So, the active marking cycle can totally ignore objects >> over NTAMS (since they have been allocated since marking started). >> >> The current interaction between evacuation pauses (let's call these >> "GCs" from now on) and concurrent marking is very tricky. Even though >> marking ignores all objects over NTAMS (currently: all objects in >> Eden regions) it still has to visit and mark objects in the Survivors >> regions. But those will be moved by subsequent GCs. So, a GC needs to >> be aware that it's moving objects that have been marked by the >> marking threads and not only propagate those marks but also notify >> the marking threads that said objects have been moved. For that we >> use several data structures: pushes to the global marking stack and >> also to what's referred to as the "region stack" which is only used >> by the GC to push a group of objects instead of pushing them >> individually ("region" here is a mem region and smaller than a G1 >> region). >> >> Additionally, because the marking threads could come across objects >> that could potentially move we have to make sure that we don't leave >> references to regions that have been evacuated on any marking data >> structure. To do that we treat as roots all entries on the taskqueues >> / global stack and drained all SATB buffers (both active buffers and >> also enqueued buffers). >> >> The first issue with the above interaction is that it has performance >> issues. Draining all SATB buffers and scanning the mark stack and >> taskqueues has been shown to be very time-consuming in some cases. >> Also, having to check whether objects are marked and propagate the >> marks appropriately during GC is an extra overhead. >> >> The second issue is that it has been shown to be very fragile. We >> have discovered and fixed many issues over time which were subtle and >> hard to reproduce. >> >> We really need to simplify the GC/marking interaction to both improve >> performance of GCs during marking, as well as improve our >> reliability. This changeset does exactly that. >> >> * Explanation of the changes >> >> The goal is to ensure that all the objects that are copied by the GC >> do not need to be visited by the marking threads and as a result do >> not need to be explicitly marked, pushed, etc. >> >> The first observation is that most objects copied during a GC are >> allocated after marking starts and are therefore implicitly live. >> This is the case for all objects on Eden regions, as well as most >> objects on Survivor regions. The only exception are objects on the >> Survivor regions during the initial-mark pause. Unfortunately, it's >> not easy to track those separately as they will get mixed in with >> future Survivors. The first decision to deal with this is to turn off >> Survivors during the initial-mark pause. This ensures that all >> objects copied during each subsequent GC will only visit objects that >> have been allocated since marking started and are therefore >> implicitly live (i.e., over NTAMS). This allows us to totally >> eliminate that code that propagates marks during the GC. We just have >> to make sure that all copied objects are over NTAMS. Turning off >> Survivors during an initial-mark pause is a bit of a "big hammer" >> approach, but it will suffice for now. We have ideas on how to >> re-enable them in the future and we'll explore a couple of alternatives. >> >> Given that the GC only copies objects that are implicitly marked it >> follows that none of the objects that are copied during any GC should >> appear on either the taskqueues nor the global marking stack. Also >> remember that we filter SATB buffers before enqueueing them which >> will filter out all implicitly marked objects. It follows that no >> enqueued SATB buffer should have references to objects that are being >> moved. This leaves the currently active SATB buffers given that the >> code that populates them is unconditional. But if we run the >> filtering on those during each GC such "offending" references are >> also quickly eliminated. So, instead of having to scan all stacks and >> all SATB buffers we only have to filter the active SATB buffers, >> which should be much, much faster. >> >> * Implementation Notes >> >> The actual changes are not too extensive as they basically mostly >> disable functionality in the GC code. The tricky part was to get the >> TAMS fields correct at various phases (start of copying, start of >> marking, etc.) and especially when an evacuation failure occurs. I >> put all that functionality in methods on HeapRegion which do the >> right thing when a GC starts, a marking starts, etc. >> >> The most important changes are in the "main" GC code, i.e. >> G1ParCopyHelper::do_oop_work() and >> G1ParCopyHelper::copy_to_survivor_space(). Instead of having to >> propagate marks we only now need to mark objects directly reachable >> from roots during the initial-mark pause. The resulting code is much >> simplified (and hopefully more performant!). >> >> I also added a method verify_no_cset_oops() which checks that indeed >> all the marking data structures do not point to regions that are >> being GCed at the start / end of each GC. (BTW, I'm considering >> adding a develop flag to enable this on demand.) >> >> I should point out that this changeset will leave a lot of dead code. >> However, I took the decision to keep the changes to a minimum in >> order not overwhelm the code reviewers and make the important changes >> clearer. (I also discussed this with a couple of potential code >> reviewers and they agreed that this is a good approach.) I >> temporarily added guarantees to ensure that methods that should not >> be called are not called. I will remove all dead code with a future >> push. >> >> I also have to apologize to John Cuthbertson for removing a lot of >> code he's added to deal with various bugs we had in the GC/marking >> interaction. Hopefully the new code will be less fragile compared to >> what we've had so far and John will be able to concentrate on more >> interesting features than trying to track down hard-to-reproduce >> failures! >> >> Tony >> From igor.veresov at oracle.com Thu Jan 5 17:26:41 2012 From: igor.veresov at oracle.com (Igor Veresov) Date: Thu, 5 Jan 2012 17:26:41 -0800 Subject: RFR(S): 7121496: G1: do the per-region evacuation failure handling work in parallel In-Reply-To: <4F06004A.5010701@oracle.com> References: <4EF4D6A8.2070301@oracle.com> <4F06004A.5010701@oracle.com> Message-ID: <0102A6C251A347B182E09F99590A231E@oracle.com> Thanks for doing that! Looks good. igor On Thursday, January 5, 2012 at 11:55 AM, John Cuthbertson wrote: > Hi Everyone, > > I have a new webrev for this CR based upon feedback from Tony and Igor. > The biggest difference is the moving of the closures and abstract gang > task that removes the self-forwared pointers into their own header file. > > The new webrev can be found at: > http://cr.openjdk.java.net/~johnc/7121496/webrev.1/ > > Thanks, > > JohnC > > On 12/23/2011 11:29 AM, John Cuthbertson wrote: > > Hi Everyone, > > > > Can I have a couple of volunteers look of this set of changes? The > > webrev can be found at: > > http://cr.openjdk.java.net/~johnc/7121496/webrev.0/ > > > > Summary: > > The work that gets done for each heap region in the collection set, in > > the event of an evacuation failure, (e.g. removing self-forwarding > > pointers, updating the BOT etc.) was serial. I parallelized it by > > simply wrapping the work done for each region inside a HeapRegion > > closure, whose doHeapRegion method claims a region and does the work > > for that region. This HeapRegion closure is, in turn, wrapped in an > > AbstractGangTask. > > > > Testing: GC test suite with both deferred and immediate RSet updates > > (in some of the configurations - SPECjbb2000, SPECjbb2005, and > > GCBasher can experience a number of evacuation failures); Kitchensink > > with a forced evacuation failure mechanism. > > > > Thanks, > > > > JohnC From john.coomes at oracle.com Thu Jan 5 20:55:50 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 06 Jan 2012 04:55:50 +0000 Subject: hg: hsx/hotspot-gc: Added tag jdk8-b20 for changeset 5a5eaf6374bc Message-ID: <20120106045550.53DE3478BF@hg.openjdk.java.net> Changeset: cc771d92284f Author: katleman Date: 2012-01-05 08:42 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/rev/cc771d92284f Added tag jdk8-b20 for changeset 5a5eaf6374bc ! .hgtags From john.coomes at oracle.com Thu Jan 5 20:55:57 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 06 Jan 2012 04:55:57 +0000 Subject: hg: hsx/hotspot-gc/corba: Added tag jdk8-b20 for changeset 51d8b6cb18c0 Message-ID: <20120106045559.D1C54478C0@hg.openjdk.java.net> Changeset: f157fc2a71a3 Author: katleman Date: 2012-01-05 08:42 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/corba/rev/f157fc2a71a3 Added tag jdk8-b20 for changeset 51d8b6cb18c0 ! .hgtags From john.coomes at oracle.com Thu Jan 5 20:56:07 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 06 Jan 2012 04:56:07 +0000 Subject: hg: hsx/hotspot-gc/jaxp: Added tag jdk8-b20 for changeset f052abb8f374 Message-ID: <20120106045607.9694D478C1@hg.openjdk.java.net> Changeset: d41eeadf5c13 Author: katleman Date: 2012-01-05 08:42 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/d41eeadf5c13 Added tag jdk8-b20 for changeset f052abb8f374 ! .hgtags From john.coomes at oracle.com Thu Jan 5 20:56:14 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 06 Jan 2012 04:56:14 +0000 Subject: hg: hsx/hotspot-gc/jaxws: Added tag jdk8-b20 for changeset 2b2818e3386f Message-ID: <20120106045614.975CE478C2@hg.openjdk.java.net> Changeset: dc2ee8b87884 Author: katleman Date: 2012-01-05 08:42 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jaxws/rev/dc2ee8b87884 Added tag jdk8-b20 for changeset 2b2818e3386f ! .hgtags From john.coomes at oracle.com Thu Jan 5 20:57:07 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 06 Jan 2012 04:57:07 +0000 Subject: hg: hsx/hotspot-gc/jdk: 9 new changesets Message-ID: <20120106045910.B0495478C3@hg.openjdk.java.net> Changeset: 172d70c50c65 Author: cgruszka Date: 2011-09-15 13:59 -0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/172d70c50c65 7066713: Separate demos from the bundles on Solaris and Linux Summary: add new license files to demos and samples, new directory for bundling Reviewed-by: katleman, ohair, billyh ! make/common/Release.gmk ! make/common/shared/Defs-control.gmk Changeset: eaf967fd25c5 Author: cgruszka Date: 2011-10-18 14:21 -0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/eaf967fd25c5 7099017: jdk7u2-dev does not build Summary: changes to skip demo/DEMOS_LICENSE and sample/SAMPLES_LICENSE when building OPENJDK Reviewed-by: ohair, billyh ! make/common/Release.gmk Changeset: 39b7f01203c9 Author: cgruszka Date: 2011-11-17 16:57 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/39b7f01203c9 Merge Changeset: b64e7263b4fd Author: cgruszka Date: 2011-11-18 01:03 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/b64e7263b4fd Merge Changeset: e98869ff9f1e Author: cgruszka Date: 2011-12-05 12:41 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/e98869ff9f1e Merge - test/java/io/FileDescriptor/FileChannelFDTest.java - test/java/io/etc/FileDescriptorSharing.java Changeset: ffa36a6a46f5 Author: cgruszka Date: 2011-12-16 15:01 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/ffa36a6a46f5 Merge - make/sun/motif12/reorder-i586 - make/sun/motif12/reorder-sparc - make/sun/motif12/reorder-sparcv9 - src/share/native/java/util/zip/zlib-1.2.3/ChangeLog - src/share/native/java/util/zip/zlib-1.2.3/README - src/share/native/java/util/zip/zlib-1.2.3/compress.c - src/share/native/java/util/zip/zlib-1.2.3/crc32.h - src/share/native/java/util/zip/zlib-1.2.3/deflate.c - src/share/native/java/util/zip/zlib-1.2.3/deflate.h - src/share/native/java/util/zip/zlib-1.2.3/gzio.c - src/share/native/java/util/zip/zlib-1.2.3/infback.c - src/share/native/java/util/zip/zlib-1.2.3/inffast.c - src/share/native/java/util/zip/zlib-1.2.3/inffast.h - src/share/native/java/util/zip/zlib-1.2.3/inffixed.h - src/share/native/java/util/zip/zlib-1.2.3/inflate.c - src/share/native/java/util/zip/zlib-1.2.3/inflate.h - src/share/native/java/util/zip/zlib-1.2.3/inftrees.c - src/share/native/java/util/zip/zlib-1.2.3/inftrees.h - src/share/native/java/util/zip/zlib-1.2.3/patches/ChangeLog_java - src/share/native/java/util/zip/zlib-1.2.3/patches/crc32.c.diff - src/share/native/java/util/zip/zlib-1.2.3/patches/inflate.c.diff - src/share/native/java/util/zip/zlib-1.2.3/patches/zconf.h.diff - src/share/native/java/util/zip/zlib-1.2.3/patches/zlib.h.diff - src/share/native/java/util/zip/zlib-1.2.3/trees.c - src/share/native/java/util/zip/zlib-1.2.3/trees.h - src/share/native/java/util/zip/zlib-1.2.3/uncompr.c - src/share/native/java/util/zip/zlib-1.2.3/zadler32.c - src/share/native/java/util/zip/zlib-1.2.3/zconf.h - src/share/native/java/util/zip/zlib-1.2.3/zcrc32.c - src/share/native/java/util/zip/zlib-1.2.3/zlib.h - src/share/native/java/util/zip/zlib-1.2.3/zutil.c - src/share/native/java/util/zip/zlib-1.2.3/zutil.h - src/solaris/classes/sun/awt/motif/AWTLockAccess.java - src/solaris/classes/sun/awt/motif/MFontPeer.java - src/solaris/classes/sun/awt/motif/MToolkit.java - src/solaris/classes/sun/awt/motif/MToolkitThreadBlockedHandler.java - src/solaris/classes/sun/awt/motif/MWindowAttributes.java - src/solaris/classes/sun/awt/motif/X11FontMetrics.java - src/solaris/native/sun/awt/MouseInfo.c - src/solaris/native/sun/awt/XDrawingArea.c - src/solaris/native/sun/awt/XDrawingArea.h - src/solaris/native/sun/awt/XDrawingAreaP.h - src/solaris/native/sun/awt/awt_Cursor.h - src/solaris/native/sun/awt/awt_KeyboardFocusManager.h - src/solaris/native/sun/awt/awt_MToolkit.c - src/solaris/native/sun/awt/awt_MToolkit.h - src/solaris/native/sun/awt/awt_MenuItem.h - src/solaris/native/sun/awt/awt_PopupMenu.h - src/solaris/native/sun/awt/awt_TopLevel.h - src/solaris/native/sun/awt/awt_Window.h - src/solaris/native/sun/awt/awt_mgrsel.c - src/solaris/native/sun/awt/awt_mgrsel.h - src/solaris/native/sun/awt/awt_motif.h - src/solaris/native/sun/awt/awt_wm.c - src/solaris/native/sun/awt/awt_wm.h - src/solaris/native/sun/awt/awt_xembed.h - src/solaris/native/sun/awt/awt_xembed_server.c - src/solaris/native/sun/awt/awt_xembed_server.h - test/java/util/ResourceBundle/Control/ExpirationTest.java - test/java/util/ResourceBundle/Control/ExpirationTest.sh Changeset: 5fe1525e6e2c Author: cgruszka Date: 2011-12-23 10:43 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/5fe1525e6e2c Merge Changeset: 39e938cd1b82 Author: cgruszka Date: 2012-01-03 14:34 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/39e938cd1b82 Merge Changeset: fc050750f8a0 Author: katleman Date: 2012-01-05 08:42 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/fc050750f8a0 Added tag jdk8-b20 for changeset 39e938cd1b82 ! .hgtags From john.coomes at oracle.com Thu Jan 5 21:00:56 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 06 Jan 2012 05:00:56 +0000 Subject: hg: hsx/hotspot-gc/langtools: Added tag jdk8-b20 for changeset ffd294128a48 Message-ID: <20120106050102.A1D17478C4@hg.openjdk.java.net> Changeset: 020819eb56d2 Author: katleman Date: 2012-01-05 08:42 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/020819eb56d2 Added tag jdk8-b20 for changeset ffd294128a48 ! .hgtags From jon.masamitsu at oracle.com Thu Jan 5 23:27:44 2012 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Thu, 05 Jan 2012 23:27:44 -0800 Subject: Promotion failures: indication of CMS fragmentation? In-Reply-To: References: <4EF9FCAC.3030208@oracle.com> Message-ID: <4F06A270.3010701@oracle.com> On 1/5/2012 3:32 PM, Taras Tielkes wrote: > Hi Jon, > > We've enabled the PrintPromotionFailure flag in our preprod > environment, but so far, no failures yet. > We know that the load we generate there is not representative. But > perhaps we'll catch something, given enough patience. > > The flag will also be enabled in our production environment next week > - so one way or the other, we'll get more diagnostic data soon. > I'll also do some allocation profiling of the application in isolation > - I know that there is abusive large byte[] and char[] allocation in > there. > > I've got two questions for now: > > 1) From googling around on the output to expect > (http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html), > I see that -XX:+PrintPromotionFailure will generate output like this: > ------- > 592.079: [ParNew (0: promotion failure size = 2698) (promotion > failed): 135865K->134943K(138240K), 0.1433555 secs] > ------- > In that example line, what does the "0" stand for? It's the index of the GC worker thread that experienced the promotion failure. > 2) Below is a snippet of (real) gc log from our production application: > ------- > 2011-12-30T22:42:12.684+0100: 2136581.585: [GC 2136581.585: [ParNew: > 345951K->40960K(368640K), 0.0676780 secs] > 3608692K->3323692K(5201920K), 0.0680220 secs] [Times: user=0.36 > sys=0.01, real=0.06 secs] > 2011-12-30T22:42:22.984+0100: 2136591.886: [GC 2136591.886: [ParNew: > 368640K->40959K(368640K), 0.0618880 secs] > 3651372K->3349928K(5201920K), 0.0622330 secs] [Times: user=0.31 > sys=0.00, real=0.06 secs] > 2011-12-30T22:42:23.052+0100: 2136591.954: [GC [1 CMS-initial-mark: > 3308968K(4833280K)] 3350041K(5201920K), 0.0377420 secs] [Times: > user=0.04 sys=0.00, real=0.04 secs] > 2011-12-30T22:42:23.090+0100: 2136591.992: [CMS-concurrent-mark-start] > 2011-12-30T22:42:24.076+0100: 2136592.978: [CMS-concurrent-mark: > 0.986/0.986 secs] [Times: user=2.05 sys=0.04, real=0.99 secs] > 2011-12-30T22:42:24.076+0100: 2136592.978: [CMS-concurrent-preclean-start] > 2011-12-30T22:42:24.099+0100: 2136593.000: [CMS-concurrent-preclean: > 0.021/0.023 secs] [Times: user=0.03 sys=0.00, real=0.02 secs] > 2011-12-30T22:42:24.099+0100: 2136593.001: > [CMS-concurrent-abortable-preclean-start] > CMS: abort preclean due to time 2011-12-30T22:42:29.335+0100: > 2136598.236: [CMS-concurrent-abortable-preclean: 5.209/5.236 secs] > [Times: user=5.70 sys=0.23, real=5.23 secs] > 2011-12-30T22:42:29.340+0100: 2136598.242: [GC[YG occupancy: 123870 K > (368640 K)]2011-12-30T22:42:29.341+0100: 2136598.242: [GC 2136598.242: > [ParNew (promotion failed): 123870K->105466K(368640K), 7.4939280 secs] > 3432839K->3423755K(5201920 > K), 7.4942670 secs] [Times: user=9.08 sys=2.10, real=7.49 secs] > 2136605.737: [Rescan (parallel) , 0.0644050 secs]2136605.801: [weak > refs processing, 0.0034280 secs]2136605.804: [class unloading, > 0.0289480 secs]2136605.833: [scrub symbol& string tables, 0.0093940 > secs] [1 CMS-remark: 3318289K(4833280K > )] 3423755K(5201920K), 7.6077990 secs] [Times: user=9.54 sys=2.10, > real=7.61 secs] > 2011-12-30T22:42:36.949+0100: 2136605.850: [CMS-concurrent-sweep-start] > 2011-12-30T22:42:45.006+0100: 2136613.907: [Full GC 2136613.908: > [CMS2011-12-30T22:42:51.038+0100: 2136619.939: [CMS-concurrent-sweep: > 12.231/14.089 secs] [Times: user=15.14 sys=5.36, real=14.08 secs] > (concurrent mode failure): 3141235K->291853K(4833280K), 10.2906040 > secs] 3491471K->291853K(5201920K), [CMS Perm : > 121784K->121765K(262144K)], 10.2910040 secs] [Times: user=10.29 > sys=0.00, real=10.29 secs] > 2011-12-30T22:42:56.281+0100: 2136625.183: [GC 2136625.183: [ParNew: > 327680K->25286K(368640K), 0.0287220 secs] 619533K->317140K(5201920K), > 0.0291610 secs] [Times: user=0.13 sys=0.00, real=0.03 secs] > 2011-12-30T22:43:10.516+0100: 2136639.418: [GC 2136639.418: [ParNew: > 352966K->26737K(368640K), 0.0586400 secs] 644820K->338758K(5201920K), > 0.0589640 secs] [Times: user=0.31 sys=0.00, real=0.06 secs] > ------- > > In this case I don't know how to interpret the output. > a) There's a promotion failure that took 7.49 secs This is the time it took to attempt the minor collection (ParNew) and to do recovery from the failure. > b) There's a full GC that took 14.08 secs > c) There's a concurrent mode failure that took 10.29 secs Not sure about b) and c) because the output is mixed up with the concurrent-sweep output but I think the "concurrent mode failure" message is part of the "Full GC" message. My guess is that the 10.29 is the time for the Full GC and the 14.08 maybe is part of the concurrent-sweep message. Really hard to be sure. Jon > How are these events, and their (real) times related to each other? > > Thanks in advance, > Taras > > On Tue, Dec 27, 2011 at 6:13 PM, Jon Masamitsu wrote: >> Taras, >> >> PrintPromotionFailure seems like it would go a long >> way to identify the root of your promotion failures (or >> at least eliminating some possible causes). I think it >> would help focus the discussion if you could send >> the result of that experiment early. >> >> Jon >> >> On 12/27/2011 5:07 AM, Taras Tielkes wrote: >>> Hi, >>> >>> We're running an application with the CMS/ParNew collectors that is >>> experiencing occasional promotion failures. >>> Environment is Linux 2.6.18 (x64), JVM is 1.6.0_29 in server mode. >>> I've listed the specific JVM options used below (a). >>> >>> The application is deployed across a handful of machines, and the >>> promotion failures are fairly uniform across those. >>> >>> The first kind of failure we observe is a promotion failure during >>> ParNew collection, I've included a snipped from the gc log below (b). >>> The second kind of failure is a concurrrent mode failure (perhaps >>> triggered by the same cause), see (c) below. >>> The frequency (after running for a some weeks) is approximately once >>> per day. This is bearable, but obviously we'd like to improve on this. >>> >>> Apart from high-volume request handling (which allocates a lot of >>> small objects), the application also runs a few dozen background >>> threads that download and process XML documents, typically in the 5-30 >>> MB range. >>> A known deficiency in the existing code is that the XML content is >>> copied twice before processing (once to a byte[], and later again to a >>> String/char[]). >>> Given that a 30 MB XML stream will result in a 60 MB >>> java.lang.String/char[], my suspicion is that these big array >>> allocations are causing us to run into the CMS fragmentation issue. >>> >>> My questions are: >>> 1) Does the data from the GC logs provide sufficient evidence to >>> conclude that CMS fragmentation is the cause of the promotion failure? >>> 2) If not, what's the next step of investigating the cause? >>> 3) We're planning to at least add -XX:+PrintPromotionFailure to get a >>> feeling for the size of the objects that fail promotion. >>> Overall, it seem that -XX:PrintFLSStatistics=1 is actually the only >>> reliable approach to diagnose CMS fragmentation. Is this indeed the >>> case? >>> >>> Thanks in advance, >>> Taras >>> >>> a) Current JVM options: >>> -------------------------------- >>> -server >>> -Xms5g >>> -Xmx5g >>> -Xmn400m >>> -XX:PermSize=256m >>> -XX:MaxPermSize=256m >>> -XX:+PrintGCTimeStamps >>> -verbose:gc >>> -XX:+PrintGCDateStamps >>> -XX:+PrintGCDetails >>> -XX:SurvivorRatio=8 >>> -XX:+UseConcMarkSweepGC >>> -XX:+UseParNewGC >>> -XX:+DisableExplicitGC >>> -XX:+UseCMSInitiatingOccupancyOnly >>> -XX:+CMSClassUnloadingEnabled >>> -XX:+CMSScavengeBeforeRemark >>> -XX:CMSInitiatingOccupancyFraction=68 >>> -Xloggc:gc.log >>> -------------------------------- >>> >>> b) Promotion failure during ParNew >>> -------------------------------- >>> 2011-12-08T18:14:40.966+0100: 219729.868: [GC 219729.868: [ParNew: >>> 368640K->40959K(368640K), 0.0693460 secs] >>> 3504917K->3195098K(5201920K), 0.0696500 secs] [Times: user=0.39 >>> sys=0.01, real=0.07 secs] >>> 2011-12-08T18:14:43.778+0100: 219732.679: [GC 219732.679: [ParNew: >>> 368639K->31321K(368640K), 0.0511400 secs] >>> 3522778K->3198316K(5201920K), 0.0514420 secs] [Times: user=0.28 >>> sys=0.00, real=0.05 secs] >>> 2011-12-08T18:14:46.945+0100: 219735.846: [GC 219735.846: [ParNew: >>> 359001K->18694K(368640K), 0.0272970 secs] >>> 3525996K->3185690K(5201920K), 0.0276080 secs] [Times: user=0.19 >>> sys=0.00, real=0.03 secs] >>> 2011-12-08T18:14:49.036+0100: 219737.938: [GC 219737.938: [ParNew >>> (promotion failed): 338813K->361078K(368640K), 0.1321200 >>> secs]219738.070: [CMS: 3167747K->434291K(4833280K), 4.8881570 secs] >>> 3505808K->434291K >>> (5201920K), [CMS Perm : 116893K->116883K(262144K)], 5.0206620 secs] >>> [Times: user=5.24 sys=0.00, real=5.02 secs] >>> 2011-12-08T18:14:54.721+0100: 219743.622: [GC 219743.623: [ParNew: >>> 327680K->40960K(368640K), 0.0949460 secs] 761971K->514584K(5201920K), >>> 0.0952820 secs] [Times: user=0.52 sys=0.04, real=0.10 secs] >>> 2011-12-08T18:14:55.580+0100: 219744.481: [GC 219744.482: [ParNew: >>> 368640K->40960K(368640K), 0.1299190 secs] 842264K->625681K(5201920K), >>> 0.1302190 secs] [Times: user=0.72 sys=0.01, real=0.13 secs] >>> 2011-12-08T18:14:58.050+0100: 219746.952: [GC 219746.952: [ParNew: >>> 368640K->40960K(368640K), 0.0870940 secs] 953361K->684121K(5201920K), >>> 0.0874110 secs] [Times: user=0.48 sys=0.01, real=0.09 secs] >>> -------------------------------- >>> >>> c) Promotion failure during CMS >>> -------------------------------- >>> 2011-12-14T08:29:26.628+0100: 703015.530: [GC 703015.530: [ParNew: >>> 357228K->40960K(368640K), 0.0525110 secs] >>> 3603068K->3312743K(5201920K), 0.0528120 secs] [Times: user=0.37 >>> sys=0.00, real=0.05 secs] >>> 2011-12-14T08:29:28.864+0100: 703017.766: [GC 703017.766: [ParNew: >>> 366075K->37119K(368640K), 0.0479780 secs] >>> 3637859K->3317662K(5201920K), 0.0483090 secs] [Times: user=0.24 >>> sys=0.01, real=0.05 secs] >>> 2011-12-14T08:29:29.553+0100: 703018.454: [GC 703018.455: [ParNew: >>> 364792K->40960K(368640K), 0.0421740 secs] >>> 3645334K->3334944K(5201920K), 0.0424810 secs] [Times: user=0.30 >>> sys=0.00, real=0.04 secs] >>> 2011-12-14T08:29:29.600+0100: 703018.502: [GC [1 CMS-initial-mark: >>> 3293984K(4833280K)] 3335025K(5201920K), 0.0272490 secs] [Times: >>> user=0.02 sys=0.00, real=0.03 secs] >>> 2011-12-14T08:29:29.628+0100: 703018.529: [CMS-concurrent-mark-start] >>> 2011-12-14T08:29:30.718+0100: 703019.620: [GC 703019.620: [ParNew: >>> 368640K->40960K(368640K), 0.0836690 secs] >>> 3662624K->3386039K(5201920K), 0.0839690 secs] [Times: user=0.50 >>> sys=0.01, real=0.08 secs] >>> 2011-12-14T08:29:30.827+0100: 703019.729: [CMS-concurrent-mark: >>> 1.108/1.200 secs] [Times: user=6.83 sys=0.23, real=1.20 secs] >>> 2011-12-14T08:29:30.827+0100: 703019.729: [CMS-concurrent-preclean-start] >>> 2011-12-14T08:29:30.938+0100: 703019.840: [CMS-concurrent-preclean: >>> 0.093/0.111 secs] [Times: user=0.48 sys=0.02, real=0.11 secs] >>> 2011-12-14T08:29:30.938+0100: 703019.840: >>> [CMS-concurrent-abortable-preclean-start] >>> 2011-12-14T08:29:32.337+0100: 703021.239: >>> [CMS-concurrent-abortable-preclean: 1.383/1.399 secs] [Times: >>> user=6.68 sys=0.27, real=1.40 secs] >>> 2011-12-14T08:29:32.343+0100: 703021.244: [GC[YG occupancy: 347750 K >>> (368640 K)]2011-12-14T08:29:32.343+0100: 703021.244: [GC 703021.244: >>> [ParNew (promotion failed): 347750K->347750K(368640K), 9.8729020 secs] >>> 3692829K->3718580K(5201920K), 9.8732380 secs] [Times: user=12.00 >>> sys=2.58, real=9.88 secs] >>> 703031.118: [Rescan (parallel) , 0.2826110 secs]703031.400: [weak refs >>> processing, 0.0014780 secs]703031.402: [class unloading, 0.0176610 >>> secs]703031.419: [scrub symbol& string tables, 0.0094960 secs] [1 CMS >>> -remark: 3370830K(4833280K)] 3718580K(5201920K), 10.1916910 secs] >>> [Times: user=13.73 sys=2.59, real=10.19 secs] >>> 2011-12-14T08:29:42.535+0100: 703031.436: [CMS-concurrent-sweep-start] >>> 2011-12-14T08:29:42.591+0100: 703031.493: [Full GC 703031.493: >>> [CMS2011-12-14T08:29:48.616+0100: 703037.518: [CMS-concurrent-sweep: >>> 6.046/6.082 secs] [Times: user=6.18 sys=0.01, real=6.09 secs] >>> (concurrent mode failure): 3370829K->433437K(4833280K), 10.9594300 >>> secs] 3739469K->433437K(5201920K), [CMS Perm : >>> 121702K->121690K(262144K)], 10.9597540 secs] [Times: user=10.95 >>> sys=0.00, real=10.96 secs] >>> 2011-12-14T08:29:53.997+0100: 703042.899: [GC 703042.899: [ParNew: >>> 327680K->40960K(368640K), 0.0799960 secs] 761117K->517836K(5201920K), >>> 0.0804100 secs] [Times: user=0.46 sys=0.00, real=0.08 secs] >>> 2011-12-14T08:29:54.649+0100: 703043.551: [GC 703043.551: [ParNew: >>> 368640K->40960K(368640K), 0.0784460 secs] 845516K->557872K(5201920K), >>> 0.0787920 secs] [Times: user=0.40 sys=0.01, real=0.08 secs] >>> 2011-12-14T08:29:56.418+0100: 703045.320: [GC 703045.320: [ParNew: >>> 368640K->40960K(368640K), 0.0784040 secs] 885552K->603017K(5201920K), >>> 0.0787630 secs] [Times: user=0.41 sys=0.01, real=0.07 secs] >>> -------------------------------- >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tony.printezis at oracle.com Fri Jan 6 00:51:52 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 06 Jan 2012 03:51:52 -0500 Subject: RFR(S): 7121496: G1: do the per-region evacuation failure handling work in parallel In-Reply-To: <4F06004A.5010701@oracle.com> References: <4EF4D6A8.2070301@oracle.com> <4F06004A.5010701@oracle.com> Message-ID: <4F06B628.60607@oracle.com> John, Looks good, ship it asap. :-) Tony On 01/05/2012 02:55 PM, John Cuthbertson wrote: > Hi Everyone, > > I have a new webrev for this CR based upon feedback from Tony and > Igor. The biggest difference is the moving of the closures and > abstract gang task that removes the self-forwared pointers into their > own header file. > > The new webrev can be found at: > http://cr.openjdk.java.net/~johnc/7121496/webrev.1/ > > Thanks, > > JohnC > > On 12/23/2011 11:29 AM, John Cuthbertson wrote: >> Hi Everyone, >> >> Can I have a couple of volunteers look of this set of changes? The >> webrev can be found at: >> http://cr.openjdk.java.net/~johnc/7121496/webrev.0/ >> >> Summary: >> The work that gets done for each heap region in the collection set, >> in the event of an evacuation failure, (e.g. removing self-forwarding >> pointers, updating the BOT etc.) was serial. I parallelized it by >> simply wrapping the work done for each region inside a HeapRegion >> closure, whose doHeapRegion method claims a region and does the work >> for that region. This HeapRegion closure is, in turn, wrapped in an >> AbstractGangTask. >> >> Testing: GC test suite with both deferred and immediate RSet updates >> (in some of the configurations - SPECjbb2000, SPECjbb2005, and >> GCBasher can experience a number of evacuation failures); Kitchensink >> with a forced evacuation failure mechanism. >> >> Thanks, >> >> JohnC >> >> > From jon.masamitsu at oracle.com Fri Jan 6 11:17:45 2012 From: jon.masamitsu at oracle.com (jon.masamitsu at oracle.com) Date: Fri, 06 Jan 2012 19:17:45 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 11 new changesets Message-ID: <20120106191814.B0771478D3@hg.openjdk.java.net> Changeset: 75c0a73eee98 Author: coleenp Date: 2011-11-17 12:53 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/75c0a73eee98 7102776: Pack instanceKlass boolean fields into single u1 field Summary: Reduce class runtime memory usage by packing 4 instanceKlass boolean fields into single u1 field. Save 4-byte for each loaded class. Reviewed-by: dholmes, bobv, phh, twisti, never, coleenp Contributed-by: Jiangli Zhou ! agent/src/share/classes/sun/jvm/hotspot/oops/InstanceKlass.java ! src/share/vm/code/dependencies.cpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/oops/instanceKlassKlass.cpp ! src/share/vm/runtime/vmStructs.cpp Changeset: da4dd142ea01 Author: bobv Date: 2011-11-29 14:44 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/da4dd142ea01 Merge ! src/share/vm/code/dependencies.cpp Changeset: 52b5d32fbfaf Author: coleenp Date: 2011-12-06 18:28 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/52b5d32fbfaf 7117052: instanceKlass::_init_state can be u1 type Summary: Change instanceKlass::_init_state field to u1 type. Reviewed-by: bdelsart, coleenp, dholmes, phh, never Contributed-by: Jiangli Zhou ! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp ! src/cpu/sparc/vm/c1_Runtime1_sparc.cpp ! src/cpu/sparc/vm/templateTable_sparc.cpp ! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp ! src/cpu/x86/vm/c1_Runtime1_x86.cpp ! src/cpu/x86/vm/templateTable_x86_32.cpp ! src/cpu/x86/vm/templateTable_x86_64.cpp ! src/share/vm/ci/ciInstanceKlass.cpp ! src/share/vm/memory/dump.cpp ! src/share/vm/oops/instanceKlass.cpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/opto/library_call.cpp ! src/share/vm/opto/parseHelper.cpp ! src/share/vm/runtime/vmStructs.cpp Changeset: eccc4b1f8945 Author: vladidan Date: 2011-12-07 16:47 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/eccc4b1f8945 7050298: ARM: SIGSEGV in JNIHandleBlock::allocate_handle Summary: missing release barrier in Monitor::IUnlock Reviewed-by: dholmes, dice ! src/share/vm/runtime/mutex.cpp Changeset: 2685ea97b89f Author: jiangli Date: 2011-12-09 11:29 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/2685ea97b89f Merge ! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp Changeset: 8fdf463085e1 Author: jiangli Date: 2011-12-16 17:33 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/8fdf463085e1 Merge Changeset: dca455dea3a7 Author: bdelsart Date: 2011-12-20 12:33 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/dca455dea3a7 7116216: StackOverflow GC crash Summary: GC crash for explicit stack overflow checks after a C2I transition. Reviewed-by: coleenp, never Contributed-by: yang02.wang at sap.com, bertrand.delsart at oracle.com ! src/cpu/sparc/vm/stubGenerator_sparc.cpp ! src/cpu/sparc/vm/templateInterpreter_sparc.cpp ! src/cpu/x86/vm/stubGenerator_x86_32.cpp ! src/cpu/x86/vm/stubGenerator_x86_64.cpp ! src/cpu/x86/vm/templateInterpreter_x86_32.cpp ! src/cpu/x86/vm/templateInterpreter_x86_64.cpp + test/compiler/7116216/LargeFrame.java + test/compiler/7116216/StackOverflow.java Changeset: cd5d8cafcc84 Author: jiangli Date: 2011-12-28 12:15 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/cd5d8cafcc84 7123315: instanceKlass::_static_oop_field_count and instanceKlass::_java_fields_count should be u2 type. Summary: Change instanceKlass::_static_oop_field_count and instanceKlass::_java_fields_count to u2 type. Reviewed-by: never, bdelsart, dholmes Contributed-by: Jiangli Zhou ! src/share/vm/classfile/classFileParser.cpp ! src/share/vm/classfile/classFileParser.hpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/runtime/vmStructs.cpp Changeset: 05de27e852c4 Author: jiangli Date: 2012-01-04 12:36 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/05de27e852c4 Merge ! src/share/vm/classfile/classFileParser.cpp Changeset: 2ee4167627a3 Author: jmasa Date: 2012-01-05 21:02 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/2ee4167627a3 Merge Changeset: 5fd354a959c5 Author: jmasa Date: 2012-01-05 21:21 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5fd354a959c5 Merge From john.cuthbertson at oracle.com Fri Jan 6 18:35:28 2012 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Sat, 07 Jan 2012 02:35:28 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7121496: G1: do the per-region evacuation failure handling work in parallel Message-ID: <20120107023535.83E88478DA@hg.openjdk.java.net> Changeset: 023652e49ac0 Author: johnc Date: 2011-12-23 11:14 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/023652e49ac0 7121496: G1: do the per-region evacuation failure handling work in parallel Summary: Parallelize the removal of self forwarding pointers etc. by wrapping in a HeapRegion closure, which is then wrapped inside an AbstractGangTask. Reviewed-by: tonyp, iveresov ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp + src/share/vm/gc_implementation/g1/g1EvacFailure.hpp ! src/share/vm/gc_implementation/g1/heapRegion.hpp From tony.printezis at oracle.com Fri Jan 6 20:43:34 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 06 Jan 2012 23:43:34 -0500 Subject: CRR (L / updated): 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces In-Reply-To: <4F060540.3070005@oracle.com> References: <4EF25FB8.5050507@oracle.com> <4EFA08D8.8040009@oracle.com> <4F060540.3070005@oracle.com> Message-ID: <4F07CD76.9080502@oracle.com> Hi again, Here's an updated webrev after I merged my changes with John's latest push: http://cr.openjdk.java.net/~tonyp/6888336/webrev.3/ The code is basically the same, I just had to move some of it to different places. Testing update: I've been testing the changes continuously on three machines over the holidays and I haven't seen any failures since the single failure over Xmas which was caused by the race in the array chunking changes which has now been resolved. I did additional testing with a patch from John (thanks again!) which artificially forces evacuation failures to stress that code and, again, I saw no issues with that either. Tony On 01/05/2012 03:17 PM, Tony Printezis wrote: > Hi all, > > Updated webrev after making some changes based on comments from John > (thanks John!): > > http://cr.openjdk.java.net/~tonyp/6888336/webrev.2/ > > I'd like to clarify something: this change relies on the array > chunking changes (7121623) but the webrev does not include those > changes (despite what the index page says). So, if you want to try > this patch out you'll need to apply the array chunking changes first. > > Tony > > On 12/27/2011 01:05 PM, Tony Printezis wrote: >> Hi all, >> >> Here's an updated webrev for this change that takes into account the >> new approach of chunking object arrays (see previous e-mails on >> 7121623): >> >> http://cr.openjdk.java.net/~tonyp/6888336/webrev.1/ >> >> If anything else the new approach simplified the code a bit since now >> we can always read an object's size from its from-image instead of >> having to check one or the other depending on whether it's a chunked >> array or not. I also moved the body of some methods from >> heapRegion.hpp to the .inline.hpp and .cpp files (as they were >> getting a bit large to keep in the .hpp file). >> >> Tony >> >> On 12/21/2011 05:37 PM, Tony Printezis wrote: >>> Hi all, >>> >>> I'd like a couple of code reviews for the following non-trivial >>> changes (large, not necessary in lines of code modified but more due >>> to the fact that the evacuation pause / concurrent marking >>> interaction is changed quite dramatically): >>> >>> http://cr.openjdk.java.net/~tonyp/6888336/webrev.0/ >>> >>> Here's some background, motivation, and a summary of the changes (I >>> felt that it was important to write a longer then usual explanation): >>> >>> * Background / Motivation >>> >>> Each G1 heap region has a field top-at-mark-start (aka TAMS) which >>> denotes where the top of the region was when marking started. An >>> object is considered implicitly live if it's over TAMS (i.e., it was >>> allocated since marking started) or explicitly live if it's below >>> TAMS (i.e., it was allocated before marking started) and marked on >>> the bitmap. (It follows that it's unnecessary to explicitly mark >>> objects over TAMS.) >>> >>> In fact, we have two copies of the above marking information: "Next >>> TAMS / Next Bitmap" and "Prev TAMS / Prev Bitmap". Prev is the copy >>> that was obtained by the last marking cycle that was successfully >>> completed (so, it is consistent: all live objects should appear as >>> live in the prev marking information). Next is the copy that will be >>> obtained / is currently being obtained and it's not consistent >>> because it's not guaranteed to be complete. >>> >>> G1 uses SATB marking which has the advantage not to require objects >>> allocated since the start of marking to be visited at all by the >>> marking threads (they are implicitly live and they do not need to be >>> scanned). So, the active marking cycle can totally ignore objects >>> over NTAMS (since they have been allocated since marking started). >>> >>> The current interaction between evacuation pauses (let's call these >>> "GCs" from now on) and concurrent marking is very tricky. Even >>> though marking ignores all objects over NTAMS (currently: all >>> objects in Eden regions) it still has to visit and mark objects in >>> the Survivors regions. But those will be moved by subsequent GCs. >>> So, a GC needs to be aware that it's moving objects that have been >>> marked by the marking threads and not only propagate those marks but >>> also notify the marking threads that said objects have been moved. >>> For that we use several data structures: pushes to the global >>> marking stack and also to what's referred to as the "region stack" >>> which is only used by the GC to push a group of objects instead of >>> pushing them individually ("region" here is a mem region and >>> smaller than a G1 region). >>> >>> Additionally, because the marking threads could come across objects >>> that could potentially move we have to make sure that we don't leave >>> references to regions that have been evacuated on any marking data >>> structure. To do that we treat as roots all entries on the >>> taskqueues / global stack and drained all SATB buffers (both active >>> buffers and also enqueued buffers). >>> >>> The first issue with the above interaction is that it has >>> performance issues. Draining all SATB buffers and scanning the mark >>> stack and taskqueues has been shown to be very time-consuming in >>> some cases. Also, having to check whether objects are marked and >>> propagate the marks appropriately during GC is an extra overhead. >>> >>> The second issue is that it has been shown to be very fragile. We >>> have discovered and fixed many issues over time which were subtle >>> and hard to reproduce. >>> >>> We really need to simplify the GC/marking interaction to both >>> improve performance of GCs during marking, as well as improve our >>> reliability. This changeset does exactly that. >>> >>> * Explanation of the changes >>> >>> The goal is to ensure that all the objects that are copied by the GC >>> do not need to be visited by the marking threads and as a result do >>> not need to be explicitly marked, pushed, etc. >>> >>> The first observation is that most objects copied during a GC are >>> allocated after marking starts and are therefore implicitly live. >>> This is the case for all objects on Eden regions, as well as most >>> objects on Survivor regions. The only exception are objects on the >>> Survivor regions during the initial-mark pause. Unfortunately, it's >>> not easy to track those separately as they will get mixed in with >>> future Survivors. The first decision to deal with this is to turn >>> off Survivors during the initial-mark pause. This ensures that all >>> objects copied during each subsequent GC will only visit objects >>> that have been allocated since marking started and are therefore >>> implicitly live (i.e., over NTAMS). This allows us to totally >>> eliminate that code that propagates marks during the GC. We just >>> have to make sure that all copied objects are over NTAMS. Turning >>> off Survivors during an initial-mark pause is a bit of a "big >>> hammer" approach, but it will suffice for now. We have ideas on how >>> to re-enable them in the future and we'll explore a couple of >>> alternatives. >>> >>> Given that the GC only copies objects that are implicitly marked it >>> follows that none of the objects that are copied during any GC >>> should appear on either the taskqueues nor the global marking stack. >>> Also remember that we filter SATB buffers before enqueueing them >>> which will filter out all implicitly marked objects. It follows that >>> no enqueued SATB buffer should have references to objects that are >>> being moved. This leaves the currently active SATB buffers given >>> that the code that populates them is unconditional. But if we run >>> the filtering on those during each GC such "offending" references >>> are also quickly eliminated. So, instead of having to scan all >>> stacks and all SATB buffers we only have to filter the active SATB >>> buffers, which should be much, much faster. >>> >>> * Implementation Notes >>> >>> The actual changes are not too extensive as they basically mostly >>> disable functionality in the GC code. The tricky part was to get the >>> TAMS fields correct at various phases (start of copying, start of >>> marking, etc.) and especially when an evacuation failure occurs. I >>> put all that functionality in methods on HeapRegion which do the >>> right thing when a GC starts, a marking starts, etc. >>> >>> The most important changes are in the "main" GC code, i.e. >>> G1ParCopyHelper::do_oop_work() and >>> G1ParCopyHelper::copy_to_survivor_space(). Instead of having to >>> propagate marks we only now need to mark objects directly reachable >>> from roots during the initial-mark pause. The resulting code is much >>> simplified (and hopefully more performant!). >>> >>> I also added a method verify_no_cset_oops() which checks that indeed >>> all the marking data structures do not point to regions that are >>> being GCed at the start / end of each GC. (BTW, I'm considering >>> adding a develop flag to enable this on demand.) >>> >>> I should point out that this changeset will leave a lot of dead >>> code. However, I took the decision to keep the changes to a minimum >>> in order not overwhelm the code reviewers and make the important >>> changes clearer. (I also discussed this with a couple of potential >>> code reviewers and they agreed that this is a good approach.) I >>> temporarily added guarantees to ensure that methods that should not >>> be called are not called. I will remove all dead code with a future >>> push. >>> >>> I also have to apologize to John Cuthbertson for removing a lot of >>> code he's added to deal with various bugs we had in the GC/marking >>> interaction. Hopefully the new code will be less fragile compared to >>> what we've had so far and John will be able to concentrate on more >>> interesting features than trying to track down hard-to-reproduce >>> failures! >>> >>> Tony >>> From tony.printezis at oracle.com Sat Jan 7 01:37:32 2012 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Sat, 07 Jan 2012 09:37:32 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7121623: G1: always be able to reliably calculate the length of a forwarded chunked array Message-ID: <20120107093734.3026D478E0@hg.openjdk.java.net> Changeset: 02838862dec8 Author: tonyp Date: 2012-01-07 00:43 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/02838862dec8 7121623: G1: always be able to reliably calculate the length of a forwarded chunked array Summary: Store the "next chunk start index" in the length field of the to-space object, instead of the from-space object, so that we can always reliably read the size of all from-space objects. Reviewed-by: johnc, ysr, jmasa ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp From java at java4.info Mon Jan 9 03:08:28 2012 From: java at java4.info (Florian Binder) Date: Mon, 09 Jan 2012 12:08:28 +0100 Subject: Very long young gc pause (ParNew with CMS) Message-ID: <4F0ACAAC.8020103@java4.info> Hi everybody, I am using CMS (with ParNew) gc and have very long (> 6 seconds) young gc pauses. As you can see in the log below the old-gen-heap consists of one large block, the new Size has 256m, it uses 13 worker threads and it has to copy 27505761 words (~210mb) directly from eden to old gen. I have seen that this problem occurs only after about one week of uptime. Even thought we make a full (compacting) gc every night. Since real-time > user-time I assume it might be a synchronization problem. Can this be true? Do you have any Ideas how I can speed up this gcs? Please let me know, if you need more informations. Thank you, Flo ##### java -version ##### java version "1.6.0_29" Java(TM) SE Runtime Environment (build 1.6.0_29-b11) Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode) ##### The startup parameters: ##### -Xms28G -Xmx28G -XX:+UseConcMarkSweepGC \ -XX:CMSMaxAbortablePrecleanTime=10000 \ -XX:SurvivorRatio=8 \ -XX:TargetSurvivorRatio=90 \ -XX:MaxTenuringThreshold=31 \ -XX:CMSInitiatingOccupancyFraction=80 \ -XX:NewSize=256M \ -verbose:gc \ -XX:+PrintFlagsFinal \ -XX:PrintFLSStatistics=1 \ -XX:+PrintGCDetails \ -XX:+PrintGCDateStamps \ -XX:-TraceClassUnloading \ -XX:+PrintGCApplicationConcurrentTime \ -XX:+PrintGCApplicationStoppedTime \ -XX:+PrintTenuringDistribution \ -XX:+CMSClassUnloadingEnabled \ -Dsun.rmi.dgc.server.gcInterval=9223372036854775807 \ -Dsun.rmi.dgc.client.gcInterval=9223372036854775807 \ -Djava.awt.headless=true ##### From the out-file (as of +PrintFlagsFinal): ##### ParallelGCThreads = 13 ##### The gc.log-excerpt: ##### Application time: 20,0617700 seconds 2011-12-22T12:02:03.289+0100: [GC Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 1183290963 Max Chunk Size: 1183290963 Number of Blocks: 1 Av. Block Size: 1183290963 Tree Height: 1 Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 [ParNew Desired survivor size 25480392 bytes, new threshold 1 (max 31) - age 1: 28260160 bytes, 28260160 total : 249216K->27648K(249216K), 6,1808130 secs] 20061765K->20056210K(29332480K)After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 1155785202 Max Chunk Size: 1155785202 Number of Blocks: 1 Av. Block Size: 1155785202 Tree Height: 1 After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 , 6,1809440 secs] [Times: user=3,08 sys=0,51, real=6,18 secs] Total time for which application threads were stopped: 6,1818730 seconds _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From ysr1729 at gmail.com Mon Jan 9 10:40:43 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Mon, 9 Jan 2012 10:40:43 -0800 Subject: Very long young gc pause (ParNew with CMS) In-Reply-To: <4F0ACAAC.8020103@java4.info> References: <4F0ACAAC.8020103@java4.info> Message-ID: Haven't looked at any logs, but setting MaxTenuringThreshold to 31 can be bad. I'd dial that down to 8, or leave it at the default of 15. (Your GC logs which must presumably include the tenuring distribution should inform you as to a more optimal size to use. As Kirk noted, premature promotion can be bad, and so can survivor space overflow, which can lead to premature promotion and exacerbate fragmentation.) -- ramki On Mon, Jan 9, 2012 at 3:08 AM, Florian Binder wrote: > Hi everybody, > > I am using CMS (with ParNew) gc and have very long (> 6 seconds) young > gc pauses. > As you can see in the log below the old-gen-heap consists of one large > block, the new Size has 256m, it uses 13 worker threads and it has to > copy 27505761 words (~210mb) directly from eden to old gen. > I have seen that this problem occurs only after about one week of > uptime. Even thought we make a full (compacting) gc every night. > Since real-time > user-time I assume it might be a synchronization > problem. Can this be true? > > Do you have any Ideas how I can speed up this gcs? > > Please let me know, if you need more informations. > > Thank you, > Flo > > > ##### java -version ##### > java version "1.6.0_29" > Java(TM) SE Runtime Environment (build 1.6.0_29-b11) > Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode) > > ##### The startup parameters: ##### > -Xms28G -Xmx28G > -XX:+UseConcMarkSweepGC \ > -XX:CMSMaxAbortablePrecleanTime=10000 \ > -XX:SurvivorRatio=8 \ > -XX:TargetSurvivorRatio=90 \ > -XX:MaxTenuringThreshold=31 \ > -XX:CMSInitiatingOccupancyFraction=80 \ > -XX:NewSize=256M \ > > -verbose:gc \ > -XX:+PrintFlagsFinal \ > -XX:PrintFLSStatistics=1 \ > -XX:+PrintGCDetails \ > -XX:+PrintGCDateStamps \ > -XX:-TraceClassUnloading \ > -XX:+PrintGCApplicationConcurrentTime \ > -XX:+PrintGCApplicationStoppedTime \ > -XX:+PrintTenuringDistribution \ > -XX:+CMSClassUnloadingEnabled \ > -Dsun.rmi.dgc.server.gcInterval=9223372036854775807 \ > -Dsun.rmi.dgc.client.gcInterval=9223372036854775807 \ > > -Djava.awt.headless=true > > ##### From the out-file (as of +PrintFlagsFinal): ##### > ParallelGCThreads = 13 > > ##### The gc.log-excerpt: ##### > Application time: 20,0617700 seconds > 2011-12-22T12:02:03.289+0100: [GC Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 1183290963 > Max Chunk Size: 1183290963 > Number of Blocks: 1 > Av. Block Size: 1183290963 > Tree Height: 1 > Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > [ParNew > Desired survivor size 25480392 bytes, new threshold 1 (max 31) > - age 1: 28260160 bytes, 28260160 total > : 249216K->27648K(249216K), 6,1808130 secs] > 20061765K->20056210K(29332480K)After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 1155785202 > Max Chunk Size: 1155785202 > Number of Blocks: 1 > Av. Block Size: 1155785202 > Tree Height: 1 > After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > , 6,1809440 secs] [Times: user=3,08 sys=0,51, real=6,18 secs] > Total time for which application threads were stopped: 6,1818730 seconds > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120109/17f1facd/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From java at java4.info Mon Jan 9 11:18:13 2012 From: java at java4.info (Florian Binder) Date: Mon, 09 Jan 2012 20:18:13 +0100 Subject: Very long young gc pause (ParNew with CMS) In-Reply-To: References: <4F0ACAAC.8020103@java4.info> Message-ID: <4F0B3D75.9060602@java4.info> Hi Ramki, Yes, I am agreed with you. 31 is too large and I have removed the parameter (using default now). Nevertheless this is not the problem as the max used age was always 1. Since the most (more than 90%) new allocated objects in our application live for a long time (>1h) we mostly will have premature promotion. Is there a way to optimize this? I have seen most time, when young gc needs much time (> 6 secs) there is only one large block in the old gen. If there has been a cms-old-gen-collection and there are more than one blocks in the old generation it is mostly (not always) much faster (needs less than 200ms). Is it possible that premature promotion can not be done parallel if there is only one large block in the old gen? In the past we have had a problem with fragmentation on this server but this is gone since we increased memory for it and triggered a full gc (compacting) every night, like Tony advised us. With setting the initiating occupancy fraction to 80% we have only a few (~10) old generation collections (which are very fast) and the heap fragmentation is low. Flo Am 09.01.2012 19:40, schrieb Srinivas Ramakrishna: > Haven't looked at any logs, but setting MaxTenuringThreshold to 31 can > be bad. I'd dial that down to 8, > or leave it at the default of 15. (Your GC logs which must presumably > include the tenuring distribution should > inform you as to a more optimal size to use. As Kirk noted, premature > promotion can be bad, and so can > survivor space overflow, which can lead to premature promotion and > exacerbate fragmentation.) > > -- ramki > > On Mon, Jan 9, 2012 at 3:08 AM, Florian Binder > wrote: > > Hi everybody, > > I am using CMS (with ParNew) gc and have very long (> 6 seconds) young > gc pauses. > As you can see in the log below the old-gen-heap consists of one large > block, the new Size has 256m, it uses 13 worker threads and it has to > copy 27505761 words (~210mb) directly from eden to old gen. > I have seen that this problem occurs only after about one week of > uptime. Even thought we make a full (compacting) gc every night. > Since real-time > user-time I assume it might be a synchronization > problem. Can this be true? > > Do you have any Ideas how I can speed up this gcs? > > Please let me know, if you need more informations. > > Thank you, > Flo > > > ##### java -version ##### > java version "1.6.0_29" > Java(TM) SE Runtime Environment (build 1.6.0_29-b11) > Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode) > > ##### The startup parameters: ##### > -Xms28G -Xmx28G > -XX:+UseConcMarkSweepGC \ > -XX:CMSMaxAbortablePrecleanTime=10000 \ > -XX:SurvivorRatio=8 \ > -XX:TargetSurvivorRatio=90 \ > -XX:MaxTenuringThreshold=31 \ > -XX:CMSInitiatingOccupancyFraction=80 \ > -XX:NewSize=256M \ > > -verbose:gc \ > -XX:+PrintFlagsFinal \ > -XX:PrintFLSStatistics=1 \ > -XX:+PrintGCDetails \ > -XX:+PrintGCDateStamps \ > -XX:-TraceClassUnloading \ > -XX:+PrintGCApplicationConcurrentTime \ > -XX:+PrintGCApplicationStoppedTime \ > -XX:+PrintTenuringDistribution \ > -XX:+CMSClassUnloadingEnabled \ > -Dsun.rmi.dgc.server.gcInterval=9223372036854775807 \ > -Dsun.rmi.dgc.client.gcInterval=9223372036854775807 \ > > -Djava.awt.headless=true > > ##### From the out-file (as of +PrintFlagsFinal): ##### > ParallelGCThreads = 13 > > ##### The gc.log-excerpt: ##### > Application time: 20,0617700 seconds > 2011-12-22T12:02:03.289+0100: [GC Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 1183290963 > Max Chunk Size: 1183290963 > Number of Blocks: 1 > Av. Block Size: 1183290963 > Tree Height: 1 > Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > [ParNew > Desired survivor size 25480392 bytes, new threshold 1 (max 31) > - age 1: 28260160 bytes, 28260160 total > : 249216K->27648K(249216K), 6,1808130 secs] > 20061765K->20056210K(29332480K)After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 1155785202 > Max Chunk Size: 1155785202 > Number of Blocks: 1 > Av. Block Size: 1155785202 > Tree Height: 1 > After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > , 6,1809440 secs] [Times: user=3,08 sys=0,51, real=6,18 secs] > Total time for which application threads were stopped: 6,1818730 > seconds > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120109/a2997a2e/attachment-0001.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From jon.masamitsu at oracle.com Mon Jan 9 11:24:05 2012 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Mon, 09 Jan 2012 11:24:05 -0800 Subject: Very long young gc pause (ParNew with CMS) In-Reply-To: <4F0ACAAC.8020103@java4.info> References: <4F0ACAAC.8020103@java4.info> Message-ID: <4F0B3ED5.6010802@oracle.com> Florian, Have you even turned on PrintReferenceGC to see if you are spending a significant amount of time doing Reference processing? If you do see significant Reference processing times , you can try turning on ParallelRefProcEnabled. Jon On 01/09/12 03:08, Florian Binder wrote: > Hi everybody, > > I am using CMS (with ParNew) gc and have very long (> 6 seconds) young > gc pauses. > As you can see in the log below the old-gen-heap consists of one large > block, the new Size has 256m, it uses 13 worker threads and it has to > copy 27505761 words (~210mb) directly from eden to old gen. > I have seen that this problem occurs only after about one week of > uptime. Even thought we make a full (compacting) gc every night. > Since real-time> user-time I assume it might be a synchronization > problem. Can this be true? > > Do you have any Ideas how I can speed up this gcs? > > Please let me know, if you need more informations. > > Thank you, > Flo > > > ##### java -version ##### > java version "1.6.0_29" > Java(TM) SE Runtime Environment (build 1.6.0_29-b11) > Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode) > > ##### The startup parameters: ##### > -Xms28G -Xmx28G > -XX:+UseConcMarkSweepGC \ > -XX:CMSMaxAbortablePrecleanTime=10000 \ > -XX:SurvivorRatio=8 \ > -XX:TargetSurvivorRatio=90 \ > -XX:MaxTenuringThreshold=31 \ > -XX:CMSInitiatingOccupancyFraction=80 \ > -XX:NewSize=256M \ > > -verbose:gc \ > -XX:+PrintFlagsFinal \ > -XX:PrintFLSStatistics=1 \ > -XX:+PrintGCDetails \ > -XX:+PrintGCDateStamps \ > -XX:-TraceClassUnloading \ > -XX:+PrintGCApplicationConcurrentTime \ > -XX:+PrintGCApplicationStoppedTime \ > -XX:+PrintTenuringDistribution \ > -XX:+CMSClassUnloadingEnabled \ > -Dsun.rmi.dgc.server.gcInterval=9223372036854775807 \ > -Dsun.rmi.dgc.client.gcInterval=9223372036854775807 \ > > -Djava.awt.headless=true > > ##### From the out-file (as of +PrintFlagsFinal): ##### > ParallelGCThreads = 13 > > ##### The gc.log-excerpt: ##### > Application time: 20,0617700 seconds > 2011-12-22T12:02:03.289+0100: [GC Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 1183290963 > Max Chunk Size: 1183290963 > Number of Blocks: 1 > Av. Block Size: 1183290963 > Tree Height: 1 > Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > [ParNew > Desired survivor size 25480392 bytes, new threshold 1 (max 31) > - age 1: 28260160 bytes, 28260160 total > : 249216K->27648K(249216K), 6,1808130 secs] > 20061765K->20056210K(29332480K)After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 1155785202 > Max Chunk Size: 1155785202 > Number of Blocks: 1 > Av. Block Size: 1155785202 > Tree Height: 1 > After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > , 6,1809440 secs] [Times: user=3,08 sys=0,51, real=6,18 secs] > Total time for which application threads were stopped: 6,1818730 seconds > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From kirk at kodewerk.com Mon Jan 9 11:06:26 2012 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Mon, 9 Jan 2012 20:06:26 +0100 Subject: Very long young gc pause (ParNew with CMS) In-Reply-To: References: <4F0ACAAC.8020103@java4.info> Message-ID: Hi Ramki, AFAICT given the limited GC log, the calculated tenuring threshold is always 1 which mean's he always flooding survivor spaces (i.e. suffering from premature promotion). My guess is that the tuning strategy assumes cost of long lived objects dominates and so heap is configured to minimize (survivor) copy costs. But it would appear that this strategy has backfired. Look at young gen size and if you do the maths you can see that there is no chance of there not being premature promotion. WIth the 80% initiating occupancy fraction.. well, that can't lead to anything good either. WIth the VM so misconfigured it's difficult to estimate true live set size which could then be used to calculate more reasonable pool sizes. So, with all the promtion going on, I suspect that fragmentation is making it difficult to reallocate the object in tenuring... hence long pause time. Would you say with these large data strictures that it might be difficult for the CMS to parallelize the scan for roots? The abortable pre-clean aborts on time which means that it's not able to clear out much and given the apparent life-cycle, is it worth running this phase? In fact, would you not guess that the parallel collector do better in this scenario? -- Kirk ps. I'm always happy beat you to the punch.. 'cos it's very difficult to do. ;-) On 2012-01-09, at 7:40 PM, Srinivas Ramakrishna wrote: > Haven't looked at any logs, but setting MaxTenuringThreshold to 31 can be bad. I'd dial that down to 8, > or leave it at the default of 15. (Your GC logs which must presumably include the tenuring distribution should > inform you as to a more optimal size to use. As Kirk noted, premature promotion can be bad, and so can > survivor space overflow, which can lead to premature promotion and exacerbate fragmentation.) > > -- ramki > > On Mon, Jan 9, 2012 at 3:08 AM, Florian Binder wrote: > Hi everybody, > > I am using CMS (with ParNew) gc and have very long (> 6 seconds) young > gc pauses. > As you can see in the log below the old-gen-heap consists of one large > block, the new Size has 256m, it uses 13 worker threads and it has to > copy 27505761 words (~210mb) directly from eden to old gen. > I have seen that this problem occurs only after about one week of > uptime. Even thought we make a full (compacting) gc every night. > Since real-time > user-time I assume it might be a synchronization > problem. Can this be true? > > Do you have any Ideas how I can speed up this gcs? > > Please let me know, if you need more informations. > > Thank you, > Flo > > > ##### java -version ##### > java version "1.6.0_29" > Java(TM) SE Runtime Environment (build 1.6.0_29-b11) > Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode) > > ##### The startup parameters: ##### > -Xms28G -Xmx28G > -XX:+UseConcMarkSweepGC \ > -XX:CMSMaxAbortablePrecleanTime=10000 \ > -XX:SurvivorRatio=8 \ > -XX:TargetSurvivorRatio=90 \ > -XX:MaxTenuringThreshold=31 \ > -XX:CMSInitiatingOccupancyFraction=80 \ > -XX:NewSize=256M \ > > -verbose:gc \ > -XX:+PrintFlagsFinal \ > -XX:PrintFLSStatistics=1 \ > -XX:+PrintGCDetails \ > -XX:+PrintGCDateStamps \ > -XX:-TraceClassUnloading \ > -XX:+PrintGCApplicationConcurrentTime \ > -XX:+PrintGCApplicationStoppedTime \ > -XX:+PrintTenuringDistribution \ > -XX:+CMSClassUnloadingEnabled \ > -Dsun.rmi.dgc.server.gcInterval=9223372036854775807 \ > -Dsun.rmi.dgc.client.gcInterval=9223372036854775807 \ > > -Djava.awt.headless=true > > ##### From the out-file (as of +PrintFlagsFinal): ##### > ParallelGCThreads = 13 > > ##### The gc.log-excerpt: ##### > Application time: 20,0617700 seconds > 2011-12-22T12:02:03.289+0100: [GC Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 1183290963 > Max Chunk Size: 1183290963 > Number of Blocks: 1 > Av. Block Size: 1183290963 > Tree Height: 1 > Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > [ParNew > Desired survivor size 25480392 bytes, new threshold 1 (max 31) > - age 1: 28260160 bytes, 28260160 total > : 249216K->27648K(249216K), 6,1808130 secs] > 20061765K->20056210K(29332480K)After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 1155785202 > Max Chunk Size: 1155785202 > Number of Blocks: 1 > Av. Block Size: 1155785202 > Tree Height: 1 > After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > , 6,1809440 secs] [Times: user=3,08 sys=0,51, real=6,18 secs] > Total time for which application threads were stopped: 6,1818730 seconds > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120109/08d12ac9/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From java at java4.info Mon Jan 9 11:47:32 2012 From: java at java4.info (Florian Binder) Date: Mon, 09 Jan 2012 20:47:32 +0100 Subject: Very long young gc pause (ParNew with CMS) In-Reply-To: References: <4F0ACAAC.8020103@java4.info> Message-ID: <4F0B4454.2010206@java4.info> Yes! You are right! I have a lot of page faults when gc is taking so much time. For example (sar -B): 00:00:01 pgpgin/s pgpgout/s fault/s majflt/s 00:50:01 0,01 45,18 162,29 0,00 01:00:01 0,02 46,58 170,45 0,00 01:10:02 25313,71 27030,39 27464,37 0,02 01:20:02 23456,85 25371,28 13621,92 0,01 01:30:01 22778,76 22918,60 10136,71 0,03 01:40:11 19020,44 22723,65 8617,42 0,15 01:50:01 5,52 44,22 147,26 0,05 What is this meaning and how can I avoid it? Flo Am 09.01.2012 20:33, schrieb Chi Ho Kwok: > Just making sure the obvious case is covered: is it just me or is 6s > real > 3.5s user+sys with 13 threads just plain weird? That means > there was 0.5 thread actually running on the average during that > collection. > > Do a sar -B (requires package sysstat) and see if there were any major > pagefaults (or indirectly via cacti and other monitoring tools via > memory usage, load average etc, or even via cat /proc/vmstat and > pgmajfault), I've seen those cause these kind of times during GC. > > > Chi Ho Kwok > > On Mon, Jan 9, 2012 at 12:08 PM, Florian Binder > wrote: > > Hi everybody, > > I am using CMS (with ParNew) gc and have very long (> 6 seconds) young > gc pauses. > As you can see in the log below the old-gen-heap consists of one large > block, the new Size has 256m, it uses 13 worker threads and it has to > copy 27505761 words (~210mb) directly from eden to old gen. > I have seen that this problem occurs only after about one week of > uptime. Even thought we make a full (compacting) gc every night. > Since real-time > user-time I assume it might be a synchronization > problem. Can this be true? > > Do you have any Ideas how I can speed up this gcs? > > Please let me know, if you need more informations. > > Thank you, > Flo > > > ##### java -version ##### > java version "1.6.0_29" > Java(TM) SE Runtime Environment (build 1.6.0_29-b11) > Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode) > > ##### The startup parameters: ##### > -Xms28G -Xmx28G > -XX:+UseConcMarkSweepGC \ > -XX:CMSMaxAbortablePrecleanTime=10000 \ > -XX:SurvivorRatio=8 \ > -XX:TargetSurvivorRatio=90 \ > -XX:MaxTenuringThreshold=31 \ > -XX:CMSInitiatingOccupancyFraction=80 \ > -XX:NewSize=256M \ > > -verbose:gc \ > -XX:+PrintFlagsFinal \ > -XX:PrintFLSStatistics=1 \ > -XX:+PrintGCDetails \ > -XX:+PrintGCDateStamps \ > -XX:-TraceClassUnloading \ > -XX:+PrintGCApplicationConcurrentTime \ > -XX:+PrintGCApplicationStoppedTime \ > -XX:+PrintTenuringDistribution \ > -XX:+CMSClassUnloadingEnabled \ > -Dsun.rmi.dgc.server.gcInterval=9223372036854775807 \ > -Dsun.rmi.dgc.client.gcInterval=9223372036854775807 \ > > -Djava.awt.headless=true > > ##### From the out-file (as of +PrintFlagsFinal): ##### > ParallelGCThreads = 13 > > ##### The gc.log-excerpt: ##### > Application time: 20,0617700 seconds > 2011-12-22T12:02:03.289+0100: [GC Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 1183290963 > Max Chunk Size: 1183290963 > Number of Blocks: 1 > Av. Block Size: 1183290963 > Tree Height: 1 > Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > [ParNew > Desired survivor size 25480392 bytes, new threshold 1 (max 31) > - age 1: 28260160 bytes, 28260160 total > : 249216K->27648K(249216K), 6,1808130 secs] > 20061765K->20056210K(29332480K)After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 1155785202 > Max Chunk Size: 1155785202 > Number of Blocks: 1 > Av. Block Size: 1155785202 > Tree Height: 1 > After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > , 6,1809440 secs] [Times: user=3,08 sys=0,51, real=6,18 secs] > Total time for which application threads were stopped: 6,1818730 > seconds > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120109/4bdedec2/attachment-0001.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tony.printezis at oracle.com Mon Jan 9 15:18:49 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Mon, 09 Jan 2012 18:18:49 -0500 Subject: CRR (XS): 7125281: G1: heap expansion code is replicated Message-ID: <4F0B75D9.5030009@oracle.com> Hi all, I'd like one code review for this very small change (I already have one from Bengt, thanks!): http://cr.openjdk.java.net/~tonyp/7125281/webrev.0/ We had accidentally replicated the code that expands the heap. Tony From tony.printezis at oracle.com Mon Jan 9 15:27:47 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Mon, 09 Jan 2012 18:27:47 -0500 Subject: CRR (XS): 7125281: G1: heap expansion code is replicated In-Reply-To: <4F0B75D9.5030009@oracle.com> References: <4F0B75D9.5030009@oracle.com> Message-ID: <4F0B77F3.5060804@oracle.com> All set thanks to John! I'll push this asap (after a bit more testing). Tony On 01/09/2012 06:18 PM, Tony Printezis wrote: > Hi all, > > I'd like one code review for this very small change (I already have > one from Bengt, thanks!): > > http://cr.openjdk.java.net/~tonyp/7125281/webrev.0/ > > We had accidentally replicated the code that expands the heap. > > Tony > From vitalyd at gmail.com Mon Jan 9 21:43:36 2012 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Tue, 10 Jan 2012 00:43:36 -0500 Subject: Very long young gc pause (ParNew with CMS) In-Reply-To: References: <4F0ACAAC.8020103@java4.info> <4F0B4454.2010206@java4.info> Message-ID: Apparently pgpgin/pgpgout may not be that accurate to determine swap file usage: http://help.lockergnome.com/linux/pgpgin-pgpgout-measure--ftopict506279.html May need to use vmstat and look at si/so instead. On Jan 10, 2012 12:24 AM, "Chi Ho Kwok" wrote: > Hi Florian, > > Uh, you might want to try sar -r as well, that reports memory usage (and > man sar for other reporting options, and -f /var/log/sysstat/saXX where xx > is the day for older data is useful as well). Page in / out means reading > or writing to the swap file, usual cause here is one or more huge > background task / cron jobs taking up too much memory forcing other things > to swap out. You can try reducing the size of the heap and see if it helps > if you're just a little bit short, but otherwise I don't think you can > solve this with just VM options. > > > Here's the relevant section from the manual: > > -B Report paging statistics. Some of the metrics below are >> available only with post 2.5 kernels. The following values are displayed: >> pgpgin/s >> Total number of kilobytes the system paged in from >> disk per second. Note: With old kernels (2.2.x) this value is a number of >> blocks >> per second (and not kilobytes). >> pgpgout/s >> Total number of kilobytes the system paged out to >> disk per second. Note: With old kernels (2.2.x) this value is a number of >> blocks >> per second (and not kilobytes). >> fault/s >> Number of page faults (major + minor) made by the >> system per second. This is not a count of page faults that generate I/O, >> because >> some page faults can be resolved without I/O. >> majflt/s >> Number of major faults the system has made per >> second, those which have required loading a memory page from disk. > > > I'm not sure what kernel you're on, but pgpgin / out being high is a bad > thing. Sar seems to report that all faults are minor, but that conflicts > with the first two columns. > > > Chi Ho Kwok > > On Mon, Jan 9, 2012 at 8:47 PM, Florian Binder wrote: > >> Yes! >> You are right! >> I have a lot of page faults when gc is taking so much time. >> >> For example (sar -B): >> 00:00:01 pgpgin/s pgpgout/s fault/s majflt/s >> 00:50:01 0,01 45,18 162,29 0,00 >> 01:00:01 0,02 46,58 170,45 0,00 >> 01:10:02 25313,71 27030,39 27464,37 0,02 >> 01:20:02 23456,85 25371,28 13621,92 0,01 >> 01:30:01 22778,76 22918,60 10136,71 0,03 >> 01:40:11 19020,44 22723,65 8617,42 0,15 >> 01:50:01 5,52 44,22 147,26 0,05 >> >> What is this meaning and how can I avoid it? >> >> >> Flo >> >> >> >> Am 09.01.2012 20:33, schrieb Chi Ho Kwok: >> >> Just making sure the obvious case is covered: is it just me or is 6s real >> > 3.5s user+sys with 13 threads just plain weird? That means there was 0.5 >> thread actually running on the average during that collection. >> >> Do a sar -B (requires package sysstat) and see if there were any major >> pagefaults (or indirectly via cacti and other monitoring tools via memory >> usage, load average etc, or even via cat /proc/vmstat and pgmajfault), I've >> seen those cause these kind of times during GC. >> >> >> Chi Ho Kwok >> >> On Mon, Jan 9, 2012 at 12:08 PM, Florian Binder wrote: >> >>> Hi everybody, >>> >>> I am using CMS (with ParNew) gc and have very long (> 6 seconds) young >>> gc pauses. >>> As you can see in the log below the old-gen-heap consists of one large >>> block, the new Size has 256m, it uses 13 worker threads and it has to >>> copy 27505761 words (~210mb) directly from eden to old gen. >>> I have seen that this problem occurs only after about one week of >>> uptime. Even thought we make a full (compacting) gc every night. >>> Since real-time > user-time I assume it might be a synchronization >>> problem. Can this be true? >>> >>> Do you have any Ideas how I can speed up this gcs? >>> >>> Please let me know, if you need more informations. >>> >>> Thank you, >>> Flo >>> >>> >>> ##### java -version ##### >>> java version "1.6.0_29" >>> Java(TM) SE Runtime Environment (build 1.6.0_29-b11) >>> Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode) >>> >>> ##### The startup parameters: ##### >>> -Xms28G -Xmx28G >>> -XX:+UseConcMarkSweepGC \ >>> -XX:CMSMaxAbortablePrecleanTime=10000 \ >>> -XX:SurvivorRatio=8 \ >>> -XX:TargetSurvivorRatio=90 \ >>> -XX:MaxTenuringThreshold=31 \ >>> -XX:CMSInitiatingOccupancyFraction=80 \ >>> -XX:NewSize=256M \ >>> >>> -verbose:gc \ >>> -XX:+PrintFlagsFinal \ >>> -XX:PrintFLSStatistics=1 \ >>> -XX:+PrintGCDetails \ >>> -XX:+PrintGCDateStamps \ >>> -XX:-TraceClassUnloading \ >>> -XX:+PrintGCApplicationConcurrentTime \ >>> -XX:+PrintGCApplicationStoppedTime \ >>> -XX:+PrintTenuringDistribution \ >>> -XX:+CMSClassUnloadingEnabled \ >>> -Dsun.rmi.dgc.server.gcInterval=9223372036854775807 \ >>> -Dsun.rmi.dgc.client.gcInterval=9223372036854775807 \ >>> >>> -Djava.awt.headless=true >>> >>> ##### From the out-file (as of +PrintFlagsFinal): ##### >>> ParallelGCThreads = 13 >>> >>> ##### The gc.log-excerpt: ##### >>> Application time: 20,0617700 seconds >>> 2011-12-22T12:02:03.289+0100: [GC Before GC: >>> Statistics for BinaryTreeDictionary: >>> ------------------------------------ >>> Total Free Space: 1183290963 >>> Max Chunk Size: 1183290963 >>> Number of Blocks: 1 >>> Av. Block Size: 1183290963 >>> Tree Height: 1 >>> Before GC: >>> Statistics for BinaryTreeDictionary: >>> ------------------------------------ >>> Total Free Space: 0 >>> Max Chunk Size: 0 >>> Number of Blocks: 0 >>> Tree Height: 0 >>> [ParNew >>> Desired survivor size 25480392 bytes, new threshold 1 (max 31) >>> - age 1: 28260160 bytes, 28260160 total >>> : 249216K->27648K(249216K), 6,1808130 secs] >>> 20061765K->20056210K(29332480K)After GC: >>> Statistics for BinaryTreeDictionary: >>> ------------------------------------ >>> Total Free Space: 1155785202 >>> Max Chunk Size: 1155785202 >>> Number of Blocks: 1 >>> Av. Block Size: 1155785202 >>> Tree Height: 1 >>> After GC: >>> Statistics for BinaryTreeDictionary: >>> ------------------------------------ >>> Total Free Space: 0 >>> Max Chunk Size: 0 >>> Number of Blocks: 0 >>> Tree Height: 0 >>> , 6,1809440 secs] [Times: user=3,08 sys=0,51, real=6,18 secs] >>> Total time for which application threads were stopped: 6,1818730 seconds >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >> >> >> > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120110/7c061fcb/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tony.printezis at oracle.com Mon Jan 9 23:34:30 2012 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Tue, 10 Jan 2012 07:34:30 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7125281: G1: heap expansion code is replicated Message-ID: <20120110073436.28104478FC@hg.openjdk.java.net> Changeset: 97c00e21fecb Author: tonyp Date: 2012-01-09 23:50 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/97c00e21fecb 7125281: G1: heap expansion code is replicated Reviewed-by: brutisso, johnc ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp From bengt.rutisson at oracle.com Mon Jan 9 23:57:24 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 10 Jan 2012 08:57:24 +0100 Subject: Request for review (XXS): 7128532 G1: Change default value of G1DefaultMaxNewGenPercent to 80 Message-ID: <4F0BEF64.8030905@oracle.com> Hi all, Could I have a couple of reviews for this really small change? I am just changing the default value of a newly introduced flag from 50 to 80. 7128532 G1: Change default value of G1DefaultMaxNewGenPercent to 80 http://monaco.us.oracle.com/detail.jsf?cr=7128532 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7128532 Some background: As part of the fix for 7113021 I introduced two new develop flags that are used as default values for the minimum and maximum size of the G1 young gen. Initially I set these to 20% and 50% respectively. Monica has now been running some performance tests to see if it would be better with other default values. It looks like we would benefit from a larger maximum value. It should be pretty safe to have a large maximum value. The heuristics in G1 calculates the actual young gen size based on the pause target. So, just because we have a large max does not mean that we will ever get a young gen that big. Also, it is always possible to override the default value with -XX:MaxNewSize. Here are some SPECjbb2005 results from Monica: Min 20, Max 77: Valid run, Score is 356592; Full GC (outside of the system GCs): 0 Min 33, Max 77: Valid run, Score is 353053; Full GC (outside of the system GCs): 0 Min 20, Max 60: Valid run, Score is 352318; Full GC (outside of the system GCs): 2 Min 15, Max 77: Valid run, Score is 349376; Full GC (outside of the system GCs): 0 default (Min: 20, Max 50): Valid run, Score is 347647; Full GC (outside of the system GCs): 2 Min 20, Max 70: Valid run, Score is 346375; Full GC (outside of the system GCs): 0 Min 33, Max 60: Valid run, Score is 333381; Full GC (outside of the system GCs): 0 For comparison: parallelold: Valid run, Score is 358668; Full GC (outside of the system GCs): 13 Here is a webrev: http://cr.openjdk.java.net/~brutisso/7128532/webrev.01/ But since the change is so small I am including the diff here as well: diff --git a/src/share/vm/gc_implementation/g1/g1_globals.hpp b/src/share/vm/gc_implementation/g1/g1_globals.hpp --- a/src/share/vm/gc_implementation/g1/g1_globals.hpp +++ b/src/share/vm/gc_implementation/g1/g1_globals.hpp @@ -295,7 +295,7 @@ "Percentage (0-100) of the heap size to use as minimum " \ "young gen size.") \ \ - develop(uintx, G1DefaultMaxNewGenPercent, 50, \ + develop(uintx, G1DefaultMaxNewGenPercent, 80, \ "Percentage (0-100) of the heap size to use as maximum " \ "young gen size.") Thanks, Bengt From fancyerii at gmail.com Tue Jan 10 01:31:07 2012 From: fancyerii at gmail.com (Li Li) Date: Tue, 10 Jan 2012 17:31:07 +0800 Subject: MaxTenuringThreshold available in ParNewGC? Message-ID: hi all I have an application that generating many large objects and then discard them. I found that full gc can free memory from 70% to 40%. I want to let this objects in young generation longer. I found -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. But I found a blog that says MaxTenuringThreshold is not used in ParNewGC. And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, but it seems no difference. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120110/1fd48a4c/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From fancyerii at gmail.com Tue Jan 10 01:49:10 2012 From: fancyerii at gmail.com (Li Li) Date: Tue, 10 Jan 2012 17:49:10 +0800 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: btw, is there any web page that list all the jvm parameters and their default values? I am confused that they are distributed in many documents and some are deprecated. On Tue, Jan 10, 2012 at 5:31 PM, Li Li wrote: > hi all > I have an application that generating many large objects and then > discard them. I found that full gc can free memory from 70% to 40%. > btw, is there any web page that list all JVM parameters and their default > values? > > I want to let this objects in young generation longer. I found > -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. > But I found a blog that says MaxTenuringThreshold is not used in > ParNewGC. > And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, but it > seems no difference. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120110/9e974a80/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From java at java4.info Tue Jan 10 02:23:26 2012 From: java at java4.info (Florian Binder) Date: Tue, 10 Jan 2012 11:23:26 +0100 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: <4F0C119E.7090600@java4.info> At http://cr.openjdk.java.net/~brutisso/7016112/webrev.02/src/share/vm/runtime/globals.hpp.html you have the source code with most jvm-parameters. I know, it is a webrev and not the newest file, but there are the most parameters with a short description ;-) An other way is to enable PrintFlagsFinal or PrintFlagsInitial or just run: java -XX:+PrintFlagsFinal Flo Am 10.01.2012 10:49, schrieb Li Li: > btw, is there any web page that list all the jvm parameters and their > default values? I am confused that they are distributed in many > documents and some are deprecated. > > On Tue, Jan 10, 2012 at 5:31 PM, Li Li > wrote: > > hi all > I have an application that generating many large objects and > then discard them. I found that full gc can free memory from 70% > to 40%. > btw, is there any web page that list all JVM parameters and their > default values? > > > I want to let this objects in young generation longer. I found > -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. > But I found a blog that says MaxTenuringThreshold is not used > in ParNewGC. > And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, > but it seems no difference. > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120110/084c12d1/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From bengt.rutisson at oracle.com Tue Jan 10 05:17:18 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 10 Jan 2012 14:17:18 +0100 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: <4F0C119E.7090600@java4.info> References: <4F0C119E.7090600@java4.info> Message-ID: <4F0C3A5E.5010300@oracle.com> On 2012-01-10 11:23, Florian Binder wrote: > At > http://cr.openjdk.java.net/~brutisso/7016112/webrev.02/src/share/vm/runtime/globals.hpp.html This is actually a link to one of my webrevs. It could be removed any day. A more stable way of finding the source for globals.hpp is to look in the mercurial repository for OpenJDK: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/file/97c00e21fecb/src/share/vm/runtime/globals.hpp Bengt > > you have the source code with most jvm-parameters. > I know, it is a webrev and not the newest file, but there are the most > parameters with a short description ;-) > > An other way is to enable PrintFlagsFinal or PrintFlagsInitial or just > run: > java -XX:+PrintFlagsFinal > > Flo > > > Am 10.01.2012 10:49, schrieb Li Li: >> btw, is there any web page that list all the jvm parameters and their >> default values? I am confused that they are distributed in many >> documents and some are deprecated. >> >> On Tue, Jan 10, 2012 at 5:31 PM, Li Li > > wrote: >> >> hi all >> I have an application that generating many large objects and >> then discard them. I found that full gc can free memory from 70% >> to 40%. >> btw, is there any web page that list all JVM parameters and their >> default values? >> >> >> I want to let this objects in young generation longer. I found >> -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. >> But I found a blog that says MaxTenuringThreshold is not used >> in ParNewGC. >> And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, >> but it seems no difference. >> >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120110/8957224e/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From jon.masamitsu at oracle.com Tue Jan 10 08:00:57 2012 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Tue, 10 Jan 2012 08:00:57 -0800 Subject: Request for review (XXS): 7128532 G1: Change default value of G1DefaultMaxNewGenPercent to 80 In-Reply-To: <4F0BEF64.8030905@oracle.com> References: <4F0BEF64.8030905@oracle.com> Message-ID: <4F0C60B9.6050803@oracle.com> Perfect. On 01/09/12 23:57, Bengt Rutisson wrote: > > Hi all, > > Could I have a couple of reviews for this really small change? I am > just changing the default value of a newly introduced flag from 50 to 80. > > 7128532 G1: Change default value of G1DefaultMaxNewGenPercent to 80 > http://monaco.us.oracle.com/detail.jsf?cr=7128532 > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7128532 > > Some background: > > As part of the fix for 7113021 I introduced two new develop flags that > are used as default values for the minimum and maximum size of the G1 > young gen. Initially I set these to 20% and 50% respectively. Monica > has now been running some performance tests to see if it would be > better with other default values. It looks like we would benefit from > a larger maximum value. > > It should be pretty safe to have a large maximum value. The heuristics > in G1 calculates the actual young gen size based on the pause target. > So, just because we have a large max does not mean that we will ever > get a young gen that big. Also, it is always possible to override the > default value with -XX:MaxNewSize. > > Here are some SPECjbb2005 results from Monica: > > Min 20, Max 77: Valid run, Score is 356592; Full GC (outside of the > system GCs): 0 > Min 33, Max 77: Valid run, Score is 353053; Full GC (outside of the > system GCs): 0 > Min 20, Max 60: Valid run, Score is 352318; Full GC (outside of the > system GCs): 2 > Min 15, Max 77: Valid run, Score is 349376; Full GC (outside of the > system GCs): 0 > default (Min: 20, Max 50): Valid run, Score is 347647; Full GC > (outside of the system GCs): 2 > Min 20, Max 70: Valid run, Score is 346375; Full GC (outside of the > system GCs): 0 > Min 33, Max 60: Valid run, Score is 333381; Full GC (outside of the > system GCs): 0 > > For comparison: > parallelold: Valid run, Score is 358668; Full GC (outside of the > system GCs): 13 > > Here is a webrev: > http://cr.openjdk.java.net/~brutisso/7128532/webrev.01/ > > But since the change is so small I am including the diff here as well: > > diff --git a/src/share/vm/gc_implementation/g1/g1_globals.hpp > b/src/share/vm/gc_implementation/g1/g1_globals.hpp > --- a/src/share/vm/gc_implementation/g1/g1_globals.hpp > +++ b/src/share/vm/gc_implementation/g1/g1_globals.hpp > @@ -295,7 +295,7 @@ > "Percentage (0-100) of the heap size to use as minimum > " \ > "young gen > size.") \ > > \ > - develop(uintx, G1DefaultMaxNewGenPercent, > 50, \ > + develop(uintx, G1DefaultMaxNewGenPercent, > 80, \ > "Percentage (0-100) of the heap size to use as maximum > " \ > "young gen size.") > > > Thanks, > Bengt > From ysr1729 at gmail.com Tue Jan 10 09:23:53 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Tue, 10 Jan 2012 09:23:53 -0800 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: I recommend Charlie's excellent book as well. To answer yr question, yes, CMS + Parew does use MaxTenuringThreshold (henceforth MTT), but in order to allow objects to age you also need sufficiently large survivor spaces to hold them for however long you wish, otherwise the adaptive tenuring policy will adjust the "current" tenuring threshold so as to prevent overflow. That may be what you saw. Check out the info printed by +PrintTenuringThreshold. -- ramki On Tue, Jan 10, 2012 at 1:31 AM, Li Li wrote: > hi all > I have an application that generating many large objects and then > discard them. I found that full gc can free memory from 70% to 40%. > I want to let this objects in young generation longer. I found > -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. > But I found a blog that says MaxTenuringThreshold is not used in > ParNewGC. > And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, but it > seems no difference. > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120110/1116311d/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From bengt.rutisson at oracle.com Tue Jan 10 11:01:36 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 10 Jan 2012 20:01:36 +0100 Subject: Request for review (XXS): 7128532 G1: Change default value of G1DefaultMaxNewGenPercent to 80 In-Reply-To: <4F0C60B9.6050803@oracle.com> References: <4F0BEF64.8030905@oracle.com> <4F0C60B9.6050803@oracle.com> Message-ID: <4F0C8B10.6010902@oracle.com> Thanks Jon and Tony! All set to push this now. Bengt On 2012-01-10 17:00, Jon Masamitsu wrote: > Perfect. > > On 01/09/12 23:57, Bengt Rutisson wrote: >> >> Hi all, >> >> Could I have a couple of reviews for this really small change? I am >> just changing the default value of a newly introduced flag from 50 to >> 80. >> >> 7128532 G1: Change default value of G1DefaultMaxNewGenPercent to 80 >> http://monaco.us.oracle.com/detail.jsf?cr=7128532 >> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7128532 >> >> Some background: >> >> As part of the fix for 7113021 I introduced two new develop flags >> that are used as default values for the minimum and maximum size of >> the G1 young gen. Initially I set these to 20% and 50% respectively. >> Monica has now been running some performance tests to see if it would >> be better with other default values. It looks like we would benefit >> from a larger maximum value. >> >> It should be pretty safe to have a large maximum value. The >> heuristics in G1 calculates the actual young gen size based on the >> pause target. So, just because we have a large max does not mean that >> we will ever get a young gen that big. Also, it is always possible to >> override the default value with -XX:MaxNewSize. >> >> Here are some SPECjbb2005 results from Monica: >> >> Min 20, Max 77: Valid run, Score is 356592; Full GC (outside of the >> system GCs): 0 >> Min 33, Max 77: Valid run, Score is 353053; Full GC (outside of the >> system GCs): 0 >> Min 20, Max 60: Valid run, Score is 352318; Full GC (outside of the >> system GCs): 2 >> Min 15, Max 77: Valid run, Score is 349376; Full GC (outside of the >> system GCs): 0 >> default (Min: 20, Max 50): Valid run, Score is 347647; Full GC >> (outside of the system GCs): 2 >> Min 20, Max 70: Valid run, Score is 346375; Full GC (outside of the >> system GCs): 0 >> Min 33, Max 60: Valid run, Score is 333381; Full GC (outside of the >> system GCs): 0 >> >> For comparison: >> parallelold: Valid run, Score is 358668; Full GC (outside of the >> system GCs): 13 >> >> Here is a webrev: >> http://cr.openjdk.java.net/~brutisso/7128532/webrev.01/ >> >> But since the change is so small I am including the diff here as well: >> >> diff --git a/src/share/vm/gc_implementation/g1/g1_globals.hpp >> b/src/share/vm/gc_implementation/g1/g1_globals.hpp >> --- a/src/share/vm/gc_implementation/g1/g1_globals.hpp >> +++ b/src/share/vm/gc_implementation/g1/g1_globals.hpp >> @@ -295,7 +295,7 @@ >> "Percentage (0-100) of the heap size to use as minimum >> " \ >> "young gen >> size.") \ >> >> \ >> - develop(uintx, G1DefaultMaxNewGenPercent, >> 50, \ >> + develop(uintx, G1DefaultMaxNewGenPercent, >> 80, \ >> "Percentage (0-100) of the heap size to use as maximum >> " \ >> "young gen size.") >> >> >> Thanks, >> Bengt >> From bengt.rutisson at oracle.com Tue Jan 10 13:29:04 2012 From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com) Date: Tue, 10 Jan 2012 21:29:04 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7128532: G1: Change default value of G1DefaultMaxNewGenPercent to 80 Message-ID: <20120110212906.E859E4790A@hg.openjdk.java.net> Changeset: 1d6185f732aa Author: brutisso Date: 2012-01-10 20:02 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/1d6185f732aa 7128532: G1: Change default value of G1DefaultMaxNewGenPercent to 80 Reviewed-by: tonyp, jmasa ! src/share/vm/gc_implementation/g1/g1_globals.hpp From bengt.rutisson at oracle.com Tue Jan 10 13:41:01 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 10 Jan 2012 22:41:01 +0100 Subject: CRR (L / updated): 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces In-Reply-To: <4F07CD76.9080502@oracle.com> References: <4EF25FB8.5050507@oracle.com> <4EFA08D8.8040009@oracle.com> <4F060540.3070005@oracle.com> <4F07CD76.9080502@oracle.com> Message-ID: <4F0CB06D.2030008@oracle.com> Hi Tony, I think this looks good. Ship it! A couple of coding style questions: In g1CollectorPolicy.cpp, lines 908-920 there is this if statement: if (!during_initial_mark_pause()) { ... } else { ... } I realize that "!during_initial_mark_pause()" is the more common case, but to me it seems more natural to avoid having a negation in the test, so I'd swap the if and else blocks. What's your opinion on this? In g1OopClosures.hpp you swapped the lines 151 and 152, which makes it look like this: 149 G1ParCopyClosure(G1CollectedHeap* g1, G1ParScanThreadState* par_scan_state, 150 ReferenceProcessor* rp) : 151 G1ParCopyHelper(g1, par_scan_state, &_scanner), 152 _scanner(g1, par_scan_state, rp) { I guess you want the call to the super class constructor before other initialization. To me it looks strange that the _scanner is passed to the super class, but is now actually not initialized until after the call to the super constructor. I would have preferred the order as it was before. I guess the new order works fine as long as the super constructor never tries to use the _scanner. Anyway, just details. All in all it looks great! Bengt On 2012-01-07 05:43, Tony Printezis wrote: > Hi again, > > Here's an updated webrev after I merged my changes with John's latest > push: > > http://cr.openjdk.java.net/~tonyp/6888336/webrev.3/ > > The code is basically the same, I just had to move some of it to > different places. > > Testing update: I've been testing the changes continuously on three > machines over the holidays and I haven't seen any failures since the > single failure over Xmas which was caused by the race in the array > chunking changes which has now been resolved. I did additional testing > with a patch from John (thanks again!) which artificially forces > evacuation failures to stress that code and, again, I saw no issues > with that either. > > Tony > > On 01/05/2012 03:17 PM, Tony Printezis wrote: >> Hi all, >> >> Updated webrev after making some changes based on comments from John >> (thanks John!): >> >> http://cr.openjdk.java.net/~tonyp/6888336/webrev.2/ >> >> I'd like to clarify something: this change relies on the array >> chunking changes (7121623) but the webrev does not include those >> changes (despite what the index page says). So, if you want to try >> this patch out you'll need to apply the array chunking changes first. >> >> Tony >> >> On 12/27/2011 01:05 PM, Tony Printezis wrote: >>> Hi all, >>> >>> Here's an updated webrev for this change that takes into account the >>> new approach of chunking object arrays (see previous e-mails on >>> 7121623): >>> >>> http://cr.openjdk.java.net/~tonyp/6888336/webrev.1/ >>> >>> If anything else the new approach simplified the code a bit since >>> now we can always read an object's size from its from-image instead >>> of having to check one or the other depending on whether it's a >>> chunked array or not. I also moved the body of some methods from >>> heapRegion.hpp to the .inline.hpp and .cpp files (as they were >>> getting a bit large to keep in the .hpp file). >>> >>> Tony >>> >>> On 12/21/2011 05:37 PM, Tony Printezis wrote: >>>> Hi all, >>>> >>>> I'd like a couple of code reviews for the following non-trivial >>>> changes (large, not necessary in lines of code modified but more >>>> due to the fact that the evacuation pause / concurrent marking >>>> interaction is changed quite dramatically): >>>> >>>> http://cr.openjdk.java.net/~tonyp/6888336/webrev.0/ >>>> >>>> Here's some background, motivation, and a summary of the changes (I >>>> felt that it was important to write a longer then usual explanation): >>>> >>>> * Background / Motivation >>>> >>>> Each G1 heap region has a field top-at-mark-start (aka TAMS) which >>>> denotes where the top of the region was when marking started. An >>>> object is considered implicitly live if it's over TAMS (i.e., it >>>> was allocated since marking started) or explicitly live if it's >>>> below TAMS (i.e., it was allocated before marking started) and >>>> marked on the bitmap. (It follows that it's unnecessary to >>>> explicitly mark objects over TAMS.) >>>> >>>> In fact, we have two copies of the above marking information: "Next >>>> TAMS / Next Bitmap" and "Prev TAMS / Prev Bitmap". Prev is the copy >>>> that was obtained by the last marking cycle that was successfully >>>> completed (so, it is consistent: all live objects should appear as >>>> live in the prev marking information). Next is the copy that will >>>> be obtained / is currently being obtained and it's not consistent >>>> because it's not guaranteed to be complete. >>>> >>>> G1 uses SATB marking which has the advantage not to require objects >>>> allocated since the start of marking to be visited at all by the >>>> marking threads (they are implicitly live and they do not need to >>>> be scanned). So, the active marking cycle can totally ignore >>>> objects over NTAMS (since they have been allocated since marking >>>> started). >>>> >>>> The current interaction between evacuation pauses (let's call these >>>> "GCs" from now on) and concurrent marking is very tricky. Even >>>> though marking ignores all objects over NTAMS (currently: all >>>> objects in Eden regions) it still has to visit and mark objects in >>>> the Survivors regions. But those will be moved by subsequent GCs. >>>> So, a GC needs to be aware that it's moving objects that have been >>>> marked by the marking threads and not only propagate those marks >>>> but also notify the marking threads that said objects have been >>>> moved. For that we use several data structures: pushes to the >>>> global marking stack and also to what's referred to as the "region >>>> stack" which is only used by the GC to push a group of objects >>>> instead of pushing them individually ("region" here is a mem >>>> region and smaller than a G1 region). >>>> >>>> Additionally, because the marking threads could come across objects >>>> that could potentially move we have to make sure that we don't >>>> leave references to regions that have been evacuated on any marking >>>> data structure. To do that we treat as roots all entries on the >>>> taskqueues / global stack and drained all SATB buffers (both active >>>> buffers and also enqueued buffers). >>>> >>>> The first issue with the above interaction is that it has >>>> performance issues. Draining all SATB buffers and scanning the mark >>>> stack and taskqueues has been shown to be very time-consuming in >>>> some cases. Also, having to check whether objects are marked and >>>> propagate the marks appropriately during GC is an extra overhead. >>>> >>>> The second issue is that it has been shown to be very fragile. We >>>> have discovered and fixed many issues over time which were subtle >>>> and hard to reproduce. >>>> >>>> We really need to simplify the GC/marking interaction to both >>>> improve performance of GCs during marking, as well as improve our >>>> reliability. This changeset does exactly that. >>>> >>>> * Explanation of the changes >>>> >>>> The goal is to ensure that all the objects that are copied by the >>>> GC do not need to be visited by the marking threads and as a result >>>> do not need to be explicitly marked, pushed, etc. >>>> >>>> The first observation is that most objects copied during a GC are >>>> allocated after marking starts and are therefore implicitly live. >>>> This is the case for all objects on Eden regions, as well as most >>>> objects on Survivor regions. The only exception are objects on the >>>> Survivor regions during the initial-mark pause. Unfortunately, it's >>>> not easy to track those separately as they will get mixed in with >>>> future Survivors. The first decision to deal with this is to turn >>>> off Survivors during the initial-mark pause. This ensures that all >>>> objects copied during each subsequent GC will only visit objects >>>> that have been allocated since marking started and are therefore >>>> implicitly live (i.e., over NTAMS). This allows us to totally >>>> eliminate that code that propagates marks during the GC. We just >>>> have to make sure that all copied objects are over NTAMS. Turning >>>> off Survivors during an initial-mark pause is a bit of a "big >>>> hammer" approach, but it will suffice for now. We have ideas on how >>>> to re-enable them in the future and we'll explore a couple of >>>> alternatives. >>>> >>>> Given that the GC only copies objects that are implicitly marked it >>>> follows that none of the objects that are copied during any GC >>>> should appear on either the taskqueues nor the global marking >>>> stack. Also remember that we filter SATB buffers before enqueueing >>>> them which will filter out all implicitly marked objects. It >>>> follows that no enqueued SATB buffer should have references to >>>> objects that are being moved. This leaves the currently active SATB >>>> buffers given that the code that populates them is unconditional. >>>> But if we run the filtering on those during each GC such >>>> "offending" references are also quickly eliminated. So, instead of >>>> having to scan all stacks and all SATB buffers we only have to >>>> filter the active SATB buffers, which should be much, much faster. >>>> >>>> * Implementation Notes >>>> >>>> The actual changes are not too extensive as they basically mostly >>>> disable functionality in the GC code. The tricky part was to get >>>> the TAMS fields correct at various phases (start of copying, start >>>> of marking, etc.) and especially when an evacuation failure occurs. >>>> I put all that functionality in methods on HeapRegion which do the >>>> right thing when a GC starts, a marking starts, etc. >>>> >>>> The most important changes are in the "main" GC code, i.e. >>>> G1ParCopyHelper::do_oop_work() and >>>> G1ParCopyHelper::copy_to_survivor_space(). Instead of having to >>>> propagate marks we only now need to mark objects directly reachable >>>> from roots during the initial-mark pause. The resulting code is >>>> much simplified (and hopefully more performant!). >>>> >>>> I also added a method verify_no_cset_oops() which checks that >>>> indeed all the marking data structures do not point to regions that >>>> are being GCed at the start / end of each GC. (BTW, I'm considering >>>> adding a develop flag to enable this on demand.) >>>> >>>> I should point out that this changeset will leave a lot of dead >>>> code. However, I took the decision to keep the changes to a minimum >>>> in order not overwhelm the code reviewers and make the important >>>> changes clearer. (I also discussed this with a couple of potential >>>> code reviewers and they agreed that this is a good approach.) I >>>> temporarily added guarantees to ensure that methods that should not >>>> be called are not called. I will remove all dead code with a future >>>> push. >>>> >>>> I also have to apologize to John Cuthbertson for removing a lot of >>>> code he's added to deal with various bugs we had in the GC/marking >>>> interaction. Hopefully the new code will be less fragile compared >>>> to what we've had so far and John will be able to concentrate on >>>> more interesting features than trying to track down >>>> hard-to-reproduce failures! >>>> >>>> Tony >>>> From tony.printezis at oracle.com Tue Jan 10 15:57:21 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 10 Jan 2012 18:57:21 -0500 Subject: CRR (L / updated): 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces In-Reply-To: <4F0CB06D.2030008@oracle.com> References: <4EF25FB8.5050507@oracle.com> <4EFA08D8.8040009@oracle.com> <4F060540.3070005@oracle.com> <4F07CD76.9080502@oracle.com> <4F0CB06D.2030008@oracle.com> Message-ID: <4F0CD061.50408@oracle.com> Bengt, Hi, thanks for looking at it! See inline. On 01/10/2012 04:41 PM, Bengt Rutisson wrote: > > Hi Tony, > > I think this looks good. Ship it! > > A couple of coding style questions: > > In g1CollectorPolicy.cpp, lines 908-920 there is this if statement: > > if (!during_initial_mark_pause()) { > ... > } else { > ... > } > > I realize that "!during_initial_mark_pause()" is the more common case, > but to me it seems more natural to avoid having a negation in the > test, so I'd swap the if and else blocks. What's your opinion on this? > As you pointed out I tend to put the common case in the if-block even if I have to negate the condition. This is a nice unambiguous way to structure the if-statement (well, assuming we can unambiguously decide what the "common" case is!). I'd like to leave this as is. If it makes you feel any better: I already have the changes to re-enable survivors so that test will disappear soon-ish. :-) > In g1OopClosures.hpp you swapped the lines 151 and 152, which makes > it look like this: > > 149 G1ParCopyClosure(G1CollectedHeap* g1, G1ParScanThreadState* > par_scan_state, > 150 ReferenceProcessor* rp) : > 151 G1ParCopyHelper(g1, par_scan_state, &_scanner), > 152 _scanner(g1, par_scan_state, rp) { > > I guess you want the call to the super class constructor before other > initialization. To me it looks strange that the _scanner is passed to > the super class, but is now actually not initialized until after the > call to the super constructor. I would have preferred the order as it > was before. I guess the new order works fine as long as the super > constructor never tries to use the _scanner. You're right. I wanted the constructor to be called first but I was clearly a bit careless and I missed the dependency. Thanks for looking at this carefully! Even though the dependency is benign here (the ref to _scanner is only stored locally in the super class) I think we should avoid and future surprises. So I'll undo the change. I'll push this asap. Tony > Anyway, just details. All in all it looks great! > > Bengt > > > On 2012-01-07 05:43, Tony Printezis wrote: >> Hi again, >> >> Here's an updated webrev after I merged my changes with John's latest >> push: >> >> http://cr.openjdk.java.net/~tonyp/6888336/webrev.3/ >> >> The code is basically the same, I just had to move some of it to >> different places. >> >> Testing update: I've been testing the changes continuously on three >> machines over the holidays and I haven't seen any failures since the >> single failure over Xmas which was caused by the race in the array >> chunking changes which has now been resolved. I did additional >> testing with a patch from John (thanks again!) which artificially >> forces evacuation failures to stress that code and, again, I saw no >> issues with that either. >> >> Tony >> >> On 01/05/2012 03:17 PM, Tony Printezis wrote: >>> Hi all, >>> >>> Updated webrev after making some changes based on comments from John >>> (thanks John!): >>> >>> http://cr.openjdk.java.net/~tonyp/6888336/webrev.2/ >>> >>> I'd like to clarify something: this change relies on the array >>> chunking changes (7121623) but the webrev does not include those >>> changes (despite what the index page says). So, if you want to try >>> this patch out you'll need to apply the array chunking changes first. >>> >>> Tony >>> >>> On 12/27/2011 01:05 PM, Tony Printezis wrote: >>>> Hi all, >>>> >>>> Here's an updated webrev for this change that takes into account >>>> the new approach of chunking object arrays (see previous e-mails on >>>> 7121623): >>>> >>>> http://cr.openjdk.java.net/~tonyp/6888336/webrev.1/ >>>> >>>> If anything else the new approach simplified the code a bit since >>>> now we can always read an object's size from its from-image instead >>>> of having to check one or the other depending on whether it's a >>>> chunked array or not. I also moved the body of some methods from >>>> heapRegion.hpp to the .inline.hpp and .cpp files (as they were >>>> getting a bit large to keep in the .hpp file). >>>> >>>> Tony >>>> >>>> On 12/21/2011 05:37 PM, Tony Printezis wrote: >>>>> Hi all, >>>>> >>>>> I'd like a couple of code reviews for the following non-trivial >>>>> changes (large, not necessary in lines of code modified but more >>>>> due to the fact that the evacuation pause / concurrent marking >>>>> interaction is changed quite dramatically): >>>>> >>>>> http://cr.openjdk.java.net/~tonyp/6888336/webrev.0/ >>>>> >>>>> Here's some background, motivation, and a summary of the changes >>>>> (I felt that it was important to write a longer then usual >>>>> explanation): >>>>> >>>>> * Background / Motivation >>>>> >>>>> Each G1 heap region has a field top-at-mark-start (aka TAMS) which >>>>> denotes where the top of the region was when marking started. An >>>>> object is considered implicitly live if it's over TAMS (i.e., it >>>>> was allocated since marking started) or explicitly live if it's >>>>> below TAMS (i.e., it was allocated before marking started) and >>>>> marked on the bitmap. (It follows that it's unnecessary to >>>>> explicitly mark objects over TAMS.) >>>>> >>>>> In fact, we have two copies of the above marking information: >>>>> "Next TAMS / Next Bitmap" and "Prev TAMS / Prev Bitmap". Prev is >>>>> the copy that was obtained by the last marking cycle that was >>>>> successfully completed (so, it is consistent: all live objects >>>>> should appear as live in the prev marking information). Next is >>>>> the copy that will be obtained / is currently being obtained and >>>>> it's not consistent because it's not guaranteed to be complete. >>>>> >>>>> G1 uses SATB marking which has the advantage not to require >>>>> objects allocated since the start of marking to be visited at all >>>>> by the marking threads (they are implicitly live and they do not >>>>> need to be scanned). So, the active marking cycle can totally >>>>> ignore objects over NTAMS (since they have been allocated since >>>>> marking started). >>>>> >>>>> The current interaction between evacuation pauses (let's call >>>>> these "GCs" from now on) and concurrent marking is very tricky. >>>>> Even though marking ignores all objects over NTAMS (currently: all >>>>> objects in Eden regions) it still has to visit and mark objects in >>>>> the Survivors regions. But those will be moved by subsequent GCs. >>>>> So, a GC needs to be aware that it's moving objects that have been >>>>> marked by the marking threads and not only propagate those marks >>>>> but also notify the marking threads that said objects have been >>>>> moved. For that we use several data structures: pushes to the >>>>> global marking stack and also to what's referred to as the "region >>>>> stack" which is only used by the GC to push a group of objects >>>>> instead of pushing them individually ("region" here is a mem >>>>> region and smaller than a G1 region). >>>>> >>>>> Additionally, because the marking threads could come across >>>>> objects that could potentially move we have to make sure that we >>>>> don't leave references to regions that have been evacuated on any >>>>> marking data structure. To do that we treat as roots all entries >>>>> on the taskqueues / global stack and drained all SATB buffers >>>>> (both active buffers and also enqueued buffers). >>>>> >>>>> The first issue with the above interaction is that it has >>>>> performance issues. Draining all SATB buffers and scanning the >>>>> mark stack and taskqueues has been shown to be very time-consuming >>>>> in some cases. Also, having to check whether objects are marked >>>>> and propagate the marks appropriately during GC is an extra overhead. >>>>> >>>>> The second issue is that it has been shown to be very fragile. We >>>>> have discovered and fixed many issues over time which were subtle >>>>> and hard to reproduce. >>>>> >>>>> We really need to simplify the GC/marking interaction to both >>>>> improve performance of GCs during marking, as well as improve our >>>>> reliability. This changeset does exactly that. >>>>> >>>>> * Explanation of the changes >>>>> >>>>> The goal is to ensure that all the objects that are copied by the >>>>> GC do not need to be visited by the marking threads and as a >>>>> result do not need to be explicitly marked, pushed, etc. >>>>> >>>>> The first observation is that most objects copied during a GC are >>>>> allocated after marking starts and are therefore implicitly live. >>>>> This is the case for all objects on Eden regions, as well as most >>>>> objects on Survivor regions. The only exception are objects on the >>>>> Survivor regions during the initial-mark pause. Unfortunately, >>>>> it's not easy to track those separately as they will get mixed in >>>>> with future Survivors. The first decision to deal with this is to >>>>> turn off Survivors during the initial-mark pause. This ensures >>>>> that all objects copied during each subsequent GC will only visit >>>>> objects that have been allocated since marking started and are >>>>> therefore implicitly live (i.e., over NTAMS). This allows us to >>>>> totally eliminate that code that propagates marks during the GC. >>>>> We just have to make sure that all copied objects are over NTAMS. >>>>> Turning off Survivors during an initial-mark pause is a bit of a >>>>> "big hammer" approach, but it will suffice for now. We have ideas >>>>> on how to re-enable them in the future and we'll explore a couple >>>>> of alternatives. >>>>> >>>>> Given that the GC only copies objects that are implicitly marked >>>>> it follows that none of the objects that are copied during any GC >>>>> should appear on either the taskqueues nor the global marking >>>>> stack. Also remember that we filter SATB buffers before enqueueing >>>>> them which will filter out all implicitly marked objects. It >>>>> follows that no enqueued SATB buffer should have references to >>>>> objects that are being moved. This leaves the currently active >>>>> SATB buffers given that the code that populates them is >>>>> unconditional. But if we run the filtering on those during each GC >>>>> such "offending" references are also quickly eliminated. So, >>>>> instead of having to scan all stacks and all SATB buffers we only >>>>> have to filter the active SATB buffers, which should be much, much >>>>> faster. >>>>> >>>>> * Implementation Notes >>>>> >>>>> The actual changes are not too extensive as they basically mostly >>>>> disable functionality in the GC code. The tricky part was to get >>>>> the TAMS fields correct at various phases (start of copying, start >>>>> of marking, etc.) and especially when an evacuation failure >>>>> occurs. I put all that functionality in methods on HeapRegion >>>>> which do the right thing when a GC starts, a marking starts, etc. >>>>> >>>>> The most important changes are in the "main" GC code, i.e. >>>>> G1ParCopyHelper::do_oop_work() and >>>>> G1ParCopyHelper::copy_to_survivor_space(). Instead of having to >>>>> propagate marks we only now need to mark objects directly >>>>> reachable from roots during the initial-mark pause. The resulting >>>>> code is much simplified (and hopefully more performant!). >>>>> >>>>> I also added a method verify_no_cset_oops() which checks that >>>>> indeed all the marking data structures do not point to regions >>>>> that are being GCed at the start / end of each GC. (BTW, I'm >>>>> considering adding a develop flag to enable this on demand.) >>>>> >>>>> I should point out that this changeset will leave a lot of dead >>>>> code. However, I took the decision to keep the changes to a >>>>> minimum in order not overwhelm the code reviewers and make the >>>>> important changes clearer. (I also discussed this with a couple of >>>>> potential code reviewers and they agreed that this is a good >>>>> approach.) I temporarily added guarantees to ensure that methods >>>>> that should not be called are not called. I will remove all dead >>>>> code with a future push. >>>>> >>>>> I also have to apologize to John Cuthbertson for removing a lot of >>>>> code he's added to deal with various bugs we had in the GC/marking >>>>> interaction. Hopefully the new code will be less fragile compared >>>>> to what we've had so far and John will be able to concentrate on >>>>> more interesting features than trying to track down >>>>> hard-to-reproduce failures! >>>>> >>>>> Tony >>>>> > From fancyerii at gmail.com Tue Jan 10 20:45:21 2012 From: fancyerii at gmail.com (Li Li) Date: Wed, 11 Jan 2012 12:45:21 +0800 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: if the young generation is too small that it can't afford space for survivors and it have to throw them to old generation. and jvm found this, it will turn down TenuringThreshold ? I set TenuringThreshold to 10. and found that the full gc is less frequent and every full gc collect less garbage. it seems the parameter have the effect. But I found the load average is up and young gc time is much more than before. And the response time is also increased. I guess that there are more objects in young generation. so it have to do more young gc. although they are garbage, it's not a good idea to collect them too early. because ParNewGC will stop the world, the response time is increasing. So I adjust TenuringThreshold to 3 and there are no remarkable difference. maybe I should use object pool for my application because it use many large temporary objects. Another question, when my application runs for about 1-2 days. I found the response time increases. I guess it's the problem of large young generation. in the beginning, the total memory usage is about 4-5GB and young generation is 100-200MB, the rest is old generation. After running for days, the total memory usage is 8GB and young generation is about 2GB(I set new Ration 1:3) I am curious about the heap size adjusting. I found ?XX:MinHeapFreeRation and ?XX:MaxHeapFreeRation the default value is 40 and 70. the memory manage white paper says if the total heap free space is less than 40%, it will increase heap. if the free space is larger than 70%, it will decrease heap size. But why I see the young generation is 200mb while old is 4gb. does the adjustment of young related to old generation? I read in http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/ young generation should be less than 512MB, is it correct? On Wed, Jan 11, 2012 at 1:23 AM, Srinivas Ramakrishna wrote: > I recommend Charlie's excellent book as well. > > To answer yr question, yes, CMS + Parew does use MaxTenuringThreshold > (henceforth MTT), > but in order to allow objects to age you also need sufficiently large > survivor spaces to hold > them for however long you wish, otherwise the adaptive tenuring policy > will adjust the > "current" tenuring threshold so as to prevent overflow. That may be what > you saw. > Check out the info printed by +PrintTenuringThreshold. > > -- ramki > > On Tue, Jan 10, 2012 at 1:31 AM, Li Li wrote: > >> hi all >> I have an application that generating many large objects and then >> discard them. I found that full gc can free memory from 70% to 40%. >> I want to let this objects in young generation longer. I found >> -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. >> But I found a blog that says MaxTenuringThreshold is not used in >> ParNewGC. >> And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, but it >> seems no difference. >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120111/86b651b6/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From fancyerii at gmail.com Tue Jan 10 23:47:29 2012 From: fancyerii at gmail.com (Li Li) Date: Wed, 11 Jan 2012 15:47:29 +0800 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: 1. I don't understand why tenuring thresholds are calculated to be 1 2. I don't set Xms, I just set Xmx=8g 3. as for memory leak, I will try to find it. On Wed, Jan 11, 2012 at 3:18 PM, Kirk Pepperdine wrote: > Hi Li LI, > > I fear that you are off in the wrong direction. Resetting tenuring > thresholds in this case will never work because they are being calculated > to be 1. You're suggesting numbers greater than 1 and so 1 will always be > used which explains why you're not seeing a difference between runs. Having > a calculated tenuring threshold set to 1 implies that the memory pool is > too small. If the a memory pool is too small the only thing you can do to > fix that is to make it bigger. In this case, your young generational space > (as I've indicated in previous postings) is too small. Also, the cost of a > young generational collection is dependent mostly upon the number of > surviving objects, not dead ones. Pooling temporary objects will only make > the problem worse. If I recall your flag settings, you've set netsize to a > fixed value. That setting will override the the new ratio setting. You also > set Xmx==Xms and that also override adaptive sizing. Also you are using CMS > which is inherently not size adaptable. > > Last point, and this is the biggest one. The numbers you're publishing > right now suggest that you have a memory leak. There is no way you're going > to stabilize the memory /gc behaviour with a memory leak. Things will get > progressively worse as you consume more and more heap. This is a blocking > issue to all tuning efforts. It is the first thing that must be dealt with. > > To find the leak; > Identify the leaking object useing VisualVM's memory profiler with > generational counts and collect allocation stack traces turned on. Sort the > profile by generational counts. When you've identified the leaking object, > the domain class with the highest and always increasing generational count. > take an allocation stack trace snapshot and a heap dump. The heap dump > should be loaded into a heap walker. Use the knowledge gained from > generational counts to inspect the linkages for the leaking object and then > use that information in the allocation stack traces to identify casual > execution paths for creation. After that, it's into application code to > determine the fix. > > Kind regards, > Kirk Pepperdine > > On 2012-01-11, at 5:45 AM, Li Li wrote: > > if the young generation is too small that it can't afford space for > survivors and it have to throw them to old generation. and jvm found this, > it will turn down TenuringThreshold ? > I set TenuringThreshold to 10. and found that the full gc is less > frequent and every full gc collect less garbage. it seems the parameter > have the effect. But I found the load average is up and young gc time is > much more than before. And the response time is also increased. > I guess that there are more objects in young generation. so it have to > do more young gc. although they are garbage, it's not a good idea to > collect them too early. because ParNewGC will stop the world, the response > time is increasing. > So I adjust TenuringThreshold to 3 and there are no remarkable > difference. > maybe I should use object pool for my application because it use many > large temporary objects. > Another question, when my application runs for about 1-2 days. I found > the response time increases. I guess it's the problem of large young > generation. > in the beginning, the total memory usage is about 4-5GB and young > generation is 100-200MB, the rest is old generation. > After running for days, the total memory usage is 8GB and young > generation is about 2GB(I set new Ration 1:3) > I am curious about the heap size adjusting. I found ?XX:MinHeapFreeRation > and ?XX:MaxHeapFreeRation > the default value is 40 and 70. the memory manage white paper says if > the total heap free space is less than 40%, it will increase heap. if the > free space is larger than 70%, it will decrease heap size. > But why I see the young generation is 200mb while old is 4gb. does the > adjustment of young related to old generation? > I read in > http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/ young > generation should be less than 512MB, is it correct? > > > > On Wed, Jan 11, 2012 at 1:23 AM, Srinivas Ramakrishna wrote: > >> I recommend Charlie's excellent book as well. >> >> To answer yr question, yes, CMS + Parew does use MaxTenuringThreshold >> (henceforth MTT), >> but in order to allow objects to age you also need sufficiently large >> survivor spaces to hold >> them for however long you wish, otherwise the adaptive tenuring policy >> will adjust the >> "current" tenuring threshold so as to prevent overflow. That may be what >> you saw. >> Check out the info printed by +PrintTenuringThreshold. >> >> -- ramki >> >> On Tue, Jan 10, 2012 at 1:31 AM, Li Li wrote: >> >>> hi all >>> I have an application that generating many large objects and then >>> discard them. I found that full gc can free memory from 70% to 40%. >>> I want to let this objects in young generation longer. I found >>> -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. >>> But I found a blog that says MaxTenuringThreshold is not used in >>> ParNewGC. >>> And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, but it >>> seems no difference. >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >> > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120111/1d7218e8/attachment-0001.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From fancyerii at gmail.com Wed Jan 11 00:06:48 2012 From: fancyerii at gmail.com (Li Li) Date: Wed, 11 Jan 2012 16:06:48 +0800 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: I understand the first one. as for Xmx, when it reach the maxium 8GB, the young generation is in deed 1.8G and Eden:s0:s1=8:1:1. That's correct. but when I restart it for a few minutes. old is 4GB while young is 200-300MB I don't think there is memory leak because it has running for more than a month without OOV. My application is using lucene+solr to provide search service which need large memory. On Wed, Jan 11, 2012 at 3:55 PM, Kirk Pepperdine wrote: > > On 2012-01-11, at 8:47 AM, Li Li wrote: > > 1. I don't understand why tenuring thresholds are > calculated to be 1 > > > because the number of expected survivors exceeds the size of the survivor > space > > 2. I don't set Xms, I just set Xmx=8g > > > with a new ratio of 3.. you should have 2 gigs of young gen meaning a .2 > gigs for each survivor space and 1.6 for young gen. Do you have a GC log > you can use to confirm these values? If not try visualvm and this plugin > should give you a clear view (www.java.net/projects/memorypoolview). > > > 3. as for memory leak, I will try to find it. > > On Wed, Jan 11, 2012 at 3:18 PM, Kirk Pepperdine wrote: > >> Hi Li LI, >> >> I fear that you are off in the wrong direction. Resetting tenuring >> thresholds in this case will never work because they are being calculated >> to be 1. You're suggesting numbers greater than 1 and so 1 will always be >> used which explains why you're not seeing a difference between runs. Having >> a calculated tenuring threshold set to 1 implies that the memory pool is >> too small. If the a memory pool is too small the only thing you can do to >> fix that is to make it bigger. In this case, your young generational space >> (as I've indicated in previous postings) is too small. Also, the cost of a >> young generational collection is dependent mostly upon the number of >> surviving objects, not dead ones. Pooling temporary objects will only make >> the problem worse. If I recall your flag settings, you've set netsize to a >> fixed value. That setting will override the the new ratio setting. You also >> set Xmx==Xms and that also override adaptive sizing. Also you are using CMS >> which is inherently not size adaptable. >> >> Last point, and this is the biggest one. The numbers you're publishing >> right now suggest that you have a memory leak. There is no way you're going >> to stabilize the memory /gc behaviour with a memory leak. Things will get >> progressively worse as you consume more and more heap. This is a blocking >> issue to all tuning efforts. It is the first thing that must be dealt with. >> >> To find the leak; >> Identify the leaking object useing VisualVM's memory profiler with >> generational counts and collect allocation stack traces turned on. Sort the >> profile by generational counts. When you've identified the leaking object, >> the domain class with the highest and always increasing generational count. >> take an allocation stack trace snapshot and a heap dump. The heap dump >> should be loaded into a heap walker. Use the knowledge gained from >> generational counts to inspect the linkages for the leaking object and then >> use that information in the allocation stack traces to identify casual >> execution paths for creation. After that, it's into application code to >> determine the fix. >> >> Kind regards, >> Kirk Pepperdine >> >> On 2012-01-11, at 5:45 AM, Li Li wrote: >> >> if the young generation is too small that it can't afford space for >> survivors and it have to throw them to old generation. and jvm found this, >> it will turn down TenuringThreshold ? >> I set TenuringThreshold to 10. and found that the full gc is less >> frequent and every full gc collect less garbage. it seems the parameter >> have the effect. But I found the load average is up and young gc time is >> much more than before. And the response time is also increased. >> I guess that there are more objects in young generation. so it have to >> do more young gc. although they are garbage, it's not a good idea to >> collect them too early. because ParNewGC will stop the world, the response >> time is increasing. >> So I adjust TenuringThreshold to 3 and there are no remarkable >> difference. >> maybe I should use object pool for my application because it use many >> large temporary objects. >> Another question, when my application runs for about 1-2 days. I found >> the response time increases. I guess it's the problem of large young >> generation. >> in the beginning, the total memory usage is about 4-5GB and young >> generation is 100-200MB, the rest is old generation. >> After running for days, the total memory usage is 8GB and young >> generation is about 2GB(I set new Ration 1:3) >> I am curious about the heap size adjusting. I found ?XX:MinHeapFreeRation >> and ?XX:MaxHeapFreeRation >> the default value is 40 and 70. the memory manage white paper says if >> the total heap free space is less than 40%, it will increase heap. if the >> free space is larger than 70%, it will decrease heap size. >> But why I see the young generation is 200mb while old is 4gb. does the >> adjustment of young related to old generation? >> I read in >> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/ young >> generation should be less than 512MB, is it correct? >> >> >> >> On Wed, Jan 11, 2012 at 1:23 AM, Srinivas Ramakrishna wrote: >> >>> I recommend Charlie's excellent book as well. >>> >>> To answer yr question, yes, CMS + Parew does use MaxTenuringThreshold >>> (henceforth MTT), >>> but in order to allow objects to age you also need sufficiently large >>> survivor spaces to hold >>> them for however long you wish, otherwise the adaptive tenuring policy >>> will adjust the >>> "current" tenuring threshold so as to prevent overflow. That may be what >>> you saw. >>> Check out the info printed by +PrintTenuringThreshold. >>> >>> -- ramki >>> >>> On Tue, Jan 10, 2012 at 1:31 AM, Li Li wrote: >>> >>>> hi all >>>> I have an application that generating many large objects and then >>>> discard them. I found that full gc can free memory from 70% to 40%. >>>> I want to let this objects in young generation longer. I found >>>> -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. >>>> But I found a blog that says MaxTenuringThreshold is not used in >>>> ParNewGC. >>>> And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, but it >>>> seems no difference. >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120111/243309d6/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From ysr1729 at gmail.com Wed Jan 11 01:00:59 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Wed, 11 Jan 2012 01:00:59 -0800 Subject: Very long young gc pause (ParNew with CMS) In-Reply-To: <4F0ACAAC.8020103@java4.info> References: <4F0ACAAC.8020103@java4.info> Message-ID: On Mon, Jan 9, 2012 at 3:08 AM, Florian Binder wrote: > ... > I have seen that this problem occurs only after about one week of > uptime. Even thought we make a full (compacting) gc every night. > Since real-time > user-time I assume it might be a synchronization > problem. Can this be true? > > Together with your and Chi-Ho's conclusion that this is possibly related to paging, a question to ponder is why this happens only after a week. Since your process' heap size is presumably fixed and you have seen multiple full GC's (from which i assume that your heap's pages have all been touched), have you checked to see if the size of either this process (i.e. its native size) or of another process on the machine has grown during the week so that you start swapping? I also find it interesting that you state that whenever you see this problem there's always a single block in the old gen, and that the problem seems to go away when there are more than one block in the old gen. That would seem to throw out the paging theory, and point the finger of suspicion to some kind of bottleneck in the allocation out of a large block. You also state that you do a compacting collection every night, but the bad behaviour sets in only after a week. So let me ask you if you see that the slow scavenge happens to be the first scavenge after a full gc, or does the condition persist for a long time and is independent if whether a full gc has happened recently? Try turning on -XX:+PrintOldPLAB to see if it sheds any light... -- ramki -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120111/0d137518/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From fancyerii at gmail.com Wed Jan 11 01:24:02 2012 From: fancyerii at gmail.com (Li Li) Date: Wed, 11 Jan 2012 17:24:02 +0800 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: the log is too large to post here I just post some lines here. I grep the lines that gc time is larger than 100ms. the question is: at the beginning, young generation is about 50M. but after running a while, the memory is growing to 1.8GB. 1.75GB is Eden and 0.2G is s0 and s1. e.g. [GC [ParNew: 1843200K->204800K(1843200K), 0.2584570 secs] it is clear that the eden is 1843200K(1.75G), s0 is 0.2G. before young gc, eden are all used. after gc, s1 is all used(other live object are moved to old generation) 2012-01-10T18:26:45.992+0800: [GC [ParNew: 58732K->6528K(59072K), 0.1234300 secs] 1391982K->1375194K(1707564K), 0.1234900 secs] [Times: user=1.44 sys=0.02, real=0.12 secs] 2012-01-10T18:26:47.185+0800: [GC [ParNew: 59072K->6528K(59072K), 0.1335480 secs] 1507767K->1490151K(2340184K), 0.1336020 secs] [Times: user=1.60 sys=0.01, real=0.13 secs] 2012-01-10T18:26:56.605+0800: [GC [ParNew: 59072K->6528K(59072K), 0.0992650 secs] 1523647K->1509678K(2522312K), 0.0993220 secs] [Times: user=1.22 sys=0.01, real=0.10 secs] 2012-01-10T18:26:57.395+0800: [GC [ParNew: 52998K->6528K(59072K), 0.1948650 secs] 1556149K->1544918K(2522312K), 0.1949120 secs] [Times: user=2.46 sys=0.01, real=0.19 secs] 2012-01-10T18:27:05.072+0800: [GC [ParNew: 38463K->6528K(59072K), 0.1571700 secs] 2449032K->2447103K(2864820K), 0.1572150 secs] [Times: user=1.98 sys=0.02, real=0.16 secs] 2012-01-10T18:27:06.220+0800: [GC [ParNew: 59072K->6528K(59072K), 0.1641610 secs] 2499647K->2483866K(2864820K), 0.1642060 secs] [Times: user=2.07 sys=0.01, real=0.17 secs] 2012-01-10T22:24:08.939+0800: [GC [ParNew: 1826901K->204800K(1843200K), 0.1418510 secs] 3923985K->2352398K(7987200K), 0.1420700 secs] [Times: user=1.59 sys=0.05, real=0.14 secs] 2012-01-10T22:24:09.343+0800: [GC [ParNew: 1843200K->175652K(1843200K), 0.1994980 secs] 3990798K->2536312K(7987200K), 0.1996880 secs] [Times: user=1.98 sys=0.02, real=0.20 secs] 2012-01-10T22:24:10.049+0800: [GC [ParNew: 1814052K->151709K(1843200K), 0.1409050 secs] 4174712K->2618929K(7987200K), 0.1410940 secs] [Times: user=1.51 sys=0.00, real=0.14 secs] 2012-01-10T22:24:11.015+0800: [GC [ParNew: 1843200K->204800K(1843200K), 0.2584570 secs] 4311783K->2831783K(7987200K), 0.2586440 secs] [Times: user=2.83 sys=0.00, real=0.26 secs] 2012-01-10T22:24:11.543+0800: [GC [ParNew: 1843200K->188261K(1843200K), 0.2356920 secs] 4470183K->3028255K(7987200K), 0.2358800 secs] [Times: user=2.41 sys=0.01, real=0.24 secs] On Wed, Jan 11, 2012 at 4:24 PM, Kirk Pepperdine wrote: > > On 2012-01-11, at 9:06 AM, Li Li wrote: > > I understand the first one. > as for Xmx, when it reach the maxium 8GB, the young generation is in deed > 1.8G and Eden:s0:s1=8:1:1. That's correct. > but when I restart it for a few minutes. old is 4GB while young is > 200-300MB > > > Right, ratios are adaptive and if you're using CMS, will require a full GC > to occur before they will adapt. Size will start off small and then get > bigger as needed. > > I don't think there is memory leak because it has running for more than a > month without OOV. > My application is using lucene+solr to provide search service which need > large memory. > > > Well, if memory use stabilizes than you don't have a leak. But I'd need to > see a GC log to give you better advice. All I can say is that the more > switches you touch the more you've got to understand about how things work > in order to make effective changes. I generally start with minimal switch > settings and then adjust as needed. Starting with a ratio is better than > starting with a fixed value. If the ratio isn't working for you then moved > to a fixed size. But use the data in the gc log to tell you how to proceed. > > Also, if your application is swapping during GC you will increase the > duration of the collection. You need to monitor system level activity as > part of the investigation. > > Regards, > Kirk > > > On Wed, Jan 11, 2012 at 3:55 PM, Kirk Pepperdine < > kirk.pepperdine at gmail.com> wrote: > >> >> On 2012-01-11, at 8:47 AM, Li Li wrote: >> >> 1. I don't understand why tenuring thresholds are >> calculated to be 1 >> >> >> because the number of expected survivors exceeds the size of the survivor >> space >> >> 2. I don't set Xms, I just set Xmx=8g >> >> >> with a new ratio of 3.. you should have 2 gigs of young gen meaning a .2 >> gigs for each survivor space and 1.6 for young gen. Do you have a GC log >> you can use to confirm these values? If not try visualvm and this plugin >> should give you a clear view (www.java.net/projects/memorypoolview). >> >> >> 3. as for memory leak, I will try to find it. >> >> On Wed, Jan 11, 2012 at 3:18 PM, Kirk Pepperdine wrote: >> >>> Hi Li LI, >>> >>> I fear that you are off in the wrong direction. Resetting tenuring >>> thresholds in this case will never work because they are being calculated >>> to be 1. You're suggesting numbers greater than 1 and so 1 will always be >>> used which explains why you're not seeing a difference between runs. Having >>> a calculated tenuring threshold set to 1 implies that the memory pool is >>> too small. If the a memory pool is too small the only thing you can do to >>> fix that is to make it bigger. In this case, your young generational space >>> (as I've indicated in previous postings) is too small. Also, the cost of a >>> young generational collection is dependent mostly upon the number of >>> surviving objects, not dead ones. Pooling temporary objects will only make >>> the problem worse. If I recall your flag settings, you've set netsize to a >>> fixed value. That setting will override the the new ratio setting. You also >>> set Xmx==Xms and that also override adaptive sizing. Also you are using CMS >>> which is inherently not size adaptable. >>> >>> Last point, and this is the biggest one. The numbers you're publishing >>> right now suggest that you have a memory leak. There is no way you're going >>> to stabilize the memory /gc behaviour with a memory leak. Things will get >>> progressively worse as you consume more and more heap. This is a blocking >>> issue to all tuning efforts. It is the first thing that must be dealt with. >>> >>> To find the leak; >>> Identify the leaking object useing VisualVM's memory profiler with >>> generational counts and collect allocation stack traces turned on. Sort the >>> profile by generational counts. When you've identified the leaking object, >>> the domain class with the highest and always increasing generational count. >>> take an allocation stack trace snapshot and a heap dump. The heap dump >>> should be loaded into a heap walker. Use the knowledge gained from >>> generational counts to inspect the linkages for the leaking object and then >>> use that information in the allocation stack traces to identify casual >>> execution paths for creation. After that, it's into application code to >>> determine the fix. >>> >>> Kind regards, >>> Kirk Pepperdine >>> >>> On 2012-01-11, at 5:45 AM, Li Li wrote: >>> >>> if the young generation is too small that it can't afford space for >>> survivors and it have to throw them to old generation. and jvm found this, >>> it will turn down TenuringThreshold ? >>> I set TenuringThreshold to 10. and found that the full gc is less >>> frequent and every full gc collect less garbage. it seems the parameter >>> have the effect. But I found the load average is up and young gc time is >>> much more than before. And the response time is also increased. >>> I guess that there are more objects in young generation. so it have >>> to do more young gc. although they are garbage, it's not a good idea to >>> collect them too early. because ParNewGC will stop the world, the response >>> time is increasing. >>> So I adjust TenuringThreshold to 3 and there are no remarkable >>> difference. >>> maybe I should use object pool for my application because it use many >>> large temporary objects. >>> Another question, when my application runs for about 1-2 days. I >>> found the response time increases. I guess it's the problem of large young >>> generation. >>> in the beginning, the total memory usage is about 4-5GB and young >>> generation is 100-200MB, the rest is old generation. >>> After running for days, the total memory usage is 8GB and young >>> generation is about 2GB(I set new Ration 1:3) >>> I am curious about the heap size adjusting. I found ?XX:MinHeapFreeRation >>> and ?XX:MaxHeapFreeRation >>> the default value is 40 and 70. the memory manage white paper says if >>> the total heap free space is less than 40%, it will increase heap. if the >>> free space is larger than 70%, it will decrease heap size. >>> But why I see the young generation is 200mb while old is 4gb. does >>> the adjustment of young related to old generation? >>> I read in >>> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/ young >>> generation should be less than 512MB, is it correct? >>> >>> >>> >>> On Wed, Jan 11, 2012 at 1:23 AM, Srinivas Ramakrishna >> > wrote: >>> >>>> I recommend Charlie's excellent book as well. >>>> >>>> To answer yr question, yes, CMS + Parew does use MaxTenuringThreshold >>>> (henceforth MTT), >>>> but in order to allow objects to age you also need sufficiently large >>>> survivor spaces to hold >>>> them for however long you wish, otherwise the adaptive tenuring policy >>>> will adjust the >>>> "current" tenuring threshold so as to prevent overflow. That may be >>>> what you saw. >>>> Check out the info printed by +PrintTenuringThreshold. >>>> >>>> -- ramki >>>> >>>> On Tue, Jan 10, 2012 at 1:31 AM, Li Li wrote: >>>> >>>>> hi all >>>>> I have an application that generating many large objects and then >>>>> discard them. I found that full gc can free memory from 70% to 40%. >>>>> I want to let this objects in young generation longer. I found >>>>> -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. >>>>> But I found a blog that says MaxTenuringThreshold is not used in >>>>> ParNewGC. >>>>> And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, but it >>>>> seems no difference. >>>>> >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>>> >>>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >>> >> >> > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120111/325a0052/attachment-0001.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From fancyerii at gmail.com Wed Jan 11 01:32:30 2012 From: fancyerii at gmail.com (Li Li) Date: Wed, 11 Jan 2012 17:32:30 +0800 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: after a concurrent mode failure. the young generation changed from about 50MB to 1.8GB What's the logic behind this? 2012-01-10T22:23:54.544+0800: [GC [ParNew: 55389K->6528K(59072K), 0.0175440 secs] 5886124K->5839323K(6195204K), 0.0177480 secs] [Times: user=0.20 sys=0.00, real=0.01 secs] 2012-01-10T22:23:54.575+0800: [GC [ParNew: 59072K->6528K(59072K), 0.0234040 secs] 5891867K->5845823K(6201540K), 0.0236070 secs] [Times: user=0.24 sys=0.00, real=0.02 secs] 2012-01-10T22:23:54.612+0800: [GC [ParNew (promotion failed): 59072K->58862K(59072K), 2.3119860 secs][CMS2012-01-10T22:23:57.153+0800: [CMS-concurrent-preclean: 10.999/28.245 secs] [Times: user=290.41 sys=4.65, real=28.24 secs] (concurrent mode failure): 5841457K->2063142K(6144000K), 8.8971660 secs] 5898367K->2063142K(6203072K), [CMS Perm : 31369K->31131K(52316K)], 11.2110080 secs] [Times: user=11.73 sys=0.51, real=11.21 secs] 2012-01-10T22:24:06.125+0800: [GC [ParNew: 1638400K->46121K(1843200K), 0.0225800 secs] 3701542K->2109263K(7987200K), 0.0228190 secs] [Times: user=0.26 sys=0.02, real=0.02 secs] 2012-01-10T22:24:06.357+0800: [GC [ParNew: 1684521K->111262K(1843200K), 0.0381370 secs] 3747663K->2174404K(7987200K), 0.0383860 secs] [Times: user=0.44 sys=0.04, real=0.04 secs] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120111/3777d402/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From java at java4.info Wed Jan 11 01:45:28 2012 From: java at java4.info (Florian Binder) Date: Wed, 11 Jan 2012 10:45:28 +0100 Subject: Very long young gc pause (ParNew with CMS) In-Reply-To: References: <4F0ACAAC.8020103@java4.info> Message-ID: <4F0D5A38.1090906@java4.info> I do not know why it has worked for a week. Maybe it is because this was the xmas week ;-) In the night there are a lot of disk operations (2 TB of data is written). Therefore the operating system caches a lot of files and tries to free memory for this, so unused pages are moved to swap space. I assume heap fragmentation avoids swapping, since more pages are touched during the application is running. After a compacting gc there is one large (free) block which is not touched until young gc copies the objects from eden space. This will yield the operating system to move the pages of this one free block to swap and at every young gc it has to read it from swap. After a CMS collection the following young gcs are much faster because the gaps in the heap are not swapped. Yesterday, we have turned off the swap on this machine and now all young gcs take less than 200ms (instead of 6s) :-) Thanks againt to Chi Ho Kwok for giving the key hint :-) Flo Am 11.01.2012 10:00, schrieb Srinivas Ramakrishna: > > > On Mon, Jan 9, 2012 at 3:08 AM, Florian Binder > wrote: > > ... > I have seen that this problem occurs only after about one week of > uptime. Even thought we make a full (compacting) gc every night. > Since real-time > user-time I assume it might be a synchronization > problem. Can this be true? > > > Together with your and Chi-Ho's conclusion that this is possibly > related to paging, > a question to ponder is why this happens only after a week. Since your > process' > heap size is presumably fixed and you have seen multiple full GC's > (from which > i assume that your heap's pages have all been touched), have you > checked to > see if the size of either this process (i.e. its native size) or of > another process > on the machine has grown during the week so that you start swapping? > > I also find it interesting that you state that whenever you see this > problem > there's always a single block in the old gen, and that the problem > seems to go > away when there are more than one block in the old gen. That would seem > to throw out the paging theory, and point the finger of suspicion to > some kind > of bottleneck in the allocation out of a large block. You also state > that you > do a compacting collection every night, but the bad behaviour sets in only > after a week. > > So let me ask you if you see that the slow scavenge happens to be the > first > scavenge after a full gc, or does the condition persist for a long > time and > is independent if whether a full gc has happened recently? > > Try turning on -XX:+PrintOldPLAB to see if it sheds any light... > > -- ramki -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120111/19dc97a7/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tony.printezis at oracle.com Wed Jan 11 07:24:34 2012 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Wed, 11 Jan 2012 15:24:34 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces Message-ID: <20120111152438.DE1C547920@hg.openjdk.java.net> Changeset: 2ace1c4ee8da Author: tonyp Date: 2012-01-10 18:58 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/2ace1c4ee8da 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces Summary: This change simplifies the interaction between GC and concurrent marking. By disabling survivor spaces during the initial-mark pause we don't need to propagate marks of objects we copy during each GC (since we never need to copy an explicitly marked object). Reviewed-by: johnc, brutisso ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/concurrentMark.hpp ! src/share/vm/gc_implementation/g1/concurrentMark.inline.hpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.hpp ! src/share/vm/gc_implementation/g1/g1EvacFailure.hpp ! src/share/vm/gc_implementation/g1/g1OopClosures.hpp ! src/share/vm/gc_implementation/g1/heapRegion.cpp ! src/share/vm/gc_implementation/g1/heapRegion.hpp ! src/share/vm/gc_implementation/g1/heapRegion.inline.hpp ! src/share/vm/gc_implementation/g1/ptrQueue.hpp ! src/share/vm/gc_implementation/g1/satbQueue.cpp ! src/share/vm/gc_implementation/g1/satbQueue.hpp From kirk.pepperdine at gmail.com Wed Jan 11 01:48:36 2012 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Wed, 11 Jan 2012 10:48:36 +0100 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: <9462441C-C11B-4CDC-83BD-D78E2A1138AB@gmail.com> CMS is not adaptive. To reconfigure heap, for many reasons, you need a full GC to occur. The response to a concurrent mode failure is always a full GC. That gave the JVM the opportunity to resize heap space. If this behaviour isn't happening when it should or is cause other problems it's time to either set the young gen size directly with NewSize or switch to the parallel collector with the adaptive sizing policy turned on. Logic here is that you want to avoid long pauses, use CMS. If CMS is giving you long pauses, than the parallel collector might be a better choice. Regards, Kirk On 2012-01-11, at 10:32 AM, Li Li wrote: > after a concurrent mode failure. the young generation changed from about 50MB to 1.8GB > What's the logic behind this? > > 2012-01-10T22:23:54.544+0800: [GC [ParNew: 55389K->6528K(59072K), 0.0175440 secs] 5886124K->5839323K(6195204K), 0.0177480 secs] [Times: user=0.20 sys=0.00, real=0.01 secs] > 2012-01-10T22:23:54.575+0800: [GC [ParNew: 59072K->6528K(59072K), 0.0234040 secs] 5891867K->5845823K(6201540K), 0.0236070 secs] [Times: user=0.24 sys=0.00, real=0.02 secs] > 2012-01-10T22:23:54.612+0800: [GC [ParNew (promotion failed): 59072K->58862K(59072K), 2.3119860 secs][CMS2012-01-10T22:23:57.153+0800: [CMS-concurrent-preclean: 10.999/28.245 secs] [Times: user=290.41 sys=4.65, real=28.24 secs] > (concurrent mode failure): 5841457K->2063142K(6144000K), 8.8971660 secs] 5898367K->2063142K(6203072K), [CMS Perm : 31369K->31131K(52316K)], 11.2110080 secs] [Times: user=11.73 sys=0.51, real=11.21 secs] > 2012-01-10T22:24:06.125+0800: [GC [ParNew: 1638400K->46121K(1843200K), 0.0225800 secs] 3701542K->2109263K(7987200K), 0.0228190 secs] [Times: user=0.26 sys=0.02, real=0.02 secs] > 2012-01-10T22:24:06.357+0800: [GC [ParNew: 1684521K->111262K(1843200K), 0.0381370 secs] 3747663K->2174404K(7987200K), 0.0383860 secs] [Times: user=0.44 sys=0.04, real=0.04 secs] > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From kirk.pepperdine at gmail.com Tue Jan 10 23:55:34 2012 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Wed, 11 Jan 2012 08:55:34 +0100 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: On 2012-01-11, at 8:47 AM, Li Li wrote: > 1. I don't understand why tenuring thresholds are > calculated to be 1 because the number of expected survivors exceeds the size of the survivor space > 2. I don't set Xms, I just set Xmx=8g with a new ratio of 3.. you should have 2 gigs of young gen meaning a .2 gigs for each survivor space and 1.6 for young gen. Do you have a GC log you can use to confirm these values? If not try visualvm and this plugin should give you a clear view (www.java.net/projects/memorypoolview). > 3. as for memory leak, I will try to find it. > > On Wed, Jan 11, 2012 at 3:18 PM, Kirk Pepperdine wrote: > Hi Li LI, > > I fear that you are off in the wrong direction. Resetting tenuring thresholds in this case will never work because they are being calculated to be 1. You're suggesting numbers greater than 1 and so 1 will always be used which explains why you're not seeing a difference between runs. Having a calculated tenuring threshold set to 1 implies that the memory pool is too small. If the a memory pool is too small the only thing you can do to fix that is to make it bigger. In this case, your young generational space (as I've indicated in previous postings) is too small. Also, the cost of a young generational collection is dependent mostly upon the number of surviving objects, not dead ones. Pooling temporary objects will only make the problem worse. If I recall your flag settings, you've set netsize to a fixed value. That setting will override the the new ratio setting. You also set Xmx==Xms and that also override adaptive sizing. Also you are using CMS which is inherently not size adaptable. > > Last point, and this is the biggest one. The numbers you're publishing right now suggest that you have a memory leak. There is no way you're going to stabilize the memory /gc behaviour with a memory leak. Things will get progressively worse as you consume more and more heap. This is a blocking issue to all tuning efforts. It is the first thing that must be dealt with. > > To find the leak; > Identify the leaking object useing VisualVM's memory profiler with generational counts and collect allocation stack traces turned on. Sort the profile by generational counts. When you've identified the leaking object, the domain class with the highest and always increasing generational count. take an allocation stack trace snapshot and a heap dump. The heap dump should be loaded into a heap walker. Use the knowledge gained from generational counts to inspect the linkages for the leaking object and then use that information in the allocation stack traces to identify casual execution paths for creation. After that, it's into application code to determine the fix. > > Kind regards, > Kirk Pepperdine > > On 2012-01-11, at 5:45 AM, Li Li wrote: > >> if the young generation is too small that it can't afford space for survivors and it have to throw them to old generation. and jvm found this, it will turn down TenuringThreshold ? >> I set TenuringThreshold to 10. and found that the full gc is less frequent and every full gc collect less garbage. it seems the parameter have the effect. But I found the load average is up and young gc time is much more than before. And the response time is also increased. >> I guess that there are more objects in young generation. so it have to do more young gc. although they are garbage, it's not a good idea to collect them too early. because ParNewGC will stop the world, the response time is increasing. >> So I adjust TenuringThreshold to 3 and there are no remarkable difference. >> maybe I should use object pool for my application because it use many large temporary objects. >> Another question, when my application runs for about 1-2 days. I found the response time increases. I guess it's the problem of large young generation. >> in the beginning, the total memory usage is about 4-5GB and young generation is 100-200MB, the rest is old generation. >> After running for days, the total memory usage is 8GB and young generation is about 2GB(I set new Ration 1:3) >> I am curious about the heap size adjusting. I found ?XX:MinHeapFreeRation and ?XX:MaxHeapFreeRation >> the default value is 40 and 70. the memory manage white paper says if the total heap free space is less than 40%, it will increase heap. if the free space is larger than 70%, it will decrease heap size. >> But why I see the young generation is 200mb while old is 4gb. does the adjustment of young related to old generation? >> I read in http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/ young generation should be less than 512MB, is it correct? >> >> >> >> On Wed, Jan 11, 2012 at 1:23 AM, Srinivas Ramakrishna wrote: >> I recommend Charlie's excellent book as well. >> >> To answer yr question, yes, CMS + Parew does use MaxTenuringThreshold (henceforth MTT), >> but in order to allow objects to age you also need sufficiently large survivor spaces to hold >> them for however long you wish, otherwise the adaptive tenuring policy will adjust the >> "current" tenuring threshold so as to prevent overflow. That may be what you saw. >> Check out the info printed by +PrintTenuringThreshold. >> >> -- ramki >> >> On Tue, Jan 10, 2012 at 1:31 AM, Li Li wrote: >> hi all >> I have an application that generating many large objects and then discard them. I found that full gc can free memory from 70% to 40%. >> I want to let this objects in young generation longer. I found -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. >> But I found a blog that says MaxTenuringThreshold is not used in ParNewGC. >> And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, but it seems no difference. >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120111/1eafdc88/attachment-0001.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From kirk at kodewerk.com Tue Jan 10 23:18:04 2012 From: kirk at kodewerk.com (Kirk Pepperdine) Date: Wed, 11 Jan 2012 08:18:04 +0100 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: Message-ID: Hi Li LI, I fear that you are off in the wrong direction. Resetting tenuring thresholds in this case will never work because they are being calculated to be 1. You're suggesting numbers greater than 1 and so 1 will always be used which explains why you're not seeing a difference between runs. Having a calculated tenuring threshold set to 1 implies that the memory pool is too small. If the a memory pool is too small the only thing you can do to fix that is to make it bigger. In this case, your young generational space (as I've indicated in previous postings) is too small. Also, the cost of a young generational collection is dependent mostly upon the number of surviving objects, not dead ones. Pooling temporary objects will only make the problem worse. If I recall your flag settings, you've set netsize to a fixed value. That setting will override the the new ratio setting. You also set Xmx==Xms and that also override adaptive sizing. Also you are using CMS which is inherently not size adaptable. Last point, and this is the biggest one. The numbers you're publishing right now suggest that you have a memory leak. There is no way you're going to stabilize the memory /gc behaviour with a memory leak. Things will get progressively worse as you consume more and more heap. This is a blocking issue to all tuning efforts. It is the first thing that must be dealt with. To find the leak; Identify the leaking object useing VisualVM's memory profiler with generational counts and collect allocation stack traces turned on. Sort the profile by generational counts. When you've identified the leaking object, the domain class with the highest and always increasing generational count. take an allocation stack trace snapshot and a heap dump. The heap dump should be loaded into a heap walker. Use the knowledge gained from generational counts to inspect the linkages for the leaking object and then use that information in the allocation stack traces to identify casual execution paths for creation. After that, it's into application code to determine the fix. Kind regards, Kirk Pepperdine On 2012-01-11, at 5:45 AM, Li Li wrote: > if the young generation is too small that it can't afford space for survivors and it have to throw them to old generation. and jvm found this, it will turn down TenuringThreshold ? > I set TenuringThreshold to 10. and found that the full gc is less frequent and every full gc collect less garbage. it seems the parameter have the effect. But I found the load average is up and young gc time is much more than before. And the response time is also increased. > I guess that there are more objects in young generation. so it have to do more young gc. although they are garbage, it's not a good idea to collect them too early. because ParNewGC will stop the world, the response time is increasing. > So I adjust TenuringThreshold to 3 and there are no remarkable difference. > maybe I should use object pool for my application because it use many large temporary objects. > Another question, when my application runs for about 1-2 days. I found the response time increases. I guess it's the problem of large young generation. > in the beginning, the total memory usage is about 4-5GB and young generation is 100-200MB, the rest is old generation. > After running for days, the total memory usage is 8GB and young generation is about 2GB(I set new Ration 1:3) > I am curious about the heap size adjusting. I found ?XX:MinHeapFreeRation and ?XX:MaxHeapFreeRation > the default value is 40 and 70. the memory manage white paper says if the total heap free space is less than 40%, it will increase heap. if the free space is larger than 70%, it will decrease heap size. > But why I see the young generation is 200mb while old is 4gb. does the adjustment of young related to old generation? > I read in http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/ young generation should be less than 512MB, is it correct? > > > > On Wed, Jan 11, 2012 at 1:23 AM, Srinivas Ramakrishna wrote: > I recommend Charlie's excellent book as well. > > To answer yr question, yes, CMS + Parew does use MaxTenuringThreshold (henceforth MTT), > but in order to allow objects to age you also need sufficiently large survivor spaces to hold > them for however long you wish, otherwise the adaptive tenuring policy will adjust the > "current" tenuring threshold so as to prevent overflow. That may be what you saw. > Check out the info printed by +PrintTenuringThreshold. > > -- ramki > > On Tue, Jan 10, 2012 at 1:31 AM, Li Li wrote: > hi all > I have an application that generating many large objects and then discard them. I found that full gc can free memory from 70% to 40%. > I want to let this objects in young generation longer. I found -XX:MaxTenuringThreshold and -XX:PretenureSizeThreshold. > But I found a blog that says MaxTenuringThreshold is not used in ParNewGC. > And I use ParNewGC+CMS. I tried to set MaxTenuringThreshold=10, but it seems no difference. > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120111/20772300/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From jon.masamitsu at oracle.com Wed Jan 11 08:54:28 2012 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Wed, 11 Jan 2012 08:54:28 -0800 Subject: Promotion failures: indication of CMS fragmentation? In-Reply-To: References: <4EF9FCAC.3030208@oracle.com> <4F06A270.3010701@oracle.com> Message-ID: <4F0DBEC4.7040907@oracle.com> Taras, > I assume that the large sizes for the promotion failures during ParNew > are confirming that eliminating large array allocations might help > here. Do you agree? I agree that eliminating the large array allocation will help but you are still having promotion failures when the allocation size is small (I think it was 1026). That says that you are filling up the old (cms) generation faster than the GC can collect it. The large arrays are aggrevating the problem but not necessarily the cause. If these are still your heap sizes, > -Xms5g > -Xmx5g > -Xmn400m Start by increasing the young gen size as may already have been suggested. If you have a test setup where you can experiment, try doubling the young gen size to start. If you have not seen this, it might be helpful. http://blogs.oracle.com/jonthecollector/entry/what_the_heck_s_a > I'm not sure what to make of the concurrent mode The concurrent mode failure is a consequence of the promotion failure. Once the promotion failure happens the concurrent mode failure is inevitable. Jon > . On 1/11/2012 3:00 AM, Taras Tielkes wrote: > Hi Jon, > > We've added the -XX:+PrintPromotionFailure flag to our production > application yesterday. > The application is running on 4 (homogenous) nodes. > > In the gc logs of 3 out of 4 nodes, I've found a promotion failure > event during ParNew: > > node-002 > ------- > 2012-01-11T09:39:14.353+0100: 102975.594: [GC 102975.594: [ParNew: > 357592K->23382K(368640K), 0.0298150 secs] > 3528237K->3194027K(5201920K), 0.0300860 secs] [Times: user=0.22 > sys=0.01, real=0.03 secs] > 2012-01-11T09:39:17.489+0100: 102978.730: [GC 102978.730: [ParNew: > 351062K->39795K(368640K), 0.0401170 secs] > 3521707K->3210439K(5201920K), 0.0403800 secs] [Times: user=0.28 > sys=0.00, real=0.04 secs] > 2012-01-11T09:39:19.869+0100: 102981.110: [GC 102981.110: [ParNew (4: > promotion failure size = 4281460) (promotion failed): > 350134K->340392K(368640K), 0.1378780 secs]102981.248: [CMS: > 3181346K->367952K(4833280K), 4.7036230 secs] 3520778K > ->367952K(5201920K), [CMS Perm : 116828K->116809K(262144K)], 4.8418590 > secs] [Times: user=5.10 sys=0.00, real=4.84 secs] > 2012-01-11T09:39:25.264+0100: 102986.504: [GC 102986.505: [ParNew: > 327680K->40960K(368640K), 0.0415470 secs] 695632K->419560K(5201920K), > 0.0418770 secs] [Times: user=0.26 sys=0.01, real=0.04 secs] > 2012-01-11T09:39:26.035+0100: 102987.276: [GC 102987.276: [ParNew: > 368640K->40960K(368640K), 0.0925740 secs] 747240K->481611K(5201920K), > 0.0928570 secs] [Times: user=0.54 sys=0.01, real=0.09 secs] > > node-003 > ------- > 2012-01-10T17:48:28.369+0100: 45929.686: [GC 45929.686: [ParNew: > 346950K->21342K(368640K), 0.0333090 secs] > 2712364K->2386756K(5201920K), 0.0335740 secs] [Times: user=0.23 > sys=0.00, real=0.03 secs] > 2012-01-10T17:48:32.933+0100: 45934.250: [GC 45934.250: [ParNew: > 345070K->32211K(368640K), 0.0369260 secs] > 2710484K->2397625K(5201920K), 0.0372380 secs] [Times: user=0.25 > sys=0.00, real=0.04 secs] > 2012-01-10T17:48:34.201+0100: 45935.518: [GC 45935.518: [ParNew (0: > promotion failure size = 1266955) (promotion failed): > 359891K->368640K(368640K), 0.1395570 secs]45935.658: [CMS: > 2387690K->348838K(4833280K), 4.5680670 secs] 2725305K->3 > 48838K(5201920K), [CMS Perm : 116740K->116715K(262144K)], 4.7079640 > secs] [Times: user=5.03 sys=0.00, real=4.71 secs] > 2012-01-10T17:48:40.572+0100: 45941.889: [GC 45941.889: [ParNew: > 327680K->40960K(368640K), 0.0486510 secs] 676518K->405004K(5201920K), > 0.0489930 secs] [Times: user=0.26 sys=0.00, real=0.05 secs] > 2012-01-10T17:48:41.959+0100: 45943.276: [GC 45943.277: [ParNew: > 360621K->40960K(368640K), 0.0833240 secs] 724666K->479857K(5201920K), > 0.0836120 secs] [Times: user=0.48 sys=0.01, real=0.08 secs] > > node-004 > ------- > 2012-01-10T18:59:02.338+0100: 50163.649: [GC 50163.649: [ParNew: > 358429K->40960K(368640K), 0.0629910 secs] > 3569331K->3283304K(5201920K), 0.0632710 secs] [Times: user=0.40 > sys=0.02, real=0.06 secs] > 2012-01-10T18:59:08.137+0100: 50169.448: [GC 50169.448: [ParNew: > 368640K->40960K(368640K), 0.0819780 secs] > 3610984K->3323445K(5201920K), 0.0822430 secs] [Times: user=0.40 > sys=0.00, real=0.08 secs] > 2012-01-10T18:59:13.945+0100: 50175.256: [GC 50175.256: [ParNew (6: > promotion failure size = 2788662) (promotion failed): > 367619K->364864K(368640K), 0.2024350 secs]50175.458: [CMS: > 3310044K->330922K(4833280K), 4.5104170 secs] > 3650104K->330922K(5201920K), [CMS Perm : 116747K->116728K(262144K)], > 4.7132220 secs] [Times: user=4.99 sys=0.01, real=4.72 secs] > 2012-01-10T18:59:20.539+0100: 50181.850: [GC 50181.850: [ParNew: > 327680K->37328K(368640K), 0.0270660 secs] 658602K->368251K(5201920K), > 0.0273800 secs] [Times: user=0.15 sys=0.00, real=0.02 secs] > 2012-01-10T18:59:25.183+0100: 50186.494: [GC 50186.494: [ParNew: > 363504K->15099K(368640K), 0.0388710 secs] 694427K->362063K(5201920K), > 0.0391790 secs] [Times: user=0.18 sys=0.00, real=0.04 secs] > > On a fourth node, I've found a different event: promotion failure > during CMS, with a much smaller size: > > node-001 > ------- > 2012-01-10T18:30:07.471+0100: 48428.764: [GC 48428.764: [ParNew: > 354039K->40960K(368640K), 0.0667340 secs] > 3609061K->3318149K(5201920K), 0.0670150 secs] [Times: user=0.37 > sys=0.01, real=0.06 secs] > 2012-01-10T18:30:08.706+0100: 48429.999: [GC 48430.000: [ParNew: > 368640K->40960K(368640K), 0.2586390 secs] > 3645829K->3417273K(5201920K), 0.2589050 secs] [Times: user=0.73 > sys=0.13, real=0.26 secs] > 2012-01-10T18:30:08.974+0100: 48430.267: [GC [1 CMS-initial-mark: > 3376313K(4833280K)] 3427492K(5201920K), 0.0743900 secs] [Times: > user=0.07 sys=0.00, real=0.07 secs] > 2012-01-10T18:30:09.049+0100: 48430.342: [CMS-concurrent-mark-start] > 2012-01-10T18:30:10.009+0100: 48431.302: [CMS-concurrent-mark: > 0.933/0.960 secs] [Times: user=4.59 sys=0.13, real=0.96 secs] > 2012-01-10T18:30:10.009+0100: 48431.302: [CMS-concurrent-preclean-start] > 2012-01-10T18:30:10.089+0100: 48431.382: [CMS-concurrent-preclean: > 0.060/0.080 secs] [Times: user=0.34 sys=0.02, real=0.08 secs] > 2012-01-10T18:30:10.089+0100: 48431.382: > [CMS-concurrent-abortable-preclean-start] > 2012-01-10T18:30:10.586+0100: 48431.880: [GC 48431.880: [ParNew: > 368640K->40960K(368640K), 0.1214420 secs] > 3744953K->3490912K(5201920K), 0.1217480 secs] [Times: user=0.66 > sys=0.05, real=0.12 secs] > 2012-01-10T18:30:12.785+0100: 48434.078: > [CMS-concurrent-abortable-preclean: 2.526/2.696 secs] [Times: > user=10.72 sys=0.48, real=2.70 secs] > 2012-01-10T18:30:12.787+0100: 48434.081: [GC[YG occupancy: 206521 K > (368640 K)]2012-01-10T18:30:12.788+0100: 48434.081: [GC 48434.081: > [ParNew (promotion failure size = 1026) (promotion failed): > 206521K->206521K(368640K), 0.1667280 secs] > 3656474K->3696197K(5201920K), 0.1670260 secs] [Times: user=0.48 > sys=0.04, real=0.17 secs] > 48434.248: [Rescan (parallel) , 0.1972570 secs]48434.445: [weak refs > processing, 0.0011570 secs]48434.446: [class unloading, 0.0277750 > secs]48434.474: [scrub symbol& string tables, 0.0088370 secs] [1 > CMS-remark: 3489675K(4833280K)] 36961 > 97K(5201920K), 0.4088040 secs] [Times: user=1.62 sys=0.05, real=0.41 secs] > 2012-01-10T18:30:13.197+0100: 48434.490: [CMS-concurrent-sweep-start] > 2012-01-10T18:30:17.427+0100: 48438.720: [Full GC 48438.720: > [CMS2012-01-10T18:30:21.636+0100: 48442.929: [CMS-concurrent-sweep: > 7.949/8.439 secs] [Times: user=15.89 sys=1.57, real=8.44 secs] > (concurrent mode failure): 2505348K->334385K(4833280K), 8.6109050 > secs] 2873988K->334385K(5201920K), [CMS Perm : > 117788K->117762K(262144K)], 8.6112520 secs] [Times: user=8.61 > sys=0.00, real=8.61 secs] > 2012-01-10T18:30:26.716+0100: 48448.009: [GC 48448.010: [ParNew: > 327680K->40960K(368640K), 0.0407520 secs] 662065K->394656K(5201920K), > 0.0411550 secs] [Times: user=0.25 sys=0.00, real=0.04 secs] > 2012-01-10T18:30:28.825+0100: 48450.118: [GC 48450.118: [ParNew: > 368639K->40960K(368640K), 0.0662780 secs] 722335K->433355K(5201920K), > 0.0666190 secs] [Times: user=0.35 sys=0.00, real=0.06 secs] > > I assume that the large sizes for the promotion failures during ParNew > are confirming that eliminating large array allocations might help > here. Do you agree? > I'm not sure what to make of the concurrent mode failure. > > Thanks in advance for any suggestions, > Taras > > On Fri, Jan 6, 2012 at 8:27 AM, Jon Masamitsu wrote: >> >> On 1/5/2012 3:32 PM, Taras Tielkes wrote: >>> Hi Jon, >>> >>> We've enabled the PrintPromotionFailure flag in our preprod >>> environment, but so far, no failures yet. >>> We know that the load we generate there is not representative. But >>> perhaps we'll catch something, given enough patience. >>> >>> The flag will also be enabled in our production environment next week >>> - so one way or the other, we'll get more diagnostic data soon. >>> I'll also do some allocation profiling of the application in isolation >>> - I know that there is abusive large byte[] and char[] allocation in >>> there. >>> >>> I've got two questions for now: >>> >>> 1) From googling around on the output to expect >>> (http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html), >>> I see that -XX:+PrintPromotionFailure will generate output like this: >>> ------- >>> 592.079: [ParNew (0: promotion failure size = 2698) (promotion >>> failed): 135865K->134943K(138240K), 0.1433555 secs] >>> ------- >>> In that example line, what does the "0" stand for? >> It's the index of the GC worker thread that experienced the promotion >> failure. >> >>> 2) Below is a snippet of (real) gc log from our production application: >>> ------- >>> 2011-12-30T22:42:12.684+0100: 2136581.585: [GC 2136581.585: [ParNew: >>> 345951K->40960K(368640K), 0.0676780 secs] >>> 3608692K->3323692K(5201920K), 0.0680220 secs] [Times: user=0.36 >>> sys=0.01, real=0.06 secs] >>> 2011-12-30T22:42:22.984+0100: 2136591.886: [GC 2136591.886: [ParNew: >>> 368640K->40959K(368640K), 0.0618880 secs] >>> 3651372K->3349928K(5201920K), 0.0622330 secs] [Times: user=0.31 >>> sys=0.00, real=0.06 secs] >>> 2011-12-30T22:42:23.052+0100: 2136591.954: [GC [1 CMS-initial-mark: >>> 3308968K(4833280K)] 3350041K(5201920K), 0.0377420 secs] [Times: >>> user=0.04 sys=0.00, real=0.04 secs] >>> 2011-12-30T22:42:23.090+0100: 2136591.992: [CMS-concurrent-mark-start] >>> 2011-12-30T22:42:24.076+0100: 2136592.978: [CMS-concurrent-mark: >>> 0.986/0.986 secs] [Times: user=2.05 sys=0.04, real=0.99 secs] >>> 2011-12-30T22:42:24.076+0100: 2136592.978: [CMS-concurrent-preclean-start] >>> 2011-12-30T22:42:24.099+0100: 2136593.000: [CMS-concurrent-preclean: >>> 0.021/0.023 secs] [Times: user=0.03 sys=0.00, real=0.02 secs] >>> 2011-12-30T22:42:24.099+0100: 2136593.001: >>> [CMS-concurrent-abortable-preclean-start] >>> CMS: abort preclean due to time 2011-12-30T22:42:29.335+0100: >>> 2136598.236: [CMS-concurrent-abortable-preclean: 5.209/5.236 secs] >>> [Times: user=5.70 sys=0.23, real=5.23 secs] >>> 2011-12-30T22:42:29.340+0100: 2136598.242: [GC[YG occupancy: 123870 K >>> (368640 K)]2011-12-30T22:42:29.341+0100: 2136598.242: [GC 2136598.242: >>> [ParNew (promotion failed): 123870K->105466K(368640K), 7.4939280 secs] >>> 3432839K->3423755K(5201920 >>> K), 7.4942670 secs] [Times: user=9.08 sys=2.10, real=7.49 secs] >>> 2136605.737: [Rescan (parallel) , 0.0644050 secs]2136605.801: [weak >>> refs processing, 0.0034280 secs]2136605.804: [class unloading, >>> 0.0289480 secs]2136605.833: [scrub symbol& string tables, 0.0093940 >>> secs] [1 CMS-remark: 3318289K(4833280K >>> )] 3423755K(5201920K), 7.6077990 secs] [Times: user=9.54 sys=2.10, >>> real=7.61 secs] >>> 2011-12-30T22:42:36.949+0100: 2136605.850: [CMS-concurrent-sweep-start] >>> 2011-12-30T22:42:45.006+0100: 2136613.907: [Full GC 2136613.908: >>> [CMS2011-12-30T22:42:51.038+0100: 2136619.939: [CMS-concurrent-sweep: >>> 12.231/14.089 secs] [Times: user=15.14 sys=5.36, real=14.08 secs] >>> (concurrent mode failure): 3141235K->291853K(4833280K), 10.2906040 >>> secs] 3491471K->291853K(5201920K), [CMS Perm : >>> 121784K->121765K(262144K)], 10.2910040 secs] [Times: user=10.29 >>> sys=0.00, real=10.29 secs] >>> 2011-12-30T22:42:56.281+0100: 2136625.183: [GC 2136625.183: [ParNew: >>> 327680K->25286K(368640K), 0.0287220 secs] 619533K->317140K(5201920K), >>> 0.0291610 secs] [Times: user=0.13 sys=0.00, real=0.03 secs] >>> 2011-12-30T22:43:10.516+0100: 2136639.418: [GC 2136639.418: [ParNew: >>> 352966K->26737K(368640K), 0.0586400 secs] 644820K->338758K(5201920K), >>> 0.0589640 secs] [Times: user=0.31 sys=0.00, real=0.06 secs] >>> ------- >>> >>> In this case I don't know how to interpret the output. >>> a) There's a promotion failure that took 7.49 secs >> This is the time it took to attempt the minor collection (ParNew) and to >> do recovery >> from the failure. >> >>> b) There's a full GC that took 14.08 secs >>> c) There's a concurrent mode failure that took 10.29 secs >> Not sure about b) and c) because the output is mixed up with the >> concurrent-sweep >> output but I think the "concurrent mode failure" message is part of the >> "Full GC" >> message. My guess is that the 10.29 is the time for the Full GC and the >> 14.08 >> maybe is part of the concurrent-sweep message. Really hard to be sure. >> >> Jon >>> How are these events, and their (real) times related to each other? >>> >>> Thanks in advance, >>> Taras >>> >>> On Tue, Dec 27, 2011 at 6:13 PM, Jon Masamitsu wrote: >>>> Taras, >>>> >>>> PrintPromotionFailure seems like it would go a long >>>> way to identify the root of your promotion failures (or >>>> at least eliminating some possible causes). I think it >>>> would help focus the discussion if you could send >>>> the result of that experiment early. >>>> >>>> Jon >>>> >>>> On 12/27/2011 5:07 AM, Taras Tielkes wrote: >>>>> Hi, >>>>> >>>>> We're running an application with the CMS/ParNew collectors that is >>>>> experiencing occasional promotion failures. >>>>> Environment is Linux 2.6.18 (x64), JVM is 1.6.0_29 in server mode. >>>>> I've listed the specific JVM options used below (a). >>>>> >>>>> The application is deployed across a handful of machines, and the >>>>> promotion failures are fairly uniform across those. >>>>> >>>>> The first kind of failure we observe is a promotion failure during >>>>> ParNew collection, I've included a snipped from the gc log below (b). >>>>> The second kind of failure is a concurrrent mode failure (perhaps >>>>> triggered by the same cause), see (c) below. >>>>> The frequency (after running for a some weeks) is approximately once >>>>> per day. This is bearable, but obviously we'd like to improve on this. >>>>> >>>>> Apart from high-volume request handling (which allocates a lot of >>>>> small objects), the application also runs a few dozen background >>>>> threads that download and process XML documents, typically in the 5-30 >>>>> MB range. >>>>> A known deficiency in the existing code is that the XML content is >>>>> copied twice before processing (once to a byte[], and later again to a >>>>> String/char[]). >>>>> Given that a 30 MB XML stream will result in a 60 MB >>>>> java.lang.String/char[], my suspicion is that these big array >>>>> allocations are causing us to run into the CMS fragmentation issue. >>>>> >>>>> My questions are: >>>>> 1) Does the data from the GC logs provide sufficient evidence to >>>>> conclude that CMS fragmentation is the cause of the promotion failure? >>>>> 2) If not, what's the next step of investigating the cause? >>>>> 3) We're planning to at least add -XX:+PrintPromotionFailure to get a >>>>> feeling for the size of the objects that fail promotion. >>>>> Overall, it seem that -XX:PrintFLSStatistics=1 is actually the only >>>>> reliable approach to diagnose CMS fragmentation. Is this indeed the >>>>> case? >>>>> >>>>> Thanks in advance, >>>>> Taras >>>>> >>>>> a) Current JVM options: >>>>> -------------------------------- >>>>> -server >>>>> -Xms5g >>>>> -Xmx5g >>>>> -Xmn400m >>>>> -XX:PermSize=256m >>>>> -XX:MaxPermSize=256m >>>>> -XX:+PrintGCTimeStamps >>>>> -verbose:gc >>>>> -XX:+PrintGCDateStamps >>>>> -XX:+PrintGCDetails >>>>> -XX:SurvivorRatio=8 >>>>> -XX:+UseConcMarkSweepGC >>>>> -XX:+UseParNewGC >>>>> -XX:+DisableExplicitGC >>>>> -XX:+UseCMSInitiatingOccupancyOnly >>>>> -XX:+CMSClassUnloadingEnabled >>>>> -XX:+CMSScavengeBeforeRemark >>>>> -XX:CMSInitiatingOccupancyFraction=68 >>>>> -Xloggc:gc.log >>>>> -------------------------------- >>>>> >>>>> b) Promotion failure during ParNew >>>>> -------------------------------- >>>>> 2011-12-08T18:14:40.966+0100: 219729.868: [GC 219729.868: [ParNew: >>>>> 368640K->40959K(368640K), 0.0693460 secs] >>>>> 3504917K->3195098K(5201920K), 0.0696500 secs] [Times: user=0.39 >>>>> sys=0.01, real=0.07 secs] >>>>> 2011-12-08T18:14:43.778+0100: 219732.679: [GC 219732.679: [ParNew: >>>>> 368639K->31321K(368640K), 0.0511400 secs] >>>>> 3522778K->3198316K(5201920K), 0.0514420 secs] [Times: user=0.28 >>>>> sys=0.00, real=0.05 secs] >>>>> 2011-12-08T18:14:46.945+0100: 219735.846: [GC 219735.846: [ParNew: >>>>> 359001K->18694K(368640K), 0.0272970 secs] >>>>> 3525996K->3185690K(5201920K), 0.0276080 secs] [Times: user=0.19 >>>>> sys=0.00, real=0.03 secs] >>>>> 2011-12-08T18:14:49.036+0100: 219737.938: [GC 219737.938: [ParNew >>>>> (promotion failed): 338813K->361078K(368640K), 0.1321200 >>>>> secs]219738.070: [CMS: 3167747K->434291K(4833280K), 4.8881570 secs] >>>>> 3505808K->434291K >>>>> (5201920K), [CMS Perm : 116893K->116883K(262144K)], 5.0206620 secs] >>>>> [Times: user=5.24 sys=0.00, real=5.02 secs] >>>>> 2011-12-08T18:14:54.721+0100: 219743.622: [GC 219743.623: [ParNew: >>>>> 327680K->40960K(368640K), 0.0949460 secs] 761971K->514584K(5201920K), >>>>> 0.0952820 secs] [Times: user=0.52 sys=0.04, real=0.10 secs] >>>>> 2011-12-08T18:14:55.580+0100: 219744.481: [GC 219744.482: [ParNew: >>>>> 368640K->40960K(368640K), 0.1299190 secs] 842264K->625681K(5201920K), >>>>> 0.1302190 secs] [Times: user=0.72 sys=0.01, real=0.13 secs] >>>>> 2011-12-08T18:14:58.050+0100: 219746.952: [GC 219746.952: [ParNew: >>>>> 368640K->40960K(368640K), 0.0870940 secs] 953361K->684121K(5201920K), >>>>> 0.0874110 secs] [Times: user=0.48 sys=0.01, real=0.09 secs] >>>>> -------------------------------- >>>>> >>>>> c) Promotion failure during CMS >>>>> -------------------------------- >>>>> 2011-12-14T08:29:26.628+0100: 703015.530: [GC 703015.530: [ParNew: >>>>> 357228K->40960K(368640K), 0.0525110 secs] >>>>> 3603068K->3312743K(5201920K), 0.0528120 secs] [Times: user=0.37 >>>>> sys=0.00, real=0.05 secs] >>>>> 2011-12-14T08:29:28.864+0100: 703017.766: [GC 703017.766: [ParNew: >>>>> 366075K->37119K(368640K), 0.0479780 secs] >>>>> 3637859K->3317662K(5201920K), 0.0483090 secs] [Times: user=0.24 >>>>> sys=0.01, real=0.05 secs] >>>>> 2011-12-14T08:29:29.553+0100: 703018.454: [GC 703018.455: [ParNew: >>>>> 364792K->40960K(368640K), 0.0421740 secs] >>>>> 3645334K->3334944K(5201920K), 0.0424810 secs] [Times: user=0.30 >>>>> sys=0.00, real=0.04 secs] >>>>> 2011-12-14T08:29:29.600+0100: 703018.502: [GC [1 CMS-initial-mark: >>>>> 3293984K(4833280K)] 3335025K(5201920K), 0.0272490 secs] [Times: >>>>> user=0.02 sys=0.00, real=0.03 secs] >>>>> 2011-12-14T08:29:29.628+0100: 703018.529: [CMS-concurrent-mark-start] >>>>> 2011-12-14T08:29:30.718+0100: 703019.620: [GC 703019.620: [ParNew: >>>>> 368640K->40960K(368640K), 0.0836690 secs] >>>>> 3662624K->3386039K(5201920K), 0.0839690 secs] [Times: user=0.50 >>>>> sys=0.01, real=0.08 secs] >>>>> 2011-12-14T08:29:30.827+0100: 703019.729: [CMS-concurrent-mark: >>>>> 1.108/1.200 secs] [Times: user=6.83 sys=0.23, real=1.20 secs] >>>>> 2011-12-14T08:29:30.827+0100: 703019.729: [CMS-concurrent-preclean-start] >>>>> 2011-12-14T08:29:30.938+0100: 703019.840: [CMS-concurrent-preclean: >>>>> 0.093/0.111 secs] [Times: user=0.48 sys=0.02, real=0.11 secs] >>>>> 2011-12-14T08:29:30.938+0100: 703019.840: >>>>> [CMS-concurrent-abortable-preclean-start] >>>>> 2011-12-14T08:29:32.337+0100: 703021.239: >>>>> [CMS-concurrent-abortable-preclean: 1.383/1.399 secs] [Times: >>>>> user=6.68 sys=0.27, real=1.40 secs] >>>>> 2011-12-14T08:29:32.343+0100: 703021.244: [GC[YG occupancy: 347750 K >>>>> (368640 K)]2011-12-14T08:29:32.343+0100: 703021.244: [GC 703021.244: >>>>> [ParNew (promotion failed): 347750K->347750K(368640K), 9.8729020 secs] >>>>> 3692829K->3718580K(5201920K), 9.8732380 secs] [Times: user=12.00 >>>>> sys=2.58, real=9.88 secs] >>>>> 703031.118: [Rescan (parallel) , 0.2826110 secs]703031.400: [weak refs >>>>> processing, 0.0014780 secs]703031.402: [class unloading, 0.0176610 >>>>> secs]703031.419: [scrub symbol& string tables, 0.0094960 secs] [1 CMS >>>>> -remark: 3370830K(4833280K)] 3718580K(5201920K), 10.1916910 secs] >>>>> [Times: user=13.73 sys=2.59, real=10.19 secs] >>>>> 2011-12-14T08:29:42.535+0100: 703031.436: [CMS-concurrent-sweep-start] >>>>> 2011-12-14T08:29:42.591+0100: 703031.493: [Full GC 703031.493: >>>>> [CMS2011-12-14T08:29:48.616+0100: 703037.518: [CMS-concurrent-sweep: >>>>> 6.046/6.082 secs] [Times: user=6.18 sys=0.01, real=6.09 secs] >>>>> (concurrent mode failure): 3370829K->433437K(4833280K), 10.9594300 >>>>> secs] 3739469K->433437K(5201920K), [CMS Perm : >>>>> 121702K->121690K(262144K)], 10.9597540 secs] [Times: user=10.95 >>>>> sys=0.00, real=10.96 secs] >>>>> 2011-12-14T08:29:53.997+0100: 703042.899: [GC 703042.899: [ParNew: >>>>> 327680K->40960K(368640K), 0.0799960 secs] 761117K->517836K(5201920K), >>>>> 0.0804100 secs] [Times: user=0.46 sys=0.00, real=0.08 secs] >>>>> 2011-12-14T08:29:54.649+0100: 703043.551: [GC 703043.551: [ParNew: >>>>> 368640K->40960K(368640K), 0.0784460 secs] 845516K->557872K(5201920K), >>>>> 0.0787920 secs] [Times: user=0.40 sys=0.01, real=0.08 secs] >>>>> 2011-12-14T08:29:56.418+0100: 703045.320: [GC 703045.320: [ParNew: >>>>> 368640K->40960K(368640K), 0.0784040 secs] 885552K->603017K(5201920K), >>>>> 0.0787630 secs] [Times: user=0.41 sys=0.01, real=0.07 secs] >>>>> -------------------------------- >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From John.Coomes at oracle.com Wed Jan 11 17:16:51 2012 From: John.Coomes at oracle.com (John Coomes) Date: Wed, 11 Jan 2012 17:16:51 -0800 Subject: CRR (L / updated): 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces In-Reply-To: <4F0CD061.50408@oracle.com> References: <4EF25FB8.5050507@oracle.com> <4EFA08D8.8040009@oracle.com> <4F060540.3070005@oracle.com> <4F07CD76.9080502@oracle.com> <4F0CB06D.2030008@oracle.com> <4F0CD061.50408@oracle.com> Message-ID: <20238.13443.469295.34166@oracle.com> Tony Printezis (tony.printezis at oracle.com) wrote: > Bengt, > > Hi, thanks for looking at it! See inline. > > On 01/10/2012 04:41 PM, Bengt Rutisson wrote: > > ... > > In g1OopClosures.hpp you swapped the lines 151 and 152, which makes > > it look like this: > > > > 149 G1ParCopyClosure(G1CollectedHeap* g1, G1ParScanThreadState* > > par_scan_state, > > 150 ReferenceProcessor* rp) : > > 151 G1ParCopyHelper(g1, par_scan_state, &_scanner), > > 152 _scanner(g1, par_scan_state, rp) { > > > > I guess you want the call to the super class constructor before other > > initialization. To me it looks strange that the _scanner is passed to > > the super class, but is now actually not initialized until after the > > call to the super constructor. I would have preferred the order as it > > was before. I guess the new order works fine as long as the super > > constructor never tries to use the _scanner. > > You're right. I wanted the constructor to be called first but I was > clearly a bit careless and I missed the dependency. Thanks for looking > at this carefully! Even though the dependency is benign here (the ref to > _scanner is only stored locally in the super class) I think we should > avoid and future surprises. So I'll undo the change. I learned this the hard way: the order you write it in the source doesn't affect the initialization order. The initialization order is defined by the declaration order (i.e., superclasses are initialized first in the order they appear in the class declaration, followed by data members in the order they're declared in the class). So the c++ compiler will call the G1ParCopyHelper ctor first, no matter how you write it :-). -John From bengt.rutisson at oracle.com Wed Jan 11 23:53:16 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 12 Jan 2012 08:53:16 +0100 Subject: CRR (L / updated): 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces In-Reply-To: <20238.13443.469295.34166@oracle.com> References: <4EF25FB8.5050507@oracle.com> <4EFA08D8.8040009@oracle.com> <4F060540.3070005@oracle.com> <4F07CD76.9080502@oracle.com> <4F0CB06D.2030008@oracle.com> <4F0CD061.50408@oracle.com> <20238.13443.469295.34166@oracle.com> Message-ID: <4F0E916C.7010801@oracle.com> John, Inline... On 2012-01-12 02:16, John Coomes wrote: > Tony Printezis (tony.printezis at oracle.com) wrote: >> Bengt, >> >> Hi, thanks for looking at it! See inline. >> >> On 01/10/2012 04:41 PM, Bengt Rutisson wrote: >>> ... >>> In g1OopClosures.hpp you swapped the lines 151 and 152, which makes >>> it look like this: >>> >>> 149 G1ParCopyClosure(G1CollectedHeap* g1, G1ParScanThreadState* >>> par_scan_state, >>> 150 ReferenceProcessor* rp) : >>> 151 G1ParCopyHelper(g1, par_scan_state,&_scanner), >>> 152 _scanner(g1, par_scan_state, rp) { >>> >>> I guess you want the call to the super class constructor before other >>> initialization. To me it looks strange that the _scanner is passed to >>> the super class, but is now actually not initialized until after the >>> call to the super constructor. I would have preferred the order as it >>> was before. I guess the new order works fine as long as the super >>> constructor never tries to use the _scanner. >> You're right. I wanted the constructor to be called first but I was >> clearly a bit careless and I missed the dependency. Thanks for looking >> at this carefully! Even though the dependency is benign here (the ref to >> _scanner is only stored locally in the super class) I think we should >> avoid and future surprises. So I'll undo the change. > I learned this the hard way: the order you write it in the source > doesn't affect the initialization order. The initialization order is > defined by the declaration order (i.e., superclasses are initialized > first in the order they appear in the class declaration, followed by > data members in the order they're declared in the class). > > So the c++ compiler will call the G1ParCopyHelper ctor first, no > matter how you write it :-). That's good to know! Thanks for pointing this out. This makes the dependency between G1ParCopyHelper and G1ParCopyClosure kind of scary to me. I agree with Tony that it is not an issue as it is right now, but if anybody in the future will try to access the _scanner field in the constructor of G1ParCopyHelper we are in trouble. I don't think we need to fix it right away, but maybe we should think about a fix for it. As far as I can tell the G1ParCopyHelper is only needed to get around some template issues with G1ParCopyClosure. Maybe there is a cleaner way to solve that? Bengt > > -John From john.cuthbertson at oracle.com Thu Jan 12 00:26:25 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Thu, 12 Jan 2012 00:26:25 -0800 Subject: RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking In-Reply-To: <4EF2127E.5050809@oracle.com> References: <4E8A40BE.9020800@oracle.com> <4EC2B317.3000006@oracle.com> <4ED38788.4010106@oracle.com> <4EF0DEF9.30306@oracle.com> <4EF1AF4E.80107@oracle.com> <4EF2127E.5050809@oracle.com> Message-ID: <4F0E9931.9070303@oracle.com> Hi Everyone, The latest incarnation of these changes can be found at: http://cr.openjdk.java.net/~johnc/6484965/webrev.3/ The changes in this version include: * Conditionally using a lock so that the output of the verification closure executed by different threads does not interfere with each other (suggested by Bengt). * Merging up to the latest hotspot-gc tip (including Tony's marking changes). This involved changing the evacuation failure code and adding a suitable mark/count routine for use in ConcurrentMark::grayRoot(). I also removed the counting changes from code that has been made obsolete as a result of Tony's marking changes. Testing: a few runs of the GC test suite with low marking thresholds (2 and 10%) with and without verification, and jprt. Thanks, JohnC On 12/21/2011 9:08 AM, John Cuthbertson wrote: > Hi Bengt, > > That's a good observation. I guess it is possible but I haven't seen > it in practice (though I was typically only using 4 threads when > debugging a verification failure). It won't do any harm so I'll add > the locking. > > Thanks, > > JohnC > > > > On 12/21/2011 2:05 AM, Bengt Rutisson wrote: >> >> Hi John, >> >> Thanks for updating your fix! Looks good. >> >> One quesiton: >> In concurrentMark.cpp it seems to me that the >> VerifyLiveObjectDataHRClosure could get the same kind of messed up >> output that Tony just fixed with 7123165 for the VerifyLiveClosure in >> heapRegion.cpp. There are several workers simultaneously doing the >> verification, right? Is it worth adding the same kind of locking that >> Tony added? >> >> Bengt >> >> On 2011-12-20 20:16, John Cuthbertson wrote: >>> Hi Bengt, >>> >>> As I mentioned earlier - thanks for the code review. I've applied >>> your suggestions, merged with the the latest changeset in >>> hsx/hotspot-gc/hotspot (resolving any conflicts), fixed the int <-> >>> size_t issue you also mentioned, and retested using the GC test >>> suite. A new webrev can be found at: >>> http://cr.openjdk.java.net/~johnc/6484965/webrev.2/ >>> >>> Specific replies are inline. >>> >>> On 11/28/11 05:07, Bengt Rutisson wrote: >>>> >>>> John, >>>> >>>> A little late, but here are some comments on this webrev. I know >>>> you have some more improvements to this change coming, but overall >>>> I think it looks good. Most of my comments are just minor coding >>>> style comments. >>>> >>>> Bengt >>>> >>>> concurrentMark.hpp >>>> >>>> Rename ConcurrentMark::clear() to ConcurrentMark::clear_mark() or >>>> ConcurrentMark::unmark()? The commment you added is definitely >>>> needed to understand what this method does. But it would be even >>>> better if it was possible to get that from the method name itself. >>> >>> Done. >>> >>>> It seems like everywhere we use count_marked_bytes_for(int >>>> worker_i) we almost directly use the array returned to index with >>>> the heap region that we are interested in. How about wrapping all >>>> of this is in something like count_set_marked_bytes_for(int >>>> worker_i, int hrs_index) and count_get_marked_bytes_for(int >>>> worker_i, int hrs_index) ? That way the data structure does not >>>> have to be exposed outside ConcurrentMark. It would mean that >>>> ConcurrentMark::count_region() would have to take a worker_i value >>>> instead of a marked_bytes_array. >>> >>> I did not do this. I embed the marked_bytes array for a worker into >>> the CMTask for that worker to save a de-reference. This was one of >>> the requests from the original code walk-through. Avoiding the >>> de-reference in the CMTask::do_marking_step() shaves a couple of >>> points off the marking time. I think your suggestion would reinstate >>> the de-reference again and we would lose those few percentage points >>> again. >>> >>>> If you don't agree with the suggestion above I would suggest to >>>> change the name from count_marked_bytes_for() to >>>> count_marked_bytes_array_for() since in every place that it is >>>> being called the resulting value is stored in a local variable >>>> called marked_bytes_array, which seems like a more informative name >>>> to me. >>> >>> Done. I agree - the new name sounds better. >>> >>>> I think this comment: >>>> >>>> // As above - but we don't know the heap region containing the >>>> // object and so have to supply it. >>>> inline bool par_mark_and_count(oop obj, int worker_i); >>>> >>>> should be something like "we don't know the heap region containing >>>> the object so we will have to look it up". >>>> >>>> Same thing here: >>>> >>>> // As above - but we don't have the heap region containing the >>>> // object, so we have to supply it. >>>> // Should *not* be called from parallel code. >>>> inline bool mark_and_count(oop obj); >>>> >>>> >>> >>> Comments were changed to: >>> >>> >>>> concurrentMark.cpp >>>> >>>> Since you are changing CalcLiveObjectsClosure::doHeapRegion() >>>> anyway, could you please remove this unused code (1393-1397): >>>> >>>> /* >>>> gclog_or_tty->print_cr("Setting bits from %d/%d.", >>>> obj_card_num - _bottom_card_num, >>>> obj_last_card_num - _bottom_card_num); >>>> */ >>>> >>>> >>> >>> Done. >>> >>>> What about the destructor ConcurrentMark::~ConcurrentMark() ? I >>>> remember Tony mentioning that it won't be called. Do you still want >>>> to keep the code? >>> >>> I removed the entire destructor - I don't see it being called in the >>> experiments I've run. >>> >>>> FinalCountDataUpdateClosure::set_bit_for_region() >>>> Probably not worth it, but would it make sense to add information >>>> in a startsHumongous HeapRegion to be able to give you the last >>>> continuesHumongous region? Since we know this when we set the >>>> regions up it seems like a waste to have to iterate over the region >>>> list to find it. >>> >>> If you read the original comment - the original author did not want >>> to make any assumptions about the internal field values of the >>> HeapRegions spanned by a humongous object and so used the loop >>> technique. I think you are correct and I now use the information in >>> the startsHumongous region to find the index of the last >>> continuesHumongous region spaned by the H-obj. >>> >>>> G1ParFinalCountTask >>>> To me it is a bit surprising that we mix in the verify code inside >>>> this closure. Would it be possible to extract this code out somehow? >>> >>> I did it this way to avoid another iteration over the heap regions. >>> But it probably does make more sense to separate them and use >>> another iteration to do the verify. Done. >>> >>>> Line 3378: "// Use fill_to_bytes". Is this something you plan on >>>> doing? >>> >>> I removed the comment. I was thinking of doing this as fill_to_bytes >>> is typically implemented using (a possibly specialized version of) >>> memset. But it's probably not worth it in this case. >>> >>>> G1ParFinalCountTask::work() >>>> Just for the record. I don't really like the way we have to set up >>>> both a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even >>>> though we will only use them if we have VerifyDuringGC enabled. I >>>> realize it is due to the scoping, but I still think it obstucts the >>>> code flow and introduces unnecessary work. Unfortunately I don't >>>> have a good suggestion for how to work around it. >>>> >>>> Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are >>>> StackObjs I assume it is not possible to get around the issue with >>>> a ResourceMark. >>> >>> Now that the verification is performed in a separate iteration of >>> the heap regions there's no need to create the >>> VerifyLiveObjectDataHRClosure and Mux2HRClosure instances here. >>> Done. I have also removed the now-redundant Mux2HRClosure. >>> >>> Hopefully the new webrev addresses these comments. >>> >>> Thanks again for looking. >>> >>> JohnC >>> >> > From fancyerii at gmail.com Thu Jan 12 04:09:28 2012 From: fancyerii at gmail.com (Li Li) Date: Thu, 12 Jan 2012 20:09:28 +0800 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: <9462441C-C11B-4CDC-83BD-D78E2A1138AB@gmail.com> References: <9462441C-C11B-4CDC-83BD-D78E2A1138AB@gmail.com> Message-ID: yesterday, we set the maxNewSize to 256mb. And it works as we expected. but an hours ago, there is a promotion failure and a concurrent mode failure which cost 14s! could anyone explain the gc logs for me? or any documents for the gc log format explanation? 1. Desired survivor size 3342336 bytes, new threshold 5 (max 5) it says survivor size is 3mb 2. 58282K->57602K(59072K), 0.0543930 secs] it says before young gc the memory used is 58282K, after young gc, there are 57602K live objects and the total young space is 59072K 3. (concurrent mode failure): 7907405K->3086848K(7929856K), 14.3005340 secs] 7961046K->3086848K(7988928K), [CMS Perm : 32296K->31852K(53932K)], 14.3552450 secs] [Times: user=14.53 sys=0.01, real=14.35 secs] before old gc, 7.9GB is used. after old gc 3GB is alive. total old space is 7.9GB in which situation will occur promotion failure and concurrent mode failure? from http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/ the author says when CMS is doing concurrent work and JVM is asked for more memory. if there isn't any space for new allocation. then it will occur concurrent mode failure and it will stop the world and do a serial old gc. if there exist enough space but they are fragemented, then a promotion failure will occur. am I right? 2012-01-12T18:27:32.582+0800: [GC [ParNew Desired survivor size 3342336 bytes, new threshold 1 (max 5) - age 1: 4594648 bytes, 4594648 total - age 2: 569200 bytes, 5163848 total : 58548K->5738K(59072K), 0.0159400 secs] 7958648K->7908502K(7984352K), 0.0160610 secs] [Times: user=0.17 sys=0.00, real=0.02 secs] 2012-01-12T18:27:32.609+0800: [GC [ParNew (promotion failed) Desired survivor size 3342336 bytes, new threshold 5 (max 5) - age 1: 1666376 bytes, 1666376 total : 58282K->57602K(59072K), 0.0543930 secs][CMS2012-01-12T18:27:33.804+0800: [CMS-concurrent-preclean: 14.098/34.323 secs] [Times: user=370.28 sys=5.65, real=34.31 secs] (concurrent mode failure): 7907405K->3086848K(7929856K), 14.3005340 secs] 7961046K->3086848K(7988928K), [CMS Perm : 32296K->31852K(53932K)], 14.3552450 secs] [Times: user=14.53 sys=0.01, real=14.35 secs] On Wed, Jan 11, 2012 at 5:48 PM, Kirk Pepperdine wrote: > CMS is not adaptive. To reconfigure heap, for many reasons, you need a > full GC to occur. The response to a concurrent mode failure is always a > full GC. That gave the JVM the opportunity to resize heap space. If this > behaviour isn't happening when it should or is cause other problems it's > time to either set the young gen size directly with NewSize or switch to > the parallel collector with the adaptive sizing policy turned on. Logic > here is that you want to avoid long pauses, use CMS. If CMS is giving you > long pauses, than the parallel collector might be a better choice. > > Regards, > Kirk > > On 2012-01-11, at 10:32 AM, Li Li wrote: > > > after a concurrent mode failure. the young generation changed from about > 50MB to 1.8GB > > What's the logic behind this? > > > > 2012-01-10T22:23:54.544+0800: [GC [ParNew: 55389K->6528K(59072K), > 0.0175440 secs] 5886124K->5839323K(6195204K), 0.0177480 secs] [Times: > user=0.20 sys=0.00, real=0.01 secs] > > 2012-01-10T22:23:54.575+0800: [GC [ParNew: 59072K->6528K(59072K), > 0.0234040 secs] 5891867K->5845823K(6201540K), 0.0236070 secs] [Times: > user=0.24 sys=0.00, real=0.02 secs] > > 2012-01-10T22:23:54.612+0800: [GC [ParNew (promotion failed): > 59072K->58862K(59072K), 2.3119860 secs][CMS2012-01-10T22:23:57.153+0800: > [CMS-concurrent-preclean: 10.999/28.245 secs] [Times: user=290.41 sys=4.65, > real=28.24 secs] > > (concurrent mode failure): 5841457K->2063142K(6144000K), 8.8971660 > secs] 5898367K->2063142K(6203072K), [CMS Perm : 31369K->31131K(52316K)], > 11.2110080 secs] [Times: user=11.73 sys=0.51, real=11.21 secs] > > 2012-01-10T22:24:06.125+0800: [GC [ParNew: 1638400K->46121K(1843200K), > 0.0225800 secs] 3701542K->2109263K(7987200K), 0.0228190 secs] [Times: > user=0.26 sys=0.02, real=0.02 secs] > > 2012-01-10T22:24:06.357+0800: [GC [ParNew: 1684521K->111262K(1843200K), > 0.0381370 secs] 3747663K->2174404K(7987200K), 0.0383860 secs] [Times: > user=0.44 sys=0.04, real=0.04 secs] > > > > _______________________________________________ > > hotspot-gc-use mailing list > > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120112/586eb629/attachment-0001.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From bartosz.markocki at gmail.com Thu Jan 12 05:56:38 2012 From: bartosz.markocki at gmail.com (Bartek Markocki) Date: Thu, 12 Jan 2012 14:56:38 +0100 Subject: How can we cut down those two CMS STW times? Message-ID: Hi all, We have a backend type of application which primary purpose is to cache user specific graphs of objects. The graphs are relatively small in size however the rate at which they can change (based on users' requests) is key here. Our main challenge was to figure out JVM settings that will handle the peak memory allocation at the level of 4.5GB/s. To make things a bit more challenging ;) we have a limited number of RAM on the box (as there are multiple applications co-located on the box and the box has just 64GB of RAM). After a decent amount of testing we figured out the following settings work for us: -Xms6G -Xmx6G -Xmn3G -Xss256k -XX:MaxPermSize=512m -XX:PermSize=512m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:TargetSurvivorRatio=90 -XX:SurvivorRatio=8 -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -XX:+CMSScavengeBeforeRemark -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled We run Java6 update 27 (64bit, server) on Solaris10. The above settings work for us with exception of one CMS-initial marks and one CMS-remark. By working I mean the STW is less than 1 second for any STW pause. There is one case when CMS-initial mark took 3.44 seconds. Here is the extract from the log showing this situation: 90516.053: [GC 90516.053: [ParNew: 2633949K->154547K(2831168K), 0.1729255 secs] 4874963K->2395755K(5976896K), 0.1734674 secs] [Times: user=3.07 sys=0.01, real=0.17 secs] 90516.846: [GC 90516.846: [ParNew: 2671155K->106975K(2831168K), 0.2183780 secs] 4906534K->2365720K(5976896K), 0.2188906 secs] [Times: user=3.62 sys=0.05, real=0.22 secs] 90517.684: [GC 90517.684: [ParNew: 2623583K->106936K(2831168K), 0.0690728 secs] 4833212K->2316857K(5976896K), 0.0695870 secs] [Times: user=1.20 sys=0.01, real=0.07 secs] 90517.976: [CMS-concurrent-sweep: 4.574/5.767 secs] [Times: user=121.01 sys=1.90, real=5.77 secs] 90517.976: [CMS-concurrent-reset-start] 90518.112: [CMS-concurrent-reset: 0.136/0.136 secs] [Times: user=2.76 sys=0.05, real=0.14 secs] 90520.117: [GC [1 CMS-initial-mark: 2209921K(3145728K)] 4768007K(5976896K), 3.4458003 secs] [Times: user=3.45 sys=0.00, real=3.45 secs] 90523.564: [CMS-concurrent-mark-start] 90523.623: [GC 90523.623: [ParNew: 2623544K->119747K(2831168K), 0.1848339 secs] 4833465K->2329818K(5976896K), 0.1853529 secs] [Times: user=3.29 sys=0.01, real=0.19 secs] 90526.087: [CMS-concurrent-mark: 2.314/2.523 secs] [Times: user=18.11 sys=0.18, real=2.52 secs] 90526.087: [CMS-concurrent-preclean-start] 90526.155: [CMS-concurrent-preclean: 0.058/0.068 secs] [Times: user=0.16 sys=0.00, real=0.07 secs] 90526.155: [CMS-concurrent-abortable-preclean-start] 90531.301: [GC 90531.301: [ParNew: 2636355K->45254K(2831168K), 0.0206247 secs] 4846426K->2255579K(5976896K), 0.0211745 secs] [Times: user=0.33 sys=0.00, real=0.02 secs] CMS: abort preclean due to time 90531.470: [CMS-concurrent-abortable-preclean: 5.271/5.315 secs] [Times: user=18.85 sys=0.26, real=5.32 secs] 90531.476: [GC[YG occupancy: 662977 K (2831168 K)]90531.476: [GC 90531.476: [ParNew: 662977K->21990K(2831168K), 0.0342782 secs] 2873302K->2232487K(5976896K), 0.0347927 secs] [Times: user=0.40 sys=0.01, real=0.04 secs] 90531.511: [Rescan (parallel) , 0.0074306 secs]90531.519: [weak refs processing, 0.0000864 secs]90531.519: [class unloading, 0.0350356 secs]90531.554: [scrub symbol & string tables, 0.0266258 secs] [1 CMS-remark: 2210497K(3145728K)] 2232487K(5976896K), 0.1197919 secs] [Times: user=0.59 sys=0.01, real=0.12 secs] 90531.597: [CMS-concurrent-sweep-start] 90532.212: [GC 90532.212: [ParNew: 2538598K->14216K(2831168K), 0.0162798 secs] 4744071K->2219741K(5976896K), 0.0167729 secs] [Times: user=0.26 sys=0.00, real=0.02 secs] 90532.865: [GC 90532.865: [ParNew: 2530824K->18587K(2831168K), 0.0192318 secs] 4732677K->2220478K(5976896K), 0.0197659 secs] [Times: user=0.31 sys=0.00, real=0.02 secs] 90533.500: [GC 90533.500: [ParNew: 2535195K->20886K(2831168K), 0.0206055 secs] 4731793K->2217494K(5976896K), 0.0211439 secs] [Times: user=0.33 sys=0.00, real=0.02 secs] Of course almost immediately one can notice that young generation was almost full during that time, so what happened should not be a surprise. After some googling I found that a similar topic was discussed on this group in 2010 ? with indication that it is caused by the 6412968 bug (CMS: Long initial mark). We tried suggested workarounds and found out that they cannot be applied in our case (limited number of available RAM) and sooner or later we hit the promotion or/and concurrent mode failure with even worse STW time. Unfortunately, as in 2010, bugs.sun.com does not show the bug so I cannot check if there was any update for the bug, so here comes my first question: was there any update for the bug (what?s the status of the bug)? The next problem that we faced was related to another CMS related bug (6990419). After we applied the suggested workaround (to enable scavenge before remark) the problem was almost completely removed with one exception. There is the following CMS remark: 199848.296: [GC 199848.296: [ParNew: 2689127K->119235K(2831168K), 0.0906522 secs] 4868292K->2309941K(5976896K), 0.0912736 secs] [Times: user=1.22 sys=0.02, real=0.09 secs] 199853.617: [GC 199853.617: [ParNew: 2635843K->91628K(2831168K), 0.1040602 secs] 4826549K->2311078K(5976896K), 0.1046178 secs] [Times: user=1.15 sys=0.01, real=0.10 secs] 199853.726: [GC [1 CMS-initial-mark: 2219449K(3145728K)] 2311170K(5976896K), 0.1208219 secs] [Times: user=0.12 sys=0.00, real=0.12 secs] 199853.847: [CMS-concurrent-mark-start] 199856.405: [CMS-concurrent-mark: 2.547/2.557 secs] [Times: user=18.49 sys=0.35, real=2.56 secs] 199856.405: [CMS-concurrent-preclean-start] 199856.438: [CMS-concurrent-preclean: 0.031/0.033 secs] [Times: user=0.13 sys=0.00, real=0.03 secs] 199856.439: [CMS-concurrent-abortable-preclean-start] CMS: abort preclean due to time 199861.899: [CMS-concurrent-abortable-preclean: 5.443/5.460 secs] [Times: user=27.67 sys=1.14, real=5.46 secs] 199861.903: [GC[YG occupancy: 1353639 K (2831168 K)]199861.903: [Rescan (parallel) , 1.4282026 secs]199863.332: [weak refs processing, 0.0019473 secs]199863.334: [class unloading, 0.0365617 secs]199863.370: [scrub symbol & string tables, 0.0267902 secs] [1 CMS-remark: 2219449K(3145728K)] 3573089K(5976896K), 1.5099836 secs] [Times: user=12.20 sys=0.17, real=1.51 secs] 199863.414: [CMS-concurrent-sweep-start] 199863.420: [GC 199863.421: [ParNew: 1355519K->53699K(2831168K), 0.1129519 secs] 3574969K->2308972K(5976896K), 0.1138995 secs] [Times: user=1.10 sys=0.01, real=0.11 secs] 199865.857: [CMS-concurrent-sweep: 2.324/2.443 secs] [Times: user=10.15 sys=0.61, real=2.44 secs] 199865.857: [CMS-concurrent-reset-start] 199865.888: [CMS-concurrent-reset: 0.031/0.031 secs] [Times: user=0.05 sys=0.00, real=0.03 secs] 199893.779: [GC 199893.780: [ParNew: 2570307K->58285K(2831168K), 0.0397922 secs] 4620179K->2108197K(5976896K), 0.0403072 secs] [Times: user=0.68 sys=0.00, real=0.04 secs] 199906.510: [GC 199906.510: [ParNew: 2574893K->55484K(2831168K), 0.0390212 secs] 4624805K->2105432K(5976896K), 0.0395148 secs] [Times: user=0.67 sys=0.01, real=0.04 secs] There are two things to notice here: 1. The time of this rescan was 20 times longer than any other rescan time (1.4 seconds comparing to 58 ms) 2. There was no minor GC before CMS-remark even though it was explicitly requested. The question here is: is that something already covered by the 6990419 for which workaround simply does not work or something else? Thank you, Bartek _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From tony.printezis at oracle.com Thu Jan 12 07:51:07 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Thu, 12 Jan 2012 10:51:07 -0500 Subject: CRR (XXS): 7078465: G1: Don't use the undefined value (-1) for the G1 old memory pool max size Message-ID: <4F0F016B.8040707@oracle.com> Hi all, I'd like a couple of code reviews for this very small change (one line!): http://cr.openjdk.java.net/~tonyp/7078465/webrev.0/ Currently, all the G1 memory pools return "undefined" (-1) as their max size given that there are no hard boundaries between them. Jon Masamitsu suggested to at least return the heap max for the old memory pool so that the pool data is a little bit more informative. Tony From tony.printezis at oracle.com Thu Jan 12 10:32:33 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Thu, 12 Jan 2012 13:32:33 -0500 Subject: CRR (S): 7097586: G1: improve the per-space output when using jmap -heap Message-ID: <4F0F2741.4090705@oracle.com> Hi all, I'd like a couple of code reviews for this change that enhances the heap summary information generated by the SA (which is used for the jmap -heap output): http://cr.openjdk.java.net/~tonyp/7097586/webrev.0/ Currently, the heap summary generated for G1 is as close as possible to what's generated for the other GCs. Bengt made a good suggestion that it'd be helpful to enhance the output with some G1-specific information in order to make it more informative. The important changes are the 15 lines or so that were changed in HeapSummary.java, the rest is boilerplate to be able to access specific fields and objects from the SA. I included before / after jmap -heap output below. Note that we actually had a small bug in the code which caused the sizing information in the G1MonitoringSupport object to become inconsistent between a cleanup and the subsequent GC: the old space information was not updated to reflect any old region reclamation during cleanup. I fixed this as part of this change too (I'll add a note to the CR). Tony BEFORE: using thread-local object allocation. Garbage-First (G1) GC with 8 thread(s) Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 1073741824 (1024.0MB) NewSize = 1048576 (1.0MB) MaxNewSize = 4294967295 (4095.9999990463257MB) OldSize = 4194304 (4.0MB) NewRatio = 2 SurvivorRatio = 8 PermSize = 16777216 (16.0MB) MaxPermSize = 67108864 (64.0MB) Heap Usage: G1 Young Generation Eden Space: capacity = 19922944 (19.0MB) used = 3145728 (3.0MB) free = 16777216 (16.0MB) 15.789473684210526% used From Space: capacity = 2097152 (2.0MB) used = 2097152 (2.0MB) free = 0 (0.0MB) 100.0% used To Space: capacity = 0 (0.0MB) used = 0 (0.0MB) free = 0 (0.0MB) 0.0% used G1 Old Generation capacity = 19922944 (19.0MB) used = 5849192 (5.578224182128906MB) free = 14073752 (13.421775817871094MB) 29.359074642783717% used Perm Generation: capacity = 16777216 (16.0MB) used = 2749208 (2.6218490600585938MB) free = 14028008 (13.378150939941406MB) 16.38655662536621% used 1719 interned Strings occupying 137520 bytes. AFTER (I marked the changes with bold; note that now there's only one Survivor section, as G1 does not have the concept of two survivors that are always allocated): using thread-local object allocation. Garbage-First (G1) GC with 8 thread(s) Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 67108864 (64.0MB) NewSize = 1048576 (1.0MB) MaxNewSize = 4294967295 (4095.9999990463257MB) OldSize = 4194304 (4.0MB) NewRatio = 2 SurvivorRatio = 8 PermSize = 16777216 (16.0MB) MaxPermSize = 67108864 (64.0MB) * G1HeapRegionSize = 1048576 (1.0MB) * Heap Usage: *G1 Heap: regions = 57 capacity = 59768832 (57.0MB) used = 18018304 (17.18359375MB) free = 41750528 (39.81640625MB) 30.146655701754387% used *G1 Young Generation: Eden Space: * regions = 3 * capacity = 30408704 (29.0MB) used = 3145728 (3.0MB) free = 27262976 (26.0MB) 10.344827586206897% used *Survivor Space: regions = 2 * capacity = 2097152 (2.0MB) used = 2097152 (2.0MB) free = 0 (0.0MB) 100.0% used G1 Old Generation: * regions = 13 * capacity = 27262976 (26.0MB) used = 12775424 (12.18359375MB) free = 14487552 (13.81640625MB) 46.85997596153846% used Perm Generation: capacity = 16777216 (16.0MB) used = 2741840 (2.6148223876953125MB) free = 14035376 (13.385177612304688MB) 16.342639923095703% used 1710 interned Strings occupying 136904 bytes. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120112/f35253bc/attachment.html From john.cuthbertson at oracle.com Thu Jan 12 11:28:49 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Thu, 12 Jan 2012 11:28:49 -0800 Subject: RFR (S): 7129271 G1: Interference from multiple threads in PrintGC/PrintGCDetails output Message-ID: <4F0F3471.6070108@oracle.com> Hi Everyone, Can I have a couple of volunteers review the changes for this CR? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7129271/webrev.0/ The issue was that the when PrintGC or PrintGCDetails was enabled, during an initial pause, the "concurrent-mark-start" message from the ConcurrentMark thread was interfering with the output (by the VM thread) from the GC pause. This was adding a randomness and irregularity to the output that was making it difficult to parse. It was also seen more frequently when the GC logging output was directed to a file rather than stdout. The solution is to move the code that signals the Concurrent Mark thread to after when the output from the GC pause is complete. In the webrev, please ignore the counts of the number of lines changed. I added an inner scope and so indented a bunch of code which I think has confused the webrev tool. Fortunately the actual web diffs seem to have not included the extra whitespace. Testing: the GC test suite (with a low marking threshold - 2% to create lots of marking cycles) Thanks, JohnC From kirk.pepperdine at gmail.com Thu Jan 12 11:03:41 2012 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Thu, 12 Jan 2012 20:03:41 +0100 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: References: <9462441C-C11B-4CDC-83BD-D78E2A1138AB@gmail.com> Message-ID: Hi, CMS failures occur as a result of a trend over time. It's almost impossible to recommend a correction from a single incident. that said, Jon's blog entry explains CMS failure very clearly. This the record you've sent suggests that young gen is way too small.. but again, I can't say anything with a single record. Regards, Kirk On 2012-01-12, at 1:09 PM, Li Li wrote: > yesterday, we set the maxNewSize to 256mb. And it works as we expected. but an hours ago, there is a promotion failure and a concurrent mode failure which cost 14s! could anyone explain the gc logs for me? > or any documents for the gc log format explanation? > > 1. Desired survivor size 3342336 bytes, new threshold 5 (max 5) > it says survivor size is 3mb > 2. 58282K->57602K(59072K), 0.0543930 secs] > it says before young gc the memory used is 58282K, after young gc, there are 57602K live objects and the total young space is 59072K > 3. (concurrent mode failure): 7907405K->3086848K(7929856K), 14.3005340 secs] 7961046K->3086848K(7988928K), [CMS Perm : 32296K->31852K(53932K)], 14.3552450 secs] [Times: user=14.53 sys=0.01, real=14.35 secs] > before old gc, 7.9GB is used. after old gc 3GB is alive. total old space is 7.9GB > > in which situation will occur promotion failure and concurrent mode failure? > from http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/ > the author says when CMS is doing concurrent work and JVM is asked for more memory. if there isn't any space for new allocation. then it will occur concurrent mode failure and it will stop the world and do a serial old gc. > if there exist enough space but they are fragemented, then a promotion failure will occur. > am I right? > > 2012-01-12T18:27:32.582+0800: [GC [ParNew > Desired survivor size 3342336 bytes, new threshold 1 (max 5) > - age 1: 4594648 bytes, 4594648 total > - age 2: 569200 bytes, 5163848 total > : 58548K->5738K(59072K), 0.0159400 secs] 7958648K->7908502K(7984352K), 0.0160610 secs] [Times: user=0.17 sys=0.00, real=0.02 secs] > 2012-01-12T18:27:32.609+0800: [GC [ParNew (promotion failed) > Desired survivor size 3342336 bytes, new threshold 5 (max 5) > - age 1: 1666376 bytes, 1666376 total > : 58282K->57602K(59072K), 0.0543930 secs][CMS2012-01-12T18:27:33.804+0800: [CMS-concurrent-preclean: 14.098/34.323 secs] [Times: user=370.28 sys=5.65, real=34.31 secs] > (concurrent mode failure): 7907405K->3086848K(7929856K), 14.3005340 secs] 7961046K->3086848K(7988928K), [CMS Perm : 32296K->31852K(53932K)], 14.3552450 secs] [Times: user=14.53 sys=0.01, real=14.35 secs] > > On Wed, Jan 11, 2012 at 5:48 PM, Kirk Pepperdine wrote: > CMS is not adaptive. To reconfigure heap, for many reasons, you need a full GC to occur. The response to a concurrent mode failure is always a full GC. That gave the JVM the opportunity to resize heap space. If this behaviour isn't happening when it should or is cause other problems it's time to either set the young gen size directly with NewSize or switch to the parallel collector with the adaptive sizing policy turned on. Logic here is that you want to avoid long pauses, use CMS. If CMS is giving you long pauses, than the parallel collector might be a better choice. > > Regards, > Kirk > > On 2012-01-11, at 10:32 AM, Li Li wrote: > > > after a concurrent mode failure. the young generation changed from about 50MB to 1.8GB > > What's the logic behind this? > > > > 2012-01-10T22:23:54.544+0800: [GC [ParNew: 55389K->6528K(59072K), 0.0175440 secs] 5886124K->5839323K(6195204K), 0.0177480 secs] [Times: user=0.20 sys=0.00, real=0.01 secs] > > 2012-01-10T22:23:54.575+0800: [GC [ParNew: 59072K->6528K(59072K), 0.0234040 secs] 5891867K->5845823K(6201540K), 0.0236070 secs] [Times: user=0.24 sys=0.00, real=0.02 secs] > > 2012-01-10T22:23:54.612+0800: [GC [ParNew (promotion failed): 59072K->58862K(59072K), 2.3119860 secs][CMS2012-01-10T22:23:57.153+0800: [CMS-concurrent-preclean: 10.999/28.245 secs] [Times: user=290.41 sys=4.65, real=28.24 secs] > > (concurrent mode failure): 5841457K->2063142K(6144000K), 8.8971660 secs] 5898367K->2063142K(6203072K), [CMS Perm : 31369K->31131K(52316K)], 11.2110080 secs] [Times: user=11.73 sys=0.51, real=11.21 secs] > > 2012-01-10T22:24:06.125+0800: [GC [ParNew: 1638400K->46121K(1843200K), 0.0225800 secs] 3701542K->2109263K(7987200K), 0.0228190 secs] [Times: user=0.26 sys=0.02, real=0.02 secs] > > 2012-01-10T22:24:06.357+0800: [GC [ParNew: 1684521K->111262K(1843200K), 0.0381370 secs] 3747663K->2174404K(7987200K), 0.0383860 secs] [Times: user=0.44 sys=0.04, real=0.04 secs] > > > > _______________________________________________ > > hotspot-gc-use mailing list > > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120112/9ec8a7e7/attachment.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From kirk.pepperdine at gmail.com Thu Jan 12 11:08:17 2012 From: kirk.pepperdine at gmail.com (Kirk Pepperdine) Date: Thu, 12 Jan 2012 20:08:17 +0100 Subject: MaxTenuringThreshold available in ParNewGC? In-Reply-To: <4F0EE66D.7090406@oracle.com> References: <9462441C-C11B-4CDC-83BD-D78E2A1138AB@gmail.com> <4F0EE66D.7090406@oracle.com> Message-ID: <1C5F8123-6AA8-4D08-A248-50F3CE4A5516@gmail.com> Charlie, You shameless shameless self promoter!!!!! Shame on you!!!! LiLi, please ignore Charlie's shameless self promotion and run out and buy the book. I think it will be of great help to your understanding of the problems your currently facing. Charlie, what's my commission on the sale? Regards, Kirk ps ;-) On 2012-01-12, at 2:55 PM, charlie hunt wrote: > At the risk of sounding self-promotional ....based on the questions you are asking, I think you'd find a lot of value in the Java Performance book: > http://www.amazon.com/Java-Performance-Charlie-Hunt/dp/0137142528 > > Many of the folks on the mailing list were key contributors to its content. > > Almost forget ... yes, the book offers a description of the GC log and it also offers suggestions on how you can use the "Desired survivor size", "new threshold" and tenuring distribution information to help determine how to size young generation. > > hths, > > charlie ... > > On 01/12/12 06:09 AM, Li Li wrote: >> >> yesterday, we set the maxNewSize to 256mb. And it works as we expected. but an hours ago, there is a promotion failure and a concurrent mode failure which cost 14s! could anyone explain the gc logs for me? >> or any documents for the gc log format explanation? >> >> 1. Desired survivor size 3342336 bytes, new threshold 5 (max 5) >> it says survivor size is 3mb >> 2. 58282K->57602K(59072K), 0.0543930 secs] >> it says before young gc the memory used is 58282K, after young gc, there are 57602K live objects and the total young space is 59072K >> 3. (concurrent mode failure): 7907405K->3086848K(7929856K), 14.3005340 secs] 7961046K->3086848K(7988928K), [CMS Perm : 32296K->31852K(53932K)], 14.3552450 secs] [Times: user=14.53 sys=0.01, real=14.35 secs] >> before old gc, 7.9GB is used. after old gc 3GB is alive. total old space is 7.9GB >> >> in which situation will occur promotion failure and concurrent mode failure? >> from http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/ >> the author says when CMS is doing concurrent work and JVM is asked for more memory. if there isn't any space for new allocation. then it will occur concurrent mode failure and it will stop the world and do a serial old gc. >> if there exist enough space but they are fragemented, then a promotion failure will occur. >> am I right? >> >> 2012-01-12T18:27:32.582+0800: [GC [ParNew >> Desired survivor size 3342336 bytes, new threshold 1 (max 5) >> - age 1: 4594648 bytes, 4594648 total >> - age 2: 569200 bytes, 5163848 total >> : 58548K->5738K(59072K), 0.0159400 secs] 7958648K->7908502K(7984352K), 0.0160610 secs] [Times: user=0.17 sys=0.00, real=0.02 secs] >> 2012-01-12T18:27:32.609+0800: [GC [ParNew (promotion failed) >> Desired survivor size 3342336 bytes, new threshold 5 (max 5) >> - age 1: 1666376 bytes, 1666376 total >> : 58282K->57602K(59072K), 0.0543930 secs][CMS2012-01-12T18:27:33.804+0800: [CMS-concurrent-preclean: 14.098/34.323 secs] [Times: user=370.28 sys=5.65, real=34.31 secs] >> (concurrent mode failure): 7907405K->3086848K(7929856K), 14.3005340 secs] 7961046K->3086848K(7988928K), [CMS Perm : 32296K->31852K(53932K)], 14.3552450 secs] [Times: user=14.53 sys=0.01, real=14.35 secs] >> >> On Wed, Jan 11, 2012 at 5:48 PM, Kirk Pepperdine wrote: >> CMS is not adaptive. To reconfigure heap, for many reasons, you need a full GC to occur. The response to a concurrent mode failure is always a full GC. That gave the JVM the opportunity to resize heap space. If this behaviour isn't happening when it should or is cause other problems it's time to either set the young gen size directly with NewSize or switch to the parallel collector with the adaptive sizing policy turned on. Logic here is that you want to avoid long pauses, use CMS. If CMS is giving you long pauses, than the parallel collector might be a better choice. >> >> Regards, >> Kirk >> >> On 2012-01-11, at 10:32 AM, Li Li wrote: >> >> > after a concurrent mode failure. the young generation changed from about 50MB to 1.8GB >> > What's the logic behind this? >> > >> > 2012-01-10T22:23:54.544+0800: [GC [ParNew: 55389K->6528K(59072K), 0.0175440 secs] 5886124K->5839323K(6195204K), 0.0177480 secs] [Times: user=0.20 sys=0.00, real=0.01 secs] >> > 2012-01-10T22:23:54.575+0800: [GC [ParNew: 59072K->6528K(59072K), 0.0234040 secs] 5891867K->5845823K(6201540K), 0.0236070 secs] [Times: user=0.24 sys=0.00, real=0.02 secs] >> > 2012-01-10T22:23:54.612+0800: [GC [ParNew (promotion failed): 59072K->58862K(59072K), 2.3119860 secs][CMS2012-01-10T22:23:57.153+0800: [CMS-concurrent-preclean: 10.999/28.245 secs] [Times: user=290.41 sys=4.65, real=28.24 secs] >> > (concurrent mode failure): 5841457K->2063142K(6144000K), 8.8971660 secs] 5898367K->2063142K(6203072K), [CMS Perm : 31369K->31131K(52316K)], 11.2110080 secs] [Times: user=11.73 sys=0.51, real=11.21 secs] >> > 2012-01-10T22:24:06.125+0800: [GC [ParNew: 1638400K->46121K(1843200K), 0.0225800 secs] 3701542K->2109263K(7987200K), 0.0228190 secs] [Times: user=0.26 sys=0.02, real=0.02 secs] >> > 2012-01-10T22:24:06.357+0800: [GC [ParNew: 1684521K->111262K(1843200K), 0.0381370 secs] 3747663K->2174404K(7987200K), 0.0383860 secs] [Times: user=0.44 sys=0.04, real=0.04 secs] >> > >> > _______________________________________________ >> > hotspot-gc-use mailing list >> > hotspot-gc-use at openjdk.java.net >> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120112/80268148/attachment-0001.html -------------- next part -------------- _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From John.Coomes at oracle.com Thu Jan 12 12:55:55 2012 From: John.Coomes at oracle.com (John Coomes) Date: Thu, 12 Jan 2012 12:55:55 -0800 Subject: CRR (L / updated): 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces In-Reply-To: <4F0E916C.7010801@oracle.com> References: <4EF25FB8.5050507@oracle.com> <4EFA08D8.8040009@oracle.com> <4F060540.3070005@oracle.com> <4F07CD76.9080502@oracle.com> <4F0CB06D.2030008@oracle.com> <4F0CD061.50408@oracle.com> <20238.13443.469295.34166@oracle.com> <4F0E916C.7010801@oracle.com> Message-ID: <20239.18651.806742.982612@oracle.com> Bengt Rutisson (bengt.rutisson at oracle.com) wrote: > > John, > > Inline... > > On 2012-01-12 02:16, John Coomes wrote: > > Tony Printezis (tony.printezis at oracle.com) wrote: > >> Bengt, > >> > >> Hi, thanks for looking at it! See inline. > >> > >> On 01/10/2012 04:41 PM, Bengt Rutisson wrote: > >>> ... > >>> In g1OopClosures.hpp you swapped the lines 151 and 152, which makes > >>> it look like this: > >>> > >>> 149 G1ParCopyClosure(G1CollectedHeap* g1, G1ParScanThreadState* > >>> par_scan_state, > >>> 150 ReferenceProcessor* rp) : > >>> 151 G1ParCopyHelper(g1, par_scan_state,&_scanner), > >>> 152 _scanner(g1, par_scan_state, rp) { > >>> > >>> I guess you want the call to the super class constructor before other > >>> initialization. To me it looks strange that the _scanner is passed to > >>> ... > > > > So the c++ compiler will call the G1ParCopyHelper ctor first, no > > matter how you write it :-). > > That's good to know! Thanks for pointing this out. > > This makes the dependency between G1ParCopyHelper and G1ParCopyClosure > kind of scary to me. I agree with Tony that it is not an issue as it is > right now, but if anybody in the future will try to access the _scanner > field in the constructor of G1ParCopyHelper we are in trouble. I agree, it's a (minor) accident waiting to happen. > I don't think we need to fix it right away, but maybe we should think > about a fix for it. As far as I can tell the G1ParCopyHelper is only > needed to get around some template issues with G1ParCopyClosure. Maybe > there is a cleaner way to solve that? I think _scanner and its accessor can move to G1ParCopyHelper, and the ReferenceProcessor needed to initialize _scanner can be passed to the G1ParCopyHelper ctor. I don't see any other types being used for _scanner, so making it a data member of G1ParCopyHelper instead of a pointer (the latter allows virtual dispatch) should be fine. -John From bengt.rutisson at oracle.com Thu Jan 12 14:26:15 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 12 Jan 2012 23:26:15 +0100 Subject: Request for review (M): 6976060 G1: humongous object allocations should initiate marking cycles when necessary Message-ID: <4F0F5E07.6050203@oracle.com> Hi all, Could I have a couple of reviews for this fix? http://cr.openjdk.java.net/~brutisso/6976060/webrev.03/ 6976060 G1: humongous object allocations should initiate marking cycles when necessary http://monaco.us.oracle.com/detail.jsf?cr=6976060 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6976060 Background: We can hit the threshold where we should initiate a concurrent marking cycle when we do humongous allocation. This fix will check that threshold after each humongous object allocation to make sure that we don't miss the chance to run a concurrent mark and have to revert to full GCs. Testing: I wrote a small Java app that only allocates humongous objects. Before my fix I get only full GCs. With my change I avoid full GCs all together. Thanks, Bengt From bengt.rutisson at oracle.com Fri Jan 13 06:19:29 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 13 Jan 2012 15:19:29 +0100 Subject: CRR (L / updated): 6888336: G1: avoid explicitly marking and pushing objects in survivor spaces In-Reply-To: <20239.18651.806742.982612@oracle.com> References: <4EF25FB8.5050507@oracle.com> <4EFA08D8.8040009@oracle.com> <4F060540.3070005@oracle.com> <4F07CD76.9080502@oracle.com> <4F0CB06D.2030008@oracle.com> <4F0CD061.50408@oracle.com> <20238.13443.469295.34166@oracle.com> <4F0E916C.7010801@oracle.com> <20239.18651.806742.982612@oracle.com> Message-ID: <4F103D71.4000803@oracle.com> John, With some good help from Mikael Gerdin I was able to work out how to fix the templates for G1ParCopyClosure to remove the need for G1ParCopyHelper. Here is a webrev where I have removed the G1ParCopyHelper completely: http://cr.openjdk.java.net/~brutisso/template-fix/webrev.01/ Should I file a bug and fix this, or should we leave it as it is? Personally I think that it is always nice to be able to reduce the inheritance graph. Thanks, Bengt On 2012-01-12 21:55, John Coomes wrote: > Bengt Rutisson (bengt.rutisson at oracle.com) wrote: >> John, >> >> Inline... >> >> On 2012-01-12 02:16, John Coomes wrote: >>> Tony Printezis (tony.printezis at oracle.com) wrote: >>>> Bengt, >>>> >>>> Hi, thanks for looking at it! See inline. >>>> >>>> On 01/10/2012 04:41 PM, Bengt Rutisson wrote: >>>>> ... >>>>> In g1OopClosures.hpp you swapped the lines 151 and 152, which makes >>>>> it look like this: >>>>> >>>>> 149 G1ParCopyClosure(G1CollectedHeap* g1, G1ParScanThreadState* >>>>> par_scan_state, >>>>> 150 ReferenceProcessor* rp) : >>>>> 151 G1ParCopyHelper(g1, par_scan_state,&_scanner), >>>>> 152 _scanner(g1, par_scan_state, rp) { >>>>> >>>>> I guess you want the call to the super class constructor before other >>>>> initialization. To me it looks strange that the _scanner is passed to >>>>> ... >>> So the c++ compiler will call the G1ParCopyHelper ctor first, no >>> matter how you write it :-). >> That's good to know! Thanks for pointing this out. >> >> This makes the dependency between G1ParCopyHelper and G1ParCopyClosure >> kind of scary to me. I agree with Tony that it is not an issue as it is >> right now, but if anybody in the future will try to access the _scanner >> field in the constructor of G1ParCopyHelper we are in trouble. > I agree, it's a (minor) accident waiting to happen. > >> I don't think we need to fix it right away, but maybe we should think >> about a fix for it. As far as I can tell the G1ParCopyHelper is only >> needed to get around some template issues with G1ParCopyClosure. Maybe >> there is a cleaner way to solve that? > I think _scanner and its accessor can move to G1ParCopyHelper, and the > ReferenceProcessor needed to initialize _scanner can be passed to the > G1ParCopyHelper ctor. I don't see any other types being used for > _scanner, so making it a data member of G1ParCopyHelper instead of a > pointer (the latter allows virtual dispatch) should be fine. > > -John From john.cuthbertson at oracle.com Fri Jan 13 09:11:11 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 13 Jan 2012 09:11:11 -0800 Subject: RFR (S): 7129271 G1: Interference from multiple threads in PrintGC/PrintGCDetails output In-Reply-To: <4F0F3471.6070108@oracle.com> References: <4F0F3471.6070108@oracle.com> Message-ID: <4F1065AF.4060902@oracle.com> Hi Everyone, Based upon feedback from Tony, I have a much simpler version of this change. The new changes can be found at: http://cr.openjdk.java.net/~johnc/7129271/webrev.1/ Thanks, JohnC On 01/12/12 11:28, John Cuthbertson wrote: > Hi Everyone, > > Can I have a couple of volunteers review the changes for this CR? The > webrev can be found at: > http://cr.openjdk.java.net/~johnc/7129271/webrev.0/ > > The issue was that the when PrintGC or PrintGCDetails was enabled, > during an initial pause, the "concurrent-mark-start" message from the > ConcurrentMark thread was interfering with the output (by the VM > thread) from the GC pause. This was adding a randomness and > irregularity to the output that was making it difficult to parse. It > was also seen more frequently when the GC logging output was directed > to a file rather than stdout. > > The solution is to move the code that signals the Concurrent Mark > thread to after when the output from the GC pause is complete. > > In the webrev, please ignore the counts of the number of lines > changed. I added an inner scope and so indented a bunch of code which > I think has confused the webrev tool. Fortunately the actual web diffs > seem to have not included the extra whitespace. > > Testing: the GC test suite (with a low marking threshold - 2% to > create lots of marking cycles) > > Thanks, > > JohnC > > From tony.printezis at oracle.com Fri Jan 13 09:25:15 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 13 Jan 2012 12:25:15 -0500 Subject: RFR (S): 7129271 G1: Interference from multiple threads in PrintGC/PrintGCDetails output In-Reply-To: <4F1065AF.4060902@oracle.com> References: <4F0F3471.6070108@oracle.com> <4F1065AF.4060902@oracle.com> Message-ID: <4F1068FB.5040403@oracle.com> John, Thanks for taking into account my feedback! My only comment is that you should maybe move the doConcurrentMark() call to even further down (i.e., as the last thing we do in that method before return true;) given that there's still some output that can be generated after its current location (e.g., PrintHeapAtGC). Tony On 01/13/2012 12:11 PM, John Cuthbertson wrote: > Hi Everyone, > > Based upon feedback from Tony, I have a much simpler version of this > change. The new changes can be found at: > http://cr.openjdk.java.net/~johnc/7129271/webrev.1/ > > Thanks, > > JohnC > > On 01/12/12 11:28, John Cuthbertson wrote: >> Hi Everyone, >> >> Can I have a couple of volunteers review the changes for this CR? The >> webrev can be found at: >> http://cr.openjdk.java.net/~johnc/7129271/webrev.0/ >> >> The issue was that the when PrintGC or PrintGCDetails was enabled, >> during an initial pause, the "concurrent-mark-start" message from the >> ConcurrentMark thread was interfering with the output (by the VM >> thread) from the GC pause. This was adding a randomness and >> irregularity to the output that was making it difficult to parse. It >> was also seen more frequently when the GC logging output was directed >> to a file rather than stdout. >> >> The solution is to move the code that signals the Concurrent Mark >> thread to after when the output from the GC pause is complete. >> >> In the webrev, please ignore the counts of the number of lines >> changed. I added an inner scope and so indented a bunch of code which >> I think has confused the webrev tool. Fortunately the actual web >> diffs seem to have not included the extra whitespace. >> >> Testing: the GC test suite (with a low marking threshold - 2% to >> create lots of marking cycles) >> >> Thanks, >> >> JohnC >> >> > From john.cuthbertson at oracle.com Fri Jan 13 09:48:30 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 13 Jan 2012 09:48:30 -0800 Subject: RFR (S): 7129271 G1: Interference from multiple threads in PrintGC/PrintGCDetails output In-Reply-To: <4F1068FB.5040403@oracle.com> References: <4F0F3471.6070108@oracle.com> <4F1065AF.4060902@oracle.com> <4F1068FB.5040403@oracle.com> Message-ID: <4F106E6E.2050605@oracle.com> Hi Tony, Thanks. Good point. I missed the print_heap_after_gc() call. Consider it done. JohnC On 01/13/12 09:25, Tony Printezis wrote: > John, > > Thanks for taking into account my feedback! My only comment is that > you should maybe move the doConcurrentMark() call to even further down > (i.e., as the last thing we do in that method before return true;) > given that there's still some output that can be generated after its > current location (e.g., PrintHeapAtGC). > > Tony > > On 01/13/2012 12:11 PM, John Cuthbertson wrote: >> Hi Everyone, >> >> Based upon feedback from Tony, I have a much simpler version of this >> change. The new changes can be found at: >> http://cr.openjdk.java.net/~johnc/7129271/webrev.1/ >> >> Thanks, >> >> JohnC >> >> On 01/12/12 11:28, John Cuthbertson wrote: >>> Hi Everyone, >>> >>> Can I have a couple of volunteers review the changes for this CR? >>> The webrev can be found at: >>> http://cr.openjdk.java.net/~johnc/7129271/webrev.0/ >>> >>> The issue was that the when PrintGC or PrintGCDetails was enabled, >>> during an initial pause, the "concurrent-mark-start" message from >>> the ConcurrentMark thread was interfering with the output (by the VM >>> thread) from the GC pause. This was adding a randomness and >>> irregularity to the output that was making it difficult to parse. It >>> was also seen more frequently when the GC logging output was >>> directed to a file rather than stdout. >>> >>> The solution is to move the code that signals the Concurrent Mark >>> thread to after when the output from the GC pause is complete. >>> >>> In the webrev, please ignore the counts of the number of lines >>> changed. I added an inner scope and so indented a bunch of code >>> which I think has confused the webrev tool. Fortunately the actual >>> web diffs seem to have not included the extra whitespace. >>> >>> Testing: the GC test suite (with a low marking threshold - 2% to >>> create lots of marking cycles) >>> >>> Thanks, >>> >>> JohnC >>> >>> >> From tony.printezis at oracle.com Fri Jan 13 09:48:55 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 13 Jan 2012 12:48:55 -0500 Subject: RFR (S): 7129271 G1: Interference from multiple threads in PrintGC/PrintGCDetails output In-Reply-To: <4F106E6E.2050605@oracle.com> References: <4F0F3471.6070108@oracle.com> <4F1065AF.4060902@oracle.com> <4F1068FB.5040403@oracle.com> <4F106E6E.2050605@oracle.com> Message-ID: <4F106E87.4040802@oracle.com> Thanks John, ship it. :-) On 01/13/2012 12:48 PM, John Cuthbertson wrote: > Hi Tony, > > Thanks. Good point. I missed the print_heap_after_gc() call. Consider > it done. > > JohnC > > On 01/13/12 09:25, Tony Printezis wrote: >> John, >> >> Thanks for taking into account my feedback! My only comment is that >> you should maybe move the doConcurrentMark() call to even further >> down (i.e., as the last thing we do in that method before return >> true;) given that there's still some output that can be generated >> after its current location (e.g., PrintHeapAtGC). >> >> Tony >> >> On 01/13/2012 12:11 PM, John Cuthbertson wrote: >>> Hi Everyone, >>> >>> Based upon feedback from Tony, I have a much simpler version of this >>> change. The new changes can be found at: >>> http://cr.openjdk.java.net/~johnc/7129271/webrev.1/ >>> >>> Thanks, >>> >>> JohnC >>> >>> On 01/12/12 11:28, John Cuthbertson wrote: >>>> Hi Everyone, >>>> >>>> Can I have a couple of volunteers review the changes for this CR? >>>> The webrev can be found at: >>>> http://cr.openjdk.java.net/~johnc/7129271/webrev.0/ >>>> >>>> The issue was that the when PrintGC or PrintGCDetails was enabled, >>>> during an initial pause, the "concurrent-mark-start" message from >>>> the ConcurrentMark thread was interfering with the output (by the >>>> VM thread) from the GC pause. This was adding a randomness and >>>> irregularity to the output that was making it difficult to parse. >>>> It was also seen more frequently when the GC logging output was >>>> directed to a file rather than stdout. >>>> >>>> The solution is to move the code that signals the Concurrent Mark >>>> thread to after when the output from the GC pause is complete. >>>> >>>> In the webrev, please ignore the counts of the number of lines >>>> changed. I added an inner scope and so indented a bunch of code >>>> which I think has confused the webrev tool. Fortunately the actual >>>> web diffs seem to have not included the extra whitespace. >>>> >>>> Testing: the GC test suite (with a low marking threshold - 2% to >>>> create lots of marking cycles) >>>> >>>> Thanks, >>>> >>>> JohnC >>>> >>>> >>> > From bengt.rutisson at oracle.com Fri Jan 13 11:48:02 2012 From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com) Date: Fri, 13 Jan 2012 19:48:02 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 28 new changesets Message-ID: <20120113194902.089CD47964@hg.openjdk.java.net> Changeset: 7ab5f6318694 Author: phh Date: 2012-01-01 11:17 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/7ab5f6318694 7125934: Add a fast unordered timestamp capability to Hotspot on x86/x64 Summary: Add rdtsc detection and inline generation. Reviewed-by: kamg, dholmes Contributed-by: karen.kinnear at oracle.com ! src/cpu/x86/vm/vm_version_x86.cpp ! src/cpu/x86/vm/vm_version_x86.hpp ! src/os_cpu/bsd_x86/vm/os_bsd_x86.hpp + src/os_cpu/bsd_x86/vm/os_bsd_x86.inline.hpp ! src/os_cpu/linux_x86/vm/os_linux_x86.hpp + src/os_cpu/linux_x86/vm/os_linux_x86.inline.hpp ! src/os_cpu/solaris_x86/vm/os_solaris_x86.hpp + src/os_cpu/solaris_x86/vm/os_solaris_x86.inline.hpp ! src/os_cpu/solaris_x86/vm/solaris_x86_32.il ! src/os_cpu/solaris_x86/vm/solaris_x86_64.il ! src/os_cpu/windows_x86/vm/os_windows_x86.hpp + src/os_cpu/windows_x86/vm/os_windows_x86.inline.hpp ! src/share/vm/runtime/init.cpp ! src/share/vm/runtime/os.cpp ! src/share/vm/runtime/os.hpp + src/share/vm/runtime/os_ext.hpp Changeset: b16494a69d3d Author: phh Date: 2012-01-03 15:11 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/b16494a69d3d 7126185: Clean up lasterror handling, add os::get_last_error() Summary: Add os::get_last_error(), replace getLastErrorString() by os::lasterror() in os_windows.cpp. Reviewed-by: kamg, dholmes Contributed-by: erik.gahlin at oracle.com ! src/os/posix/vm/os_posix.cpp ! src/os/windows/vm/os_windows.cpp ! src/share/vm/runtime/os.hpp Changeset: 5b58979183f9 Author: dcubed Date: 2012-01-05 06:24 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5b58979183f9 7127032: fix for 7122253 adds a JvmtiThreadState earlier than necessary Summary: Use JavaThread::jvmti_thread_state() instead of JvmtiThreadState::state_for(). Reviewed-by: coleenp, poonam, acorn ! src/share/vm/classfile/classFileParser.cpp Changeset: 8a63c6323842 Author: fparain Date: 2012-01-05 07:26 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/8a63c6323842 7125594: C-heap growth issue in ThreadService::find_deadlocks_at_safepoint Reviewed-by: sspitsyn, dcubed, mchung, dholmes ! src/share/vm/services/threadService.cpp Changeset: 2e0ef19fc891 Author: phh Date: 2012-01-05 17:14 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/2e0ef19fc891 7126480: Make JVM start time in milliseconds since the Java epoch available Summary: Expose existing Management::_begin_vm_creation_time via new accessor Management::begin_vm_creation_time(). Reviewed-by: acorn, dcubed ! src/share/vm/services/management.hpp Changeset: 66259eca2bf7 Author: phh Date: 2012-01-05 17:16 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/66259eca2bf7 Merge Changeset: 2b3acb34791f Author: dcubed Date: 2012-01-06 16:18 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/2b3acb34791f Merge ! src/os/windows/vm/os_windows.cpp ! src/share/vm/classfile/classFileParser.cpp ! src/share/vm/runtime/os.hpp Changeset: abcceac2f7cd Author: iveresov Date: 2011-12-12 12:44 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/abcceac2f7cd 7119730: Tiered: SIGSEGV in AdvancedThresholdPolicy::is_method_profiled(methodOop) Summary: Added handles for references to methods in select_task() Reviewed-by: twisti, kvn ! src/share/vm/runtime/advancedThresholdPolicy.cpp Changeset: 7bca37d28f32 Author: roland Date: 2011-12-13 10:54 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/7bca37d28f32 7114106: C1: assert(goto_state->is_same(sux_state)) failed: states must match now Summary: fix C1's CEE to take inlining into account when the stacks in states are compared. Reviewed-by: iveresov, never ! src/share/vm/c1/c1_Optimizer.cpp Changeset: d725f0affb1a Author: iveresov Date: 2011-12-13 17:10 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/d725f0affb1a 7121111: -server -Xcomp -XX:+TieredCompilation does not invoke C2 compiler Summary: Exercise C2 more in tiered mode with Xcomp Reviewed-by: kvn, never ! src/share/vm/runtime/arguments.cpp Changeset: 127b3692c168 Author: kvn Date: 2011-12-14 14:54 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/127b3692c168 7116452: Add support for AVX instructions Summary: Added support for AVX extension to the x86 instruction set. Reviewed-by: never ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp ! src/cpu/x86/vm/assembler_x86.inline.hpp ! src/cpu/x86/vm/nativeInst_x86.cpp ! src/cpu/x86/vm/nativeInst_x86.hpp ! src/cpu/x86/vm/register_definitions_x86.cpp ! src/cpu/x86/vm/vm_version_x86.cpp ! src/cpu/x86/vm/vm_version_x86.hpp ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad ! src/share/vm/runtime/globals.hpp Changeset: 669f6a7d5b70 Author: never Date: 2011-12-19 14:16 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/669f6a7d5b70 7121073: secondary_super_cache memory slice has incorrect bounds in flatten_alias_type Reviewed-by: kvn ! src/share/vm/opto/compile.cpp Changeset: 65149e74c706 Author: kvn Date: 2011-12-20 00:55 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/65149e74c706 7121648: Use 3-operands SIMD instructions on x86 with AVX Summary: Use 3-operands SIMD instructions in C2 generated code for machines with AVX. Reviewed-by: never ! make/bsd/makefiles/adlc.make ! make/linux/makefiles/adlc.make ! make/solaris/makefiles/adlc.make ! make/windows/makefiles/adlc.make ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/assembler_x86.hpp + src/cpu/x86/vm/x86.ad ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad ! src/share/vm/opto/matcher.cpp Changeset: 069ab3f976d3 Author: stefank Date: 2011-12-07 11:35 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/069ab3f976d3 7118863: Move sizeof(klassOopDesc) into the *Klass::*_offset_in_bytes() functions Summary: Moved sizeof(klassOopDesc), changed the return type to ByteSize and removed the _in_bytes suffix. Reviewed-by: never, bdelsart, coleenp, jrose ! src/cpu/sparc/vm/assembler_sparc.cpp ! src/cpu/sparc/vm/c1_CodeStubs_sparc.cpp ! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp ! src/cpu/sparc/vm/c1_MacroAssembler_sparc.cpp ! src/cpu/sparc/vm/c1_Runtime1_sparc.cpp ! src/cpu/sparc/vm/cppInterpreter_sparc.cpp ! src/cpu/sparc/vm/methodHandles_sparc.cpp ! src/cpu/sparc/vm/stubGenerator_sparc.cpp ! src/cpu/sparc/vm/templateInterpreter_sparc.cpp ! src/cpu/sparc/vm/templateTable_sparc.cpp ! src/cpu/x86/vm/assembler_x86.cpp ! src/cpu/x86/vm/c1_CodeStubs_x86.cpp ! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp ! src/cpu/x86/vm/c1_MacroAssembler_x86.cpp ! src/cpu/x86/vm/c1_Runtime1_x86.cpp ! src/cpu/x86/vm/cppInterpreter_x86.cpp ! src/cpu/x86/vm/methodHandles_x86.cpp ! src/cpu/x86/vm/stubGenerator_x86_32.cpp ! src/cpu/x86/vm/stubGenerator_x86_64.cpp ! src/cpu/x86/vm/templateInterpreter_x86_32.cpp ! src/cpu/x86/vm/templateInterpreter_x86_64.cpp ! src/cpu/x86/vm/templateTable_x86_32.cpp ! src/cpu/x86/vm/templateTable_x86_64.cpp ! src/cpu/x86/vm/x86_64.ad ! src/share/vm/c1/c1_LIRGenerator.cpp ! src/share/vm/oops/arrayKlass.hpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/oops/klass.cpp ! src/share/vm/oops/klass.hpp ! src/share/vm/oops/klassOop.hpp ! src/share/vm/oops/objArrayKlass.hpp ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/graphKit.cpp ! src/share/vm/opto/library_call.cpp ! src/share/vm/opto/macro.cpp ! src/share/vm/opto/memnode.cpp ! src/share/vm/opto/parse1.cpp ! src/share/vm/opto/parseHelper.cpp ! src/share/vm/shark/sharkIntrinsics.cpp ! src/share/vm/shark/sharkTopLevelBlock.cpp Changeset: 1dc233a8c7fe Author: roland Date: 2011-12-20 16:56 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/1dc233a8c7fe 7121140: Allocation paths require explicit memory synchronization operations for RMO systems Summary: adds store store barrier after initialization of header and body of objects. Reviewed-by: never, kvn ! src/cpu/sparc/vm/sparc.ad ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad ! src/share/vm/adlc/formssel.cpp ! src/share/vm/opto/callnode.hpp ! src/share/vm/opto/classes.hpp ! src/share/vm/opto/escape.cpp ! src/share/vm/opto/graphKit.cpp ! src/share/vm/opto/library_call.cpp ! src/share/vm/opto/macro.cpp ! src/share/vm/opto/memnode.cpp ! src/share/vm/opto/memnode.hpp ! src/share/vm/opto/node.hpp Changeset: e5ac210043cd Author: roland Date: 2011-12-22 10:55 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/e5ac210043cd 7123108: C1: assert(if_state != NULL) failed: states do not match up Summary: In CEE, ensure if and common successor state are at the same inline level Reviewed-by: never ! src/share/vm/c1/c1_Optimizer.cpp + test/compiler/7123108/Test7123108.java Changeset: b642b49f9738 Author: roland Date: 2011-12-23 09:36 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/b642b49f9738 7123253: C1: in store check code, usage of registers may be incorrect Summary: fix usage of input register in assembly code for store check. Reviewed-by: never ! src/share/vm/c1/c1_LIR.cpp Changeset: 40c2484c09e1 Author: kvn Date: 2011-12-23 15:24 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/40c2484c09e1 7110832: ctw/.../org_apache_avalon_composition_util_StringHelper crashes the VM Summary: Distance is too large for one short branch in string_indexofC8(). Reviewed-by: iveresov ! src/cpu/x86/vm/assembler_x86.cpp ! src/share/vm/asm/assembler.cpp ! src/share/vm/asm/assembler.hpp Changeset: d12a66fa3820 Author: kvn Date: 2011-12-27 15:08 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/d12a66fa3820 7123954: Some CTW test crash with SIGSEGV Summary: Correct Allocate expansion code to preserve i_o when only slow call is generated. Reviewed-by: iveresov ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/macro.cpp Changeset: 8940fd98d540 Author: kvn Date: 2011-12-29 11:37 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/8940fd98d540 Merge ! src/cpu/x86/vm/assembler_x86.cpp ! src/share/vm/runtime/globals.hpp Changeset: 9c87bcb3b4dd Author: kvn Date: 2011-12-30 11:43 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/9c87bcb3b4dd 7125879: assert(proj != NULL) failed: must be found Summary: Leave i_o attached to slow allocation call when there are no i_o users after the call. Reviewed-by: iveresov, twisti ! src/share/vm/opto/macro.cpp + test/compiler/7125879/Test7125879.java Changeset: 1cb50d7a9d95 Author: iveresov Date: 2012-01-05 17:25 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/1cb50d7a9d95 7119294: Two command line options cause JVM to crash Summary: Setup thread register in MacroAssembler::incr_allocated_bytes() on x64 Reviewed-by: kvn ! src/cpu/x86/vm/assembler_x86.cpp Changeset: 22cee0ee8927 Author: kvn Date: 2012-01-06 20:09 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/22cee0ee8927 Merge ! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp ! src/cpu/sparc/vm/c1_Runtime1_sparc.cpp ! src/cpu/sparc/vm/stubGenerator_sparc.cpp ! src/cpu/sparc/vm/templateInterpreter_sparc.cpp ! src/cpu/sparc/vm/templateTable_sparc.cpp ! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp ! src/cpu/x86/vm/c1_Runtime1_x86.cpp ! src/cpu/x86/vm/stubGenerator_x86_32.cpp ! src/cpu/x86/vm/stubGenerator_x86_64.cpp ! src/cpu/x86/vm/templateInterpreter_x86_32.cpp ! src/cpu/x86/vm/templateInterpreter_x86_64.cpp ! src/cpu/x86/vm/templateTable_x86_32.cpp ! src/cpu/x86/vm/templateTable_x86_64.cpp ! src/cpu/x86/vm/vm_version_x86.cpp ! src/cpu/x86/vm/vm_version_x86.hpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/opto/library_call.cpp ! src/share/vm/opto/parseHelper.cpp Changeset: 8f8b94305aff Author: dcubed Date: 2012-01-11 19:54 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/8f8b94305aff 7129240: backout fix for 7102776 until 7128770 is resolved Reviewed-by: phh, bobv, coleenp, dcubed Contributed-by: Jiangli Zhou ! agent/src/share/classes/sun/jvm/hotspot/oops/InstanceKlass.java ! src/share/vm/code/dependencies.cpp ! src/share/vm/oops/instanceKlass.hpp ! src/share/vm/oops/instanceKlassKlass.cpp ! src/share/vm/runtime/vmStructs.cpp Changeset: 4f25538b54c9 Author: fparain Date: 2012-01-09 10:27 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4f25538b54c9 7120511: Add diagnostic commands Reviewed-by: acorn, phh, dcubed, sspitsyn ! src/share/vm/classfile/vmSymbols.hpp ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/globals.cpp ! src/share/vm/runtime/globals.hpp ! src/share/vm/runtime/init.cpp ! src/share/vm/services/attachListener.cpp ! src/share/vm/services/diagnosticCommand.cpp ! src/share/vm/services/diagnosticCommand.hpp ! src/share/vm/services/diagnosticFramework.cpp ! src/share/vm/services/diagnosticFramework.hpp ! src/share/vm/services/management.cpp Changeset: 865e0817f32b Author: kamg Date: 2012-01-10 15:47 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/865e0817f32b Merge ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/globals.hpp Changeset: efdf6985a3a2 Author: kamg Date: 2012-01-12 09:59 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/efdf6985a3a2 Merge Changeset: 9d4f4a1825e4 Author: brutisso Date: 2012-01-13 01:55 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/9d4f4a1825e4 Merge From john.cuthbertson at oracle.com Fri Jan 13 21:51:05 2012 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Sat, 14 Jan 2012 05:51:05 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7121547: G1: High number mispredicted branches while iterating over the marking bitmap Message-ID: <20120114055110.73C0547974@hg.openjdk.java.net> Changeset: 2e966d967c5c Author: johnc Date: 2012-01-13 13:27 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/2e966d967c5c 7121547: G1: High number mispredicted branches while iterating over the marking bitmap Summary: There is a high number of mispredicted branches associated with calling BitMap::iteratate() from within CMBitMapRO::iterate(). Implement a version of CMBitMapRO::iterate() directly using inline-able routines. Reviewed-by: tonyp, iveresov ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/concurrentMark.hpp ! src/share/vm/gc_implementation/g1/concurrentMark.inline.hpp ! src/share/vm/utilities/bitMap.inline.hpp From bengt.rutisson at oracle.com Mon Jan 16 02:36:44 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 16 Jan 2012 11:36:44 +0100 Subject: Request for review (XXS): 7130334 G1: Change comments and error messages that refer to CMS in g1/concurrentMark.cpp/hpp Message-ID: <4F13FDBC.5040907@oracle.com> Hi all, Can I have a couple of reviews for this really small and simple change: http://cr.openjdk.java.net/~brutisso/7130334/webrev.01/ Background Charlie Hunt got the following error message when he was running low on native memory while using G1: "failed: couldn't reseve backing store for CMS bit map". Kind of a strange to mention CMS when you are using G1. The error comes from concurrentMark.cpp and it turns out that there were a couple of error messages and some comments there that mention CMS. I removed some of the comments and updated others to say "concurren marking" which I think is more what was intended. I have been searching the rest of the G1 code for references to CMS, but the other places I have found seem correct. So, this issue seems to only concern concurrentMark.cpp/hpp. CR: 7130334 G1: Change comments and error messages that refer to CMS in g1/concurrentMark.cpp/hpp http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7130334 http://monaco.sfbay.sun.com/detail.jsf?cr=7130334 Bengt From bengt.rutisson at oracle.com Mon Jan 16 02:47:02 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 16 Jan 2012 11:47:02 +0100 Subject: RFR (S): 7129271 G1: Interference from multiple threads in PrintGC/PrintGCDetails output In-Reply-To: <4F106E87.4040802@oracle.com> References: <4F0F3471.6070108@oracle.com> <4F1065AF.4060902@oracle.com> <4F1068FB.5040403@oracle.com> <4F106E6E.2050605@oracle.com> <4F106E87.4040802@oracle.com> Message-ID: <4F140026.2030104@oracle.com> John, With the doConcurrentMark() call just before the "return true" statement I think this looks good too. Nice that it no longer conflicts with my marking cycle changes for humongous object allocations. :-) Ship it! Bengt On 2012-01-13 18:48, Tony Printezis wrote: > Thanks John, ship it. :-) > > On 01/13/2012 12:48 PM, John Cuthbertson wrote: >> Hi Tony, >> >> Thanks. Good point. I missed the print_heap_after_gc() call. Consider >> it done. >> >> JohnC >> >> On 01/13/12 09:25, Tony Printezis wrote: >>> John, >>> >>> Thanks for taking into account my feedback! My only comment is that >>> you should maybe move the doConcurrentMark() call to even further >>> down (i.e., as the last thing we do in that method before return >>> true;) given that there's still some output that can be generated >>> after its current location (e.g., PrintHeapAtGC). >>> >>> Tony >>> >>> On 01/13/2012 12:11 PM, John Cuthbertson wrote: >>>> Hi Everyone, >>>> >>>> Based upon feedback from Tony, I have a much simpler version of >>>> this change. The new changes can be found at: >>>> http://cr.openjdk.java.net/~johnc/7129271/webrev.1/ >>>> >>>> Thanks, >>>> >>>> JohnC >>>> >>>> On 01/12/12 11:28, John Cuthbertson wrote: >>>>> Hi Everyone, >>>>> >>>>> Can I have a couple of volunteers review the changes for this CR? >>>>> The webrev can be found at: >>>>> http://cr.openjdk.java.net/~johnc/7129271/webrev.0/ >>>>> >>>>> The issue was that the when PrintGC or PrintGCDetails was enabled, >>>>> during an initial pause, the "concurrent-mark-start" message from >>>>> the ConcurrentMark thread was interfering with the output (by the >>>>> VM thread) from the GC pause. This was adding a randomness and >>>>> irregularity to the output that was making it difficult to parse. >>>>> It was also seen more frequently when the GC logging output was >>>>> directed to a file rather than stdout. >>>>> >>>>> The solution is to move the code that signals the Concurrent Mark >>>>> thread to after when the output from the GC pause is complete. >>>>> >>>>> In the webrev, please ignore the counts of the number of lines >>>>> changed. I added an inner scope and so indented a bunch of code >>>>> which I think has confused the webrev tool. Fortunately the actual >>>>> web diffs seem to have not included the extra whitespace. >>>>> >>>>> Testing: the GC test suite (with a low marking threshold - 2% to >>>>> create lots of marking cycles) >>>>> >>>>> Thanks, >>>>> >>>>> JohnC >>>>> >>>>> >>>> >> From bengt.rutisson at oracle.com Mon Jan 16 06:22:54 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 16 Jan 2012 15:22:54 +0100 Subject: Request for review (M): 6976060 G1: humongous object allocations should initiate marking cycles when necessary In-Reply-To: <4F0F5E07.6050203@oracle.com> References: <4F0F5E07.6050203@oracle.com> Message-ID: <4F1432BE.6060409@oracle.com> Hi again, Updated webrev based on comments from Tony: http://cr.openjdk.java.net/~brutisso/6976060/webrev.04/ Thanks, Tony for the review! Still need one more... Bengt On 2012-01-12 23:26, Bengt Rutisson wrote: > > Hi all, > > Could I have a couple of reviews for this fix? > http://cr.openjdk.java.net/~brutisso/6976060/webrev.03/ > > 6976060 G1: humongous object allocations should initiate marking > cycles when necessary > http://monaco.us.oracle.com/detail.jsf?cr=6976060 > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6976060 > > Background: > > We can hit the threshold where we should initiate a concurrent marking > cycle when we do humongous allocation. This fix will check that > threshold after each humongous object allocation to make sure that we > don't miss the chance to run a concurrent mark and have to revert to > full GCs. > > Testing: > I wrote a small Java app that only allocates humongous objects. Before > my fix I get only full GCs. With my change I avoid full GCs all together. > > Thanks, > Bengt From bengt.rutisson at oracle.com Mon Jan 16 13:24:28 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 16 Jan 2012 22:24:28 +0100 Subject: Request for review (M): 6976060 G1: humongous object allocations should initiate marking cycles when necessary In-Reply-To: <4F1432BE.6060409@oracle.com> References: <4F0F5E07.6050203@oracle.com> <4F1432BE.6060409@oracle.com> Message-ID: <4F14958C.3010805@oracle.com> Hi again, Updated webrev with just one change compared to the last one. Tony pointed out that I mixed one of the if statements a bit based on his initial comments. Thanks Tony for catching it! http://cr.openjdk.java.net/~brutisso/6976060/webrev.05/ Bengt On 2012-01-16 15:22, Bengt Rutisson wrote: > > Hi again, > > Updated webrev based on comments from Tony: > http://cr.openjdk.java.net/~brutisso/6976060/webrev.04/ > > Thanks, Tony for the review! Still need one more... > > Bengt > > On 2012-01-12 23:26, Bengt Rutisson wrote: >> >> Hi all, >> >> Could I have a couple of reviews for this fix? >> http://cr.openjdk.java.net/~brutisso/6976060/webrev.03/ >> >> 6976060 G1: humongous object allocations should initiate marking >> cycles when necessary >> http://monaco.us.oracle.com/detail.jsf?cr=6976060 >> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6976060 >> >> Background: >> >> We can hit the threshold where we should initiate a concurrent >> marking cycle when we do humongous allocation. This fix will check >> that threshold after each humongous object allocation to make sure >> that we don't miss the chance to run a concurrent mark and have to >> revert to full GCs. >> >> Testing: >> I wrote a small Java app that only allocates humongous objects. >> Before my fix I get only full GCs. With my change I avoid full GCs >> all together. >> >> Thanks, >> Bengt > From bengt.rutisson at oracle.com Tue Jan 17 04:28:44 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 17 Jan 2012 13:28:44 +0100 Subject: CRR (XXS): 7078465: G1: Don't use the undefined value (-1) for the G1 old memory pool max size In-Reply-To: <4F0F016B.8040707@oracle.com> References: <4F0F016B.8040707@oracle.com> Message-ID: <4F15697C.60504@oracle.com> Hi Tony, This looks good! :-) One question. The CR says "Similarly, we should consider setting the old minimum capacity to the heap minimum capacity. This is only used by jstat and currently we set the minimum capacity of all the spaces to 0.". Do you want to do that as well? Bengt On 2012-01-12 16:51, Tony Printezis wrote: > Hi all, > > I'd like a couple of code reviews for this very small change (one > line!): > > http://cr.openjdk.java.net/~tonyp/7078465/webrev.0/ > > Currently, all the G1 memory pools return "undefined" (-1) as their > max size given that there are no hard boundaries between them. Jon > Masamitsu suggested to at least return the heap max for the old memory > pool so that the pool data is a little bit more informative. > > Tony > From bengt.rutisson at oracle.com Tue Jan 17 04:52:01 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 17 Jan 2012 13:52:01 +0100 Subject: CRR (S): 7097586: G1: improve the per-space output when using jmap -heap In-Reply-To: <4F0F2741.4090705@oracle.com> References: <4F0F2741.4090705@oracle.com> Message-ID: <4F156EF1.6000003@oracle.com> Hi Tony, Looks good! Ship it! Bengt On 2012-01-12 19:32, Tony Printezis wrote: > Hi all, > > I'd like a couple of code reviews for this change that enhances the > heap summary information generated by the SA (which is used for the > jmap -heap output): > > http://cr.openjdk.java.net/~tonyp/7097586/webrev.0/ > > Currently, the heap summary generated for G1 is as close as possible > to what's generated for the other GCs. Bengt made a good suggestion > that it'd be helpful to enhance the output with some G1-specific > information in order to make it more informative. The important > changes are the 15 lines or so that were changed in HeapSummary.java, > the rest is boilerplate to be able to access specific fields and > objects from the SA. I included before / after jmap -heap output below. > > Note that we actually had a small bug in the code which caused the > sizing information in the G1MonitoringSupport object to become > inconsistent between a cleanup and the subsequent GC: the old space > information was not updated to reflect any old region reclamation > during cleanup. I fixed this as part of this change too (I'll add a > note to the CR). > > Tony > > BEFORE: > > using thread-local object allocation. > Garbage-First (G1) GC with 8 thread(s) > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 1073741824 (1024.0MB) > NewSize = 1048576 (1.0MB) > MaxNewSize = 4294967295 (4095.9999990463257MB) > OldSize = 4194304 (4.0MB) > NewRatio = 2 > SurvivorRatio = 8 > PermSize = 16777216 (16.0MB) > MaxPermSize = 67108864 (64.0MB) > > Heap Usage: > G1 Young Generation > Eden Space: > capacity = 19922944 (19.0MB) > used = 3145728 (3.0MB) > free = 16777216 (16.0MB) > 15.789473684210526% used > From Space: > capacity = 2097152 (2.0MB) > used = 2097152 (2.0MB) > free = 0 (0.0MB) > 100.0% used > To Space: > capacity = 0 (0.0MB) > used = 0 (0.0MB) > free = 0 (0.0MB) > 0.0% used > G1 Old Generation > capacity = 19922944 (19.0MB) > used = 5849192 (5.578224182128906MB) > free = 14073752 (13.421775817871094MB) > 29.359074642783717% used > Perm Generation: > capacity = 16777216 (16.0MB) > used = 2749208 (2.6218490600585938MB) > free = 14028008 (13.378150939941406MB) > 16.38655662536621% used > > 1719 interned Strings occupying 137520 bytes. > > > AFTER (I marked the changes with bold; note that now there's only one > Survivor section, as G1 does not have the concept of two survivors > that are always allocated): > > using thread-local object allocation. > Garbage-First (G1) GC with 8 thread(s) > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 67108864 (64.0MB) > NewSize = 1048576 (1.0MB) > MaxNewSize = 4294967295 (4095.9999990463257MB) > OldSize = 4194304 (4.0MB) > NewRatio = 2 > SurvivorRatio = 8 > PermSize = 16777216 (16.0MB) > MaxPermSize = 67108864 (64.0MB) > * G1HeapRegionSize = 1048576 (1.0MB) > * > Heap Usage: > *G1 Heap: > regions = 57 > capacity = 59768832 (57.0MB) > used = 18018304 (17.18359375MB) > free = 41750528 (39.81640625MB) > 30.146655701754387% used > *G1 Young Generation: > Eden Space: > * regions = 3 > * capacity = 30408704 (29.0MB) > used = 3145728 (3.0MB) > free = 27262976 (26.0MB) > 10.344827586206897% used > *Survivor Space: > regions = 2 > * capacity = 2097152 (2.0MB) > used = 2097152 (2.0MB) > free = 0 (0.0MB) > 100.0% used > G1 Old Generation: > * regions = 13 > * capacity = 27262976 (26.0MB) > used = 12775424 (12.18359375MB) > free = 14487552 (13.81640625MB) > 46.85997596153846% used > Perm Generation: > capacity = 16777216 (16.0MB) > used = 2741840 (2.6148223876953125MB) > free = 14035376 (13.385177612304688MB) > 16.342639923095703% used > > 1710 interned Strings occupying 136904 bytes. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120117/1d4cd918/attachment.html From jon.masamitsu at oracle.com Tue Jan 17 08:47:24 2012 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Tue, 17 Jan 2012 08:47:24 -0800 Subject: Request for review (XXS): 7130334 G1: Change comments and error messages that refer to CMS in g1/concurrentMark.cpp/hpp In-Reply-To: <4F13FDBC.5040907@oracle.com> References: <4F13FDBC.5040907@oracle.com> Message-ID: <4F15A61C.5000401@oracle.com> Looks fine. On 01/16/12 02:36, Bengt Rutisson wrote: > > Hi all, > > Can I have a couple of reviews for this really small and simple change: > http://cr.openjdk.java.net/~brutisso/7130334/webrev.01/ > > Background > Charlie Hunt got the following error message when he was running low > on native memory while using G1: "failed: couldn't reseve backing > store for CMS bit map". Kind of a strange to mention CMS when you are > using G1. > > The error comes from concurrentMark.cpp and it turns out that there > were a couple of error messages and some comments there that mention > CMS. I removed some of the comments and updated others to say > "concurren marking" which I think is more what was intended. > > I have been searching the rest of the G1 code for references to CMS, > but the other places I have found seem correct. So, this issue seems > to only concern concurrentMark.cpp/hpp. > > CR: > 7130334 G1: Change comments and error messages that refer to CMS in > g1/concurrentMark.cpp/hpp > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7130334 > http://monaco.sfbay.sun.com/detail.jsf?cr=7130334 > > Bengt From john.cuthbertson at oracle.com Tue Jan 17 09:44:26 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 17 Jan 2012 09:44:26 -0800 Subject: Request for review (XXS): 7130334 G1: Change comments and error messages that refer to CMS in g1/concurrentMark.cpp/hpp In-Reply-To: <4F13FDBC.5040907@oracle.com> References: <4F13FDBC.5040907@oracle.com> Message-ID: <4F15B37A.3010007@oracle.com> Hi Bengt, Looks good to me. JohnC On 01/16/12 02:36, Bengt Rutisson wrote: > > Hi all, > > Can I have a couple of reviews for this really small and simple change: > http://cr.openjdk.java.net/~brutisso/7130334/webrev.01/ > > Background > Charlie Hunt got the following error message when he was running low > on native memory while using G1: "failed: couldn't reseve backing > store for CMS bit map". Kind of a strange to mention CMS when you are > using G1. > > The error comes from concurrentMark.cpp and it turns out that there > were a couple of error messages and some comments there that mention > CMS. I removed some of the comments and updated others to say > "concurren marking" which I think is more what was intended. > > I have been searching the rest of the G1 code for references to CMS, > but the other places I have found seem correct. So, this issue seems > to only concern concurrentMark.cpp/hpp. > > CR: > 7130334 G1: Change comments and error messages that refer to CMS in > g1/concurrentMark.cpp/hpp > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7130334 > http://monaco.sfbay.sun.com/detail.jsf?cr=7130334 > > Bengt From bengt.rutisson at oracle.com Tue Jan 17 09:55:27 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 17 Jan 2012 18:55:27 +0100 Subject: Request for review (XXS): 7130334 G1: Change comments and error messages that refer to CMS in g1/concurrentMark.cpp/hpp In-Reply-To: <4F15B37A.3010007@oracle.com> References: <4F13FDBC.5040907@oracle.com> <4F15B37A.3010007@oracle.com> Message-ID: <4F15B60F.80403@oracle.com> Thanks Tony, Jon and John for the reviews! All set to push this now. Bengt On 2012-01-17 18:44, John Cuthbertson wrote: > Hi Bengt, > > Looks good to me. > > JohnC > > On 01/16/12 02:36, Bengt Rutisson wrote: >> >> Hi all, >> >> Can I have a couple of reviews for this really small and simple change: >> http://cr.openjdk.java.net/~brutisso/7130334/webrev.01/ >> >> Background >> Charlie Hunt got the following error message when he was running low >> on native memory while using G1: "failed: couldn't reseve backing >> store for CMS bit map". Kind of a strange to mention CMS when you >> are using G1. >> >> The error comes from concurrentMark.cpp and it turns out that there >> were a couple of error messages and some comments there that mention >> CMS. I removed some of the comments and updated others to say >> "concurren marking" which I think is more what was intended. >> >> I have been searching the rest of the G1 code for references to CMS, >> but the other places I have found seem correct. So, this issue seems >> to only concern concurrentMark.cpp/hpp. >> >> CR: >> 7130334 G1: Change comments and error messages that refer to CMS in >> g1/concurrentMark.cpp/hpp >> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7130334 >> http://monaco.sfbay.sun.com/detail.jsf?cr=7130334 >> >> Bengt > From tony.printezis at oracle.com Tue Jan 17 12:47:26 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 17 Jan 2012 15:47:26 -0500 Subject: CRR (XXS): 7078465: G1: Don't use the undefined value (-1) for the G1 old memory pool max size In-Reply-To: <4F15697C.60504@oracle.com> References: <4F0F016B.8040707@oracle.com> <4F15697C.60504@oracle.com> Message-ID: <4F15DE5E.3050401@oracle.com> Bengt, Good point, I'll add it to the change and do some more testing before publishing a new webrev. Thanks, Tony On 01/17/2012 07:28 AM, Bengt Rutisson wrote: > > Hi Tony, > > This looks good! :-) > > One question. The CR says "Similarly, we should consider setting the > old minimum capacity to the heap minimum capacity. This is only used > by jstat and currently we set the minimum capacity of all the spaces > to 0.". Do you want to do that as well? > > Bengt > > On 2012-01-12 16:51, Tony Printezis wrote: >> Hi all, >> >> I'd like a couple of code reviews for this very small change (one >> line!): >> >> http://cr.openjdk.java.net/~tonyp/7078465/webrev.0/ >> >> Currently, all the G1 memory pools return "undefined" (-1) as their >> max size given that there are no hard boundaries between them. Jon >> Masamitsu suggested to at least return the heap max for the old >> memory pool so that the pool data is a little bit more informative. >> >> Tony >> > From jon.masamitsu at oracle.com Tue Jan 17 13:13:48 2012 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Tue, 17 Jan 2012 13:13:48 -0800 Subject: Removing hotspot-gc-dev from hotspot-gc-use mailing list Message-ID: <4F15E48C.50903@oracle.com> I'm going to be removing hotspot-gc-dev from the hotspot-gc-use mailing list. If you would like to continue to see the postings to hotspot-gc-use, please subscribe. http://mail.openjdk.java.net/mailman/listinfo It seemed like a good idea at the time but the burden of duplicate mail it causes doesn't feel like it is worth it any longer. Thanks. From tony.printezis at oracle.com Tue Jan 17 13:19:46 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 17 Jan 2012 16:19:46 -0500 Subject: CRR (XXS): 7078465: G1: Don't use the undefined value (-1) for the G1 old memory pool max size In-Reply-To: <4F15DE5E.3050401@oracle.com> References: <4F0F016B.8040707@oracle.com> <4F15697C.60504@oracle.com> <4F15DE5E.3050401@oracle.com> Message-ID: <4F15E5F2.4000304@oracle.com> Bengt, I don't think what I had written on the CR actually makes sense. The old gen min capacity cannot be the heap min capacity given that, when the heap capacity is at the minimum, the old gen capacity could be smaller (we need space for the young gen for example). So, I think I'll leave it as is (and I'll update the CR if you agree). I did update a couple of related comments in the code, here's the updated webrev: http://cr.openjdk.java.net/~tonyp/7078465/webrev.1/ I'll push it tomorrow as long as noone has any objections. Tony On 01/17/2012 03:47 PM, Tony Printezis wrote: > Bengt, > > Good point, I'll add it to the change and do some more testing before > publishing a new webrev. Thanks, > > Tony > > On 01/17/2012 07:28 AM, Bengt Rutisson wrote: >> >> Hi Tony, >> >> This looks good! :-) >> >> One question. The CR says "Similarly, we should consider setting the >> old minimum capacity to the heap minimum capacity. This is only used >> by jstat and currently we set the minimum capacity of all the spaces >> to 0.". Do you want to do that as well? >> >> Bengt >> >> On 2012-01-12 16:51, Tony Printezis wrote: >>> Hi all, >>> >>> I'd like a couple of code reviews for this very small change (one >>> line!): >>> >>> http://cr.openjdk.java.net/~tonyp/7078465/webrev.0/ >>> >>> Currently, all the G1 memory pools return "undefined" (-1) as their >>> max size given that there are no hard boundaries between them. Jon >>> Masamitsu suggested to at least return the heap max for the old >>> memory pool so that the pool data is a little bit more informative. >>> >>> Tony >>> >> From john.cuthbertson at oracle.com Tue Jan 17 13:58:37 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 17 Jan 2012 13:58:37 -0800 Subject: CRR (S): 7097586: G1: improve the per-space output when using jmap -heap In-Reply-To: <4F0F2741.4090705@oracle.com> References: <4F0F2741.4090705@oracle.com> Message-ID: <4F15EF0D.1060300@oracle.com> Hi Tony, This looks good to me. JohnC On 01/12/12 10:32, Tony Printezis wrote: > Hi all, > > I'd like a couple of code reviews for this change that enhances the > heap summary information generated by the SA (which is used for the > jmap -heap output): > > http://cr.openjdk.java.net/~tonyp/7097586/webrev.0/ > > Currently, the heap summary generated for G1 is as close as possible > to what's generated for the other GCs. Bengt made a good suggestion > that it'd be helpful to enhance the output with some G1-specific > information in order to make it more informative. The important > changes are the 15 lines or so that were changed in HeapSummary.java, > the rest is boilerplate to be able to access specific fields and > objects from the SA. I included before / after jmap -heap output below. > > Note that we actually had a small bug in the code which caused the > sizing information in the G1MonitoringSupport object to become > inconsistent between a cleanup and the subsequent GC: the old space > information was not updated to reflect any old region reclamation > during cleanup. I fixed this as part of this change too (I'll add a > note to the CR). > > Tony > > BEFORE: > > using thread-local object allocation. > Garbage-First (G1) GC with 8 thread(s) > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 1073741824 (1024.0MB) > NewSize = 1048576 (1.0MB) > MaxNewSize = 4294967295 (4095.9999990463257MB) > OldSize = 4194304 (4.0MB) > NewRatio = 2 > SurvivorRatio = 8 > PermSize = 16777216 (16.0MB) > MaxPermSize = 67108864 (64.0MB) > > Heap Usage: > G1 Young Generation > Eden Space: > capacity = 19922944 (19.0MB) > used = 3145728 (3.0MB) > free = 16777216 (16.0MB) > 15.789473684210526% used > From Space: > capacity = 2097152 (2.0MB) > used = 2097152 (2.0MB) > free = 0 (0.0MB) > 100.0% used > To Space: > capacity = 0 (0.0MB) > used = 0 (0.0MB) > free = 0 (0.0MB) > 0.0% used > G1 Old Generation > capacity = 19922944 (19.0MB) > used = 5849192 (5.578224182128906MB) > free = 14073752 (13.421775817871094MB) > 29.359074642783717% used > Perm Generation: > capacity = 16777216 (16.0MB) > used = 2749208 (2.6218490600585938MB) > free = 14028008 (13.378150939941406MB) > 16.38655662536621% used > > 1719 interned Strings occupying 137520 bytes. > > > AFTER (I marked the changes with bold; note that now there's only one > Survivor section, as G1 does not have the concept of two survivors > that are always allocated): > > using thread-local object allocation. > Garbage-First (G1) GC with 8 thread(s) > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 67108864 (64.0MB) > NewSize = 1048576 (1.0MB) > MaxNewSize = 4294967295 (4095.9999990463257MB) > OldSize = 4194304 (4.0MB) > NewRatio = 2 > SurvivorRatio = 8 > PermSize = 16777216 (16.0MB) > MaxPermSize = 67108864 (64.0MB) > * G1HeapRegionSize = 1048576 (1.0MB) > * > Heap Usage: > *G1 Heap: > regions = 57 > capacity = 59768832 (57.0MB) > used = 18018304 (17.18359375MB) > free = 41750528 (39.81640625MB) > 30.146655701754387% used > *G1 Young Generation: > Eden Space: > * regions = 3 > * capacity = 30408704 (29.0MB) > used = 3145728 (3.0MB) > free = 27262976 (26.0MB) > 10.344827586206897% used > *Survivor Space: > regions = 2 > * capacity = 2097152 (2.0MB) > used = 2097152 (2.0MB) > free = 0 (0.0MB) > 100.0% used > G1 Old Generation: > * regions = 13 > * capacity = 27262976 (26.0MB) > used = 12775424 (12.18359375MB) > free = 14487552 (13.81640625MB) > 46.85997596153846% used > Perm Generation: > capacity = 16777216 (16.0MB) > used = 2741840 (2.6148223876953125MB) > free = 14035376 (13.385177612304688MB) > 16.342639923095703% used > > 1710 interned Strings occupying 136904 bytes. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120117/e49abd1a/attachment.html From tony.printezis at oracle.com Tue Jan 17 14:04:48 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 17 Jan 2012 17:04:48 -0500 Subject: CRR (S): 7097586: G1: improve the per-space output when using jmap -heap In-Reply-To: <4F15EF0D.1060300@oracle.com> References: <4F0F2741.4090705@oracle.com> <4F15EF0D.1060300@oracle.com> Message-ID: <4F15F080.4040604@oracle.com> Thanks John! On 01/17/2012 04:58 PM, John Cuthbertson wrote: > Hi Tony, > > This looks good to me. > > JohnC > > On 01/12/12 10:32, Tony Printezis wrote: >> Hi all, >> >> I'd like a couple of code reviews for this change that enhances the >> heap summary information generated by the SA (which is used for the >> jmap -heap output): >> >> http://cr.openjdk.java.net/~tonyp/7097586/webrev.0/ >> >> Currently, the heap summary generated for G1 is as close as possible >> to what's generated for the other GCs. Bengt made a good suggestion >> that it'd be helpful to enhance the output with some G1-specific >> information in order to make it more informative. The important >> changes are the 15 lines or so that were changed in HeapSummary.java, >> the rest is boilerplate to be able to access specific fields and >> objects from the SA. I included before / after jmap -heap output below. >> >> Note that we actually had a small bug in the code which caused the >> sizing information in the G1MonitoringSupport object to become >> inconsistent between a cleanup and the subsequent GC: the old space >> information was not updated to reflect any old region reclamation >> during cleanup. I fixed this as part of this change too (I'll add a >> note to the CR). >> >> Tony >> >> BEFORE: >> >> using thread-local object allocation. >> Garbage-First (G1) GC with 8 thread(s) >> >> Heap Configuration: >> MinHeapFreeRatio = 40 >> MaxHeapFreeRatio = 70 >> MaxHeapSize = 1073741824 (1024.0MB) >> NewSize = 1048576 (1.0MB) >> MaxNewSize = 4294967295 (4095.9999990463257MB) >> OldSize = 4194304 (4.0MB) >> NewRatio = 2 >> SurvivorRatio = 8 >> PermSize = 16777216 (16.0MB) >> MaxPermSize = 67108864 (64.0MB) >> >> Heap Usage: >> G1 Young Generation >> Eden Space: >> capacity = 19922944 (19.0MB) >> used = 3145728 (3.0MB) >> free = 16777216 (16.0MB) >> 15.789473684210526% used >> >From Space: >> capacity = 2097152 (2.0MB) >> used = 2097152 (2.0MB) >> free = 0 (0.0MB) >> 100.0% used >> To Space: >> capacity = 0 (0.0MB) >> used = 0 (0.0MB) >> free = 0 (0.0MB) >> 0.0% used >> G1 Old Generation >> capacity = 19922944 (19.0MB) >> used = 5849192 (5.578224182128906MB) >> free = 14073752 (13.421775817871094MB) >> 29.359074642783717% used >> Perm Generation: >> capacity = 16777216 (16.0MB) >> used = 2749208 (2.6218490600585938MB) >> free = 14028008 (13.378150939941406MB) >> 16.38655662536621% used >> >> 1719 interned Strings occupying 137520 bytes. >> >> >> AFTER (I marked the changes with bold; note that now there's only one >> Survivor section, as G1 does not have the concept of two survivors >> that are always allocated): >> >> using thread-local object allocation. >> Garbage-First (G1) GC with 8 thread(s) >> >> Heap Configuration: >> MinHeapFreeRatio = 40 >> MaxHeapFreeRatio = 70 >> MaxHeapSize = 67108864 (64.0MB) >> NewSize = 1048576 (1.0MB) >> MaxNewSize = 4294967295 (4095.9999990463257MB) >> OldSize = 4194304 (4.0MB) >> NewRatio = 2 >> SurvivorRatio = 8 >> PermSize = 16777216 (16.0MB) >> MaxPermSize = 67108864 (64.0MB) >> * G1HeapRegionSize = 1048576 (1.0MB) >> * >> Heap Usage: >> *G1 Heap: >> regions = 57 >> capacity = 59768832 (57.0MB) >> used = 18018304 (17.18359375MB) >> free = 41750528 (39.81640625MB) >> 30.146655701754387% used >> *G1 Young Generation: >> Eden Space: >> * regions = 3 >> * capacity = 30408704 (29.0MB) >> used = 3145728 (3.0MB) >> free = 27262976 (26.0MB) >> 10.344827586206897% used >> *Survivor Space: >> regions = 2 >> * capacity = 2097152 (2.0MB) >> used = 2097152 (2.0MB) >> free = 0 (0.0MB) >> 100.0% used >> G1 Old Generation: >> * regions = 13 >> * capacity = 27262976 (26.0MB) >> used = 12775424 (12.18359375MB) >> free = 14487552 (13.81640625MB) >> 46.85997596153846% used >> Perm Generation: >> capacity = 16777216 (16.0MB) >> used = 2741840 (2.6148223876953125MB) >> free = 14035376 (13.385177612304688MB) >> 16.342639923095703% used >> >> 1710 interned Strings occupying 136904 bytes. >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120117/fdaa4dbe/attachment-0001.html From tony.printezis at oracle.com Tue Jan 17 15:48:16 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 17 Jan 2012 18:48:16 -0500 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause Message-ID: <4F1608C0.2040601@oracle.com> Hi all, Can I have a couple of code reviews for this change that re-enables the use of survivor regions during the initial-mark pause? http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ From the CR: We could scan the survivors as we're copying them, however this will require more work during the initial-mark GCs (and in particular: special-case code in the fast path). A better approach is to let the concurrent marking threads scan the survivors and mark everything reachable from them a) before any more concurrent marking work is done (so that we can just mark the objects, without needing to push them on a stack, and let the "finger" algorithm discover them) and b) before the next GC starts (since, if we copy them, we won't know which of the new survivors are the ones we need to scan). This approach has the advantage that it does not require any extra work during the initial-mark GCs and all the work is done by the concurrent marking threads. However, it has the disadvantage that the survivor scanning might hold up the next GC. In most cases this should not be an issue as GCs take place at a reasonably low rate. If it does become a problem we could consider the following: - like when the GC locker is active, try to extend the eden to give a bit more time to the marking threads to finish scanning the survivors - instead of waiting for the marking threads, a GC can take over and finish up scanning the remaining survivors (typically, we have more GC threads than marking threads, so the overhead will be reduced) - if we supported region pinning, we could pin all the regions that were not scanned by the time the GC started so that the marking threads can resume scanning them after the GC completes Implementation notes: I introduced the concept of a "snapshot regions" in the ConcurrentMark which is a set of regions that need to be scanned at the start of a concurrent cycle. Currently, these can only be survivors but maybe we can use the same concept for something else in the future. Tony From poonam.bajaj at oracle.com Tue Jan 17 22:44:11 2012 From: poonam.bajaj at oracle.com (Poonam Bajaj) Date: Wed, 18 Jan 2012 12:14:11 +0530 Subject: CRR (S): 7097586: G1: improve the per-space output when using jmap -heap In-Reply-To: <4F0F2741.4090705@oracle.com> References: <4F0F2741.4090705@oracle.com> Message-ID: <4F166A3B.6020008@oracle.com> Hi Tony, Looks good! Thanks for making these serviceability changes.. regards, Poonam On 1/13/2012 12:02 AM, Tony Printezis wrote: > Hi all, > > I'd like a couple of code reviews for this change that enhances the > heap summary information generated by the SA (which is used for the > jmap -heap output): > > http://cr.openjdk.java.net/~tonyp/7097586/webrev.0/ > > Currently, the heap summary generated for G1 is as close as possible > to what's generated for the other GCs. Bengt made a good suggestion > that it'd be helpful to enhance the output with some G1-specific > information in order to make it more informative. The important > changes are the 15 lines or so that were changed in HeapSummary.java, > the rest is boilerplate to be able to access specific fields and > objects from the SA. I included before / after jmap -heap output below. > > Note that we actually had a small bug in the code which caused the > sizing information in the G1MonitoringSupport object to become > inconsistent between a cleanup and the subsequent GC: the old space > information was not updated to reflect any old region reclamation > during cleanup. I fixed this as part of this change too (I'll add a > note to the CR). > > Tony > > BEFORE: > > using thread-local object allocation. > Garbage-First (G1) GC with 8 thread(s) > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 1073741824 (1024.0MB) > NewSize = 1048576 (1.0MB) > MaxNewSize = 4294967295 (4095.9999990463257MB) > OldSize = 4194304 (4.0MB) > NewRatio = 2 > SurvivorRatio = 8 > PermSize = 16777216 (16.0MB) > MaxPermSize = 67108864 (64.0MB) > > Heap Usage: > G1 Young Generation > Eden Space: > capacity = 19922944 (19.0MB) > used = 3145728 (3.0MB) > free = 16777216 (16.0MB) > 15.789473684210526% used > From Space: > capacity = 2097152 (2.0MB) > used = 2097152 (2.0MB) > free = 0 (0.0MB) > 100.0% used > To Space: > capacity = 0 (0.0MB) > used = 0 (0.0MB) > free = 0 (0.0MB) > 0.0% used > G1 Old Generation > capacity = 19922944 (19.0MB) > used = 5849192 (5.578224182128906MB) > free = 14073752 (13.421775817871094MB) > 29.359074642783717% used > Perm Generation: > capacity = 16777216 (16.0MB) > used = 2749208 (2.6218490600585938MB) > free = 14028008 (13.378150939941406MB) > 16.38655662536621% used > > 1719 interned Strings occupying 137520 bytes. > > > AFTER (I marked the changes with bold; note that now there's only one > Survivor section, as G1 does not have the concept of two survivors > that are always allocated): > > using thread-local object allocation. > Garbage-First (G1) GC with 8 thread(s) > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 67108864 (64.0MB) > NewSize = 1048576 (1.0MB) > MaxNewSize = 4294967295 (4095.9999990463257MB) > OldSize = 4194304 (4.0MB) > NewRatio = 2 > SurvivorRatio = 8 > PermSize = 16777216 (16.0MB) > MaxPermSize = 67108864 (64.0MB) > * G1HeapRegionSize = 1048576 (1.0MB) > * > Heap Usage: > *G1 Heap: > regions = 57 > capacity = 59768832 (57.0MB) > used = 18018304 (17.18359375MB) > free = 41750528 (39.81640625MB) > 30.146655701754387% used > *G1 Young Generation: > Eden Space: > * regions = 3 > * capacity = 30408704 (29.0MB) > used = 3145728 (3.0MB) > free = 27262976 (26.0MB) > 10.344827586206897% used > *Survivor Space: > regions = 2 > * capacity = 2097152 (2.0MB) > used = 2097152 (2.0MB) > free = 0 (0.0MB) > 100.0% used > G1 Old Generation: > * regions = 13 > * capacity = 27262976 (26.0MB) > used = 12775424 (12.18359375MB) > free = 14487552 (13.81640625MB) > 46.85997596153846% used > Perm Generation: > capacity = 16777216 (16.0MB) > used = 2741840 (2.6148223876953125MB) > free = 14035376 (13.385177612304688MB) > 16.342639923095703% used > > 1710 interned Strings occupying 136904 bytes. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120118/423f7da0/attachment.html From bengt.rutisson at oracle.com Wed Jan 18 00:36:29 2012 From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com) Date: Wed, 18 Jan 2012 08:36:29 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7130334: G1: Change comments and error messages that refer to CMS in g1/concurrentMark.cpp/hpp Message-ID: <20120118083634.0B65A479B2@hg.openjdk.java.net> Changeset: 851b58c26def Author: brutisso Date: 2012-01-16 11:21 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/851b58c26def 7130334: G1: Change comments and error messages that refer to CMS in g1/concurrentMark.cpp/hpp Summary: Removed references to CMS in the concurrentMark.cpp/hpp files. Reviewed-by: tonyp, jmasa, johnc ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/concurrentMark.hpp From bengt.rutisson at oracle.com Wed Jan 18 02:07:52 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 18 Jan 2012 11:07:52 +0100 Subject: RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking In-Reply-To: <4F0E9931.9070303@oracle.com> References: <4E8A40BE.9020800@oracle.com> <4EC2B317.3000006@oracle.com> <4ED38788.4010106@oracle.com> <4EF0DEF9.30306@oracle.com> <4EF1AF4E.80107@oracle.com> <4EF2127E.5050809@oracle.com> <4F0E9931.9070303@oracle.com> Message-ID: <4F1699F8.7040405@oracle.com> Hi John, I had a quick look through the changes. Looks good to me. I looked a little closer at the log output and locking in VerifyLiveObjectDataHRClosure::doHeapRegion(). I know I suggested to take the ParGCRareEvent_lock to avoid interleaved output from different threads, but I am having second thoughts about this. All your output is prefixed with "Region %d" and logged with print_cr(), so even if several threads do logging at the same time it should be easy enough to parse the log files. There is one exception to this and that is this block: 1612 if (failures > 0 && _verbose) { 1613 gclog_or_tty->print("Region %d: bottom: "PTR_FORMAT", ntams: " 1614 PTR_FORMAT", top: "PTR_FORMAT", end: "PTR_FORMAT, 1615 hr->hrs_index(), hr->bottom(), hr->next_top_at_mark_start(), 1616 hr->top(), hr->end()); 1617 gclog_or_tty->print_cr(", marked_bytes: calc/actual "SIZE_FORMAT"/"SIZE_FORMAT, 1618 _calc_cl.region_marked_bytes(), 1619 hr->next_marked_bytes()); 1620 } I guess this is what triggered my suggestion to use the lock in the last review. But now that I look at it I think it would be better to just merge the print() and print_cr() statements into one single print_cr() statement. If you do that I don't see the need for taking the lock. Sorry for going back and forth on this topic. The rest of the changes look good to me! Bengt On 2012-01-12 09:26, John Cuthbertson wrote: > Hi Everyone, > > The latest incarnation of these changes can be found at: > http://cr.openjdk.java.net/~johnc/6484965/webrev.3/ > > The changes in this version include: > * Conditionally using a lock so that the output of the verification > closure executed by different threads does not interfere with each > other (suggested by Bengt). > * Merging up to the latest hotspot-gc tip (including Tony's marking > changes). This involved changing the evacuation failure code and > adding a suitable mark/count routine for use in > ConcurrentMark::grayRoot(). I also removed the counting changes from > code that has been made obsolete as a result of Tony's marking changes. > > Testing: a few runs of the GC test suite with low marking thresholds > (2 and 10%) with and without verification, and jprt. > > Thanks, > > JohnC > > On 12/21/2011 9:08 AM, John Cuthbertson wrote: >> Hi Bengt, >> >> That's a good observation. I guess it is possible but I haven't seen >> it in practice (though I was typically only using 4 threads when >> debugging a verification failure). It won't do any harm so I'll add >> the locking. >> >> Thanks, >> >> JohnC >> >> >> >> On 12/21/2011 2:05 AM, Bengt Rutisson wrote: >>> >>> Hi John, >>> >>> Thanks for updating your fix! Looks good. >>> >>> One quesiton: >>> In concurrentMark.cpp it seems to me that the >>> VerifyLiveObjectDataHRClosure could get the same kind of messed up >>> output that Tony just fixed with 7123165 for the VerifyLiveClosure >>> in heapRegion.cpp. There are several workers simultaneously doing >>> the verification, right? Is it worth adding the same kind of locking >>> that Tony added? >>> >>> Bengt >>> >>> On 2011-12-20 20:16, John Cuthbertson wrote: >>>> Hi Bengt, >>>> >>>> As I mentioned earlier - thanks for the code review. I've applied >>>> your suggestions, merged with the the latest changeset in >>>> hsx/hotspot-gc/hotspot (resolving any conflicts), fixed the int <-> >>>> size_t issue you also mentioned, and retested using the GC test >>>> suite. A new webrev can be found at: >>>> http://cr.openjdk.java.net/~johnc/6484965/webrev.2/ >>>> >>>> Specific replies are inline. >>>> >>>> On 11/28/11 05:07, Bengt Rutisson wrote: >>>>> >>>>> John, >>>>> >>>>> A little late, but here are some comments on this webrev. I know >>>>> you have some more improvements to this change coming, but overall >>>>> I think it looks good. Most of my comments are just minor coding >>>>> style comments. >>>>> >>>>> Bengt >>>>> >>>>> concurrentMark.hpp >>>>> >>>>> Rename ConcurrentMark::clear() to ConcurrentMark::clear_mark() or >>>>> ConcurrentMark::unmark()? The commment you added is definitely >>>>> needed to understand what this method does. But it would be even >>>>> better if it was possible to get that from the method name itself. >>>> >>>> Done. >>>> >>>>> It seems like everywhere we use count_marked_bytes_for(int >>>>> worker_i) we almost directly use the array returned to index with >>>>> the heap region that we are interested in. How about wrapping all >>>>> of this is in something like count_set_marked_bytes_for(int >>>>> worker_i, int hrs_index) and count_get_marked_bytes_for(int >>>>> worker_i, int hrs_index) ? That way the data structure does not >>>>> have to be exposed outside ConcurrentMark. It would mean that >>>>> ConcurrentMark::count_region() would have to take a worker_i value >>>>> instead of a marked_bytes_array. >>>> >>>> I did not do this. I embed the marked_bytes array for a worker into >>>> the CMTask for that worker to save a de-reference. This was one of >>>> the requests from the original code walk-through. Avoiding the >>>> de-reference in the CMTask::do_marking_step() shaves a couple of >>>> points off the marking time. I think your suggestion would >>>> reinstate the de-reference again and we would lose those few >>>> percentage points again. >>>> >>>>> If you don't agree with the suggestion above I would suggest to >>>>> change the name from count_marked_bytes_for() to >>>>> count_marked_bytes_array_for() since in every place that it is >>>>> being called the resulting value is stored in a local variable >>>>> called marked_bytes_array, which seems like a more informative >>>>> name to me. >>>> >>>> Done. I agree - the new name sounds better. >>>> >>>>> I think this comment: >>>>> >>>>> // As above - but we don't know the heap region containing the >>>>> // object and so have to supply it. >>>>> inline bool par_mark_and_count(oop obj, int worker_i); >>>>> >>>>> should be something like "we don't know the heap region containing >>>>> the object so we will have to look it up". >>>>> >>>>> Same thing here: >>>>> >>>>> // As above - but we don't have the heap region containing the >>>>> // object, so we have to supply it. >>>>> // Should *not* be called from parallel code. >>>>> inline bool mark_and_count(oop obj); >>>>> >>>>> >>>> >>>> Comments were changed to: >>>> >>>> >>>>> concurrentMark.cpp >>>>> >>>>> Since you are changing CalcLiveObjectsClosure::doHeapRegion() >>>>> anyway, could you please remove this unused code (1393-1397): >>>>> >>>>> /* >>>>> gclog_or_tty->print_cr("Setting bits from %d/%d.", >>>>> obj_card_num - _bottom_card_num, >>>>> obj_last_card_num - _bottom_card_num); >>>>> */ >>>>> >>>>> >>>> >>>> Done. >>>> >>>>> What about the destructor ConcurrentMark::~ConcurrentMark() ? I >>>>> remember Tony mentioning that it won't be called. Do you still >>>>> want to keep the code? >>>> >>>> I removed the entire destructor - I don't see it being called in >>>> the experiments I've run. >>>> >>>>> FinalCountDataUpdateClosure::set_bit_for_region() >>>>> Probably not worth it, but would it make sense to add information >>>>> in a startsHumongous HeapRegion to be able to give you the last >>>>> continuesHumongous region? Since we know this when we set the >>>>> regions up it seems like a waste to have to iterate over the >>>>> region list to find it. >>>> >>>> If you read the original comment - the original author did not want >>>> to make any assumptions about the internal field values of the >>>> HeapRegions spanned by a humongous object and so used the loop >>>> technique. I think you are correct and I now use the information in >>>> the startsHumongous region to find the index of the last >>>> continuesHumongous region spaned by the H-obj. >>>> >>>>> G1ParFinalCountTask >>>>> To me it is a bit surprising that we mix in the verify code inside >>>>> this closure. Would it be possible to extract this code out somehow? >>>> >>>> I did it this way to avoid another iteration over the heap regions. >>>> But it probably does make more sense to separate them and use >>>> another iteration to do the verify. Done. >>>> >>>>> Line 3378: "// Use fill_to_bytes". Is this something you plan on >>>>> doing? >>>> >>>> I removed the comment. I was thinking of doing this as >>>> fill_to_bytes is typically implemented using (a possibly >>>> specialized version of) memset. But it's probably not worth it in >>>> this case. >>>> >>>>> G1ParFinalCountTask::work() >>>>> Just for the record. I don't really like the way we have to set up >>>>> both a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even >>>>> though we will only use them if we have VerifyDuringGC enabled. I >>>>> realize it is due to the scoping, but I still think it obstucts >>>>> the code flow and introduces unnecessary work. Unfortunately I >>>>> don't have a good suggestion for how to work around it. >>>>> >>>>> Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are >>>>> StackObjs I assume it is not possible to get around the issue with >>>>> a ResourceMark. >>>> >>>> Now that the verification is performed in a separate iteration of >>>> the heap regions there's no need to create the >>>> VerifyLiveObjectDataHRClosure and Mux2HRClosure instances here. >>>> Done. I have also removed the now-redundant Mux2HRClosure. >>>> >>>> Hopefully the new webrev addresses these comments. >>>> >>>> Thanks again for looking. >>>> >>>> JohnC >>>> >>> >> > From bengt.rutisson at oracle.com Wed Jan 18 03:38:27 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 18 Jan 2012 12:38:27 +0100 Subject: CRR (XXS): 7078465: G1: Don't use the undefined value (-1) for the G1 old memory pool max size In-Reply-To: <4F15E5F2.4000304@oracle.com> References: <4F0F016B.8040707@oracle.com> <4F15697C.60504@oracle.com> <4F15DE5E.3050401@oracle.com> <4F15E5F2.4000304@oracle.com> Message-ID: <4F16AF33.3060202@oracle.com> Tony, On 2012-01-17 22:19, Tony Printezis wrote: > Bengt, > > I don't think what I had written on the CR actually makes sense. The > old gen min capacity cannot be the heap min capacity given that, when > the heap capacity is at the minimum, the old gen capacity could be > smaller (we need space for the young gen for example). So, I think > I'll leave it as is (and I'll update the CR if you agree). Good point. I completely agree. > I did update a couple of related comments in the code, here's the > updated webrev: > > http://cr.openjdk.java.net/~tonyp/7078465/webrev.1/ Looks good. Great that you found the comment too. Copyright year needs to be 2012... ;-) > I'll push it tomorrow as long as noone has any objections. I am all for it! Bengt > > Tony > > On 01/17/2012 03:47 PM, Tony Printezis wrote: >> Bengt, >> >> Good point, I'll add it to the change and do some more testing before >> publishing a new webrev. Thanks, >> >> Tony >> >> On 01/17/2012 07:28 AM, Bengt Rutisson wrote: >>> >>> Hi Tony, >>> >>> This looks good! :-) >>> >>> One question. The CR says "Similarly, we should consider setting the >>> old minimum capacity to the heap minimum capacity. This is only used >>> by jstat and currently we set the minimum capacity of all the spaces >>> to 0.". Do you want to do that as well? >>> >>> Bengt >>> >>> On 2012-01-12 16:51, Tony Printezis wrote: >>>> Hi all, >>>> >>>> I'd like a couple of code reviews for this very small change (one >>>> line!): >>>> >>>> http://cr.openjdk.java.net/~tonyp/7078465/webrev.0/ >>>> >>>> Currently, all the G1 memory pools return "undefined" (-1) as their >>>> max size given that there are no hard boundaries between them. Jon >>>> Masamitsu suggested to at least return the heap max for the old >>>> memory pool so that the pool data is a little bit more informative. >>>> >>>> Tony >>>> >>> From tony.printezis at oracle.com Wed Jan 18 05:32:51 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Wed, 18 Jan 2012 08:32:51 -0500 Subject: CRR (XXS): 7078465: G1: Don't use the undefined value (-1) for the G1 old memory pool max size In-Reply-To: <4F16AF33.3060202@oracle.com> References: <4F0F016B.8040707@oracle.com> <4F15697C.60504@oracle.com> <4F15DE5E.3050401@oracle.com> <4F15E5F2.4000304@oracle.com> <4F16AF33.3060202@oracle.com> Message-ID: <4F16CA03.80903@oracle.com> Bengt, Thanks for the comments below and for catching that I forgot to update the copyright year! I will try to push this today after your get your second changeset in. Tony On 1/18/2012 6:38 AM, Bengt Rutisson wrote: > > Tony, > > On 2012-01-17 22:19, Tony Printezis wrote: >> Bengt, >> >> I don't think what I had written on the CR actually makes sense. The >> old gen min capacity cannot be the heap min capacity given that, when >> the heap capacity is at the minimum, the old gen capacity could be >> smaller (we need space for the young gen for example). So, I think >> I'll leave it as is (and I'll update the CR if you agree). > > Good point. I completely agree. > >> I did update a couple of related comments in the code, here's the >> updated webrev: >> >> http://cr.openjdk.java.net/~tonyp/7078465/webrev.1/ > > Looks good. Great that you found the comment too. > > Copyright year needs to be 2012... ;-) > >> I'll push it tomorrow as long as noone has any objections. > > I am all for it! > > Bengt > >> >> Tony >> >> On 01/17/2012 03:47 PM, Tony Printezis wrote: >>> Bengt, >>> >>> Good point, I'll add it to the change and do some more testing >>> before publishing a new webrev. Thanks, >>> >>> Tony >>> >>> On 01/17/2012 07:28 AM, Bengt Rutisson wrote: >>>> >>>> Hi Tony, >>>> >>>> This looks good! :-) >>>> >>>> One question. The CR says "Similarly, we should consider setting >>>> the old minimum capacity to the heap minimum capacity. This is only >>>> used by jstat and currently we set the minimum capacity of all the >>>> spaces to 0.". Do you want to do that as well? >>>> >>>> Bengt >>>> >>>> On 2012-01-12 16:51, Tony Printezis wrote: >>>>> Hi all, >>>>> >>>>> I'd like a couple of code reviews for this very small change (one >>>>> line!): >>>>> >>>>> http://cr.openjdk.java.net/~tonyp/7078465/webrev.0/ >>>>> >>>>> Currently, all the G1 memory pools return "undefined" (-1) as >>>>> their max size given that there are no hard boundaries between >>>>> them. Jon Masamitsu suggested to at least return the heap max for >>>>> the old memory pool so that the pool data is a little bit more >>>>> informative. >>>>> >>>>> Tony >>>>> >>>> > From bengt.rutisson at oracle.com Wed Jan 18 06:01:38 2012 From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com) Date: Wed, 18 Jan 2012 14:01:38 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 6976060: G1: humongous object allocations should initiate marking cycles when necessary Message-ID: <20120118140140.DA1C8479B6@hg.openjdk.java.net> Changeset: 9509c20bba28 Author: brutisso Date: 2012-01-16 22:10 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/9509c20bba28 6976060: G1: humongous object allocations should initiate marking cycles when necessary Reviewed-by: tonyp, johnc ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.hpp ! src/share/vm/gc_implementation/g1/vm_operations_g1.cpp ! src/share/vm/gc_interface/gcCause.cpp ! src/share/vm/gc_interface/gcCause.hpp From bengt.rutisson at oracle.com Wed Jan 18 06:23:51 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 18 Jan 2012 15:23:51 +0100 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F1608C0.2040601@oracle.com> References: <4F1608C0.2040601@oracle.com> Message-ID: <4F16D5F7.4070605@oracle.com> Tony, Overall this looks really good. Thanks for fixing it. Some comments: First, a general question regarding naming and logging. We now talk about "snapshot" a lot. It is a pretty good name, but maybe it needs some more context to be understandable in the code and the GC log. I don't have any really good names, but maybe "survivor_snapshot" or "initial_mark_snapshot"? concurrentMark.inline.hpp if (hr == NULL) { hr = _g1h->heap_region_containing_raw(addr); // Given that we're looking for a region that contains an object // header it's impossible to get back a HC region. assert(!hr->continuesHumongous(), "sanity"); } else { assert(hr->is_in(addr), "pre-condition"); } The first assert should probably hold even for regions that are passed in to grayRoot() right? So, maybe something like: if (hr == NULL) { hr = _g1h->heap_region_containing_raw(addr); } else { assert(hr->is_in(addr), "pre-condition"); } // Given that we need a region that contains an object // header it's impossible for it to be a HC region. assert(!hr->continuesHumongous(), "sanity"); concurrentMarkThread.cpp ConcurrentMarkThread::run() Why do we do the explicit time/date stamping? gclog_or_tty->date_stamp(PrintGCDateStamps); gclog_or_tty->stamp(PrintGCTimeStamps); gclog_or_tty->print_cr("[GC concurrent-snapshot-scan-start]"); why is it not enough with the normal -XX:+PrintGCTimeStamps information? This is probably correct since I see this pattern in other places. But I would like to understand why we do it. g1CollectedHeap.cpp: G1CollectedHeap::do_collection() Is it worth logging how long we had to wait in _cm->snapshot_regions()->wait_until_scan_finished(), the same way that we do in G1CollectedHeap::do_collection_pause_at_safepoint()? Finally, just some food for thought. Could this be generalized to more roots? I mean take a snapshot and scan it concurrently. Bengt On 2012-01-18 00:48, Tony Printezis wrote: > Hi all, > > Can I have a couple of code reviews for this change that re-enables > the use of survivor regions during the initial-mark pause? > > http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ > > From the CR: > > We could scan the survivors as we're copying them, however this will > require more work during the initial-mark GCs (and in particular: > special-case code in the fast path). > > A better approach is to let the concurrent marking threads scan the > survivors and mark everything reachable from them a) before any more > concurrent marking work is done (so that we can just mark the objects, > without needing to push them on a stack, and let the "finger" > algorithm discover them) and b) before the next GC starts (since, if > we copy them, we won't know which of the new survivors are the ones we > need to scan). > > This approach has the advantage that it does not require any extra > work during the initial-mark GCs and all the work is done by the > concurrent marking threads. However, it has the disadvantage that the > survivor scanning might hold up the next GC. In most cases this should > not be an issue as GCs take place at a reasonably low rate. If it does > become a problem we could consider the following: > > - like when the GC locker is active, try to extend the eden to give a > bit more time to the marking threads to finish scanning the survivors > - instead of waiting for the marking threads, a GC can take over and > finish up scanning the remaining survivors (typically, we have more GC > threads than marking threads, so the overhead will be reduced) > - if we supported region pinning, we could pin all the regions that > were not scanned by the time the GC started so that the marking > threads can resume scanning them after the GC completes > > Implementation notes: > > I introduced the concept of a "snapshot regions" in the ConcurrentMark > which is a set of regions that need to be scanned at the start of a > concurrent cycle. Currently, these can only be survivors but maybe we > can use the same concept for something else in the future. > > Tony > > From john.cuthbertson at oracle.com Wed Jan 18 09:53:28 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 18 Jan 2012 09:53:28 -0800 Subject: RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking In-Reply-To: <4F1699F8.7040405@oracle.com> References: <4E8A40BE.9020800@oracle.com> <4EC2B317.3000006@oracle.com> <4ED38788.4010106@oracle.com> <4EF0DEF9.30306@oracle.com> <4EF1AF4E.80107@oracle.com> <4EF2127E.5050809@oracle.com> <4F0E9931.9070303@oracle.com> <4F1699F8.7040405@oracle.com> Message-ID: <4F170718.3090601@oracle.com> Hi Bengt, Not a problem. I'll take out the locking and combine the print statements. If you don't mind I'll also change the prints in the verification code to use HR_FORMAT and HR_FORMAT_PARAMS to make them a bit more consistent with others. Thanks again. JohnC On 1/18/2012 2:07 AM, Bengt Rutisson wrote: > > Hi John, > > I had a quick look through the changes. Looks good to me. > > I looked a little closer at the log output and locking in > VerifyLiveObjectDataHRClosure::doHeapRegion(). I know I suggested to > take the ParGCRareEvent_lock to avoid interleaved output from > different threads, but I am having second thoughts about this. > > All your output is prefixed with "Region %d" and logged with > print_cr(), so even if several threads do logging at the same time it > should be easy enough to parse the log files. There is one exception > to this and that is this block: > > 1612 if (failures > 0 && _verbose) { > 1613 gclog_or_tty->print("Region %d: bottom: "PTR_FORMAT", ntams: " > 1614 PTR_FORMAT", top: "PTR_FORMAT", end: > "PTR_FORMAT, > 1615 hr->hrs_index(), hr->bottom(), > hr->next_top_at_mark_start(), > 1616 hr->top(), hr->end()); > 1617 gclog_or_tty->print_cr(", marked_bytes: calc/actual > "SIZE_FORMAT"/"SIZE_FORMAT, > 1618 _calc_cl.region_marked_bytes(), > 1619 hr->next_marked_bytes()); > 1620 } > > I guess this is what triggered my suggestion to use the lock in the > last review. But now that I look at it I think it would be better to > just merge the print() and print_cr() statements into one single > print_cr() statement. If you do that I don't see the need for taking > the lock. > > Sorry for going back and forth on this topic. > > The rest of the changes look good to me! > Bengt > > On 2012-01-12 09:26, John Cuthbertson wrote: >> Hi Everyone, >> >> The latest incarnation of these changes can be found at: >> http://cr.openjdk.java.net/~johnc/6484965/webrev.3/ >> >> The changes in this version include: >> * Conditionally using a lock so that the output of the verification >> closure executed by different threads does not interfere with each >> other (suggested by Bengt). >> * Merging up to the latest hotspot-gc tip (including Tony's marking >> changes). This involved changing the evacuation failure code and >> adding a suitable mark/count routine for use in >> ConcurrentMark::grayRoot(). I also removed the counting changes from >> code that has been made obsolete as a result of Tony's marking changes. >> >> Testing: a few runs of the GC test suite with low marking thresholds >> (2 and 10%) with and without verification, and jprt. >> >> Thanks, >> >> JohnC >> >> On 12/21/2011 9:08 AM, John Cuthbertson wrote: >>> Hi Bengt, >>> >>> That's a good observation. I guess it is possible but I haven't seen >>> it in practice (though I was typically only using 4 threads when >>> debugging a verification failure). It won't do any harm so I'll add >>> the locking. >>> >>> Thanks, >>> >>> JohnC >>> >>> >>> >>> On 12/21/2011 2:05 AM, Bengt Rutisson wrote: >>>> >>>> Hi John, >>>> >>>> Thanks for updating your fix! Looks good. >>>> >>>> One quesiton: >>>> In concurrentMark.cpp it seems to me that the >>>> VerifyLiveObjectDataHRClosure could get the same kind of messed up >>>> output that Tony just fixed with 7123165 for the VerifyLiveClosure >>>> in heapRegion.cpp. There are several workers simultaneously doing >>>> the verification, right? Is it worth adding the same kind of >>>> locking that Tony added? >>>> >>>> Bengt >>>> >>>> On 2011-12-20 20:16, John Cuthbertson wrote: >>>>> Hi Bengt, >>>>> >>>>> As I mentioned earlier - thanks for the code review. I've applied >>>>> your suggestions, merged with the the latest changeset in >>>>> hsx/hotspot-gc/hotspot (resolving any conflicts), fixed the int >>>>> <-> size_t issue you also mentioned, and retested using the GC >>>>> test suite. A new webrev can be found at: >>>>> http://cr.openjdk.java.net/~johnc/6484965/webrev.2/ >>>>> >>>>> Specific replies are inline. >>>>> >>>>> On 11/28/11 05:07, Bengt Rutisson wrote: >>>>>> >>>>>> John, >>>>>> >>>>>> A little late, but here are some comments on this webrev. I know >>>>>> you have some more improvements to this change coming, but >>>>>> overall I think it looks good. Most of my comments are just minor >>>>>> coding style comments. >>>>>> >>>>>> Bengt >>>>>> >>>>>> concurrentMark.hpp >>>>>> >>>>>> Rename ConcurrentMark::clear() to ConcurrentMark::clear_mark() >>>>>> or ConcurrentMark::unmark()? The commment you added is definitely >>>>>> needed to understand what this method does. But it would be even >>>>>> better if it was possible to get that from the method name itself. >>>>> >>>>> Done. >>>>> >>>>>> It seems like everywhere we use count_marked_bytes_for(int >>>>>> worker_i) we almost directly use the array returned to index with >>>>>> the heap region that we are interested in. How about wrapping all >>>>>> of this is in something like count_set_marked_bytes_for(int >>>>>> worker_i, int hrs_index) and count_get_marked_bytes_for(int >>>>>> worker_i, int hrs_index) ? That way the data structure does not >>>>>> have to be exposed outside ConcurrentMark. It would mean that >>>>>> ConcurrentMark::count_region() would have to take a worker_i >>>>>> value instead of a marked_bytes_array. >>>>> >>>>> I did not do this. I embed the marked_bytes array for a worker >>>>> into the CMTask for that worker to save a de-reference. This was >>>>> one of the requests from the original code walk-through. Avoiding >>>>> the de-reference in the CMTask::do_marking_step() shaves a couple >>>>> of points off the marking time. I think your suggestion would >>>>> reinstate the de-reference again and we would lose those few >>>>> percentage points again. >>>>> >>>>>> If you don't agree with the suggestion above I would suggest to >>>>>> change the name from count_marked_bytes_for() to >>>>>> count_marked_bytes_array_for() since in every place that it is >>>>>> being called the resulting value is stored in a local variable >>>>>> called marked_bytes_array, which seems like a more informative >>>>>> name to me. >>>>> >>>>> Done. I agree - the new name sounds better. >>>>> >>>>>> I think this comment: >>>>>> >>>>>> // As above - but we don't know the heap region containing the >>>>>> // object and so have to supply it. >>>>>> inline bool par_mark_and_count(oop obj, int worker_i); >>>>>> >>>>>> should be something like "we don't know the heap region >>>>>> containing the object so we will have to look it up". >>>>>> >>>>>> Same thing here: >>>>>> >>>>>> // As above - but we don't have the heap region containing the >>>>>> // object, so we have to supply it. >>>>>> // Should *not* be called from parallel code. >>>>>> inline bool mark_and_count(oop obj); >>>>>> >>>>>> >>>>> >>>>> Comments were changed to: >>>>> >>>>> >>>>>> concurrentMark.cpp >>>>>> >>>>>> Since you are changing CalcLiveObjectsClosure::doHeapRegion() >>>>>> anyway, could you please remove this unused code (1393-1397): >>>>>> >>>>>> /* >>>>>> gclog_or_tty->print_cr("Setting bits from %d/%d.", >>>>>> obj_card_num - _bottom_card_num, >>>>>> obj_last_card_num - >>>>>> _bottom_card_num); >>>>>> */ >>>>>> >>>>>> >>>>> >>>>> Done. >>>>> >>>>>> What about the destructor ConcurrentMark::~ConcurrentMark() ? I >>>>>> remember Tony mentioning that it won't be called. Do you still >>>>>> want to keep the code? >>>>> >>>>> I removed the entire destructor - I don't see it being called in >>>>> the experiments I've run. >>>>> >>>>>> FinalCountDataUpdateClosure::set_bit_for_region() >>>>>> Probably not worth it, but would it make sense to add information >>>>>> in a startsHumongous HeapRegion to be able to give you the last >>>>>> continuesHumongous region? Since we know this when we set the >>>>>> regions up it seems like a waste to have to iterate over the >>>>>> region list to find it. >>>>> >>>>> If you read the original comment - the original author did not >>>>> want to make any assumptions about the internal field values of >>>>> the HeapRegions spanned by a humongous object and so used the loop >>>>> technique. I think you are correct and I now use the information >>>>> in the startsHumongous region to find the index of the last >>>>> continuesHumongous region spaned by the H-obj. >>>>> >>>>>> G1ParFinalCountTask >>>>>> To me it is a bit surprising that we mix in the verify code >>>>>> inside this closure. Would it be possible to extract this code >>>>>> out somehow? >>>>> >>>>> I did it this way to avoid another iteration over the heap >>>>> regions. But it probably does make more sense to separate them and >>>>> use another iteration to do the verify. Done. >>>>> >>>>>> Line 3378: "// Use fill_to_bytes". Is this something you plan on >>>>>> doing? >>>>> >>>>> I removed the comment. I was thinking of doing this as >>>>> fill_to_bytes is typically implemented using (a possibly >>>>> specialized version of) memset. But it's probably not worth it in >>>>> this case. >>>>> >>>>>> G1ParFinalCountTask::work() >>>>>> Just for the record. I don't really like the way we have to set >>>>>> up both a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even >>>>>> though we will only use them if we have VerifyDuringGC enabled. I >>>>>> realize it is due to the scoping, but I still think it obstucts >>>>>> the code flow and introduces unnecessary work. Unfortunately I >>>>>> don't have a good suggestion for how to work around it. >>>>>> >>>>>> Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are >>>>>> StackObjs I assume it is not possible to get around the issue >>>>>> with a ResourceMark. >>>>> >>>>> Now that the verification is performed in a separate iteration of >>>>> the heap regions there's no need to create the >>>>> VerifyLiveObjectDataHRClosure and Mux2HRClosure instances here. >>>>> Done. I have also removed the now-redundant Mux2HRClosure. >>>>> >>>>> Hopefully the new webrev addresses these comments. >>>>> >>>>> Thanks again for looking. >>>>> >>>>> JohnC >>>>> >>>> >>> >> > From john.cuthbertson at oracle.com Wed Jan 18 10:08:40 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 18 Jan 2012 10:08:40 -0800 Subject: RFR(XS): 7129514: time warp warnings after 7117303 Message-ID: <4F170AA8.5070403@oracle.com> Hi Everyone, While making the changes for 711303, I missed a few calls to os::javaTimeMillis() (specifically with updating the time since the last GC). As a consequence we can still see the occasional time-warp warning. The issue is that os::javaTimeMillis() returns values that are not guaranteed to be monotonically non-decreasing and so they can go backwards. I've replaced these calls to an equivalent that uses os::javaTimeNanos(), which will return values that are monotonically non-decreasing if the underlying system time source supports such a mode. Many thanks to David Holmes for diagnosing the issue. Thanks, JohnC From bengt.rutisson at oracle.com Wed Jan 18 11:19:36 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 18 Jan 2012 20:19:36 +0100 Subject: RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking In-Reply-To: <4F170718.3090601@oracle.com> References: <4E8A40BE.9020800@oracle.com> <4EC2B317.3000006@oracle.com> <4ED38788.4010106@oracle.com> <4EF0DEF9.30306@oracle.com> <4EF1AF4E.80107@oracle.com> <4EF2127E.5050809@oracle.com> <4F0E9931.9070303@oracle.com> <4F1699F8.7040405@oracle.com> <4F170718.3090601@oracle.com> Message-ID: <4F171B48.6060404@oracle.com> John, On 2012-01-18 18:53, John Cuthbertson wrote: > Hi Bengt, > > Not a problem. I'll take out the locking and combine the print > statements. If you don't mind I'll also change the prints in the > verification code to use HR_FORMAT and HR_FORMAT_PARAMS to make them a > bit more consistent with others. Sounds great. Ship it! Thanks, Bengt > > Thanks again. > > JohnC > > On 1/18/2012 2:07 AM, Bengt Rutisson wrote: >> >> Hi John, >> >> I had a quick look through the changes. Looks good to me. >> >> I looked a little closer at the log output and locking in >> VerifyLiveObjectDataHRClosure::doHeapRegion(). I know I suggested to >> take the ParGCRareEvent_lock to avoid interleaved output from >> different threads, but I am having second thoughts about this. >> >> All your output is prefixed with "Region %d" and logged with >> print_cr(), so even if several threads do logging at the same time it >> should be easy enough to parse the log files. There is one exception >> to this and that is this block: >> >> 1612 if (failures > 0 && _verbose) { >> 1613 gclog_or_tty->print("Region %d: bottom: "PTR_FORMAT", >> ntams: " >> 1614 PTR_FORMAT", top: "PTR_FORMAT", end: >> "PTR_FORMAT, >> 1615 hr->hrs_index(), hr->bottom(), >> hr->next_top_at_mark_start(), >> 1616 hr->top(), hr->end()); >> 1617 gclog_or_tty->print_cr(", marked_bytes: calc/actual >> "SIZE_FORMAT"/"SIZE_FORMAT, >> 1618 _calc_cl.region_marked_bytes(), >> 1619 hr->next_marked_bytes()); >> 1620 } >> >> I guess this is what triggered my suggestion to use the lock in the >> last review. But now that I look at it I think it would be better to >> just merge the print() and print_cr() statements into one single >> print_cr() statement. If you do that I don't see the need for taking >> the lock. >> >> Sorry for going back and forth on this topic. >> >> The rest of the changes look good to me! >> Bengt >> >> On 2012-01-12 09:26, John Cuthbertson wrote: >>> Hi Everyone, >>> >>> The latest incarnation of these changes can be found at: >>> http://cr.openjdk.java.net/~johnc/6484965/webrev.3/ >>> >>> The changes in this version include: >>> * Conditionally using a lock so that the output of the verification >>> closure executed by different threads does not interfere with each >>> other (suggested by Bengt). >>> * Merging up to the latest hotspot-gc tip (including Tony's marking >>> changes). This involved changing the evacuation failure code and >>> adding a suitable mark/count routine for use in >>> ConcurrentMark::grayRoot(). I also removed the counting changes from >>> code that has been made obsolete as a result of Tony's marking changes. >>> >>> Testing: a few runs of the GC test suite with low marking thresholds >>> (2 and 10%) with and without verification, and jprt. >>> >>> Thanks, >>> >>> JohnC >>> >>> On 12/21/2011 9:08 AM, John Cuthbertson wrote: >>>> Hi Bengt, >>>> >>>> That's a good observation. I guess it is possible but I haven't >>>> seen it in practice (though I was typically only using 4 threads >>>> when debugging a verification failure). It won't do any harm so >>>> I'll add the locking. >>>> >>>> Thanks, >>>> >>>> JohnC >>>> >>>> >>>> >>>> On 12/21/2011 2:05 AM, Bengt Rutisson wrote: >>>>> >>>>> Hi John, >>>>> >>>>> Thanks for updating your fix! Looks good. >>>>> >>>>> One quesiton: >>>>> In concurrentMark.cpp it seems to me that the >>>>> VerifyLiveObjectDataHRClosure could get the same kind of messed up >>>>> output that Tony just fixed with 7123165 for the VerifyLiveClosure >>>>> in heapRegion.cpp. There are several workers simultaneously doing >>>>> the verification, right? Is it worth adding the same kind of >>>>> locking that Tony added? >>>>> >>>>> Bengt >>>>> >>>>> On 2011-12-20 20:16, John Cuthbertson wrote: >>>>>> Hi Bengt, >>>>>> >>>>>> As I mentioned earlier - thanks for the code review. I've applied >>>>>> your suggestions, merged with the the latest changeset in >>>>>> hsx/hotspot-gc/hotspot (resolving any conflicts), fixed the int >>>>>> <-> size_t issue you also mentioned, and retested using the GC >>>>>> test suite. A new webrev can be found at: >>>>>> http://cr.openjdk.java.net/~johnc/6484965/webrev.2/ >>>>>> >>>>>> Specific replies are inline. >>>>>> >>>>>> On 11/28/11 05:07, Bengt Rutisson wrote: >>>>>>> >>>>>>> John, >>>>>>> >>>>>>> A little late, but here are some comments on this webrev. I know >>>>>>> you have some more improvements to this change coming, but >>>>>>> overall I think it looks good. Most of my comments are just >>>>>>> minor coding style comments. >>>>>>> >>>>>>> Bengt >>>>>>> >>>>>>> concurrentMark.hpp >>>>>>> >>>>>>> Rename ConcurrentMark::clear() to ConcurrentMark::clear_mark() >>>>>>> or ConcurrentMark::unmark()? The commment you added is >>>>>>> definitely needed to understand what this method does. But it >>>>>>> would be even better if it was possible to get that from the >>>>>>> method name itself. >>>>>> >>>>>> Done. >>>>>> >>>>>>> It seems like everywhere we use count_marked_bytes_for(int >>>>>>> worker_i) we almost directly use the array returned to index >>>>>>> with the heap region that we are interested in. How about >>>>>>> wrapping all of this is in something like >>>>>>> count_set_marked_bytes_for(int worker_i, int hrs_index) and >>>>>>> count_get_marked_bytes_for(int worker_i, int hrs_index) ? That >>>>>>> way the data structure does not have to be exposed outside >>>>>>> ConcurrentMark. It would mean that >>>>>>> ConcurrentMark::count_region() would have to take a worker_i >>>>>>> value instead of a marked_bytes_array. >>>>>> >>>>>> I did not do this. I embed the marked_bytes array for a worker >>>>>> into the CMTask for that worker to save a de-reference. This was >>>>>> one of the requests from the original code walk-through. Avoiding >>>>>> the de-reference in the CMTask::do_marking_step() shaves a couple >>>>>> of points off the marking time. I think your suggestion would >>>>>> reinstate the de-reference again and we would lose those few >>>>>> percentage points again. >>>>>> >>>>>>> If you don't agree with the suggestion above I would suggest to >>>>>>> change the name from count_marked_bytes_for() to >>>>>>> count_marked_bytes_array_for() since in every place that it is >>>>>>> being called the resulting value is stored in a local variable >>>>>>> called marked_bytes_array, which seems like a more informative >>>>>>> name to me. >>>>>> >>>>>> Done. I agree - the new name sounds better. >>>>>> >>>>>>> I think this comment: >>>>>>> >>>>>>> // As above - but we don't know the heap region containing the >>>>>>> // object and so have to supply it. >>>>>>> inline bool par_mark_and_count(oop obj, int worker_i); >>>>>>> >>>>>>> should be something like "we don't know the heap region >>>>>>> containing the object so we will have to look it up". >>>>>>> >>>>>>> Same thing here: >>>>>>> >>>>>>> // As above - but we don't have the heap region containing the >>>>>>> // object, so we have to supply it. >>>>>>> // Should *not* be called from parallel code. >>>>>>> inline bool mark_and_count(oop obj); >>>>>>> >>>>>>> >>>>>> >>>>>> Comments were changed to: >>>>>> >>>>>> >>>>>>> concurrentMark.cpp >>>>>>> >>>>>>> Since you are changing CalcLiveObjectsClosure::doHeapRegion() >>>>>>> anyway, could you please remove this unused code (1393-1397): >>>>>>> >>>>>>> /* >>>>>>> gclog_or_tty->print_cr("Setting bits from %d/%d.", >>>>>>> obj_card_num - _bottom_card_num, >>>>>>> obj_last_card_num - >>>>>>> _bottom_card_num); >>>>>>> */ >>>>>>> >>>>>>> >>>>>> >>>>>> Done. >>>>>> >>>>>>> What about the destructor ConcurrentMark::~ConcurrentMark() ? I >>>>>>> remember Tony mentioning that it won't be called. Do you still >>>>>>> want to keep the code? >>>>>> >>>>>> I removed the entire destructor - I don't see it being called in >>>>>> the experiments I've run. >>>>>> >>>>>>> FinalCountDataUpdateClosure::set_bit_for_region() >>>>>>> Probably not worth it, but would it make sense to add >>>>>>> information in a startsHumongous HeapRegion to be able to give >>>>>>> you the last continuesHumongous region? Since we know this when >>>>>>> we set the regions up it seems like a waste to have to iterate >>>>>>> over the region list to find it. >>>>>> >>>>>> If you read the original comment - the original author did not >>>>>> want to make any assumptions about the internal field values of >>>>>> the HeapRegions spanned by a humongous object and so used the >>>>>> loop technique. I think you are correct and I now use the >>>>>> information in the startsHumongous region to find the index of >>>>>> the last continuesHumongous region spaned by the H-obj. >>>>>> >>>>>>> G1ParFinalCountTask >>>>>>> To me it is a bit surprising that we mix in the verify code >>>>>>> inside this closure. Would it be possible to extract this code >>>>>>> out somehow? >>>>>> >>>>>> I did it this way to avoid another iteration over the heap >>>>>> regions. But it probably does make more sense to separate them >>>>>> and use another iteration to do the verify. Done. >>>>>> >>>>>>> Line 3378: "// Use fill_to_bytes". Is this something you plan on >>>>>>> doing? >>>>>> >>>>>> I removed the comment. I was thinking of doing this as >>>>>> fill_to_bytes is typically implemented using (a possibly >>>>>> specialized version of) memset. But it's probably not worth it in >>>>>> this case. >>>>>> >>>>>>> G1ParFinalCountTask::work() >>>>>>> Just for the record. I don't really like the way we have to set >>>>>>> up both a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even >>>>>>> though we will only use them if we have VerifyDuringGC enabled. >>>>>>> I realize it is due to the scoping, but I still think it >>>>>>> obstucts the code flow and introduces unnecessary work. >>>>>>> Unfortunately I don't have a good suggestion for how to work >>>>>>> around it. >>>>>>> >>>>>>> Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are >>>>>>> StackObjs I assume it is not possible to get around the issue >>>>>>> with a ResourceMark. >>>>>> >>>>>> Now that the verification is performed in a separate iteration of >>>>>> the heap regions there's no need to create the >>>>>> VerifyLiveObjectDataHRClosure and Mux2HRClosure instances here. >>>>>> Done. I have also removed the now-redundant Mux2HRClosure. >>>>>> >>>>>> Hopefully the new webrev addresses these comments. >>>>>> >>>>>> Thanks again for looking. >>>>>> >>>>>> JohnC >>>>>> >>>>> >>>> >>> >> > From john.cuthbertson at oracle.com Wed Jan 18 11:38:44 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 18 Jan 2012 11:38:44 -0800 Subject: RFR(XS): 7129514: time warp warnings after 7117303 In-Reply-To: <4F170AA8.5070403@oracle.com> References: <4F170AA8.5070403@oracle.com> Message-ID: <4F171FC4.3020708@oracle.com> Hi Everyone, I forgot to include the webrev link: http://cr.openjdk.java.net/~johnc/7129514/webrev.0/ Thanks to Ramki for pointing this out. JohnC On 1/18/2012 10:08 AM, John Cuthbertson wrote: > Hi Everyone, > > While making the changes for 711303, I missed a few calls to > os::javaTimeMillis() (specifically with updating the time since the > last GC). As a consequence we can still see the occasional time-warp > warning. The issue is that os::javaTimeMillis() returns values that > are not guaranteed to be monotonically non-decreasing and so they can > go backwards. I've replaced these calls to an equivalent that uses > os::javaTimeNanos(), which will return values that are monotonically > non-decreasing if the underlying system time source supports such a mode. > > Many thanks to David Holmes for diagnosing the issue. > > Thanks, > > JohnC From tony.printezis at oracle.com Wed Jan 18 12:51:34 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Wed, 18 Jan 2012 15:51:34 -0500 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F16D5F7.4070605@oracle.com> References: <4F1608C0.2040601@oracle.com> <4F16D5F7.4070605@oracle.com> Message-ID: <4F1730D6.1070004@oracle.com> Hi Bengt, Thanks for looking at this so quickly! Inline. Bengt Rutisson wrote: > > Tony, > > Overall this looks really good. Thanks for fixing it. > > Some comments: > > First, a general question regarding naming and logging. We now talk > about "snapshot" a lot. It is a pretty good name, but maybe it needs > some more context to be understandable in the code and the GC log. I > don't have any really good names, but maybe "survivor_snapshot" I'd rather not mention "survivors" given that we might add non-survivor regions in the future. > or "initial_mark_snapshot"? I like "initial-mark snapshot" better. Having said that CMInitialMarkSnapshotRegions and _initial_mark_snapshot_regions are kinda long. :-) I'll abbreviate to CMIMSnapshotRegions and _im_snapshot_regions if that's OK. > concurrentMark.inline.hpp > > if (hr == NULL) { > hr = _g1h->heap_region_containing_raw(addr); > // Given that we're looking for a region that contains an object > // header it's impossible to get back a HC region. > assert(!hr->continuesHumongous(), "sanity"); > } else { > assert(hr->is_in(addr), "pre-condition"); > } > > The first assert should probably hold even for regions that are passed > in to grayRoot() right? So, maybe something like: > > if (hr == NULL) { > hr = _g1h->heap_region_containing_raw(addr); > } else { > assert(hr->is_in(addr), "pre-condition"); > } > // Given that we need a region that contains an object > // header it's impossible for it to be a HC region. > assert(!hr->continuesHumongous(), "sanity"); Good observation! I changed to the above. > concurrentMarkThread.cpp > > ConcurrentMarkThread::run() > > Why do we do the explicit time/date stamping? > > gclog_or_tty->date_stamp(PrintGCDateStamps); > gclog_or_tty->stamp(PrintGCTimeStamps); > gclog_or_tty->print_cr("[GC concurrent-snapshot-scan-start]"); > > why is it not enough with the normal -XX:+PrintGCTimeStamps information? Not quite sure what you mean with "is it not enough with the normal ... information". Each log record needs either a GC time stamp or a GC date stamp and we have to print either or both depending on the two -XX parameters. Unfortunately, the logging code has not been well abstracted and/or refactored so we have this unfortunate replication throughout the GCs. > This is probably correct since I see this pattern in other places. But > I would like to understand why we do it. > > > g1CollectedHeap.cpp: > > G1CollectedHeap::do_collection() > > Is it worth logging how long we had to wait in > _cm->snapshot_regions()->wait_until_scan_finished(), the same way that > we do in G1CollectedHeap::do_collection_pause_at_safepoint()? Currently, the GC log records for the evacuation pauses have a lot of extra information when +PrintGCDetails is set and it was reasonable to add an extra record with the wait time. And it's more important to know how the wait for snapshot region scanning affects evacuation pauses, which are more critical. The Full GC log records are currently one line and I don't think we want to extend them further (at least, not before we put a decent GC logging framework in place). On the other hand, the snapshot region scanning aborts when a marking cycle is aborted due to a Full GC. So, this wait time should not be long. How about I add a comment in the code saying that, when we introduce a more extensible logging framework, we could add the wait time to the Full GC log records? Something like: // Note: When we have a more flexible GC logging framework that // allows us to add optional attributes to a GC log record we // could consider timing and reporting how long we wait in the // following two methods. wait_while_free_regions_coming(); // ... _cm->snapshot_regions()->wait_until_scan_finished(); > Finally, just some food for thought. Could this be generalized to more > roots? I mean take a snapshot and scan it concurrently. By scanning the IM snapshot regions we say "these guys are all roots, instead of scanning them during the GC we will scan them concurrently". And we can do that for any object / region a) as long as we know they will not move while they are being scanned and b) because we have the pre-barrier. If any references on the snapshot objects are updated, the pre-barrier ensures that their values at the start of marking will be enqueued and processed. For external arbitrary roots we don't have a write barrier (and we shouldn't as it'd be too expensive). So, we cannot do that for non-object roots without a "pre-barrier"-type mechanism. Tony > Bengt > > > > On 2012-01-18 00:48, Tony Printezis wrote: >> Hi all, >> >> Can I have a couple of code reviews for this change that re-enables >> the use of survivor regions during the initial-mark pause? >> >> http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ >> >> From the CR: >> >> We could scan the survivors as we're copying them, however this will >> require more work during the initial-mark GCs (and in particular: >> special-case code in the fast path). >> >> A better approach is to let the concurrent marking threads scan the >> survivors and mark everything reachable from them a) before any more >> concurrent marking work is done (so that we can just mark the >> objects, without needing to push them on a stack, and let the >> "finger" algorithm discover them) and b) before the next GC starts >> (since, if we copy them, we won't know which of the new survivors are >> the ones we need to scan). >> >> This approach has the advantage that it does not require any extra >> work during the initial-mark GCs and all the work is done by the >> concurrent marking threads. However, it has the disadvantage that the >> survivor scanning might hold up the next GC. In most cases this >> should not be an issue as GCs take place at a reasonably low rate. If >> it does become a problem we could consider the following: >> >> - like when the GC locker is active, try to extend the eden to give a >> bit more time to the marking threads to finish scanning the survivors >> - instead of waiting for the marking threads, a GC can take over and >> finish up scanning the remaining survivors (typically, we have more >> GC threads than marking threads, so the overhead will be reduced) >> - if we supported region pinning, we could pin all the regions that >> were not scanned by the time the GC started so that the marking >> threads can resume scanning them after the GC completes >> >> Implementation notes: >> >> I introduced the concept of a "snapshot regions" in the >> ConcurrentMark which is a set of regions that need to be scanned at >> the start of a concurrent cycle. Currently, these can only be >> survivors but maybe we can use the same concept for something else in >> the future. >> >> Tony >> >> > From tony.printezis at oracle.com Wed Jan 18 13:31:51 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Wed, 18 Jan 2012 16:31:51 -0500 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F1730D6.1070004@oracle.com> References: <4F1608C0.2040601@oracle.com> <4F16D5F7.4070605@oracle.com> <4F1730D6.1070004@oracle.com> Message-ID: <4F173A47.9040705@oracle.com> Bengt, Here's a webrev with the renaming: http://cr.openjdk.java.net/~tonyp/7127706/webrev.1/ I have to say I'm not sure I really like the term "initial-mark / IM snapshot regions". I'll try to come up with an alternative name for them.... Tony Tony Printezis wrote: > Hi Bengt, > > Thanks for looking at this so quickly! Inline. > > Bengt Rutisson wrote: >> >> Tony, >> >> Overall this looks really good. Thanks for fixing it. >> >> Some comments: >> >> First, a general question regarding naming and logging. We now talk >> about "snapshot" a lot. It is a pretty good name, but maybe it needs >> some more context to be understandable in the code and the GC log. I >> don't have any really good names, but maybe "survivor_snapshot" > > I'd rather not mention "survivors" given that we might add > non-survivor regions in the future. > >> or "initial_mark_snapshot"? > > I like "initial-mark snapshot" better. Having said that > CMInitialMarkSnapshotRegions and _initial_mark_snapshot_regions are > kinda long. :-) I'll abbreviate to CMIMSnapshotRegions and > _im_snapshot_regions if that's OK. > >> concurrentMark.inline.hpp >> >> if (hr == NULL) { >> hr = _g1h->heap_region_containing_raw(addr); >> // Given that we're looking for a region that contains an object >> // header it's impossible to get back a HC region. >> assert(!hr->continuesHumongous(), "sanity"); >> } else { >> assert(hr->is_in(addr), "pre-condition"); >> } >> >> The first assert should probably hold even for regions that are >> passed in to grayRoot() right? So, maybe something like: >> >> if (hr == NULL) { >> hr = _g1h->heap_region_containing_raw(addr); >> } else { >> assert(hr->is_in(addr), "pre-condition"); >> } >> // Given that we need a region that contains an object >> // header it's impossible for it to be a HC region. >> assert(!hr->continuesHumongous(), "sanity"); > > Good observation! I changed to the above. > >> concurrentMarkThread.cpp >> >> ConcurrentMarkThread::run() >> >> Why do we do the explicit time/date stamping? >> >> gclog_or_tty->date_stamp(PrintGCDateStamps); >> gclog_or_tty->stamp(PrintGCTimeStamps); >> gclog_or_tty->print_cr("[GC concurrent-snapshot-scan-start]"); >> >> why is it not enough with the normal -XX:+PrintGCTimeStamps information? > > Not quite sure what you mean with "is it not enough with the normal > ... information". Each log record needs either a GC time stamp or a GC > date stamp and we have to print either or both depending on the two > -XX parameters. Unfortunately, the logging code has not been well > abstracted and/or refactored so we have this unfortunate replication > throughout the GCs. > >> This is probably correct since I see this pattern in other places. >> But I would like to understand why we do it. >> >> >> g1CollectedHeap.cpp: >> >> G1CollectedHeap::do_collection() >> >> Is it worth logging how long we had to wait in >> _cm->snapshot_regions()->wait_until_scan_finished(), the same way >> that we do in G1CollectedHeap::do_collection_pause_at_safepoint()? > > Currently, the GC log records for the evacuation pauses have a lot of > extra information when +PrintGCDetails is set and it was reasonable to > add an extra record with the wait time. And it's more important to > know how the wait for snapshot region scanning affects evacuation > pauses, which are more critical. The Full GC log records are currently > one line and I don't think we want to extend them further (at least, > not before we put a decent GC logging framework in place). On the > other hand, the snapshot region scanning aborts when a marking cycle > is aborted due to a Full GC. So, this wait time should not be long. > How about I add a comment in the code saying that, when we introduce a > more extensible logging framework, we could add the wait time to the > Full GC log records? Something like: > > // Note: When we have a more flexible GC logging framework that > // allows us to add optional attributes to a GC log record we > // could consider timing and reporting how long we wait in the > // following two methods. > wait_while_free_regions_coming(); > // ... > _cm->snapshot_regions()->wait_until_scan_finished(); > > >> Finally, just some food for thought. Could this be generalized to >> more roots? I mean take a snapshot and scan it concurrently. > > By scanning the IM snapshot regions we say "these guys are all roots, > instead of scanning them during the GC we will scan them > concurrently". And we can do that for any object / region a) as long > as we know they will not move while they are being scanned and b) > because we have the pre-barrier. If any references on the snapshot > objects are updated, the pre-barrier ensures that their values at the > start of marking will be enqueued and processed. > > For external arbitrary roots we don't have a write barrier (and we > shouldn't as it'd be too expensive). So, we cannot do that for > non-object roots without a "pre-barrier"-type mechanism. > > Tony > > >> Bengt >> >> >> >> On 2012-01-18 00:48, Tony Printezis wrote: >>> Hi all, >>> >>> Can I have a couple of code reviews for this change that re-enables >>> the use of survivor regions during the initial-mark pause? >>> >>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ >>> >>> From the CR: >>> >>> We could scan the survivors as we're copying them, however this will >>> require more work during the initial-mark GCs (and in particular: >>> special-case code in the fast path). >>> >>> A better approach is to let the concurrent marking threads scan the >>> survivors and mark everything reachable from them a) before any more >>> concurrent marking work is done (so that we can just mark the >>> objects, without needing to push them on a stack, and let the >>> "finger" algorithm discover them) and b) before the next GC starts >>> (since, if we copy them, we won't know which of the new survivors >>> are the ones we need to scan). >>> >>> This approach has the advantage that it does not require any extra >>> work during the initial-mark GCs and all the work is done by the >>> concurrent marking threads. However, it has the disadvantage that >>> the survivor scanning might hold up the next GC. In most cases this >>> should not be an issue as GCs take place at a reasonably low rate. >>> If it does become a problem we could consider the following: >>> >>> - like when the GC locker is active, try to extend the eden to give >>> a bit more time to the marking threads to finish scanning the survivors >>> - instead of waiting for the marking threads, a GC can take over and >>> finish up scanning the remaining survivors (typically, we have more >>> GC threads than marking threads, so the overhead will be reduced) >>> - if we supported region pinning, we could pin all the regions that >>> were not scanned by the time the GC started so that the marking >>> threads can resume scanning them after the GC completes >>> >>> Implementation notes: >>> >>> I introduced the concept of a "snapshot regions" in the >>> ConcurrentMark which is a set of regions that need to be scanned at >>> the start of a concurrent cycle. Currently, these can only be >>> survivors but maybe we can use the same concept for something else >>> in the future. >>> >>> Tony >>> >>> >> > From bengt.rutisson at oracle.com Thu Jan 19 00:39:12 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 19 Jan 2012 09:39:12 +0100 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F1730D6.1070004@oracle.com> References: <4F1608C0.2040601@oracle.com> <4F16D5F7.4070605@oracle.com> <4F1730D6.1070004@oracle.com> Message-ID: <4F17D6B0.8040901@oracle.com> Tony, Inline. On 2012-01-18 21:51, Tony Printezis wrote: > Hi Bengt, > > Thanks for looking at this so quickly! Inline. > > Bengt Rutisson wrote: >> >> Tony, >> >> Overall this looks really good. Thanks for fixing it. >> >> Some comments: >> >> First, a general question regarding naming and logging. We now talk >> about "snapshot" a lot. It is a pretty good name, but maybe it needs >> some more context to be understandable in the code and the GC log. I >> don't have any really good names, but maybe "survivor_snapshot" > > I'd rather not mention "survivors" given that we might add > non-survivor regions in the future. Agreed. > >> or "initial_mark_snapshot"? > > I like "initial-mark snapshot" better. Having said that > CMInitialMarkSnapshotRegions and _initial_mark_snapshot_regions are > kinda long. :-) I'll abbreviate to CMIMSnapshotRegions and > _im_snapshot_regions if that's OK. I'll look at your new webrev and comment on that. Thanks for trying it out! >> concurrentMark.inline.hpp >> >> if (hr == NULL) { >> hr = _g1h->heap_region_containing_raw(addr); >> // Given that we're looking for a region that contains an object >> // header it's impossible to get back a HC region. >> assert(!hr->continuesHumongous(), "sanity"); >> } else { >> assert(hr->is_in(addr), "pre-condition"); >> } >> >> The first assert should probably hold even for regions that are >> passed in to grayRoot() right? So, maybe something like: >> >> if (hr == NULL) { >> hr = _g1h->heap_region_containing_raw(addr); >> } else { >> assert(hr->is_in(addr), "pre-condition"); >> } >> // Given that we need a region that contains an object >> // header it's impossible for it to be a HC region. >> assert(!hr->continuesHumongous(), "sanity"); > > Good observation! I changed to the above. Great. > >> concurrentMarkThread.cpp >> >> ConcurrentMarkThread::run() >> >> Why do we do the explicit time/date stamping? >> >> gclog_or_tty->date_stamp(PrintGCDateStamps); >> gclog_or_tty->stamp(PrintGCTimeStamps); >> gclog_or_tty->print_cr("[GC concurrent-snapshot-scan-start]"); >> >> why is it not enough with the normal -XX:+PrintGCTimeStamps information? > > Not quite sure what you mean with "is it not enough with the normal > ... information". Each log record needs either a GC time stamp or a GC > date stamp and we have to print either or both depending on the two > -XX parameters. Unfortunately, the logging code has not been well > abstracted and/or refactored so we have this unfortunate replication > throughout the GCs. Sorry, this was a misunderstanding from my side. I have never really thought about how the time stamping was implemented. I could not imagine that it was implemented as three different print statements, so I thought you were doing something extra here. I realize now that this is the "normal" time stamping. Stating the obvious: We desperately need a new logging framework! >> This is probably correct since I see this pattern in other places. >> But I would like to understand why we do it. >> >> >> g1CollectedHeap.cpp: >> >> G1CollectedHeap::do_collection() >> >> Is it worth logging how long we had to wait in >> _cm->snapshot_regions()->wait_until_scan_finished(), the same way >> that we do in G1CollectedHeap::do_collection_pause_at_safepoint()? > > Currently, the GC log records for the evacuation pauses have a lot of > extra information when +PrintGCDetails is set and it was reasonable to > add an extra record with the wait time. And it's more important to > know how the wait for snapshot region scanning affects evacuation > pauses, which are more critical. The Full GC log records are currently > one line and I don't think we want to extend them further (at least, > not before we put a decent GC logging framework in place). On the > other hand, the snapshot region scanning aborts when a marking cycle > is aborted due to a Full GC. So, this wait time should not be long. > How about I add a comment in the code saying that, when we introduce a > more extensible logging framework, we could add the wait time to the > Full GC log records? Something like: > > // Note: When we have a more flexible GC logging framework that > // allows us to add optional attributes to a GC log record we > // could consider timing and reporting how long we wait in the > // following two methods. > wait_while_free_regions_coming(); > // ... > _cm->snapshot_regions()->wait_until_scan_finished(); Ok. That's fine. >> Finally, just some food for thought. Could this be generalized to >> more roots? I mean take a snapshot and scan it concurrently. > > By scanning the IM snapshot regions we say "these guys are all roots, > instead of scanning them during the GC we will scan them > concurrently". And we can do that for any object / region a) as long > as we know they will not move while they are being scanned and b) > because we have the pre-barrier. If any references on the snapshot > objects are updated, the pre-barrier ensures that their values at the > start of marking will be enqueued and processed. > > For external arbitrary roots we don't have a write barrier (and we > shouldn't as it'd be too expensive). So, we cannot do that for > non-object roots without a "pre-barrier"-type mechanism. But if we have roots that do not change between initial mark and the first young GC we could do it without the pre-barrier, right? I'm thinking of classes for example. But then again there is class redefinition, so maybe they can change... Anyway, not something we should do now, just wanted to mention it. Bengt > > Tony > > >> Bengt >> >> >> >> On 2012-01-18 00:48, Tony Printezis wrote: >>> Hi all, >>> >>> Can I have a couple of code reviews for this change that re-enables >>> the use of survivor regions during the initial-mark pause? >>> >>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ >>> >>> From the CR: >>> >>> We could scan the survivors as we're copying them, however this will >>> require more work during the initial-mark GCs (and in particular: >>> special-case code in the fast path). >>> >>> A better approach is to let the concurrent marking threads scan the >>> survivors and mark everything reachable from them a) before any more >>> concurrent marking work is done (so that we can just mark the >>> objects, without needing to push them on a stack, and let the >>> "finger" algorithm discover them) and b) before the next GC starts >>> (since, if we copy them, we won't know which of the new survivors >>> are the ones we need to scan). >>> >>> This approach has the advantage that it does not require any extra >>> work during the initial-mark GCs and all the work is done by the >>> concurrent marking threads. However, it has the disadvantage that >>> the survivor scanning might hold up the next GC. In most cases this >>> should not be an issue as GCs take place at a reasonably low rate. >>> If it does become a problem we could consider the following: >>> >>> - like when the GC locker is active, try to extend the eden to give >>> a bit more time to the marking threads to finish scanning the survivors >>> - instead of waiting for the marking threads, a GC can take over and >>> finish up scanning the remaining survivors (typically, we have more >>> GC threads than marking threads, so the overhead will be reduced) >>> - if we supported region pinning, we could pin all the regions that >>> were not scanned by the time the GC started so that the marking >>> threads can resume scanning them after the GC completes >>> >>> Implementation notes: >>> >>> I introduced the concept of a "snapshot regions" in the >>> ConcurrentMark which is a set of regions that need to be scanned at >>> the start of a concurrent cycle. Currently, these can only be >>> survivors but maybe we can use the same concept for something else >>> in the future. >>> >>> Tony >>> >>> >> From bengt.rutisson at oracle.com Thu Jan 19 01:03:24 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 19 Jan 2012 10:03:24 +0100 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F173A47.9040705@oracle.com> References: <4F1608C0.2040601@oracle.com> <4F16D5F7.4070605@oracle.com> <4F1730D6.1070004@oracle.com> <4F173A47.9040705@oracle.com> Message-ID: <4F17DC5C.4050403@oracle.com> Tony, On 2012-01-18 22:31, Tony Printezis wrote: > Bengt, > > Here's a webrev with the renaming: > > http://cr.openjdk.java.net/~tonyp/7127706/webrev.1/ > > I have to say I'm not sure I really like the term "initial-mark / IM > snapshot regions". I'll try to come up with an alternative name for > them.... Looked quickly at the new webrev. I agree that IM-snapshot might not be optimal. Still I like the fact that it is not just "snapshot" since I think that can easily be confused with the SATB terms. It is of course part of the SATB snapshot, but not the whole thing. Just thinking aloud here, what about not using the word "snapshot" at all? How about "to_be_scanned_regions", "root_regions" or "concurrent_roots"? Really not sure what a good name is here...I kind of like the log message "Concurrent root scanning took 0.000x ms". Bengt > > Tony > > Tony Printezis wrote: >> Hi Bengt, >> >> Thanks for looking at this so quickly! Inline. >> >> Bengt Rutisson wrote: >>> >>> Tony, >>> >>> Overall this looks really good. Thanks for fixing it. >>> >>> Some comments: >>> >>> First, a general question regarding naming and logging. We now talk >>> about "snapshot" a lot. It is a pretty good name, but maybe it needs >>> some more context to be understandable in the code and the GC log. I >>> don't have any really good names, but maybe "survivor_snapshot" >> >> I'd rather not mention "survivors" given that we might add >> non-survivor regions in the future. >> >>> or "initial_mark_snapshot"? >> >> I like "initial-mark snapshot" better. Having said that >> CMInitialMarkSnapshotRegions and _initial_mark_snapshot_regions are >> kinda long. :-) I'll abbreviate to CMIMSnapshotRegions and >> _im_snapshot_regions if that's OK. >> >>> concurrentMark.inline.hpp >>> >>> if (hr == NULL) { >>> hr = _g1h->heap_region_containing_raw(addr); >>> // Given that we're looking for a region that contains an object >>> // header it's impossible to get back a HC region. >>> assert(!hr->continuesHumongous(), "sanity"); >>> } else { >>> assert(hr->is_in(addr), "pre-condition"); >>> } >>> >>> The first assert should probably hold even for regions that are >>> passed in to grayRoot() right? So, maybe something like: >>> >>> if (hr == NULL) { >>> hr = _g1h->heap_region_containing_raw(addr); >>> } else { >>> assert(hr->is_in(addr), "pre-condition"); >>> } >>> // Given that we need a region that contains an object >>> // header it's impossible for it to be a HC region. >>> assert(!hr->continuesHumongous(), "sanity"); >> >> Good observation! I changed to the above. >> >>> concurrentMarkThread.cpp >>> >>> ConcurrentMarkThread::run() >>> >>> Why do we do the explicit time/date stamping? >>> >>> gclog_or_tty->date_stamp(PrintGCDateStamps); >>> gclog_or_tty->stamp(PrintGCTimeStamps); >>> gclog_or_tty->print_cr("[GC >>> concurrent-snapshot-scan-start]"); >>> >>> why is it not enough with the normal -XX:+PrintGCTimeStamps >>> information? >> >> Not quite sure what you mean with "is it not enough with the normal >> ... information". Each log record needs either a GC time stamp or a >> GC date stamp and we have to print either or both depending on the >> two -XX parameters. Unfortunately, the logging code has not been well >> abstracted and/or refactored so we have this unfortunate replication >> throughout the GCs. >> >>> This is probably correct since I see this pattern in other places. >>> But I would like to understand why we do it. >>> >>> >>> g1CollectedHeap.cpp: >>> >>> G1CollectedHeap::do_collection() >>> >>> Is it worth logging how long we had to wait in >>> _cm->snapshot_regions()->wait_until_scan_finished(), the same way >>> that we do in G1CollectedHeap::do_collection_pause_at_safepoint()? >> >> Currently, the GC log records for the evacuation pauses have a lot of >> extra information when +PrintGCDetails is set and it was reasonable >> to add an extra record with the wait time. And it's more important to >> know how the wait for snapshot region scanning affects evacuation >> pauses, which are more critical. The Full GC log records are >> currently one line and I don't think we want to extend them further >> (at least, not before we put a decent GC logging framework in place). >> On the other hand, the snapshot region scanning aborts when a marking >> cycle is aborted due to a Full GC. So, this wait time should not be >> long. How about I add a comment in the code saying that, when we >> introduce a more extensible logging framework, we could add the wait >> time to the Full GC log records? Something like: >> >> // Note: When we have a more flexible GC logging framework that >> // allows us to add optional attributes to a GC log record we >> // could consider timing and reporting how long we wait in the >> // following two methods. >> wait_while_free_regions_coming(); >> // ... >> _cm->snapshot_regions()->wait_until_scan_finished(); >> >> >>> Finally, just some food for thought. Could this be generalized to >>> more roots? I mean take a snapshot and scan it concurrently. >> >> By scanning the IM snapshot regions we say "these guys are all roots, >> instead of scanning them during the GC we will scan them >> concurrently". And we can do that for any object / region a) as long >> as we know they will not move while they are being scanned and b) >> because we have the pre-barrier. If any references on the snapshot >> objects are updated, the pre-barrier ensures that their values at the >> start of marking will be enqueued and processed. >> >> For external arbitrary roots we don't have a write barrier (and we >> shouldn't as it'd be too expensive). So, we cannot do that for >> non-object roots without a "pre-barrier"-type mechanism. >> >> Tony >> >> >>> Bengt >>> >>> >>> >>> On 2012-01-18 00:48, Tony Printezis wrote: >>>> Hi all, >>>> >>>> Can I have a couple of code reviews for this change that re-enables >>>> the use of survivor regions during the initial-mark pause? >>>> >>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ >>>> >>>> From the CR: >>>> >>>> We could scan the survivors as we're copying them, however this >>>> will require more work during the initial-mark GCs (and in >>>> particular: special-case code in the fast path). >>>> >>>> A better approach is to let the concurrent marking threads scan the >>>> survivors and mark everything reachable from them a) before any >>>> more concurrent marking work is done (so that we can just mark the >>>> objects, without needing to push them on a stack, and let the >>>> "finger" algorithm discover them) and b) before the next GC starts >>>> (since, if we copy them, we won't know which of the new survivors >>>> are the ones we need to scan). >>>> >>>> This approach has the advantage that it does not require any extra >>>> work during the initial-mark GCs and all the work is done by the >>>> concurrent marking threads. However, it has the disadvantage that >>>> the survivor scanning might hold up the next GC. In most cases this >>>> should not be an issue as GCs take place at a reasonably low rate. >>>> If it does become a problem we could consider the following: >>>> >>>> - like when the GC locker is active, try to extend the eden to give >>>> a bit more time to the marking threads to finish scanning the >>>> survivors >>>> - instead of waiting for the marking threads, a GC can take over >>>> and finish up scanning the remaining survivors (typically, we have >>>> more GC threads than marking threads, so the overhead will be reduced) >>>> - if we supported region pinning, we could pin all the regions that >>>> were not scanned by the time the GC started so that the marking >>>> threads can resume scanning them after the GC completes >>>> >>>> Implementation notes: >>>> >>>> I introduced the concept of a "snapshot regions" in the >>>> ConcurrentMark which is a set of regions that need to be scanned at >>>> the start of a concurrent cycle. Currently, these can only be >>>> survivors but maybe we can use the same concept for something else >>>> in the future. >>>> >>>> Tony >>>> >>>> >>> >> From john.cuthbertson at oracle.com Thu Jan 19 05:12:33 2012 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Thu, 19 Jan 2012 13:12:33 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets Message-ID: <20120119131244.6D929479EB@hg.openjdk.java.net> Changeset: 0b3d1ec6eaee Author: tonyp Date: 2012-01-18 10:30 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/0b3d1ec6eaee 7097586: G1: improve the per-space output when using jmap -heap Summary: Extend the jmap -heap output for G1 to include some more G1-specific information. Reviewed-by: brutisso, johnc, poonam ! agent/src/share/classes/sun/jvm/hotspot/gc_implementation/g1/G1CollectedHeap.java ! agent/src/share/classes/sun/jvm/hotspot/gc_implementation/g1/G1MonitoringSupport.java + agent/src/share/classes/sun/jvm/hotspot/gc_implementation/g1/HeapRegionSetBase.java ! agent/src/share/classes/sun/jvm/hotspot/tools/HeapSummary.java ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/heapRegionSet.hpp ! src/share/vm/gc_implementation/g1/vmStructs_g1.hpp Changeset: 7ca7be5a6a0b Author: johnc Date: 2012-01-17 10:21 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/7ca7be5a6a0b 7129271: G1: Interference from multiple threads in PrintGC/PrintGCDetails output Summary: During an initial mark pause, signal the Concurrent Mark thread after the pause output from PrintGC/PrintGCDetails is complete. Reviewed-by: tonyp, brutisso ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp From tony.printezis at oracle.com Thu Jan 19 05:53:34 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Thu, 19 Jan 2012 08:53:34 -0500 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F17D6B0.8040901@oracle.com> References: <4F1608C0.2040601@oracle.com> <4F16D5F7.4070605@oracle.com> <4F1730D6.1070004@oracle.com> <4F17D6B0.8040901@oracle.com> Message-ID: <4F18205E.2020200@oracle.com> Hi Bengt, Inline. On 1/19/2012 3:39 AM, Bengt Rutisson wrote: > >>> why is it not enough with the normal -XX:+PrintGCTimeStamps >>> information? >> >> Not quite sure what you mean with "is it not enough with the normal >> ... information". Each log record needs either a GC time stamp or a >> GC date stamp and we have to print either or both depending on the >> two -XX parameters. Unfortunately, the logging code has not been well >> abstracted and/or refactored so we have this unfortunate replication >> throughout the GCs. > > > Sorry, this was a misunderstanding from my side. I have never really > thought about how the time stamping was implemented. I could not > imagine that it was implemented as three different print statements, > so I thought you were doing something extra here. I realize now that > this is the "normal" time stamping. > > Stating the obvious: We desperately need a new logging framework! +1 > >>> Finally, just some food for thought. Could this be generalized to >>> more roots? I mean take a snapshot and scan it concurrently. >> >> By scanning the IM snapshot regions we say "these guys are all roots, >> instead of scanning them during the GC we will scan them >> concurrently". And we can do that for any object / region a) as long >> as we know they will not move while they are being scanned and b) >> because we have the pre-barrier. If any references on the snapshot >> objects are updated, the pre-barrier ensures that their values at the >> start of marking will be enqueued and processed. >> >> For external arbitrary roots we don't have a write barrier (and we >> shouldn't as it'd be too expensive). So, we cannot do that for >> non-object roots without a "pre-barrier"-type mechanism. > > But if we have roots that do not change between initial mark and the > first young GC we could do it without the pre-barrier, right? This is indeed correct. However, also note that those roots might also point into the collection set so we have to scan them during the initial-mark pause anyway. So, might as well mark what they point to. Having said that, we can somehow tag roots as "non-young" after we know they do not point into the young gen so that we don't have to scan them during subsequent young GCs. If we know that they also don't change, we could scan them concurrently. > I'm thinking of classes for example. But then again there is class > redefinition, so maybe they can change... Again, this is a good point. We can talk with the folks who are working on the perm gen removal project to classify (if they have not already done it) which refs on metadata can be updated after a class is initialized and which cannot. There might be some opportunities for improvement here. > Anyway, not something we should do now, just wanted to mention it. > Agreed. And I have to say the first phase I'd try to optimize would be the remark (do it mostly concurrently), not the initial-mark. Tony From tony.printezis at oracle.com Thu Jan 19 06:04:02 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Thu, 19 Jan 2012 09:04:02 -0500 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F17DC5C.4050403@oracle.com> References: <4F1608C0.2040601@oracle.com> <4F16D5F7.4070605@oracle.com> <4F1730D6.1070004@oracle.com> <4F173A47.9040705@oracle.com> <4F17DC5C.4050403@oracle.com> Message-ID: <4F1822D2.2060202@oracle.com> Bengt, Inline (again!) On 1/19/2012 4:03 AM, Bengt Rutisson wrote: > > Tony, > > On 2012-01-18 22:31, Tony Printezis wrote: >> Bengt, >> >> Here's a webrev with the renaming: >> >> http://cr.openjdk.java.net/~tonyp/7127706/webrev.1/ >> >> I have to say I'm not sure I really like the term "initial-mark / IM >> snapshot regions". I'll try to come up with an alternative name for >> them.... > > Looked quickly at the new webrev. > > I agree that IM-snapshot might not be optimal. Still I like the fact > that it is not just "snapshot" since I think that can easily be > confused with the SATB terms. Well, it's supposed to be. In SATB, anything that's reachable from the "snapshot" at initial-mark time will be retained. Since we want to avoid explicitly marking the survivors, we make them all implicitly live and part of the "snapshot" which is why we have to scan them (the same way we scan roots during the initial-mark pause). So, "snapshot regions" is not unreasonable given that if we were not using SATB we would not be able to do this. > It is of course part of the SATB snapshot, but not the whole thing. > > Just thinking aloud here, what about not using the word "snapshot" at > all? How about "to_be_scanned_regions", "root_regions" or > "concurrent_roots"? I like "root regions". Let's go with that (it's shorter too!). Tony > Really not sure what a good name is here...I kind of like the log > message "Concurrent root scanning took 0.000x ms". > > Bengt > >> >> Tony >> >> Tony Printezis wrote: >>> Hi Bengt, >>> >>> Thanks for looking at this so quickly! Inline. >>> >>> Bengt Rutisson wrote: >>>> >>>> Tony, >>>> >>>> Overall this looks really good. Thanks for fixing it. >>>> >>>> Some comments: >>>> >>>> First, a general question regarding naming and logging. We now talk >>>> about "snapshot" a lot. It is a pretty good name, but maybe it >>>> needs some more context to be understandable in the code and the GC >>>> log. I don't have any really good names, but maybe "survivor_snapshot" >>> >>> I'd rather not mention "survivors" given that we might add >>> non-survivor regions in the future. >>> >>>> or "initial_mark_snapshot"? >>> >>> I like "initial-mark snapshot" better. Having said that >>> CMInitialMarkSnapshotRegions and _initial_mark_snapshot_regions are >>> kinda long. :-) I'll abbreviate to CMIMSnapshotRegions and >>> _im_snapshot_regions if that's OK. >>> >>>> concurrentMark.inline.hpp >>>> >>>> if (hr == NULL) { >>>> hr = _g1h->heap_region_containing_raw(addr); >>>> // Given that we're looking for a region that contains an object >>>> // header it's impossible to get back a HC region. >>>> assert(!hr->continuesHumongous(), "sanity"); >>>> } else { >>>> assert(hr->is_in(addr), "pre-condition"); >>>> } >>>> >>>> The first assert should probably hold even for regions that are >>>> passed in to grayRoot() right? So, maybe something like: >>>> >>>> if (hr == NULL) { >>>> hr = _g1h->heap_region_containing_raw(addr); >>>> } else { >>>> assert(hr->is_in(addr), "pre-condition"); >>>> } >>>> // Given that we need a region that contains an object >>>> // header it's impossible for it to be a HC region. >>>> assert(!hr->continuesHumongous(), "sanity"); >>> >>> Good observation! I changed to the above. >>> >>>> concurrentMarkThread.cpp >>>> >>>> ConcurrentMarkThread::run() >>>> >>>> Why do we do the explicit time/date stamping? >>>> >>>> gclog_or_tty->date_stamp(PrintGCDateStamps); >>>> gclog_or_tty->stamp(PrintGCTimeStamps); >>>> gclog_or_tty->print_cr("[GC >>>> concurrent-snapshot-scan-start]"); >>>> >>>> why is it not enough with the normal -XX:+PrintGCTimeStamps >>>> information? >>> >>> Not quite sure what you mean with "is it not enough with the normal >>> ... information". Each log record needs either a GC time stamp or a >>> GC date stamp and we have to print either or both depending on the >>> two -XX parameters. Unfortunately, the logging code has not been >>> well abstracted and/or refactored so we have this unfortunate >>> replication throughout the GCs. >>> >>>> This is probably correct since I see this pattern in other places. >>>> But I would like to understand why we do it. >>>> >>>> >>>> g1CollectedHeap.cpp: >>>> >>>> G1CollectedHeap::do_collection() >>>> >>>> Is it worth logging how long we had to wait in >>>> _cm->snapshot_regions()->wait_until_scan_finished(), the same way >>>> that we do in G1CollectedHeap::do_collection_pause_at_safepoint()? >>> >>> Currently, the GC log records for the evacuation pauses have a lot >>> of extra information when +PrintGCDetails is set and it was >>> reasonable to add an extra record with the wait time. And it's more >>> important to know how the wait for snapshot region scanning affects >>> evacuation pauses, which are more critical. The Full GC log records >>> are currently one line and I don't think we want to extend them >>> further (at least, not before we put a decent GC logging framework >>> in place). On the other hand, the snapshot region scanning aborts >>> when a marking cycle is aborted due to a Full GC. So, this wait time >>> should not be long. How about I add a comment in the code saying >>> that, when we introduce a more extensible logging framework, we >>> could add the wait time to the Full GC log records? Something like: >>> >>> // Note: When we have a more flexible GC logging framework that >>> // allows us to add optional attributes to a GC log record we >>> // could consider timing and reporting how long we wait in the >>> // following two methods. >>> wait_while_free_regions_coming(); >>> // ... >>> _cm->snapshot_regions()->wait_until_scan_finished(); >>> >>> >>>> Finally, just some food for thought. Could this be generalized to >>>> more roots? I mean take a snapshot and scan it concurrently. >>> >>> By scanning the IM snapshot regions we say "these guys are all >>> roots, instead of scanning them during the GC we will scan them >>> concurrently". And we can do that for any object / region a) as long >>> as we know they will not move while they are being scanned and b) >>> because we have the pre-barrier. If any references on the snapshot >>> objects are updated, the pre-barrier ensures that their values at >>> the start of marking will be enqueued and processed. >>> >>> For external arbitrary roots we don't have a write barrier (and we >>> shouldn't as it'd be too expensive). So, we cannot do that for >>> non-object roots without a "pre-barrier"-type mechanism. >>> >>> Tony >>> >>> >>>> Bengt >>>> >>>> >>>> >>>> On 2012-01-18 00:48, Tony Printezis wrote: >>>>> Hi all, >>>>> >>>>> Can I have a couple of code reviews for this change that >>>>> re-enables the use of survivor regions during the initial-mark pause? >>>>> >>>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ >>>>> >>>>> From the CR: >>>>> >>>>> We could scan the survivors as we're copying them, however this >>>>> will require more work during the initial-mark GCs (and in >>>>> particular: special-case code in the fast path). >>>>> >>>>> A better approach is to let the concurrent marking threads scan >>>>> the survivors and mark everything reachable from them a) before >>>>> any more concurrent marking work is done (so that we can just mark >>>>> the objects, without needing to push them on a stack, and let the >>>>> "finger" algorithm discover them) and b) before the next GC starts >>>>> (since, if we copy them, we won't know which of the new survivors >>>>> are the ones we need to scan). >>>>> >>>>> This approach has the advantage that it does not require any extra >>>>> work during the initial-mark GCs and all the work is done by the >>>>> concurrent marking threads. However, it has the disadvantage that >>>>> the survivor scanning might hold up the next GC. In most cases >>>>> this should not be an issue as GCs take place at a reasonably low >>>>> rate. If it does become a problem we could consider the following: >>>>> >>>>> - like when the GC locker is active, try to extend the eden to >>>>> give a bit more time to the marking threads to finish scanning the >>>>> survivors >>>>> - instead of waiting for the marking threads, a GC can take over >>>>> and finish up scanning the remaining survivors (typically, we have >>>>> more GC threads than marking threads, so the overhead will be >>>>> reduced) >>>>> - if we supported region pinning, we could pin all the regions >>>>> that were not scanned by the time the GC started so that the >>>>> marking threads can resume scanning them after the GC completes >>>>> >>>>> Implementation notes: >>>>> >>>>> I introduced the concept of a "snapshot regions" in the >>>>> ConcurrentMark which is a set of regions that need to be scanned >>>>> at the start of a concurrent cycle. Currently, these can only be >>>>> survivors but maybe we can use the same concept for something else >>>>> in the future. >>>>> >>>>> Tony >>>>> >>>>> >>>> >>> > From tony.printezis at oracle.com Thu Jan 19 06:40:37 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Thu, 19 Jan 2012 09:40:37 -0500 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F1822D2.2060202@oracle.com> References: <4F1608C0.2040601@oracle.com> <4F16D5F7.4070605@oracle.com> <4F1730D6.1070004@oracle.com> <4F173A47.9040705@oracle.com> <4F17DC5C.4050403@oracle.com> <4F1822D2.2060202@oracle.com> Message-ID: <4F182B65.3050701@oracle.com> Bengt (and all), Updated webrev using "root regions" now: http://cr.openjdk.java.net/~tonyp/7127706/webrev.2/ Tony On 01/19/2012 09:04 AM, Tony Printezis wrote: > Bengt, > > Inline (again!) > > On 1/19/2012 4:03 AM, Bengt Rutisson wrote: >> >> Tony, >> >> On 2012-01-18 22:31, Tony Printezis wrote: >>> Bengt, >>> >>> Here's a webrev with the renaming: >>> >>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.1/ >>> >>> I have to say I'm not sure I really like the term "initial-mark / IM >>> snapshot regions". I'll try to come up with an alternative name for >>> them.... >> >> Looked quickly at the new webrev. >> >> I agree that IM-snapshot might not be optimal. Still I like the fact >> that it is not just "snapshot" since I think that can easily be >> confused with the SATB terms. > > Well, it's supposed to be. In SATB, anything that's reachable from the > "snapshot" at initial-mark time will be retained. Since we want to > avoid explicitly marking the survivors, we make them all implicitly > live and part of the "snapshot" which is why we have to scan them (the > same way we scan roots during the initial-mark pause). So, "snapshot > regions" is not unreasonable given that if we were not using SATB we > would not be able to do this. > >> It is of course part of the SATB snapshot, but not the whole thing. >> >> Just thinking aloud here, what about not using the word "snapshot" at >> all? How about "to_be_scanned_regions", "root_regions" or >> "concurrent_roots"? > > I like "root regions". Let's go with that (it's shorter too!). > > Tony > >> Really not sure what a good name is here...I kind of like the log >> message "Concurrent root scanning took 0.000x ms". >> >> Bengt >> >>> >>> Tony >>> >>> Tony Printezis wrote: >>>> Hi Bengt, >>>> >>>> Thanks for looking at this so quickly! Inline. >>>> >>>> Bengt Rutisson wrote: >>>>> >>>>> Tony, >>>>> >>>>> Overall this looks really good. Thanks for fixing it. >>>>> >>>>> Some comments: >>>>> >>>>> First, a general question regarding naming and logging. We now >>>>> talk about "snapshot" a lot. It is a pretty good name, but maybe >>>>> it needs some more context to be understandable in the code and >>>>> the GC log. I don't have any really good names, but maybe >>>>> "survivor_snapshot" >>>> >>>> I'd rather not mention "survivors" given that we might add >>>> non-survivor regions in the future. >>>> >>>>> or "initial_mark_snapshot"? >>>> >>>> I like "initial-mark snapshot" better. Having said that >>>> CMInitialMarkSnapshotRegions and _initial_mark_snapshot_regions are >>>> kinda long. :-) I'll abbreviate to CMIMSnapshotRegions and >>>> _im_snapshot_regions if that's OK. >>>> >>>>> concurrentMark.inline.hpp >>>>> >>>>> if (hr == NULL) { >>>>> hr = _g1h->heap_region_containing_raw(addr); >>>>> // Given that we're looking for a region that contains an object >>>>> // header it's impossible to get back a HC region. >>>>> assert(!hr->continuesHumongous(), "sanity"); >>>>> } else { >>>>> assert(hr->is_in(addr), "pre-condition"); >>>>> } >>>>> >>>>> The first assert should probably hold even for regions that are >>>>> passed in to grayRoot() right? So, maybe something like: >>>>> >>>>> if (hr == NULL) { >>>>> hr = _g1h->heap_region_containing_raw(addr); >>>>> } else { >>>>> assert(hr->is_in(addr), "pre-condition"); >>>>> } >>>>> // Given that we need a region that contains an object >>>>> // header it's impossible for it to be a HC region. >>>>> assert(!hr->continuesHumongous(), "sanity"); >>>> >>>> Good observation! I changed to the above. >>>> >>>>> concurrentMarkThread.cpp >>>>> >>>>> ConcurrentMarkThread::run() >>>>> >>>>> Why do we do the explicit time/date stamping? >>>>> >>>>> gclog_or_tty->date_stamp(PrintGCDateStamps); >>>>> gclog_or_tty->stamp(PrintGCTimeStamps); >>>>> gclog_or_tty->print_cr("[GC >>>>> concurrent-snapshot-scan-start]"); >>>>> >>>>> why is it not enough with the normal -XX:+PrintGCTimeStamps >>>>> information? >>>> >>>> Not quite sure what you mean with "is it not enough with the normal >>>> ... information". Each log record needs either a GC time stamp or a >>>> GC date stamp and we have to print either or both depending on the >>>> two -XX parameters. Unfortunately, the logging code has not been >>>> well abstracted and/or refactored so we have this unfortunate >>>> replication throughout the GCs. >>>> >>>>> This is probably correct since I see this pattern in other places. >>>>> But I would like to understand why we do it. >>>>> >>>>> >>>>> g1CollectedHeap.cpp: >>>>> >>>>> G1CollectedHeap::do_collection() >>>>> >>>>> Is it worth logging how long we had to wait in >>>>> _cm->snapshot_regions()->wait_until_scan_finished(), the same way >>>>> that we do in G1CollectedHeap::do_collection_pause_at_safepoint()? >>>> >>>> Currently, the GC log records for the evacuation pauses have a lot >>>> of extra information when +PrintGCDetails is set and it was >>>> reasonable to add an extra record with the wait time. And it's more >>>> important to know how the wait for snapshot region scanning affects >>>> evacuation pauses, which are more critical. The Full GC log records >>>> are currently one line and I don't think we want to extend them >>>> further (at least, not before we put a decent GC logging framework >>>> in place). On the other hand, the snapshot region scanning aborts >>>> when a marking cycle is aborted due to a Full GC. So, this wait >>>> time should not be long. How about I add a comment in the code >>>> saying that, when we introduce a more extensible logging framework, >>>> we could add the wait time to the Full GC log records? Something like: >>>> >>>> // Note: When we have a more flexible GC logging framework that >>>> // allows us to add optional attributes to a GC log record we >>>> // could consider timing and reporting how long we wait in the >>>> // following two methods. >>>> wait_while_free_regions_coming(); >>>> // ... >>>> _cm->snapshot_regions()->wait_until_scan_finished(); >>>> >>>> >>>>> Finally, just some food for thought. Could this be generalized to >>>>> more roots? I mean take a snapshot and scan it concurrently. >>>> >>>> By scanning the IM snapshot regions we say "these guys are all >>>> roots, instead of scanning them during the GC we will scan them >>>> concurrently". And we can do that for any object / region a) as >>>> long as we know they will not move while they are being scanned and >>>> b) because we have the pre-barrier. If any references on the >>>> snapshot objects are updated, the pre-barrier ensures that their >>>> values at the start of marking will be enqueued and processed. >>>> >>>> For external arbitrary roots we don't have a write barrier (and we >>>> shouldn't as it'd be too expensive). So, we cannot do that for >>>> non-object roots without a "pre-barrier"-type mechanism. >>>> >>>> Tony >>>> >>>> >>>>> Bengt >>>>> >>>>> >>>>> >>>>> On 2012-01-18 00:48, Tony Printezis wrote: >>>>>> Hi all, >>>>>> >>>>>> Can I have a couple of code reviews for this change that >>>>>> re-enables the use of survivor regions during the initial-mark >>>>>> pause? >>>>>> >>>>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ >>>>>> >>>>>> From the CR: >>>>>> >>>>>> We could scan the survivors as we're copying them, however this >>>>>> will require more work during the initial-mark GCs (and in >>>>>> particular: special-case code in the fast path). >>>>>> >>>>>> A better approach is to let the concurrent marking threads scan >>>>>> the survivors and mark everything reachable from them a) before >>>>>> any more concurrent marking work is done (so that we can just >>>>>> mark the objects, without needing to push them on a stack, and >>>>>> let the "finger" algorithm discover them) and b) before the next >>>>>> GC starts (since, if we copy them, we won't know which of the new >>>>>> survivors are the ones we need to scan). >>>>>> >>>>>> This approach has the advantage that it does not require any >>>>>> extra work during the initial-mark GCs and all the work is done >>>>>> by the concurrent marking threads. However, it has the >>>>>> disadvantage that the survivor scanning might hold up the next >>>>>> GC. In most cases this should not be an issue as GCs take place >>>>>> at a reasonably low rate. If it does become a problem we could >>>>>> consider the following: >>>>>> >>>>>> - like when the GC locker is active, try to extend the eden to >>>>>> give a bit more time to the marking threads to finish scanning >>>>>> the survivors >>>>>> - instead of waiting for the marking threads, a GC can take over >>>>>> and finish up scanning the remaining survivors (typically, we >>>>>> have more GC threads than marking threads, so the overhead will >>>>>> be reduced) >>>>>> - if we supported region pinning, we could pin all the regions >>>>>> that were not scanned by the time the GC started so that the >>>>>> marking threads can resume scanning them after the GC completes >>>>>> >>>>>> Implementation notes: >>>>>> >>>>>> I introduced the concept of a "snapshot regions" in the >>>>>> ConcurrentMark which is a set of regions that need to be scanned >>>>>> at the start of a concurrent cycle. Currently, these can only be >>>>>> survivors but maybe we can use the same concept for something >>>>>> else in the future. >>>>>> >>>>>> Tony >>>>>> >>>>>> >>>>> >>>> >> From tony.printezis at oracle.com Thu Jan 19 12:00:03 2012 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Thu, 19 Jan 2012 20:00:03 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7078465: G1: Don't use the undefined value (-1) for the G1 old memory pool max size Message-ID: <20120119200005.C38134709E@hg.openjdk.java.net> Changeset: a8a126788ea0 Author: tonyp Date: 2012-01-19 09:13 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a8a126788ea0 7078465: G1: Don't use the undefined value (-1) for the G1 old memory pool max size Reviewed-by: johnc, brutisso ! src/share/vm/gc_implementation/g1/g1MonitoringSupport.hpp ! src/share/vm/services/g1MemoryPool.hpp From bengt.rutisson at oracle.com Fri Jan 20 08:06:52 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 20 Jan 2012 17:06:52 +0100 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 Message-ID: <4F19911C.8040701@oracle.com> Hi all, Can I have a couple of quick reviews for this small change: http://cr.openjdk.java.net/~brutisso/7131791/webrev.02 This should hopefully fix the 500+ failures in the G1 nightlies. So, I would like to get it in before the nightlies tonight. The issue is that we call collect() which will trigger a collection without protecting the memory that we just allocated for a humongous object. The fix (thanks Tony for helping me out!!!) is to fake an object and create a handle to it before we call collect. Bengt From stefan.karlsson at oracle.com Fri Jan 20 08:21:22 2012 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 20 Jan 2012 17:21:22 +0100 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 In-Reply-To: <4F19911C.8040701@oracle.com> References: <4F19911C.8040701@oracle.com> Message-ID: <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> On 20 jan 2012, at 17:06, Bengt Rutisson wrote: > > Hi all, > > Can I have a couple of quick reviews for this small change: > http://cr.openjdk.java.net/~brutisso/7131791/webrev.02 > > This should hopefully fix the 500+ failures in the G1 nightlies. So, I would like to get it in before the nightlies tonight. > > The issue is that we call collect() which will trigger a collection without protecting the memory that we just allocated for a humongous object. The fix (thanks Tony for helping me out!!!) is to fake an object and create a handle to it before we call collect. 1067 Handle h((oop)result); 1068 collect(GCCause::_g1_humongous_allocation); 1069 } 1070 return result; 1071 } Can we really have a handle to uninitialized memory? Are you sure that the humongous object will not be moved by a full collection. You should probably return h() instead of result. StefanK > > Bengt -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120120/e2611ae7/attachment.html From bengt.rutisson at oracle.com Fri Jan 20 08:43:29 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 20 Jan 2012 17:43:29 +0100 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 In-Reply-To: <4F19911C.8040701@oracle.com> References: <4F19911C.8040701@oracle.com> Message-ID: <4F1999B1.2050309@oracle.com> Hi again, Here is an updated webrev based on some comments from Tony. The code comments were updated and I pass false as the zap parameter to CollectedHeap::fill_with_object(result, word_size, false); Thanks Tony for the review! Bengt On 2012-01-20 17:06, Bengt Rutisson wrote: > > Hi all, > > Can I have a couple of quick reviews for this small change: > http://cr.openjdk.java.net/~brutisso/7131791/webrev.02 > > This should hopefully fix the 500+ failures in the G1 nightlies. So, I > would like to get it in before the nightlies tonight. > > The issue is that we call collect() which will trigger a collection > without protecting the memory that we just allocated for a humongous > object. The fix (thanks Tony for helping me out!!!) is to fake an > object and create a handle to it before we call collect. > > Bengt From bengt.rutisson at oracle.com Fri Jan 20 08:46:24 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 20 Jan 2012 17:46:24 +0100 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 In-Reply-To: <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> References: <4F19911C.8040701@oracle.com> <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> Message-ID: <4F199A60.6070506@oracle.com> Stefan, Thanks for the prompt review! Comments inline. On 2012-01-20 17:21, Stefan Karlsson wrote: > On 20 jan 2012, at 17:06, Bengt Rutisson > wrote: > >> >> Hi all, >> >> Can I have a couple of quick reviews for this small change: >> http://cr.openjdk.java.net/~brutisso/7131791/webrev.02 >> >> >> This should hopefully fix the 500+ failures in the G1 nightlies. So, >> I would like to get it in before the nightlies tonight. >> >> The issue is that we call collect() which will trigger a collection >> without protecting the memory that we just allocated for a humongous >> object. The fix (thanks Tony for helping me out!!!) is to fake an >> object and create a handle to it before we call collect. > > 1067 Handle h((oop)result); > 1068 collect(GCCause::_g1_humongous_allocation); > 1069 } > 1070 return result; > 1071 } > Can we really have a handle to uninitialized memory? The memory is not uninitialized since I fake an object there with the call to CollectedHeap::fill_with_object(result, word_size, false); just before the code you have above. > Are you sure that the humongous object will not be moved by a full collection. You should probably return h() instead of result. Humongous objects will not be moved by G1 collections so I think we are ok. Thanks for the prompt review! Bengt > > StefanK > >> >> Bengt -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120120/5adb2fb7/attachment.html From tony.printezis at oracle.com Fri Jan 20 08:44:28 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 20 Jan 2012 11:44:28 -0500 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 In-Reply-To: <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> References: <4F19911C.8040701@oracle.com> <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> Message-ID: <4F1999EC.5090208@oracle.com> Stefan, On 01/20/2012 11:21 AM, Stefan Karlsson wrote: > 1067 Handle h((oop)result); > 1068 collect(GCCause::_g1_humongous_allocation); > 1069 } > 1070 return result; > 1071 } > Can we really have a handle to uninitialized memory? Yes. The "unitialized" memory is made to look like a scalar array (of the correct size) before creating the handle. So, I can't see any issues. > Are you sure that the humongous object will not be moved by a full collection. Yes, humongous objects never move during evacuation pauses. > You should probably return h() instead of result. Maybe we could add an assert that h() == result? Tony > StefanK > >> >> Bengt -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120120/d85181d1/attachment-0001.html From stefan.karlsson at oracle.com Fri Jan 20 08:44:45 2012 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 20 Jan 2012 17:44:45 +0100 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 In-Reply-To: <4F199A60.6070506@oracle.com> References: <4F19911C.8040701@oracle.com> <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> <4F199A60.6070506@oracle.com> Message-ID: <4F1999FD.6020009@oracle.com> On 2012-01-20 17:46, Bengt Rutisson wrote: > > Stefan, > > Thanks for the prompt review! > > Comments inline. > > On 2012-01-20 17:21, Stefan Karlsson wrote: >> On 20 jan 2012, at 17:06, Bengt Rutisson > > wrote: >> >>> >>> Hi all, >>> >>> Can I have a couple of quick reviews for this small change: >>> http://cr.openjdk.java.net/~brutisso/7131791/webrev.02 >>> >>> >>> This should hopefully fix the 500+ failures in the G1 nightlies. So, >>> I would like to get it in before the nightlies tonight. >>> >>> The issue is that we call collect() which will trigger a collection >>> without protecting the memory that we just allocated for a humongous >>> object. The fix (thanks Tony for helping me out!!!) is to fake an >>> object and create a handle to it before we call collect. >> >> 1067 Handle h((oop)result); >> 1068 collect(GCCause::_g1_humongous_allocation); >> 1069 } >> 1070 return result; >> 1071 } >> Can we really have a handle to uninitialized memory? > > The memory is not uninitialized since I fake an object there with the > call to CollectedHeap::fill_with_object(result, word_size, false); > just before the code you have above. I missed that. > >> Are you sure that the humongous object will not be moved by a full collection. You should probably return h() instead of result. > > Humongous objects will not be moved by G1 collections so I think we > are ok. OK. But maybe we should be a bit defensive and return h() here. StefanK > > Thanks for the prompt review! > > Bengt >> >> StefanK >> >>> >>> Bengt > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120120/e0082881/attachment.html From bengt.rutisson at oracle.com Fri Jan 20 08:56:00 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 20 Jan 2012 17:56:00 +0100 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 In-Reply-To: <4F1999FD.6020009@oracle.com> References: <4F19911C.8040701@oracle.com> <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> <4F199A60.6070506@oracle.com> <4F1999FD.6020009@oracle.com> Message-ID: <4F199CA0.1090606@oracle.com> Stefan, are you OK with adding the assert that Tony suggested? Bengt On 2012-01-20 17:44, Stefan Karlsson wrote: > On 2012-01-20 17:46, Bengt Rutisson wrote: >> >> Stefan, >> >> Thanks for the prompt review! >> >> Comments inline. >> >> On 2012-01-20 17:21, Stefan Karlsson wrote: >>> On 20 jan 2012, at 17:06, Bengt Rutisson >> > wrote: >>> >>>> >>>> Hi all, >>>> >>>> Can I have a couple of quick reviews for this small change: >>>> http://cr.openjdk.java.net/~brutisso/7131791/webrev.02 >>>> >>>> >>>> This should hopefully fix the 500+ failures in the G1 nightlies. >>>> So, I would like to get it in before the nightlies tonight. >>>> >>>> The issue is that we call collect() which will trigger a collection >>>> without protecting the memory that we just allocated for a >>>> humongous object. The fix (thanks Tony for helping me out!!!) is to >>>> fake an object and create a handle to it before we call collect. >>> >>> 1067 Handle h((oop)result); >>> 1068 collect(GCCause::_g1_humongous_allocation); >>> 1069 } >>> 1070 return result; >>> 1071 } >>> Can we really have a handle to uninitialized memory? >> >> The memory is not uninitialized since I fake an object there with the >> call to CollectedHeap::fill_with_object(result, word_size, false); >> just before the code you have above. > > I missed that. > >> >>> Are you sure that the humongous object will not be moved by a full collection. You should probably return h() instead of result. >> >> Humongous objects will not be moved by G1 collections so I think we >> are ok. > > OK. But maybe we should be a bit defensive and return h() here. > > StefanK > >> >> Thanks for the prompt review! >> >> Bengt >>> >>> StefanK >>> >>>> >>>> Bengt >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120120/8388a074/attachment.html From stefan.karlsson at oracle.com Fri Jan 20 08:53:55 2012 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 20 Jan 2012 17:53:55 +0100 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 In-Reply-To: <4F199CA0.1090606@oracle.com> References: <4F19911C.8040701@oracle.com> <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> <4F199A60.6070506@oracle.com> <4F1999FD.6020009@oracle.com> <4F199CA0.1090606@oracle.com> Message-ID: <4F199C23.5020007@oracle.com> On 2012-01-20 17:56, Bengt Rutisson wrote: > > Stefan, are you OK with adding the assert that Tony suggested? Use the assert if you want to, but I don't see why that would be a better solution. If we ever start moving humongous objects, returning h() will work in production code, while the assert will only be found in debug builds. StefanK > > Bengt > > On 2012-01-20 17:44, Stefan Karlsson wrote: >> On 2012-01-20 17:46, Bengt Rutisson wrote: >>> >>> Stefan, >>> >>> Thanks for the prompt review! >>> >>> Comments inline. >>> >>> On 2012-01-20 17:21, Stefan Karlsson wrote: >>>> On 20 jan 2012, at 17:06, Bengt Rutisson >>> > wrote: >>>> >>>>> >>>>> Hi all, >>>>> >>>>> Can I have a couple of quick reviews for this small change: >>>>> http://cr.openjdk.java.net/~brutisso/7131791/webrev.02 >>>>> >>>>> >>>>> This should hopefully fix the 500+ failures in the G1 nightlies. >>>>> So, I would like to get it in before the nightlies tonight. >>>>> >>>>> The issue is that we call collect() which will trigger a >>>>> collection without protecting the memory that we just allocated >>>>> for a humongous object. The fix (thanks Tony for helping me >>>>> out!!!) is to fake an object and create a handle to it before we >>>>> call collect. >>>> >>>> 1067 Handle h((oop)result); >>>> 1068 collect(GCCause::_g1_humongous_allocation); >>>> 1069 } >>>> 1070 return result; >>>> 1071 } >>>> Can we really have a handle to uninitialized memory? >>> >>> The memory is not uninitialized since I fake an object there with >>> the call to CollectedHeap::fill_with_object(result, word_size, >>> false); just before the code you have above. >> >> I missed that. >> >>> >>>> Are you sure that the humongous object will not be moved by a full collection. You should probably return h() instead of result. >>> >>> Humongous objects will not be moved by G1 collections so I think we >>> are ok. >> >> OK. But maybe we should be a bit defensive and return h() here. >> >> StefanK >> >>> >>> Thanks for the prompt review! >>> >>> Bengt >>>> >>>> StefanK >>>> >>>>> >>>>> Bengt >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120120/6e4fe166/attachment-0001.html From tony.printezis at oracle.com Fri Jan 20 08:57:20 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 20 Jan 2012 11:57:20 -0500 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 In-Reply-To: <4F199C23.5020007@oracle.com> References: <4F19911C.8040701@oracle.com> <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> <4F199A60.6070506@oracle.com> <4F1999FD.6020009@oracle.com> <4F199CA0.1090606@oracle.com> <4F199C23.5020007@oracle.com> Message-ID: <4F199CF0.2030307@oracle.com> Stefan, It's a fundamental assumption of G1 that we never move humongous objects during a GC. In the future, we might consider them for collection during a GC, however they will either be reclaimed or be left where they are. Given that a humongous object takes up whole regions, there's no much point in moving it somewhere else. Tony On 01/20/2012 11:53 AM, Stefan Karlsson wrote: > On 2012-01-20 17:56, Bengt Rutisson wrote: >> >> Stefan, are you OK with adding the assert that Tony suggested? > > Use the assert if you want to, but I don't see why that would be a > better solution. If we ever start moving humongous objects, returning > h() will work in production code, while the assert will only be found > in debug builds. > > StefanK > >> >> Bengt >> >> On 2012-01-20 17:44, Stefan Karlsson wrote: >>> On 2012-01-20 17:46, Bengt Rutisson wrote: >>>> >>>> Stefan, >>>> >>>> Thanks for the prompt review! >>>> >>>> Comments inline. >>>> >>>> On 2012-01-20 17:21, Stefan Karlsson wrote: >>>>> On 20 jan 2012, at 17:06, Bengt Rutisson >>>>> > wrote: >>>>> >>>>>> >>>>>> Hi all, >>>>>> >>>>>> Can I have a couple of quick reviews for this small change: >>>>>> http://cr.openjdk.java.net/~brutisso/7131791/webrev.02 >>>>>> >>>>>> >>>>>> This should hopefully fix the 500+ failures in the G1 nightlies. >>>>>> So, I would like to get it in before the nightlies tonight. >>>>>> >>>>>> The issue is that we call collect() which will trigger a >>>>>> collection without protecting the memory that we just allocated >>>>>> for a humongous object. The fix (thanks Tony for helping me >>>>>> out!!!) is to fake an object and create a handle to it before we >>>>>> call collect. >>>>> >>>>> 1067 Handle h((oop)result); >>>>> 1068 collect(GCCause::_g1_humongous_allocation); >>>>> 1069 } >>>>> 1070 return result; >>>>> 1071 } >>>>> Can we really have a handle to uninitialized memory? >>>> >>>> The memory is not uninitialized since I fake an object there with >>>> the call to CollectedHeap::fill_with_object(result, word_size, >>>> false); just before the code you have above. >>> >>> I missed that. >>> >>>> >>>>> Are you sure that the humongous object will not be moved by a full collection. You should probably return h() instead of result. >>>> >>>> Humongous objects will not be moved by G1 collections so I think we >>>> are ok. >>> >>> OK. But maybe we should be a bit defensive and return h() here. >>> >>> StefanK >>> >>>> >>>> Thanks for the prompt review! >>>> >>>> Bengt >>>>> >>>>> StefanK >>>>> >>>>>> >>>>>> Bengt >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120120/64baf537/attachment.html From bengt.rutisson at oracle.com Fri Jan 20 09:05:16 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 20 Jan 2012 18:05:16 +0100 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 In-Reply-To: <4F199CA0.1090606@oracle.com> References: <4F19911C.8040701@oracle.com> <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> <4F199A60.6070506@oracle.com> <4F1999FD.6020009@oracle.com> <4F199CA0.1090606@oracle.com> Message-ID: <4F199ECC.1020809@oracle.com> Here is an updated webrev with the added assert: http://cr.openjdk.java.net/~brutisso/7131791/webrev.05 Bengt On 2012-01-20 17:56, Bengt Rutisson wrote: > > Stefan, are you OK with adding the assert that Tony suggested? > > Bengt > > On 2012-01-20 17:44, Stefan Karlsson wrote: >> On 2012-01-20 17:46, Bengt Rutisson wrote: >>> >>> Stefan, >>> >>> Thanks for the prompt review! >>> >>> Comments inline. >>> >>> On 2012-01-20 17:21, Stefan Karlsson wrote: >>>> On 20 jan 2012, at 17:06, Bengt Rutisson >>> > wrote: >>>> >>>>> >>>>> Hi all, >>>>> >>>>> Can I have a couple of quick reviews for this small change: >>>>> http://cr.openjdk.java.net/~brutisso/7131791/webrev.02 >>>>> >>>>> >>>>> This should hopefully fix the 500+ failures in the G1 nightlies. >>>>> So, I would like to get it in before the nightlies tonight. >>>>> >>>>> The issue is that we call collect() which will trigger a >>>>> collection without protecting the memory that we just allocated >>>>> for a humongous object. The fix (thanks Tony for helping me >>>>> out!!!) is to fake an object and create a handle to it before we >>>>> call collect. >>>> >>>> 1067 Handle h((oop)result); >>>> 1068 collect(GCCause::_g1_humongous_allocation); >>>> 1069 } >>>> 1070 return result; >>>> 1071 } >>>> Can we really have a handle to uninitialized memory? >>> >>> The memory is not uninitialized since I fake an object there with >>> the call to CollectedHeap::fill_with_object(result, word_size, >>> false); just before the code you have above. >> >> I missed that. >> >>> >>>> Are you sure that the humongous object will not be moved by a full collection. You should probably return h() instead of result. >>> >>> Humongous objects will not be moved by G1 collections so I think we >>> are ok. >> >> OK. But maybe we should be a bit defensive and return h() here. >> >> StefanK >> >>> >>> Thanks for the prompt review! >>> >>> Bengt >>>> >>>> StefanK >>>> >>>>> >>>>> Bengt >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120120/4ff99d37/attachment.html From tony.printezis at oracle.com Fri Jan 20 09:13:39 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Fri, 20 Jan 2012 12:13:39 -0500 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F182B65.3050701@oracle.com> References: <4F1608C0.2040601@oracle.com> <4F16D5F7.4070605@oracle.com> <4F1730D6.1070004@oracle.com> <4F173A47.9040705@oracle.com> <4F17DC5C.4050403@oracle.com> <4F1822D2.2060202@oracle.com> <4F182B65.3050701@oracle.com> Message-ID: <4F19A0C3.1060200@oracle.com> Hi all, New webrev for this based on comments from John: http://cr.openjdk.java.net/~tonyp/7127706/webrev.3/ Now, the CMRootRegions::claim_next() method does not check the ConcurrentMark::has_aborted() flag to know when to abort (and return NULL) but, instead, a _has_aborted flag I added to the CMRootRegions class. This is set to true at the start of a Full GC and before the Full GC waits for the root region scan to finish. Additionally, I now call _root_regions.reset() from CM::checkpointRootsInitialPost()instead of calling it explicitly from do_collection_pause_at_safepoint(). Tony On 01/19/2012 09:40 AM, Tony Printezis wrote: > Bengt (and all), > > Updated webrev using "root regions" now: > > http://cr.openjdk.java.net/~tonyp/7127706/webrev.2/ > > Tony > > On 01/19/2012 09:04 AM, Tony Printezis wrote: >> Bengt, >> >> Inline (again!) >> >> On 1/19/2012 4:03 AM, Bengt Rutisson wrote: >>> >>> Tony, >>> >>> On 2012-01-18 22:31, Tony Printezis wrote: >>>> Bengt, >>>> >>>> Here's a webrev with the renaming: >>>> >>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.1/ >>>> >>>> I have to say I'm not sure I really like the term "initial-mark / >>>> IM snapshot regions". I'll try to come up with an alternative name >>>> for them.... >>> >>> Looked quickly at the new webrev. >>> >>> I agree that IM-snapshot might not be optimal. Still I like the fact >>> that it is not just "snapshot" since I think that can easily be >>> confused with the SATB terms. >> >> Well, it's supposed to be. In SATB, anything that's reachable from >> the "snapshot" at initial-mark time will be retained. Since we want >> to avoid explicitly marking the survivors, we make them all >> implicitly live and part of the "snapshot" which is why we have to >> scan them (the same way we scan roots during the initial-mark pause). >> So, "snapshot regions" is not unreasonable given that if we were not >> using SATB we would not be able to do this. >> >>> It is of course part of the SATB snapshot, but not the whole thing. >>> >>> Just thinking aloud here, what about not using the word "snapshot" >>> at all? How about "to_be_scanned_regions", "root_regions" or >>> "concurrent_roots"? >> >> I like "root regions". Let's go with that (it's shorter too!). >> >> Tony >> >>> Really not sure what a good name is here...I kind of like the log >>> message "Concurrent root scanning took 0.000x ms". >>> >>> Bengt >>> >>>> >>>> Tony >>>> >>>> Tony Printezis wrote: >>>>> Hi Bengt, >>>>> >>>>> Thanks for looking at this so quickly! Inline. >>>>> >>>>> Bengt Rutisson wrote: >>>>>> >>>>>> Tony, >>>>>> >>>>>> Overall this looks really good. Thanks for fixing it. >>>>>> >>>>>> Some comments: >>>>>> >>>>>> First, a general question regarding naming and logging. We now >>>>>> talk about "snapshot" a lot. It is a pretty good name, but maybe >>>>>> it needs some more context to be understandable in the code and >>>>>> the GC log. I don't have any really good names, but maybe >>>>>> "survivor_snapshot" >>>>> >>>>> I'd rather not mention "survivors" given that we might add >>>>> non-survivor regions in the future. >>>>> >>>>>> or "initial_mark_snapshot"? >>>>> >>>>> I like "initial-mark snapshot" better. Having said that >>>>> CMInitialMarkSnapshotRegions and _initial_mark_snapshot_regions >>>>> are kinda long. :-) I'll abbreviate to CMIMSnapshotRegions and >>>>> _im_snapshot_regions if that's OK. >>>>> >>>>>> concurrentMark.inline.hpp >>>>>> >>>>>> if (hr == NULL) { >>>>>> hr = _g1h->heap_region_containing_raw(addr); >>>>>> // Given that we're looking for a region that contains an object >>>>>> // header it's impossible to get back a HC region. >>>>>> assert(!hr->continuesHumongous(), "sanity"); >>>>>> } else { >>>>>> assert(hr->is_in(addr), "pre-condition"); >>>>>> } >>>>>> >>>>>> The first assert should probably hold even for regions that are >>>>>> passed in to grayRoot() right? So, maybe something like: >>>>>> >>>>>> if (hr == NULL) { >>>>>> hr = _g1h->heap_region_containing_raw(addr); >>>>>> } else { >>>>>> assert(hr->is_in(addr), "pre-condition"); >>>>>> } >>>>>> // Given that we need a region that contains an object >>>>>> // header it's impossible for it to be a HC region. >>>>>> assert(!hr->continuesHumongous(), "sanity"); >>>>> >>>>> Good observation! I changed to the above. >>>>> >>>>>> concurrentMarkThread.cpp >>>>>> >>>>>> ConcurrentMarkThread::run() >>>>>> >>>>>> Why do we do the explicit time/date stamping? >>>>>> >>>>>> gclog_or_tty->date_stamp(PrintGCDateStamps); >>>>>> gclog_or_tty->stamp(PrintGCTimeStamps); >>>>>> gclog_or_tty->print_cr("[GC >>>>>> concurrent-snapshot-scan-start]"); >>>>>> >>>>>> why is it not enough with the normal -XX:+PrintGCTimeStamps >>>>>> information? >>>>> >>>>> Not quite sure what you mean with "is it not enough with the >>>>> normal ... information". Each log record needs either a GC time >>>>> stamp or a GC date stamp and we have to print either or both >>>>> depending on the two -XX parameters. Unfortunately, the logging >>>>> code has not been well abstracted and/or refactored so we have >>>>> this unfortunate replication throughout the GCs. >>>>> >>>>>> This is probably correct since I see this pattern in other >>>>>> places. But I would like to understand why we do it. >>>>>> >>>>>> >>>>>> g1CollectedHeap.cpp: >>>>>> >>>>>> G1CollectedHeap::do_collection() >>>>>> >>>>>> Is it worth logging how long we had to wait in >>>>>> _cm->snapshot_regions()->wait_until_scan_finished(), the same way >>>>>> that we do in G1CollectedHeap::do_collection_pause_at_safepoint()? >>>>> >>>>> Currently, the GC log records for the evacuation pauses have a lot >>>>> of extra information when +PrintGCDetails is set and it was >>>>> reasonable to add an extra record with the wait time. And it's >>>>> more important to know how the wait for snapshot region scanning >>>>> affects evacuation pauses, which are more critical. The Full GC >>>>> log records are currently one line and I don't think we want to >>>>> extend them further (at least, not before we put a decent GC >>>>> logging framework in place). On the other hand, the snapshot >>>>> region scanning aborts when a marking cycle is aborted due to a >>>>> Full GC. So, this wait time should not be long. How about I add a >>>>> comment in the code saying that, when we introduce a more >>>>> extensible logging framework, we could add the wait time to the >>>>> Full GC log records? Something like: >>>>> >>>>> // Note: When we have a more flexible GC logging framework that >>>>> // allows us to add optional attributes to a GC log record we >>>>> // could consider timing and reporting how long we wait in the >>>>> // following two methods. >>>>> wait_while_free_regions_coming(); >>>>> // ... >>>>> _cm->snapshot_regions()->wait_until_scan_finished(); >>>>> >>>>> >>>>>> Finally, just some food for thought. Could this be generalized to >>>>>> more roots? I mean take a snapshot and scan it concurrently. >>>>> >>>>> By scanning the IM snapshot regions we say "these guys are all >>>>> roots, instead of scanning them during the GC we will scan them >>>>> concurrently". And we can do that for any object / region a) as >>>>> long as we know they will not move while they are being scanned >>>>> and b) because we have the pre-barrier. If any references on the >>>>> snapshot objects are updated, the pre-barrier ensures that their >>>>> values at the start of marking will be enqueued and processed. >>>>> >>>>> For external arbitrary roots we don't have a write barrier (and we >>>>> shouldn't as it'd be too expensive). So, we cannot do that for >>>>> non-object roots without a "pre-barrier"-type mechanism. >>>>> >>>>> Tony >>>>> >>>>> >>>>>> Bengt >>>>>> >>>>>> >>>>>> >>>>>> On 2012-01-18 00:48, Tony Printezis wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Can I have a couple of code reviews for this change that >>>>>>> re-enables the use of survivor regions during the initial-mark >>>>>>> pause? >>>>>>> >>>>>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ >>>>>>> >>>>>>> From the CR: >>>>>>> >>>>>>> We could scan the survivors as we're copying them, however this >>>>>>> will require more work during the initial-mark GCs (and in >>>>>>> particular: special-case code in the fast path). >>>>>>> >>>>>>> A better approach is to let the concurrent marking threads scan >>>>>>> the survivors and mark everything reachable from them a) before >>>>>>> any more concurrent marking work is done (so that we can just >>>>>>> mark the objects, without needing to push them on a stack, and >>>>>>> let the "finger" algorithm discover them) and b) before the next >>>>>>> GC starts (since, if we copy them, we won't know which of the >>>>>>> new survivors are the ones we need to scan). >>>>>>> >>>>>>> This approach has the advantage that it does not require any >>>>>>> extra work during the initial-mark GCs and all the work is done >>>>>>> by the concurrent marking threads. However, it has the >>>>>>> disadvantage that the survivor scanning might hold up the next >>>>>>> GC. In most cases this should not be an issue as GCs take place >>>>>>> at a reasonably low rate. If it does become a problem we could >>>>>>> consider the following: >>>>>>> >>>>>>> - like when the GC locker is active, try to extend the eden to >>>>>>> give a bit more time to the marking threads to finish scanning >>>>>>> the survivors >>>>>>> - instead of waiting for the marking threads, a GC can take over >>>>>>> and finish up scanning the remaining survivors (typically, we >>>>>>> have more GC threads than marking threads, so the overhead will >>>>>>> be reduced) >>>>>>> - if we supported region pinning, we could pin all the regions >>>>>>> that were not scanned by the time the GC started so that the >>>>>>> marking threads can resume scanning them after the GC completes >>>>>>> >>>>>>> Implementation notes: >>>>>>> >>>>>>> I introduced the concept of a "snapshot regions" in the >>>>>>> ConcurrentMark which is a set of regions that need to be scanned >>>>>>> at the start of a concurrent cycle. Currently, these can only be >>>>>>> survivors but maybe we can use the same concept for something >>>>>>> else in the future. >>>>>>> >>>>>>> Tony >>>>>>> >>>>>>> >>>>>> >>>>> >>> From bengt.rutisson at oracle.com Fri Jan 20 09:21:47 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 20 Jan 2012 18:21:47 +0100 Subject: Review request (S): 7131791 G1: Asserts in nightly testing due to 6976060 In-Reply-To: <4F199ECC.1020809@oracle.com> References: <4F19911C.8040701@oracle.com> <8C7CE33B-1215-4828-836B-EC0860F5D5CF@oracle.com> <4F199A60.6070506@oracle.com> <4F1999FD.6020009@oracle.com> <4F199CA0.1090606@oracle.com> <4F199ECC.1020809@oracle.com> Message-ID: <4F19A2AB.3010300@oracle.com> Stefan and Tony, Thanks for the really fast reviews! My job is now in JPRT. I hope it makes it in before the nightlies tonight. Thanks again for all the help from both of you! (You both helped me much more today than what is visible in this email thread!) Bengt On 2012-01-20 18:05, Bengt Rutisson wrote: > > Here is an updated webrev with the added assert: > http://cr.openjdk.java.net/~brutisso/7131791/webrev.05 > > Bengt > > On 2012-01-20 17:56, Bengt Rutisson wrote: >> >> Stefan, are you OK with adding the assert that Tony suggested? >> >> Bengt >> >> On 2012-01-20 17:44, Stefan Karlsson wrote: >>> On 2012-01-20 17:46, Bengt Rutisson wrote: >>>> >>>> Stefan, >>>> >>>> Thanks for the prompt review! >>>> >>>> Comments inline. >>>> >>>> On 2012-01-20 17:21, Stefan Karlsson wrote: >>>>> On 20 jan 2012, at 17:06, Bengt Rutisson >>>>> > wrote: >>>>> >>>>>> >>>>>> Hi all, >>>>>> >>>>>> Can I have a couple of quick reviews for this small change: >>>>>> http://cr.openjdk.java.net/~brutisso/7131791/webrev.02 >>>>>> >>>>>> >>>>>> This should hopefully fix the 500+ failures in the G1 nightlies. >>>>>> So, I would like to get it in before the nightlies tonight. >>>>>> >>>>>> The issue is that we call collect() which will trigger a >>>>>> collection without protecting the memory that we just allocated >>>>>> for a humongous object. The fix (thanks Tony for helping me >>>>>> out!!!) is to fake an object and create a handle to it before we >>>>>> call collect. >>>>> >>>>> 1067 Handle h((oop)result); >>>>> 1068 collect(GCCause::_g1_humongous_allocation); >>>>> 1069 } >>>>> 1070 return result; >>>>> 1071 } >>>>> Can we really have a handle to uninitialized memory? >>>> >>>> The memory is not uninitialized since I fake an object there with >>>> the call to CollectedHeap::fill_with_object(result, word_size, >>>> false); just before the code you have above. >>> >>> I missed that. >>> >>>> >>>>> Are you sure that the humongous object will not be moved by a full collection. You should probably return h() instead of result. >>>> >>>> Humongous objects will not be moved by G1 collections so I think we >>>> are ok. >>> >>> OK. But maybe we should be a bit defensive and return h() here. >>> >>> StefanK >>> >>>> >>>> Thanks for the prompt review! >>>> >>>> Bengt >>>>> >>>>> StefanK >>>>> >>>>>> >>>>>> Bengt >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120120/f3738c7d/attachment.html From bengt.rutisson at oracle.com Fri Jan 20 11:28:37 2012 From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com) Date: Fri, 20 Jan 2012 19:28:37 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7131791: G1: Asserts in nightly testing due to 6976060 Message-ID: <20120120192842.9AD14470CD@hg.openjdk.java.net> Changeset: 57025542827f Author: brutisso Date: 2012-01-20 18:01 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/57025542827f 7131791: G1: Asserts in nightly testing due to 6976060 Summary: Create a handle and fake an object to make sure that we don't loose the memory we just allocated Reviewed-by: tonyp, stefank ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp From john.coomes at oracle.com Sat Jan 21 15:31:01 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Sat, 21 Jan 2012 23:31:01 +0000 Subject: hg: hsx/hotspot-gc: 2 new changesets Message-ID: <20120121233101.406E447107@hg.openjdk.java.net> Changeset: 7ad075c80995 Author: katleman Date: 2012-01-13 10:05 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/rev/7ad075c80995 Added tag jdk8-b21 for changeset cc771d92284f ! .hgtags Changeset: 60d6f64a86b1 Author: katleman Date: 2012-01-20 13:08 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/rev/60d6f64a86b1 Added tag jdk8-b22 for changeset 7ad075c80995 ! .hgtags From john.coomes at oracle.com Sat Jan 21 15:31:06 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Sat, 21 Jan 2012 23:31:06 +0000 Subject: hg: hsx/hotspot-gc/corba: 2 new changesets Message-ID: <20120121233109.1957E47108@hg.openjdk.java.net> Changeset: a11d0062c445 Author: katleman Date: 2012-01-13 10:05 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/corba/rev/a11d0062c445 Added tag jdk8-b21 for changeset f157fc2a71a3 ! .hgtags Changeset: 5218eb256658 Author: katleman Date: 2012-01-20 13:08 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/corba/rev/5218eb256658 Added tag jdk8-b22 for changeset a11d0062c445 ! .hgtags From john.coomes at oracle.com Sat Jan 21 15:31:14 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Sat, 21 Jan 2012 23:31:14 +0000 Subject: hg: hsx/hotspot-gc/jaxp: 2 new changesets Message-ID: <20120121233114.DFA2547109@hg.openjdk.java.net> Changeset: cf9d6ec44f89 Author: katleman Date: 2012-01-13 10:05 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/cf9d6ec44f89 Added tag jdk8-b21 for changeset d41eeadf5c13 ! .hgtags Changeset: 95102fd33418 Author: katleman Date: 2012-01-20 13:08 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/95102fd33418 Added tag jdk8-b22 for changeset cf9d6ec44f89 ! .hgtags From john.coomes at oracle.com Sat Jan 21 15:31:20 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Sat, 21 Jan 2012 23:31:20 +0000 Subject: hg: hsx/hotspot-gc/jaxws: 4 new changesets Message-ID: <20120121233120.5FD944710A@hg.openjdk.java.net> Changeset: e67d51254533 Author: ohair Date: 2012-01-09 09:22 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jaxws/rev/e67d51254533 7096063: /META-INF/mimetypes.default missing in jre\lib\resources.jar Reviewed-by: dholmes ! build-defs.xml Changeset: c266cab0e3ff Author: katleman Date: 2012-01-11 16:12 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jaxws/rev/c266cab0e3ff Merge Changeset: 8d3df89b0f2d Author: katleman Date: 2012-01-13 10:05 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jaxws/rev/8d3df89b0f2d Added tag jdk8-b21 for changeset c266cab0e3ff ! .hgtags Changeset: 25ce7a000487 Author: katleman Date: 2012-01-20 13:08 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jaxws/rev/25ce7a000487 Added tag jdk8-b22 for changeset 8d3df89b0f2d ! .hgtags From john.coomes at oracle.com Sat Jan 21 15:31:48 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Sat, 21 Jan 2012 23:31:48 +0000 Subject: hg: hsx/hotspot-gc/jdk: 23 new changesets Message-ID: <20120121233549.7410D47110@hg.openjdk.java.net> Changeset: 1c4fffa22930 Author: okutsu Date: 2011-12-21 17:09 +0900 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/1c4fffa22930 7122054: (tz) Windows-only: tzmappings needs update for KB2633952 Reviewed-by: peytoia ! src/windows/lib/tzmappings Changeset: b1814b3ea6d3 Author: michaelm Date: 2011-12-21 10:06 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/b1814b3ea6d3 7078386: NetworkInterface.getNetworkInterfaces() may return corrupted results on linux Reviewed-by: michaelm, alanb, chegar Contributed-by: brandon.passanisi at oracle.com ! src/solaris/native/java/net/NetworkInterface.c Changeset: a9dfdc523c2c Author: valeriep Date: 2011-12-21 14:08 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/a9dfdc523c2c 6839886: Array overrun in pkcs11 Summary: Fix the wrong value when dealing w/ month and day. Reviewed-by: mullan ! src/share/native/sun/security/pkcs11/wrapper/p11_convert.c Changeset: 11698dedbe79 Author: weijun Date: 2011-12-22 15:35 +0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/11698dedbe79 7122169: TcpTimeout fail for various reasons Reviewed-by: alanb ! test/sun/security/krb5/auto/TcpTimeout.java Changeset: 559e07ed1f56 Author: alanb Date: 2011-12-22 10:52 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/559e07ed1f56 7076310: (file) AclEntry.Builder setFlags throws IllegalArgumentException if set argument is empty Reviewed-by: alanb Contributed-by: stephen.flores at oracle.com ! src/share/classes/java/nio/file/attribute/AclEntry.java + test/java/nio/file/attribute/AclEntry/EmptySet.java Changeset: 3c1ab134db71 Author: dcubed Date: 2011-12-22 18:35 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/3c1ab134db71 7121600: Instrumentation.redefineClasses() leaks class bytes Summary: Call JNI ReleaseByteArrayElements() on memory returned by JNI GetByteArrayElements(). Also push test for 7122253. Reviewed-by: acorn, poonam ! src/share/instrument/JPLISAgent.c + test/java/lang/instrument/BigClass.java + test/java/lang/instrument/MakeJAR4.sh + test/java/lang/instrument/RedefineBigClass.sh + test/java/lang/instrument/RedefineBigClassAgent.java + test/java/lang/instrument/RedefineBigClassApp.java + test/java/lang/instrument/RetransformBigClass.sh + test/java/lang/instrument/RetransformBigClassAgent.java + test/java/lang/instrument/RetransformBigClassApp.java Changeset: 437255d15e76 Author: lana Date: 2011-12-28 10:51 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/437255d15e76 Merge - src/share/classes/sun/awt/FocusingTextField.java - src/share/classes/sun/awt/HorizBagLayout.java - src/share/classes/sun/awt/OrientableFlowLayout.java - src/share/classes/sun/awt/VariableGridLayout.java - src/share/classes/sun/awt/VerticalBagLayout.java Changeset: 3a7ea63302f8 Author: smarks Date: 2011-12-29 16:39 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/3a7ea63302f8 7122061: add -Xlint:all -Werror to warning-free build steps Reviewed-by: chegar, alanb, dholmes, ohair ! make/com/sun/demo/jvmti/hprof/Makefile ! make/com/sun/java/browser/net/Makefile ! make/com/sun/tools/Makefile ! make/com/sun/tools/attach/Makefile ! make/com/sun/tracing/Makefile ! make/com/sun/tracing/dtrace/Makefile ! make/java/instrument/Makefile ! make/java/rmi/Makefile ! make/java/text/base/Makefile ! make/java/text/bidi/Makefile ! make/java/util/Makefile ! make/javax/accessibility/Makefile ! make/javax/others/Makefile ! make/javax/security/Makefile ! make/jpda/tty/Makefile ! make/sun/launcher/Makefile ! make/sun/serialver/Makefile ! make/sun/text/Makefile ! make/sun/util/Makefile Changeset: 5aeefe0e5d8c Author: chegar Date: 2012-01-01 09:24 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/5aeefe0e5d8c 7125055: ContentHandler.getContent API changed in error Reviewed-by: alanb ! src/share/classes/java/net/ContentHandler.java ! src/share/classes/sun/net/www/content/image/gif.java ! src/share/classes/sun/net/www/content/image/jpeg.java ! src/share/classes/sun/net/www/content/image/png.java ! src/share/classes/sun/net/www/content/image/x_xbitmap.java ! src/share/classes/sun/net/www/content/image/x_xpixmap.java Changeset: 8952a5f494f9 Author: ksrini Date: 2012-01-03 08:27 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/8952a5f494f9 7123582: (launcher) display the -version and -XshowSettings Reviewed-by: alanb ! src/share/bin/java.c ! test/tools/launcher/Settings.java Changeset: 5e34726cb4bb Author: ksrini Date: 2012-01-03 08:33 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/5e34726cb4bb 7124443: (launcher) test DefaultsLocaleTest fails with Windows shells. Reviewed-by: darcy ! test/tools/launcher/DefaultLocaleTest.java - test/tools/launcher/DefaultLocaleTest.sh + test/tools/launcher/DefaultLocaleTestRun.java ! test/tools/launcher/TestHelper.java Changeset: 0194fe5ca404 Author: fparain Date: 2012-01-04 03:49 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/0194fe5ca404 7104647: Adding a diagnostic command framework Reviewed-by: mchung, dholmes ! make/common/Release.gmk ! make/java/management/mapfile-vers ! make/launchers/Makefile ! make/sun/tools/Makefile + src/linux/doc/man/jcmd.1 + src/share/classes/com/sun/management/DiagnosticCommandArgumentInfo.java + src/share/classes/com/sun/management/DiagnosticCommandInfo.java ! src/share/classes/com/sun/management/HotSpotDiagnosticMXBean.java ! src/share/classes/sun/management/HotSpotDiagnostic.java ! src/share/classes/sun/tools/attach/HotSpotVirtualMachine.java + src/share/classes/sun/tools/jcmd/Arguments.java + src/share/classes/sun/tools/jcmd/JCmd.java ! src/share/javavm/export/jmm.h ! src/share/native/sun/management/HotSpotDiagnostic.c + src/solaris/doc/sun/man/man1/jcmd.1 + test/com/sun/management/HotSpotDiagnosticMXBean/ExecuteDiagnosticCommand.java + test/com/sun/management/HotSpotDiagnosticMXBean/GetDiagnosticCommandInfo.java + test/com/sun/management/HotSpotDiagnosticMXBean/GetDiagnosticCommands.java ! test/sun/tools/common/CommonSetup.sh + test/sun/tools/jcmd/dcmd-script.txt + test/sun/tools/jcmd/help_help.out + test/sun/tools/jcmd/jcmd-Defaults.sh + test/sun/tools/jcmd/jcmd-f.sh + test/sun/tools/jcmd/jcmd-help-help.sh + test/sun/tools/jcmd/jcmd-help.sh + test/sun/tools/jcmd/jcmd-pid.sh + test/sun/tools/jcmd/jcmd_Output1.awk + test/sun/tools/jcmd/jcmd_pid_Output1.awk + test/sun/tools/jcmd/jcmd_pid_Output2.awk + test/sun/tools/jcmd/usage.out Changeset: 38a318502e19 Author: lana Date: 2012-01-04 10:57 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/38a318502e19 Merge ! make/common/Release.gmk - test/tools/launcher/DefaultLocaleTest.sh Changeset: 93ab1df09d11 Author: lana Date: 2012-01-09 19:12 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/93ab1df09d11 Merge - test/tools/launcher/DefaultLocaleTest.sh Changeset: ddb97d4fa83d Author: ohair Date: 2012-01-04 17:42 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/ddb97d4fa83d 7127104: Build issue with prtconf and zones, also using := to avoid extra execs Reviewed-by: katleman ! make/common/shared/Platform.gmk Changeset: 7c8c16f2c05e Author: ohair Date: 2012-01-09 09:18 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/7c8c16f2c05e 7128320: Fix freetype sanity check to make it more generic Reviewed-by: luchsh, katleman Contributed-by: Jonathan Lu ! make/common/Defs-linux.gmk ! make/common/Defs-solaris.gmk ! make/common/Defs-windows.gmk ! make/common/Demo.gmk ! make/tools/freetypecheck/Makefile Changeset: 664fa4fb0ee4 Author: katleman Date: 2012-01-11 16:13 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/664fa4fb0ee4 Merge Changeset: dda27c73d8db Author: katleman Date: 2012-01-13 10:05 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/dda27c73d8db Added tag jdk8-b21 for changeset 664fa4fb0ee4 ! .hgtags Changeset: 76bfd08d8cc5 Author: katleman Date: 2012-01-20 13:08 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/76bfd08d8cc5 Added tag jdk8-b22 for changeset dda27c73d8db ! .hgtags Changeset: db189e2f3cdb Author: jrose Date: 2012-01-18 17:34 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/db189e2f3cdb 7117167: Misc warnings in java.lang.invoke and sun.invoke.* Reviewed-by: smarks ! src/share/classes/java/lang/invoke/AdapterMethodHandle.java ! src/share/classes/java/lang/invoke/MemberName.java ! src/share/classes/java/lang/invoke/MethodHandleImpl.java ! src/share/classes/java/lang/invoke/MethodHandleProxies.java ! src/share/classes/java/lang/invoke/MethodHandles.java ! src/share/classes/sun/invoke/util/ValueConversions.java ! src/share/classes/sun/invoke/util/Wrapper.java ! test/java/lang/invoke/CallSiteTest.java ! test/java/lang/invoke/ClassValueTest.java ! test/java/lang/invoke/InvokeGenericTest.java ! test/java/lang/invoke/JavaDocExamplesTest.java ! test/java/lang/invoke/MethodHandlesTest.java ! test/java/lang/invoke/MethodTypeTest.java ! test/java/lang/invoke/PermuteArgsTest.java ! test/java/lang/invoke/RicochetTest.java ! test/java/lang/invoke/ThrowExceptionsTest.java ! test/sun/invoke/util/ValueConversionsTest.java Changeset: 01014596ada1 Author: jrose Date: 2012-01-18 17:34 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/01014596ada1 7077803: java.lang.InternalError in java.lang.invoke.MethodHandleNatives.init Summary: Use correct access token for unreflecting MHs where setAccessible(true) Reviewed-by: never, twisti ! src/share/classes/java/lang/invoke/MethodHandles.java Changeset: 92d2cba30f08 Author: jrose Date: 2012-01-18 17:34 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/92d2cba30f08 7030453: JSR 292 ClassValue.get method is too slow Summary: Implement ClassValue cooperatively with Class like ThreadLocal with Thread. Reviewed-by: twisti, mduigou ! src/share/classes/java/lang/Class.java ! src/share/classes/java/lang/ClassValue.java ! test/java/lang/invoke/ClassValueTest.java Changeset: 81a2629aa2a2 Author: amurillo Date: 2012-01-20 14:31 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/81a2629aa2a2 Merge From ysr1729 at gmail.com Sun Jan 22 23:39:53 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Sun, 22 Jan 2012 23:39:53 -0800 Subject: RFR(XS): 7129514: time warp warnings after 7117303 In-Reply-To: <4F171FC4.3020708@oracle.com> References: <4F170AA8.5070403@oracle.com> <4F171FC4.3020708@oracle.com> Message-ID: Looks good. (As in earlier review, I wish the duplication of comment and code could be minimized via consolidation into a suitable new method definition ...) - ramki On Wed, Jan 18, 2012 at 11:38 AM, John Cuthbertson < john.cuthbertson at oracle.com> wrote: > Hi Everyone, > > I forgot to include the webrev link: http://cr.openjdk.java.net/~** > johnc/7129514/webrev.0/ > > Thanks to Ramki for pointing this out. > > JohnC > > > On 1/18/2012 10:08 AM, John Cuthbertson wrote: > >> Hi Everyone, >> >> While making the changes for 711303, I missed a few calls to >> os::javaTimeMillis() (specifically with updating the time since the last >> GC). As a consequence we can still see the occasional time-warp warning. >> The issue is that os::javaTimeMillis() returns values that are not >> guaranteed to be monotonically non-decreasing and so they can go backwards. >> I've replaced these calls to an equivalent that uses os::javaTimeNanos(), >> which will return values that are monotonically non-decreasing if the >> underlying system time source supports such a mode. >> >> Many thanks to David Holmes for diagnosing the issue. >> >> Thanks, >> >> JohnC >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120122/1da50cd8/attachment.html From bengt.rutisson at oracle.com Mon Jan 23 04:21:11 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 23 Jan 2012 13:21:11 +0100 Subject: Review request (S): 7132311 G1: assert((s == klass->oop_size(this)) || (Universe::heap()->is_gc_active() && ((is_typeArray()... Message-ID: <4F1D50B7.1070704@oracle.com> Hi all, Can I have one more review for this change, please? Stefan has already looked at it. http://cr.openjdk.java.net/~brutisso/7132311/webrev.02/ The idea is that we move the check for whether or not we should initiate a marking cycle to before we allocate a humongous object. This way we can ignore the issue with uninitialized memory. Thanks, Bengt From bengt.rutisson at oracle.com Mon Jan 23 05:30:59 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 23 Jan 2012 14:30:59 +0100 Subject: Review request (S): 7132311 G1: assert((s == klass->oop_size(this)) || (Universe::heap()->is_gc_active() && ((is_typeArray()... In-Reply-To: <4F1D50B7.1070704@oracle.com> References: <4F1D50B7.1070704@oracle.com> Message-ID: <4F1D6113.6090707@oracle.com> Forgot to mention something about testing: I was able to reproduce the assert using UTE and the Juggle12 test. It hits the assert every time I run it on my laptop. With the fix in the webrev the assert is gone. I also tested with the reproducer I had for "7131791: G1: Asserts in nightly testing due to 6976060" and it passes that too. Finally I also tried out the small test app that I wrote to see that we get concurrent collections. That works fine as well. No full GCs. Bengt On 2012-01-23 13:21, Bengt Rutisson wrote: > > Hi all, > > Can I have one more review for this change, please? Stefan has already > looked at it. > > http://cr.openjdk.java.net/~brutisso/7132311/webrev.02/ > > The idea is that we move the check for whether or not we should > initiate a marking cycle to before we allocate a humongous object. > This way we can ignore the issue with uninitialized memory. > > Thanks, > Bengt > From bengt.rutisson at oracle.com Mon Jan 23 10:06:09 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Mon, 23 Jan 2012 19:06:09 +0100 Subject: Review request (S): 7132311 G1: assert((s == klass->oop_size(this)) || (Universe::heap()->is_gc_active() && ((is_typeArray()... In-Reply-To: <4F1D7784.2080801@oracle.com> References: <4F1D50B7.1070704@oracle.com> <4F1D7784.2080801@oracle.com> Message-ID: <4F1DA191.80503@oracle.com> Updated webrev based on comments from Tony below: http://cr.openjdk.java.net/~brutisso/7132311/webrev.03/ Thanks Stefan and Tony for the quick reviews. As soon as JPRT is back online I will try to push this. Bengt On 2012-01-23 16:06, Tony Printezis wrote: > Bengt, > > g1CollectedHeap.cpp: > > 1128 if (result != NULL&& > g1_policy()->need_to_start_conc_mark("STW humongous allocation")) { > 1129 g1_policy()->set_initiate_conc_mark_if_possible(); > 1130 } > 1131 return result; > > Should you maybe leave this in (it should be benign if the cycle is > already in progress or is about to start)? Imagine the following > scenario: a hum allocation is attempted, the old gen + hum allocation > size is just under the threshold, the hum allocation fails > concurrently (not enough contiguous regions to satisfy it), we do a > young GC to satisfy it, at the end of the young GC the old gen > (including the newly-promoted objects) + hum allocation size is over > the threshold (and maybe by a large amount if we promoted a lot). I > know it's a bit of a stretch.... > > g1CollectorPolicy.cpp: > > 1142 if (_g1->mark_in_progress()) { > > (As discussed) Please change this to cmThread()->during_cycle(). > > 1149 size_t total_estimated_bytes = cur_used_bytes + alloc_word_size > * BytesPerWord > > What's the different between BytesPerWord and HeapWordSize? I > generally use the latter. > > Also, thanks for updating the ergo_verbose entry points. However, > alloc_word_size is in words and you will print it as bytes (and it's > good to be consistent with the units anyway, given that everything > else is in bytes). Maybe just introduce a local field, > alloc_size_bytes, and use it both the condition as well as the > ergo_verbose output? > > Nits: > > g1CollectedHeap.cpp: > > 1056 result = humongous_obj_allocate(word_size); > 1057 > 1058 if (result != NULL) { > 1059 return result; > 1060 } > > Can you delete line 1057. In the code the NULL test is generally on > the line after the assignment (at least, it should be in most of the > slow path allocation code). > > g1CollectorPolicy.hpp > > 802 bool need_to_start_conc_mark(const char* source, size_t > alloc_word_size); > > I'd give alloc_word_size a 0 default value to avoid passing 0 when a > size is not needed. > > > > On 1/23/2012 7:21 AM, Bengt Rutisson wrote: >> >> Hi all, >> >> Can I have one more review for this change, please? Stefan has >> already looked at it. >> >> http://cr.openjdk.java.net/~brutisso/7132311/webrev.02/ >> >> The idea is that we move the check for whether or not we should >> initiate a marking cycle to before we allocate a humongous object. >> This way we can ignore the issue with uninitialized memory. >> >> Thanks, >> Bengt >> From tony.printezis at oracle.com Mon Jan 23 10:54:42 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Mon, 23 Jan 2012 13:54:42 -0500 Subject: Review request (S): 7132311 G1: assert((s == klass->oop_size(this)) || (Universe::heap()->is_gc_active() && ((is_typeArray()... In-Reply-To: <4F1DA191.80503@oracle.com> References: <4F1D50B7.1070704@oracle.com> <4F1D7784.2080801@oracle.com> <4F1DA191.80503@oracle.com> Message-ID: <4F1DACF2.4020705@oracle.com> Bengt, Thanks for fixing this and for taking into account my previous suggestions. The only very minor change I'd recommend is to replace the "alloc size" string in both ergo_verbose calls with "allocation request" given that this is what it is called in other places: ergo_verbose1(ErgoHeapSizing, "attempt heap expansion", ergo_format_reason("humongous allocation request failed") ergo_format_byte("allocation request"), word_size * HeapWordSize); So, it'd be nice if we were consistent. Apart from that: ship it! Also, I really liked Stefan's suggestion to do the GC before the allocation. It simplified the fix quite a lot... Tony On 01/23/2012 01:06 PM, Bengt Rutisson wrote: > > Updated webrev based on comments from Tony below: > http://cr.openjdk.java.net/~brutisso/7132311/webrev.03/ > > Thanks Stefan and Tony for the quick reviews. As soon as JPRT is back > online I will try to push this. > > Bengt > > > On 2012-01-23 16:06, Tony Printezis wrote: >> Bengt, >> >> g1CollectedHeap.cpp: >> >> 1128 if (result != NULL&& >> g1_policy()->need_to_start_conc_mark("STW humongous allocation")) { >> 1129 g1_policy()->set_initiate_conc_mark_if_possible(); >> 1130 } >> 1131 return result; >> >> Should you maybe leave this in (it should be benign if the cycle is >> already in progress or is about to start)? Imagine the following >> scenario: a hum allocation is attempted, the old gen + hum allocation >> size is just under the threshold, the hum allocation fails >> concurrently (not enough contiguous regions to satisfy it), we do a >> young GC to satisfy it, at the end of the young GC the old gen >> (including the newly-promoted objects) + hum allocation size is over >> the threshold (and maybe by a large amount if we promoted a lot). I >> know it's a bit of a stretch.... >> >> g1CollectorPolicy.cpp: >> >> 1142 if (_g1->mark_in_progress()) { >> >> (As discussed) Please change this to cmThread()->during_cycle(). >> >> 1149 size_t total_estimated_bytes = cur_used_bytes + >> alloc_word_size * BytesPerWord >> >> What's the different between BytesPerWord and HeapWordSize? I >> generally use the latter. >> >> Also, thanks for updating the ergo_verbose entry points. However, >> alloc_word_size is in words and you will print it as bytes (and it's >> good to be consistent with the units anyway, given that everything >> else is in bytes). Maybe just introduce a local field, >> alloc_size_bytes, and use it both the condition as well as the >> ergo_verbose output? >> >> Nits: >> >> g1CollectedHeap.cpp: >> >> 1056 result = humongous_obj_allocate(word_size); >> 1057 >> 1058 if (result != NULL) { >> 1059 return result; >> 1060 } >> >> Can you delete line 1057. In the code the NULL test is generally on >> the line after the assignment (at least, it should be in most of the >> slow path allocation code). >> >> g1CollectorPolicy.hpp >> >> 802 bool need_to_start_conc_mark(const char* source, size_t >> alloc_word_size); >> >> I'd give alloc_word_size a 0 default value to avoid passing 0 when a >> size is not needed. >> >> >> >> On 1/23/2012 7:21 AM, Bengt Rutisson wrote: >>> >>> Hi all, >>> >>> Can I have one more review for this change, please? Stefan has >>> already looked at it. >>> >>> http://cr.openjdk.java.net/~brutisso/7132311/webrev.02/ >>> >>> The idea is that we move the check for whether or not we should >>> initiate a marking cycle to before we allocate a humongous object. >>> This way we can ignore the issue with uninitialized memory. >>> >>> Thanks, >>> Bengt >>> > From john.coomes at oracle.com Mon Jan 23 14:54:20 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Mon, 23 Jan 2012 22:54:20 +0000 Subject: hg: hsx/hotspot-gc/langtools: 8 new changesets Message-ID: <20120123225440.CD38247143@hg.openjdk.java.net> Changeset: 116f68a5e677 Author: jjg Date: 2011-12-23 22:30 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/116f68a5e677 7124605: typos in javac comments Reviewed-by: ksrini ! test/tools/javac/generics/diamond/7046778/DiamondAndInnerClassTest.java ! test/tools/javac/generics/inference/7086601/T7086601b.java ! test/tools/javac/generics/rawOverride/7062745/GenericOverrideTest.java ! test/tools/javac/lambda/LambdaParserTest.java Changeset: 67512b631961 Author: lana Date: 2011-12-28 10:52 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/67512b631961 Merge Changeset: 7a836147b266 Author: jjg Date: 2012-01-03 11:37 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/7a836147b266 4881269: improve diagnostic for ill-formed tokens Reviewed-by: mcimadamore ! src/share/classes/com/sun/tools/javac/parser/JavaTokenizer.java ! src/share/classes/com/sun/tools/javac/resources/compiler.properties + test/tools/javac/diags/examples/IllegalDot.java + test/tools/javac/parser/T4881269.java + test/tools/javac/parser/T4881269.out Changeset: a07eef109532 Author: jjh Date: 2012-01-03 17:18 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/a07eef109532 7046929: tools/javac/api/T6397104.java fails Reviewed-by: jjg ! test/tools/javac/api/T6397104.java Changeset: 4e8aa6eca726 Author: lana Date: 2012-01-04 10:58 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/4e8aa6eca726 Merge Changeset: bcb21abf1c41 Author: lana Date: 2012-01-09 19:13 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/bcb21abf1c41 Merge Changeset: 390a7828ae18 Author: katleman Date: 2012-01-13 10:05 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/390a7828ae18 Added tag jdk8-b21 for changeset bcb21abf1c41 ! .hgtags Changeset: f6191bad139a Author: katleman Date: 2012-01-20 13:08 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/f6191bad139a Added tag jdk8-b22 for changeset 390a7828ae18 ! .hgtags From bengt.rutisson at oracle.com Mon Jan 23 16:20:42 2012 From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com) Date: Tue, 24 Jan 2012 00:20:42 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7132311: G1: assert((s == klass->oop_size(this)) || (Universe::heap()->is_gc_active() && ((is_typeArray()... Message-ID: <20120124002046.F1CE047147@hg.openjdk.java.net> Changeset: 6a78aa6ac1ff Author: brutisso Date: 2012-01-23 20:36 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/6a78aa6ac1ff 7132311: G1: assert((s == klass->oop_size(this)) || (Universe::heap()->is_gc_active() && ((is_typeArray()... Summary: Move the check for when to call collect() to before we do a humongous object allocation Reviewed-by: stefank, tonyp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.hpp From john.cuthbertson at oracle.com Tue Jan 24 10:53:18 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 24 Jan 2012 10:53:18 -0800 Subject: RFR(L): 6484965: G1: piggy-back liveness accounting phase on marking In-Reply-To: <4F0E9931.9070303@oracle.com> References: <4E8A40BE.9020800@oracle.com> <4EC2B317.3000006@oracle.com> <4ED38788.4010106@oracle.com> <4EF0DEF9.30306@oracle.com> <4EF1AF4E.80107@oracle.com> <4EF2127E.5050809@oracle.com> <4F0E9931.9070303@oracle.com> Message-ID: <4F1EFE1E.4050202@oracle.com> Hi Everyone, I have a new webrev for these changes that can be found at: http://cr.openjdk.java.net/~johnc/6484965/webrev.5/ This version includes changes based upon code review comments by Tony, including: * a helper routine to calculate the index in the card bitmap(s) for a given address * a clean up of the code that is used to set/clear bits in the card bitmap(s): for small ranges I use a simple loop which the compilers seem to be doing a reasonable job optimizing and for larger ranges I use [par_]set_range() with suitable parameters to set the bit in the range inclusively. The single-bit case is handle by the small range code; nor have I seen the OOB assertion failure during testing. * Moved the clearing/initialization of the liveness counting data structures from the initial mark pause to the ConcurrentMark constructor and wherever the Next marking bitmap is cleared (at the end of the pause or when the marking is aborted as result of a full GC). At the end of the marking cycle, this clearing is concurrent. * Changing the type of _queue_num in G1ParScanThreadState to a uint. * Various cleanups, typos, and formatting changes. I found and fixed a small bug in the aggregation code that was exposed by a change based upon one of the comments by Tony. Testing: GC test suite with marking verification on and low marking thresholds (2% and 10%). At Tony's request, I also ran with some prototype marking bitmap verification code. Thanks, JohnC On 01/12/12 00:26, John Cuthbertson wrote: > Hi Everyone, > > The latest incarnation of these changes can be found at: > http://cr.openjdk.java.net/~johnc/6484965/webrev.3/ > > The changes in this version include: > * Conditionally using a lock so that the output of the verification > closure executed by different threads does not interfere with each > other (suggested by Bengt). > * Merging up to the latest hotspot-gc tip (including Tony's marking > changes). This involved changing the evacuation failure code and > adding a suitable mark/count routine for use in > ConcurrentMark::grayRoot(). I also removed the counting changes from > code that has been made obsolete as a result of Tony's marking changes. > > Testing: a few runs of the GC test suite with low marking thresholds > (2 and 10%) with and without verification, and jprt. > > Thanks, > > JohnC > > On 12/21/2011 9:08 AM, John Cuthbertson wrote: >> Hi Bengt, >> >> That's a good observation. I guess it is possible but I haven't seen >> it in practice (though I was typically only using 4 threads when >> debugging a verification failure). It won't do any harm so I'll add >> the locking. >> >> Thanks, >> >> JohnC >> >> >> >> On 12/21/2011 2:05 AM, Bengt Rutisson wrote: >>> >>> Hi John, >>> >>> Thanks for updating your fix! Looks good. >>> >>> One quesiton: >>> In concurrentMark.cpp it seems to me that the >>> VerifyLiveObjectDataHRClosure could get the same kind of messed up >>> output that Tony just fixed with 7123165 for the VerifyLiveClosure >>> in heapRegion.cpp. There are several workers simultaneously doing >>> the verification, right? Is it worth adding the same kind of locking >>> that Tony added? >>> >>> Bengt >>> >>> On 2011-12-20 20:16, John Cuthbertson wrote: >>>> Hi Bengt, >>>> >>>> As I mentioned earlier - thanks for the code review. I've applied >>>> your suggestions, merged with the the latest changeset in >>>> hsx/hotspot-gc/hotspot (resolving any conflicts), fixed the int <-> >>>> size_t issue you also mentioned, and retested using the GC test >>>> suite. A new webrev can be found at: >>>> http://cr.openjdk.java.net/~johnc/6484965/webrev.2/ >>>> >>>> Specific replies are inline. >>>> >>>> On 11/28/11 05:07, Bengt Rutisson wrote: >>>>> >>>>> John, >>>>> >>>>> A little late, but here are some comments on this webrev. I know >>>>> you have some more improvements to this change coming, but overall >>>>> I think it looks good. Most of my comments are just minor coding >>>>> style comments. >>>>> >>>>> Bengt >>>>> >>>>> concurrentMark.hpp >>>>> >>>>> Rename ConcurrentMark::clear() to ConcurrentMark::clear_mark() or >>>>> ConcurrentMark::unmark()? The commment you added is definitely >>>>> needed to understand what this method does. But it would be even >>>>> better if it was possible to get that from the method name itself. >>>> >>>> Done. >>>> >>>>> It seems like everywhere we use count_marked_bytes_for(int >>>>> worker_i) we almost directly use the array returned to index with >>>>> the heap region that we are interested in. How about wrapping all >>>>> of this is in something like count_set_marked_bytes_for(int >>>>> worker_i, int hrs_index) and count_get_marked_bytes_for(int >>>>> worker_i, int hrs_index) ? That way the data structure does not >>>>> have to be exposed outside ConcurrentMark. It would mean that >>>>> ConcurrentMark::count_region() would have to take a worker_i value >>>>> instead of a marked_bytes_array. >>>> >>>> I did not do this. I embed the marked_bytes array for a worker into >>>> the CMTask for that worker to save a de-reference. This was one of >>>> the requests from the original code walk-through. Avoiding the >>>> de-reference in the CMTask::do_marking_step() shaves a couple of >>>> points off the marking time. I think your suggestion would >>>> reinstate the de-reference again and we would lose those few >>>> percentage points again. >>>> >>>>> If you don't agree with the suggestion above I would suggest to >>>>> change the name from count_marked_bytes_for() to >>>>> count_marked_bytes_array_for() since in every place that it is >>>>> being called the resulting value is stored in a local variable >>>>> called marked_bytes_array, which seems like a more informative >>>>> name to me. >>>> >>>> Done. I agree - the new name sounds better. >>>> >>>>> I think this comment: >>>>> >>>>> // As above - but we don't know the heap region containing the >>>>> // object and so have to supply it. >>>>> inline bool par_mark_and_count(oop obj, int worker_i); >>>>> >>>>> should be something like "we don't know the heap region containing >>>>> the object so we will have to look it up". >>>>> >>>>> Same thing here: >>>>> >>>>> // As above - but we don't have the heap region containing the >>>>> // object, so we have to supply it. >>>>> // Should *not* be called from parallel code. >>>>> inline bool mark_and_count(oop obj); >>>>> >>>>> >>>> >>>> Comments were changed to: >>>> >>>> >>>>> concurrentMark.cpp >>>>> >>>>> Since you are changing CalcLiveObjectsClosure::doHeapRegion() >>>>> anyway, could you please remove this unused code (1393-1397): >>>>> >>>>> /* >>>>> gclog_or_tty->print_cr("Setting bits from %d/%d.", >>>>> obj_card_num - _bottom_card_num, >>>>> obj_last_card_num - _bottom_card_num); >>>>> */ >>>>> >>>>> >>>> >>>> Done. >>>> >>>>> What about the destructor ConcurrentMark::~ConcurrentMark() ? I >>>>> remember Tony mentioning that it won't be called. Do you still >>>>> want to keep the code? >>>> >>>> I removed the entire destructor - I don't see it being called in >>>> the experiments I've run. >>>> >>>>> FinalCountDataUpdateClosure::set_bit_for_region() >>>>> Probably not worth it, but would it make sense to add information >>>>> in a startsHumongous HeapRegion to be able to give you the last >>>>> continuesHumongous region? Since we know this when we set the >>>>> regions up it seems like a waste to have to iterate over the >>>>> region list to find it. >>>> >>>> If you read the original comment - the original author did not want >>>> to make any assumptions about the internal field values of the >>>> HeapRegions spanned by a humongous object and so used the loop >>>> technique. I think you are correct and I now use the information in >>>> the startsHumongous region to find the index of the last >>>> continuesHumongous region spaned by the H-obj. >>>> >>>>> G1ParFinalCountTask >>>>> To me it is a bit surprising that we mix in the verify code inside >>>>> this closure. Would it be possible to extract this code out somehow? >>>> >>>> I did it this way to avoid another iteration over the heap regions. >>>> But it probably does make more sense to separate them and use >>>> another iteration to do the verify. Done. >>>> >>>>> Line 3378: "// Use fill_to_bytes". Is this something you plan on >>>>> doing? >>>> >>>> I removed the comment. I was thinking of doing this as >>>> fill_to_bytes is typically implemented using (a possibly >>>> specialized version of) memset. But it's probably not worth it in >>>> this case. >>>> >>>>> G1ParFinalCountTask::work() >>>>> Just for the record. I don't really like the way we have to set up >>>>> both a VerifyLiveObjectDataHRClosure and a Mux2HRClosure even >>>>> though we will only use them if we have VerifyDuringGC enabled. I >>>>> realize it is due to the scoping, but I still think it obstucts >>>>> the code flow and introduces unnecessary work. Unfortunately I >>>>> don't have a good suggestion for how to work around it. >>>>> >>>>> Since both VerifyLiveObjectDataHRClosure and a Mux2HRClosure are >>>>> StackObjs I assume it is not possible to get around the issue with >>>>> a ResourceMark. >>>> >>>> Now that the verification is performed in a separate iteration of >>>> the heap regions there's no need to create the >>>> VerifyLiveObjectDataHRClosure and Mux2HRClosure instances here. >>>> Done. I have also removed the now-redundant Mux2HRClosure. >>>> >>>> Hopefully the new webrev addresses these comments. >>>> >>>> Thanks again for looking. >>>> >>>> JohnC >>>> >>> >> > From tony.printezis at oracle.com Tue Jan 24 13:02:07 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 24 Jan 2012 16:02:07 -0500 Subject: CRR (XXS): 7132398: G1: java.lang.IllegalArgumentException: Invalid threshold: 9223372036854775807 > max (1073741824) Message-ID: <4F1F1C4F.9020406@oracle.com> Hi all, Can I have a couple of code reviews for this very small change (one line!): http://cr.openjdk.java.net/~tonyp/7132398/webrev.0/ It turns out my fix for 7078465 (G1: Don't use the undefined value (-1) for the G1 old memory pool max size) was incomplete. The max_size() method of the G1 old pool now returns the correct value. However, the max is also passed to the memory pool constructor and is somehow cached and I missed to update that. Tony From bengt.rutisson at oracle.com Tue Jan 24 13:31:43 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Tue, 24 Jan 2012 22:31:43 +0100 Subject: CRR (XXS): 7132398: G1: java.lang.IllegalArgumentException: Invalid threshold: 9223372036854775807 > max (1073741824) In-Reply-To: <4F1F1C4F.9020406@oracle.com> References: <4F1F1C4F.9020406@oracle.com> Message-ID: <175A95BA-5FF1-4DBB-94F0-250ED96DB226@oracle.com> Looks good! Bengt 24 jan 2012 kl. 22:02 skrev Tony Printezis : > Hi all, > > Can I have a couple of code reviews for this very small change (one line!): > > http://cr.openjdk.java.net/~tonyp/7132398/webrev.0/ > > It turns out my fix for 7078465 (G1: Don't use the undefined value (-1) for the G1 old memory pool max size) was incomplete. The max_size() method of the G1 old pool now returns the correct value. However, the max is also passed to the memory pool constructor and is somehow cached and I missed to update that. > > Tony > From tony.printezis at oracle.com Tue Jan 24 14:39:58 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Tue, 24 Jan 2012 17:39:58 -0500 Subject: CRR (XXS): 7132398: G1: java.lang.IllegalArgumentException: Invalid threshold: 9223372036854775807 > max (1073741824) In-Reply-To: <4F1F1C4F.9020406@oracle.com> References: <4F1F1C4F.9020406@oracle.com> Message-ID: <4F1F333E.9020302@oracle.com> All set thanks to Bengt and John! Tony Tony Printezis wrote: > Hi all, > > Can I have a couple of code reviews for this very small change (one > line!): > > http://cr.openjdk.java.net/~tonyp/7132398/webrev.0/ > > It turns out my fix for 7078465 (G1: Don't use the undefined value > (-1) for the G1 old memory pool max size) was incomplete. The > max_size() method of the G1 old pool now returns the correct value. > However, the max is also passed to the memory pool constructor and is > somehow cached and I missed to update that. > > Tony > > From john.cuthbertson at oracle.com Tue Jan 24 16:41:11 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 24 Jan 2012 16:41:11 -0800 Subject: RFR(S): 7133038: G1: Some small profile based optimizations Message-ID: <4F1F4FA7.9080907@oracle.com> Hi There, Can I have a couple of volunteers review the changes for this CR? The webrev can be found at: http://cr.openjdk.java.net/~johnc/7133038/webrev.0/ Summary: While going through hardware profiles of various G1 workloads we were seeing some high data cache miss rates, and a high number of branches and branch mispredicts in some routines. These routines help to reduce those by adding prefetching and some minor code refactoring. Testing: GC test suite; jprt; specjbb2005 (to verify the profiling). Thanks, JohnC From tony.printezis at oracle.com Tue Jan 24 18:04:00 2012 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Wed, 25 Jan 2012 02:04:00 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7132398: G1: java.lang.IllegalArgumentException: Invalid threshold: 9223372036854775807 > max (1073741824) Message-ID: <20120125020404.87A1747180@hg.openjdk.java.net> Changeset: 877914d90c57 Author: tonyp Date: 2012-01-24 17:08 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/877914d90c57 7132398: G1: java.lang.IllegalArgumentException: Invalid threshold: 9223372036854775807 > max (1073741824) Summary: Was not passing the right old pool max to the memory pool constructor in the fix for 7078465. Reviewed-by: brutisso, johnc ! src/share/vm/services/g1MemoryPool.cpp From John.Coomes at oracle.com Tue Jan 24 21:26:07 2012 From: John.Coomes at oracle.com (John Coomes) Date: Tue, 24 Jan 2012 21:26:07 -0800 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS Message-ID: <20255.37487.600174.703384@oracle.com> I'd appreciate reviews of a simple change to disable AdaptiveSizePolicy with CMS and/or ParNew, even if it has been enabled on the command line. http://cr.openjdk.java.net/~jcoomes/7112413-cms-adaptive -John From john.cuthbertson at oracle.com Tue Jan 24 22:28:17 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 24 Jan 2012 22:28:17 -0800 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS In-Reply-To: <20255.37487.600174.703384@oracle.com> References: <20255.37487.600174.703384@oracle.com> Message-ID: <4F1FA101.6010507@oracle.com> Hi John, Looks good to me. JohnC On 1/24/2012 9:26 PM, John Coomes wrote: > I'd appreciate reviews of a simple change to disable > AdaptiveSizePolicy with CMS and/or ParNew, even if it has been enabled > on the command line. > > http://cr.openjdk.java.net/~jcoomes/7112413-cms-adaptive > > -John From bengt.rutisson at oracle.com Wed Jan 25 00:01:37 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 25 Jan 2012 09:01:37 +0100 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS In-Reply-To: <20255.37487.600174.703384@oracle.com> References: <20255.37487.600174.703384@oracle.com> Message-ID: <4F1FB6E1.90306@oracle.com> Hi John, Looks good. One minor comment: I'd prefer the test: 1045 if (!FLAG_IS_DEFAULT(UseAdaptiveSizePolicy)) { to be: 1045 if (FLAG_IS_CMDLINE(UseAdaptiveSizePolicy)) { I think users are only interested in the warning if they actually had the switch on the command line. If hotspot turns on the flag ergonomically I think it is just confusing to customers to see the warning. And a nit: copyright year should be 2012 ;-) Finally, a question that is not directly related to your change now. But what is the plan for CMS and UseAdaptiveSizePolicy? Do we plan on fixing it or should we just remove it? If the latter is the case, is there a CR to remove it? With your change there is quite a few lines of code that are essentially dead now. Bengt On 2012-01-25 06:26, John Coomes wrote: > I'd appreciate reviews of a simple change to disable > AdaptiveSizePolicy with CMS and/or ParNew, even if it has been enabled > on the command line. > > http://cr.openjdk.java.net/~jcoomes/7112413-cms-adaptive > > -John From john.cuthbertson at oracle.com Wed Jan 25 01:12:30 2012 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Wed, 25 Jan 2012 09:12:30 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 6484965: G1: piggy-back liveness accounting phase on marking Message-ID: <20120125091235.00F804718B@hg.openjdk.java.net> Changeset: d30fa85f9994 Author: johnc Date: 2012-01-12 00:06 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/d30fa85f9994 6484965: G1: piggy-back liveness accounting phase on marking Summary: Remove the separate counting phase of concurrent marking by tracking the amount of marked bytes and the cards spanned by marked objects in marking task/worker thread local data structures, which are updated as individual objects are marked. Reviewed-by: brutisso, tonyp ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/concurrentMark.hpp ! src/share/vm/gc_implementation/g1/concurrentMark.inline.hpp ! src/share/vm/gc_implementation/g1/concurrentMarkThread.cpp ! src/share/vm/gc_implementation/g1/concurrentMarkThread.hpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.hpp ! src/share/vm/gc_implementation/g1/g1EvacFailure.hpp ! src/share/vm/gc_implementation/g1/g1OopClosures.hpp ! src/share/vm/gc_implementation/g1/heapRegion.hpp From bengt.rutisson at oracle.com Wed Jan 25 06:02:22 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Wed, 25 Jan 2012 15:02:22 +0100 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F19A0C3.1060200@oracle.com> References: <4F1608C0.2040601@oracle.com> <4F16D5F7.4070605@oracle.com> <4F1730D6.1070004@oracle.com> <4F173A47.9040705@oracle.com> <4F17DC5C.4050403@oracle.com> <4F1822D2.2060202@oracle.com> <4F182B65.3050701@oracle.com> <4F19A0C3.1060200@oracle.com> Message-ID: <4F200B6E.4060807@oracle.com> Tony, This looks good. One minor thing. Is it possible to rename CMRootRegions::reset() ? From the name I would have expected it to kind of set its fields to NULL or something similar. But what it does is set up the CMRootRegion for the upcoming concurrent scanning. How about CMRootRegions::setup() or CMRootRegions::prepare() instead? Just a reminder (I guess you will remember to do this anyway): concurrentMark.inline.hpp // TODO: make sure we pass hr to par_mark_and_count() after merging // with John's changes. Also, about the prefetching in void ConcurrentMark::scanRootRegion(). Did you measure any performance impact of this? Why did you decide to include it? Bengt On 2012-01-20 18:13, Tony Printezis wrote: > Hi all, > > New webrev for this based on comments from John: > > http://cr.openjdk.java.net/~tonyp/7127706/webrev.3/ > > Now, the CMRootRegions::claim_next() method does not check the > ConcurrentMark::has_aborted() flag to know when to abort (and return > NULL) but, instead, a _has_aborted flag I added to the CMRootRegions > class. This is set to true at the start of a Full GC and before the > Full GC waits for the root region scan to finish. > > Additionally, I now call _root_regions.reset() from > CM::checkpointRootsInitialPost()instead of calling it explicitly from > do_collection_pause_at_safepoint(). > > Tony > > On 01/19/2012 09:40 AM, Tony Printezis wrote: >> Bengt (and all), >> >> Updated webrev using "root regions" now: >> >> http://cr.openjdk.java.net/~tonyp/7127706/webrev.2/ >> >> Tony >> >> On 01/19/2012 09:04 AM, Tony Printezis wrote: >>> Bengt, >>> >>> Inline (again!) >>> >>> On 1/19/2012 4:03 AM, Bengt Rutisson wrote: >>>> >>>> Tony, >>>> >>>> On 2012-01-18 22:31, Tony Printezis wrote: >>>>> Bengt, >>>>> >>>>> Here's a webrev with the renaming: >>>>> >>>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.1/ >>>>> >>>>> I have to say I'm not sure I really like the term "initial-mark / >>>>> IM snapshot regions". I'll try to come up with an alternative name >>>>> for them.... >>>> >>>> Looked quickly at the new webrev. >>>> >>>> I agree that IM-snapshot might not be optimal. Still I like the >>>> fact that it is not just "snapshot" since I think that can easily >>>> be confused with the SATB terms. >>> >>> Well, it's supposed to be. In SATB, anything that's reachable from >>> the "snapshot" at initial-mark time will be retained. Since we want >>> to avoid explicitly marking the survivors, we make them all >>> implicitly live and part of the "snapshot" which is why we have to >>> scan them (the same way we scan roots during the initial-mark >>> pause). So, "snapshot regions" is not unreasonable given that if we >>> were not using SATB we would not be able to do this. >>> >>>> It is of course part of the SATB snapshot, but not the whole thing. >>>> >>>> Just thinking aloud here, what about not using the word "snapshot" >>>> at all? How about "to_be_scanned_regions", "root_regions" or >>>> "concurrent_roots"? >>> >>> I like "root regions". Let's go with that (it's shorter too!). >>> >>> Tony >>> >>>> Really not sure what a good name is here...I kind of like the log >>>> message "Concurrent root scanning took 0.000x ms". >>>> >>>> Bengt >>>> >>>>> >>>>> Tony >>>>> >>>>> Tony Printezis wrote: >>>>>> Hi Bengt, >>>>>> >>>>>> Thanks for looking at this so quickly! Inline. >>>>>> >>>>>> Bengt Rutisson wrote: >>>>>>> >>>>>>> Tony, >>>>>>> >>>>>>> Overall this looks really good. Thanks for fixing it. >>>>>>> >>>>>>> Some comments: >>>>>>> >>>>>>> First, a general question regarding naming and logging. We now >>>>>>> talk about "snapshot" a lot. It is a pretty good name, but maybe >>>>>>> it needs some more context to be understandable in the code and >>>>>>> the GC log. I don't have any really good names, but maybe >>>>>>> "survivor_snapshot" >>>>>> >>>>>> I'd rather not mention "survivors" given that we might add >>>>>> non-survivor regions in the future. >>>>>> >>>>>>> or "initial_mark_snapshot"? >>>>>> >>>>>> I like "initial-mark snapshot" better. Having said that >>>>>> CMInitialMarkSnapshotRegions and _initial_mark_snapshot_regions >>>>>> are kinda long. :-) I'll abbreviate to CMIMSnapshotRegions and >>>>>> _im_snapshot_regions if that's OK. >>>>>> >>>>>>> concurrentMark.inline.hpp >>>>>>> >>>>>>> if (hr == NULL) { >>>>>>> hr = _g1h->heap_region_containing_raw(addr); >>>>>>> // Given that we're looking for a region that contains an >>>>>>> object >>>>>>> // header it's impossible to get back a HC region. >>>>>>> assert(!hr->continuesHumongous(), "sanity"); >>>>>>> } else { >>>>>>> assert(hr->is_in(addr), "pre-condition"); >>>>>>> } >>>>>>> >>>>>>> The first assert should probably hold even for regions that are >>>>>>> passed in to grayRoot() right? So, maybe something like: >>>>>>> >>>>>>> if (hr == NULL) { >>>>>>> hr = _g1h->heap_region_containing_raw(addr); >>>>>>> } else { >>>>>>> assert(hr->is_in(addr), "pre-condition"); >>>>>>> } >>>>>>> // Given that we need a region that contains an object >>>>>>> // header it's impossible for it to be a HC region. >>>>>>> assert(!hr->continuesHumongous(), "sanity"); >>>>>> >>>>>> Good observation! I changed to the above. >>>>>> >>>>>>> concurrentMarkThread.cpp >>>>>>> >>>>>>> ConcurrentMarkThread::run() >>>>>>> >>>>>>> Why do we do the explicit time/date stamping? >>>>>>> >>>>>>> gclog_or_tty->date_stamp(PrintGCDateStamps); >>>>>>> gclog_or_tty->stamp(PrintGCTimeStamps); >>>>>>> gclog_or_tty->print_cr("[GC >>>>>>> concurrent-snapshot-scan-start]"); >>>>>>> >>>>>>> why is it not enough with the normal -XX:+PrintGCTimeStamps >>>>>>> information? >>>>>> >>>>>> Not quite sure what you mean with "is it not enough with the >>>>>> normal ... information". Each log record needs either a GC time >>>>>> stamp or a GC date stamp and we have to print either or both >>>>>> depending on the two -XX parameters. Unfortunately, the logging >>>>>> code has not been well abstracted and/or refactored so we have >>>>>> this unfortunate replication throughout the GCs. >>>>>> >>>>>>> This is probably correct since I see this pattern in other >>>>>>> places. But I would like to understand why we do it. >>>>>>> >>>>>>> >>>>>>> g1CollectedHeap.cpp: >>>>>>> >>>>>>> G1CollectedHeap::do_collection() >>>>>>> >>>>>>> Is it worth logging how long we had to wait in >>>>>>> _cm->snapshot_regions()->wait_until_scan_finished(), the same >>>>>>> way that we do in >>>>>>> G1CollectedHeap::do_collection_pause_at_safepoint()? >>>>>> >>>>>> Currently, the GC log records for the evacuation pauses have a >>>>>> lot of extra information when +PrintGCDetails is set and it was >>>>>> reasonable to add an extra record with the wait time. And it's >>>>>> more important to know how the wait for snapshot region scanning >>>>>> affects evacuation pauses, which are more critical. The Full GC >>>>>> log records are currently one line and I don't think we want to >>>>>> extend them further (at least, not before we put a decent GC >>>>>> logging framework in place). On the other hand, the snapshot >>>>>> region scanning aborts when a marking cycle is aborted due to a >>>>>> Full GC. So, this wait time should not be long. How about I add a >>>>>> comment in the code saying that, when we introduce a more >>>>>> extensible logging framework, we could add the wait time to the >>>>>> Full GC log records? Something like: >>>>>> >>>>>> // Note: When we have a more flexible GC logging framework that >>>>>> // allows us to add optional attributes to a GC log record we >>>>>> // could consider timing and reporting how long we wait in the >>>>>> // following two methods. >>>>>> wait_while_free_regions_coming(); >>>>>> // ... >>>>>> _cm->snapshot_regions()->wait_until_scan_finished(); >>>>>> >>>>>> >>>>>>> Finally, just some food for thought. Could this be generalized >>>>>>> to more roots? I mean take a snapshot and scan it concurrently. >>>>>> >>>>>> By scanning the IM snapshot regions we say "these guys are all >>>>>> roots, instead of scanning them during the GC we will scan them >>>>>> concurrently". And we can do that for any object / region a) as >>>>>> long as we know they will not move while they are being scanned >>>>>> and b) because we have the pre-barrier. If any references on the >>>>>> snapshot objects are updated, the pre-barrier ensures that their >>>>>> values at the start of marking will be enqueued and processed. >>>>>> >>>>>> For external arbitrary roots we don't have a write barrier (and >>>>>> we shouldn't as it'd be too expensive). So, we cannot do that for >>>>>> non-object roots without a "pre-barrier"-type mechanism. >>>>>> >>>>>> Tony >>>>>> >>>>>> >>>>>>> Bengt >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2012-01-18 00:48, Tony Printezis wrote: >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Can I have a couple of code reviews for this change that >>>>>>>> re-enables the use of survivor regions during the initial-mark >>>>>>>> pause? >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ >>>>>>>> >>>>>>>> From the CR: >>>>>>>> >>>>>>>> We could scan the survivors as we're copying them, however this >>>>>>>> will require more work during the initial-mark GCs (and in >>>>>>>> particular: special-case code in the fast path). >>>>>>>> >>>>>>>> A better approach is to let the concurrent marking threads scan >>>>>>>> the survivors and mark everything reachable from them a) before >>>>>>>> any more concurrent marking work is done (so that we can just >>>>>>>> mark the objects, without needing to push them on a stack, and >>>>>>>> let the "finger" algorithm discover them) and b) before the >>>>>>>> next GC starts (since, if we copy them, we won't know which of >>>>>>>> the new survivors are the ones we need to scan). >>>>>>>> >>>>>>>> This approach has the advantage that it does not require any >>>>>>>> extra work during the initial-mark GCs and all the work is done >>>>>>>> by the concurrent marking threads. However, it has the >>>>>>>> disadvantage that the survivor scanning might hold up the next >>>>>>>> GC. In most cases this should not be an issue as GCs take place >>>>>>>> at a reasonably low rate. If it does become a problem we could >>>>>>>> consider the following: >>>>>>>> >>>>>>>> - like when the GC locker is active, try to extend the eden to >>>>>>>> give a bit more time to the marking threads to finish scanning >>>>>>>> the survivors >>>>>>>> - instead of waiting for the marking threads, a GC can take >>>>>>>> over and finish up scanning the remaining survivors (typically, >>>>>>>> we have more GC threads than marking threads, so the overhead >>>>>>>> will be reduced) >>>>>>>> - if we supported region pinning, we could pin all the regions >>>>>>>> that were not scanned by the time the GC started so that the >>>>>>>> marking threads can resume scanning them after the GC completes >>>>>>>> >>>>>>>> Implementation notes: >>>>>>>> >>>>>>>> I introduced the concept of a "snapshot regions" in the >>>>>>>> ConcurrentMark which is a set of regions that need to be >>>>>>>> scanned at the start of a concurrent cycle. Currently, these >>>>>>>> can only be survivors but maybe we can use the same concept for >>>>>>>> something else in the future. >>>>>>>> >>>>>>>> Tony >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> From jon.masamitsu at oracle.com Wed Jan 25 07:01:32 2012 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Wed, 25 Jan 2012 07:01:32 -0800 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS In-Reply-To: <20255.37487.600174.703384@oracle.com> References: <20255.37487.600174.703384@oracle.com> Message-ID: <4F20194C.7020902@oracle.com> What happens with SerialGC? On 1/24/2012 9:26 PM, John Coomes wrote: > I'd appreciate reviews of a simple change to disable > AdaptiveSizePolicy with CMS and/or ParNew, even if it has been enabled > on the command line. > > http://cr.openjdk.java.net/~jcoomes/7112413-cms-adaptive > > -John From jon.masamitsu at oracle.com Wed Jan 25 07:11:46 2012 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Wed, 25 Jan 2012 07:11:46 -0800 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS In-Reply-To: <4F1FB6E1.90306@oracle.com> References: <20255.37487.600174.703384@oracle.com> <4F1FB6E1.90306@oracle.com> Message-ID: <4F201BB2.6070406@oracle.com> On 1/25/2012 12:01 AM, Bengt Rutisson wrote: > > ... > > Finally, a question that is not directly related to your change now. > But what is the plan for CMS and UseAdaptiveSizePolicy? Do we plan on > fixing it or should we just remove it? If the latter is the case, is > there a CR to remove it? With your change there is quite a few lines > of code that are essentially dead now. If you do a CR to remove the UseAdaptiveSizePolicy for CMS, please do no remove the code from ParNew and SerialGC. I think client side GC ergo is in our future. I'd also suggest making the CR dependent on G1 replacing CMS. Just on the decision that we will be removing CMS, not waiting for the actual removal of the CMS code. Jon > > Bengt > > On 2012-01-25 06:26, John Coomes wrote: >> I'd appreciate reviews of a simple change to disable >> AdaptiveSizePolicy with CMS and/or ParNew, even if it has been enabled >> on the command line. >> >> http://cr.openjdk.java.net/~jcoomes/7112413-cms-adaptive >> >> -John > From tony.printezis at oracle.com Wed Jan 25 09:11:15 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Wed, 25 Jan 2012 12:11:15 -0500 Subject: CRR (M): 7127706: G1: re-enable survivors during the initial-mark pause In-Reply-To: <4F200B6E.4060807@oracle.com> References: <4F1608C0.2040601@oracle.com> <4F16D5F7.4070605@oracle.com> <4F1730D6.1070004@oracle.com> <4F173A47.9040705@oracle.com> <4F17DC5C.4050403@oracle.com> <4F1822D2.2060202@oracle.com> <4F182B65.3050701@oracle.com> <4F19A0C3.1060200@oracle.com> <4F200B6E.4060807@oracle.com> Message-ID: <4F2037B3.9020302@oracle.com> Hi Bengt, Thanks for looking at this once more. Inline. On 01/25/2012 09:02 AM, Bengt Rutisson wrote: > > Tony, > > This looks good. > > One minor thing. Is it possible to rename CMRootRegions::reset() ? > From the name I would have expected it to kind of set its fields to > NULL or something similar. But what it does is set up the CMRootRegion > for the upcoming concurrent scanning. How about CMRootRegions::setup() > or CMRootRegions::prepare() instead? I changed it to prepare_for_scan() > Just a reminder (I guess you will remember to do this anyway): > > concurrentMark.inline.hpp > // TODO: make sure we pass hr to par_mark_and_count() after merging > // with John's changes. Yep, merged with John's changes and that comment is now gone. > Also, about the prefetching in void ConcurrentMark::scanRootRegion(). > Did you measure any performance impact of this? Why did you decide to > include it? I have to say I didn't do any performance measurements on this. But, typically, when iterating over a large part of the heap, prefetching pays off. Tony > Bengt > > On 2012-01-20 18:13, Tony Printezis wrote: >> Hi all, >> >> New webrev for this based on comments from John: >> >> http://cr.openjdk.java.net/~tonyp/7127706/webrev.3/ >> >> Now, the CMRootRegions::claim_next() method does not check the >> ConcurrentMark::has_aborted() flag to know when to abort (and return >> NULL) but, instead, a _has_aborted flag I added to the CMRootRegions >> class. This is set to true at the start of a Full GC and before the >> Full GC waits for the root region scan to finish. >> >> Additionally, I now call _root_regions.reset() from >> CM::checkpointRootsInitialPost()instead of calling it explicitly from >> do_collection_pause_at_safepoint(). >> >> Tony >> >> On 01/19/2012 09:40 AM, Tony Printezis wrote: >>> Bengt (and all), >>> >>> Updated webrev using "root regions" now: >>> >>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.2/ >>> >>> Tony >>> >>> On 01/19/2012 09:04 AM, Tony Printezis wrote: >>>> Bengt, >>>> >>>> Inline (again!) >>>> >>>> On 1/19/2012 4:03 AM, Bengt Rutisson wrote: >>>>> >>>>> Tony, >>>>> >>>>> On 2012-01-18 22:31, Tony Printezis wrote: >>>>>> Bengt, >>>>>> >>>>>> Here's a webrev with the renaming: >>>>>> >>>>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.1/ >>>>>> >>>>>> I have to say I'm not sure I really like the term "initial-mark / >>>>>> IM snapshot regions". I'll try to come up with an alternative >>>>>> name for them.... >>>>> >>>>> Looked quickly at the new webrev. >>>>> >>>>> I agree that IM-snapshot might not be optimal. Still I like the >>>>> fact that it is not just "snapshot" since I think that can easily >>>>> be confused with the SATB terms. >>>> >>>> Well, it's supposed to be. In SATB, anything that's reachable from >>>> the "snapshot" at initial-mark time will be retained. Since we want >>>> to avoid explicitly marking the survivors, we make them all >>>> implicitly live and part of the "snapshot" which is why we have to >>>> scan them (the same way we scan roots during the initial-mark >>>> pause). So, "snapshot regions" is not unreasonable given that if we >>>> were not using SATB we would not be able to do this. >>>> >>>>> It is of course part of the SATB snapshot, but not the whole thing. >>>>> >>>>> Just thinking aloud here, what about not using the word "snapshot" >>>>> at all? How about "to_be_scanned_regions", "root_regions" or >>>>> "concurrent_roots"? >>>> >>>> I like "root regions". Let's go with that (it's shorter too!). >>>> >>>> Tony >>>> >>>>> Really not sure what a good name is here...I kind of like the log >>>>> message "Concurrent root scanning took 0.000x ms". >>>>> >>>>> Bengt >>>>> >>>>>> >>>>>> Tony >>>>>> >>>>>> Tony Printezis wrote: >>>>>>> Hi Bengt, >>>>>>> >>>>>>> Thanks for looking at this so quickly! Inline. >>>>>>> >>>>>>> Bengt Rutisson wrote: >>>>>>>> >>>>>>>> Tony, >>>>>>>> >>>>>>>> Overall this looks really good. Thanks for fixing it. >>>>>>>> >>>>>>>> Some comments: >>>>>>>> >>>>>>>> First, a general question regarding naming and logging. We now >>>>>>>> talk about "snapshot" a lot. It is a pretty good name, but >>>>>>>> maybe it needs some more context to be understandable in the >>>>>>>> code and the GC log. I don't have any really good names, but >>>>>>>> maybe "survivor_snapshot" >>>>>>> >>>>>>> I'd rather not mention "survivors" given that we might add >>>>>>> non-survivor regions in the future. >>>>>>> >>>>>>>> or "initial_mark_snapshot"? >>>>>>> >>>>>>> I like "initial-mark snapshot" better. Having said that >>>>>>> CMInitialMarkSnapshotRegions and _initial_mark_snapshot_regions >>>>>>> are kinda long. :-) I'll abbreviate to CMIMSnapshotRegions and >>>>>>> _im_snapshot_regions if that's OK. >>>>>>> >>>>>>>> concurrentMark.inline.hpp >>>>>>>> >>>>>>>> if (hr == NULL) { >>>>>>>> hr = _g1h->heap_region_containing_raw(addr); >>>>>>>> // Given that we're looking for a region that contains an >>>>>>>> object >>>>>>>> // header it's impossible to get back a HC region. >>>>>>>> assert(!hr->continuesHumongous(), "sanity"); >>>>>>>> } else { >>>>>>>> assert(hr->is_in(addr), "pre-condition"); >>>>>>>> } >>>>>>>> >>>>>>>> The first assert should probably hold even for regions that are >>>>>>>> passed in to grayRoot() right? So, maybe something like: >>>>>>>> >>>>>>>> if (hr == NULL) { >>>>>>>> hr = _g1h->heap_region_containing_raw(addr); >>>>>>>> } else { >>>>>>>> assert(hr->is_in(addr), "pre-condition"); >>>>>>>> } >>>>>>>> // Given that we need a region that contains an object >>>>>>>> // header it's impossible for it to be a HC region. >>>>>>>> assert(!hr->continuesHumongous(), "sanity"); >>>>>>> >>>>>>> Good observation! I changed to the above. >>>>>>> >>>>>>>> concurrentMarkThread.cpp >>>>>>>> >>>>>>>> ConcurrentMarkThread::run() >>>>>>>> >>>>>>>> Why do we do the explicit time/date stamping? >>>>>>>> >>>>>>>> gclog_or_tty->date_stamp(PrintGCDateStamps); >>>>>>>> gclog_or_tty->stamp(PrintGCTimeStamps); >>>>>>>> gclog_or_tty->print_cr("[GC >>>>>>>> concurrent-snapshot-scan-start]"); >>>>>>>> >>>>>>>> why is it not enough with the normal -XX:+PrintGCTimeStamps >>>>>>>> information? >>>>>>> >>>>>>> Not quite sure what you mean with "is it not enough with the >>>>>>> normal ... information". Each log record needs either a GC time >>>>>>> stamp or a GC date stamp and we have to print either or both >>>>>>> depending on the two -XX parameters. Unfortunately, the logging >>>>>>> code has not been well abstracted and/or refactored so we have >>>>>>> this unfortunate replication throughout the GCs. >>>>>>> >>>>>>>> This is probably correct since I see this pattern in other >>>>>>>> places. But I would like to understand why we do it. >>>>>>>> >>>>>>>> >>>>>>>> g1CollectedHeap.cpp: >>>>>>>> >>>>>>>> G1CollectedHeap::do_collection() >>>>>>>> >>>>>>>> Is it worth logging how long we had to wait in >>>>>>>> _cm->snapshot_regions()->wait_until_scan_finished(), the same >>>>>>>> way that we do in >>>>>>>> G1CollectedHeap::do_collection_pause_at_safepoint()? >>>>>>> >>>>>>> Currently, the GC log records for the evacuation pauses have a >>>>>>> lot of extra information when +PrintGCDetails is set and it was >>>>>>> reasonable to add an extra record with the wait time. And it's >>>>>>> more important to know how the wait for snapshot region scanning >>>>>>> affects evacuation pauses, which are more critical. The Full GC >>>>>>> log records are currently one line and I don't think we want to >>>>>>> extend them further (at least, not before we put a decent GC >>>>>>> logging framework in place). On the other hand, the snapshot >>>>>>> region scanning aborts when a marking cycle is aborted due to a >>>>>>> Full GC. So, this wait time should not be long. How about I add >>>>>>> a comment in the code saying that, when we introduce a more >>>>>>> extensible logging framework, we could add the wait time to the >>>>>>> Full GC log records? Something like: >>>>>>> >>>>>>> // Note: When we have a more flexible GC logging framework that >>>>>>> // allows us to add optional attributes to a GC log record we >>>>>>> // could consider timing and reporting how long we wait in the >>>>>>> // following two methods. >>>>>>> wait_while_free_regions_coming(); >>>>>>> // ... >>>>>>> _cm->snapshot_regions()->wait_until_scan_finished(); >>>>>>> >>>>>>> >>>>>>>> Finally, just some food for thought. Could this be generalized >>>>>>>> to more roots? I mean take a snapshot and scan it concurrently. >>>>>>> >>>>>>> By scanning the IM snapshot regions we say "these guys are all >>>>>>> roots, instead of scanning them during the GC we will scan them >>>>>>> concurrently". And we can do that for any object / region a) as >>>>>>> long as we know they will not move while they are being scanned >>>>>>> and b) because we have the pre-barrier. If any references on the >>>>>>> snapshot objects are updated, the pre-barrier ensures that their >>>>>>> values at the start of marking will be enqueued and processed. >>>>>>> >>>>>>> For external arbitrary roots we don't have a write barrier (and >>>>>>> we shouldn't as it'd be too expensive). So, we cannot do that >>>>>>> for non-object roots without a "pre-barrier"-type mechanism. >>>>>>> >>>>>>> Tony >>>>>>> >>>>>>> >>>>>>>> Bengt >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 2012-01-18 00:48, Tony Printezis wrote: >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Can I have a couple of code reviews for this change that >>>>>>>>> re-enables the use of survivor regions during the initial-mark >>>>>>>>> pause? >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~tonyp/7127706/webrev.0/ >>>>>>>>> >>>>>>>>> From the CR: >>>>>>>>> >>>>>>>>> We could scan the survivors as we're copying them, however >>>>>>>>> this will require more work during the initial-mark GCs (and >>>>>>>>> in particular: special-case code in the fast path). >>>>>>>>> >>>>>>>>> A better approach is to let the concurrent marking threads >>>>>>>>> scan the survivors and mark everything reachable from them a) >>>>>>>>> before any more concurrent marking work is done (so that we >>>>>>>>> can just mark the objects, without needing to push them on a >>>>>>>>> stack, and let the "finger" algorithm discover them) and b) >>>>>>>>> before the next GC starts (since, if we copy them, we won't >>>>>>>>> know which of the new survivors are the ones we need to scan). >>>>>>>>> >>>>>>>>> This approach has the advantage that it does not require any >>>>>>>>> extra work during the initial-mark GCs and all the work is >>>>>>>>> done by the concurrent marking threads. However, it has the >>>>>>>>> disadvantage that the survivor scanning might hold up the next >>>>>>>>> GC. In most cases this should not be an issue as GCs take >>>>>>>>> place at a reasonably low rate. If it does become a problem we >>>>>>>>> could consider the following: >>>>>>>>> >>>>>>>>> - like when the GC locker is active, try to extend the eden to >>>>>>>>> give a bit more time to the marking threads to finish scanning >>>>>>>>> the survivors >>>>>>>>> - instead of waiting for the marking threads, a GC can take >>>>>>>>> over and finish up scanning the remaining survivors >>>>>>>>> (typically, we have more GC threads than marking threads, so >>>>>>>>> the overhead will be reduced) >>>>>>>>> - if we supported region pinning, we could pin all the regions >>>>>>>>> that were not scanned by the time the GC started so that the >>>>>>>>> marking threads can resume scanning them after the GC completes >>>>>>>>> >>>>>>>>> Implementation notes: >>>>>>>>> >>>>>>>>> I introduced the concept of a "snapshot regions" in the >>>>>>>>> ConcurrentMark which is a set of regions that need to be >>>>>>>>> scanned at the start of a concurrent cycle. Currently, these >>>>>>>>> can only be survivors but maybe we can use the same concept >>>>>>>>> for something else in the future. >>>>>>>>> >>>>>>>>> Tony >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> > From tony.printezis at oracle.com Wed Jan 25 16:29:37 2012 From: tony.printezis at oracle.com (tony.printezis at oracle.com) Date: Thu, 26 Jan 2012 00:29:37 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7127706: G1: re-enable survivors during the initial-mark pause Message-ID: <20120126002948.71B2C471AE@hg.openjdk.java.net> Changeset: eff609af17d7 Author: tonyp Date: 2012-01-25 12:58 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/eff609af17d7 7127706: G1: re-enable survivors during the initial-mark pause Summary: Re-enable survivors during the initial-mark pause. Afterwards, the concurrent marking threads have to scan them and mark everything reachable from them. The next GC will have to wait for the survivors to be scanned. Reviewed-by: brutisso, johnc ! src/share/vm/gc_implementation/g1/concurrentMark.cpp ! src/share/vm/gc_implementation/g1/concurrentMark.hpp ! src/share/vm/gc_implementation/g1/concurrentMark.inline.hpp ! src/share/vm/gc_implementation/g1/concurrentMarkThread.cpp ! src/share/vm/gc_implementation/g1/g1CollectedHeap.cpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.cpp ! src/share/vm/gc_implementation/g1/g1CollectorPolicy.hpp ! src/share/vm/gc_implementation/g1/g1EvacFailure.hpp ! src/share/vm/gc_implementation/g1/g1OopClosures.hpp ! src/share/vm/gc_implementation/g1/g1OopClosures.inline.hpp ! src/share/vm/gc_implementation/g1/heapRegion.inline.hpp ! src/share/vm/runtime/mutexLocker.cpp ! src/share/vm/runtime/mutexLocker.hpp From John.Coomes at oracle.com Wed Jan 25 16:31:44 2012 From: John.Coomes at oracle.com (John Coomes) Date: Wed, 25 Jan 2012 16:31:44 -0800 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS In-Reply-To: <4F1FA101.6010507@oracle.com> References: <20255.37487.600174.703384@oracle.com> <4F1FA101.6010507@oracle.com> Message-ID: <20256.40688.393980.209881@oracle.com> John Cuthbertson (john.cuthbertson at oracle.com) wrote: > Hi John, > > Looks good to me. Many thanks for looking at it. -John > On 1/24/2012 9:26 PM, John Coomes wrote: > > I'd appreciate reviews of a simple change to disable > > AdaptiveSizePolicy with CMS and/or ParNew, even if it has been enabled > > on the command line. > > > > http://cr.openjdk.java.net/~jcoomes/7112413-cms-adaptive > > > > -John > From John.Coomes at oracle.com Wed Jan 25 16:32:08 2012 From: John.Coomes at oracle.com (John Coomes) Date: Wed, 25 Jan 2012 16:32:08 -0800 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS In-Reply-To: <4F1FB6E1.90306@oracle.com> References: <20255.37487.600174.703384@oracle.com> <4F1FB6E1.90306@oracle.com> Message-ID: <20256.40712.465298.881929@oracle.com> Bengt Rutisson (bengt.rutisson at oracle.com) wrote: > > Hi John, > > Looks good. > > One minor comment: > > I'd prefer the test: > > 1045 if (!FLAG_IS_DEFAULT(UseAdaptiveSizePolicy)) { > > to be: > > 1045 if (FLAG_IS_CMDLINE(UseAdaptiveSizePolicy)) { > > I think users are only interested in the warning if they actually had > the switch on the command line. If hotspot turns on the flag > ergonomically I think it is just confusing to customers to see the warning. Thanks for the review. I'll make that change; it's more future-proof. > > And a nit: copyright year should be 2012 ;-) Will fix that too. > Finally, a question that is not directly related to your change now. But > what is the plan for CMS and UseAdaptiveSizePolicy? Do we plan on fixing > it or should we just remove it? If the latter is the case, is there a CR > to remove it? With your change there is quite a few lines of code that > are essentially dead now. I think this is a longer-term discussion, tied to the future of CMS. Good topic for a future meeting :-). -John > On 2012-01-25 06:26, John Coomes wrote: > > I'd appreciate reviews of a simple change to disable > > AdaptiveSizePolicy with CMS and/or ParNew, even if it has been enabled > > on the command line. > > > > http://cr.openjdk.java.net/~jcoomes/7112413-cms-adaptive > > > > -John > From John.Coomes at oracle.com Wed Jan 25 17:44:51 2012 From: John.Coomes at oracle.com (John Coomes) Date: Wed, 25 Jan 2012 17:44:51 -0800 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS In-Reply-To: <4F20194C.7020902@oracle.com> References: <20255.37487.600174.703384@oracle.com> <4F20194C.7020902@oracle.com> Message-ID: <20256.45075.702816.886922@oracle.com> Jon Masamitsu (jon.masamitsu at oracle.com) wrote: > What happens with SerialGC? Nothing changes for SerialGC. ParNew and CMS had existing code to disable AdaptiveSizePolicy, unless it was enabled on the command line. I simply changed those places to disable it unconditionally. -John > On 1/24/2012 9:26 PM, John Coomes wrote: > > I'd appreciate reviews of a simple change to disable > > AdaptiveSizePolicy with CMS and/or ParNew, even if it has been enabled > > on the command line. > > > > http://cr.openjdk.java.net/~jcoomes/7112413-cms-adaptive > > > > -John From bengt.rutisson at oracle.com Thu Jan 26 00:00:21 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 26 Jan 2012 09:00:21 +0100 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS In-Reply-To: <20256.40712.465298.881929@oracle.com> References: <20255.37487.600174.703384@oracle.com> <4F1FB6E1.90306@oracle.com> <20256.40712.465298.881929@oracle.com> Message-ID: <4F210815.2020301@oracle.com> John, Inline. On 2012-01-26 01:32, John Coomes wrote: > Bengt Rutisson (bengt.rutisson at oracle.com) wrote: >> Hi John, >> >> Looks good. >> >> One minor comment: >> >> I'd prefer the test: >> >> 1045 if (!FLAG_IS_DEFAULT(UseAdaptiveSizePolicy)) { >> >> to be: >> >> 1045 if (FLAG_IS_CMDLINE(UseAdaptiveSizePolicy)) { >> >> I think users are only interested in the warning if they actually had >> the switch on the command line. If hotspot turns on the flag >> ergonomically I think it is just confusing to customers to see the warning. > Thanks for the review. I'll make that change; it's more future-proof. Great! >> And a nit: copyright year should be 2012 ;-) > Will fix that too. :-) > >> Finally, a question that is not directly related to your change now. But >> what is the plan for CMS and UseAdaptiveSizePolicy? Do we plan on fixing >> it or should we just remove it? If the latter is the case, is there a CR >> to remove it? With your change there is quite a few lines of code that >> are essentially dead now. > I think this is a longer-term discussion, tied to the future of CMS. > Good topic for a future meeting :-). I agree. We should include this topic when we discuss the GC technical roadmap. Ship it! Bengt > > -John > >> On 2012-01-25 06:26, John Coomes wrote: >>> I'd appreciate reviews of a simple change to disable >>> AdaptiveSizePolicy with CMS and/or ParNew, even if it has been enabled >>> on the command line. >>> >>> http://cr.openjdk.java.net/~jcoomes/7112413-cms-adaptive >>> >>> -John From bengt.rutisson at oracle.com Thu Jan 26 00:06:01 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 26 Jan 2012 09:06:01 +0100 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS In-Reply-To: <4F201BB2.6070406@oracle.com> References: <20255.37487.600174.703384@oracle.com> <4F1FB6E1.90306@oracle.com> <4F201BB2.6070406@oracle.com> Message-ID: <4F210969.4050308@oracle.com> Jon, Thanks for your feedback! On 2012-01-25 16:11, Jon Masamitsu wrote: > > > On 1/25/2012 12:01 AM, Bengt Rutisson wrote: >> >> ... >> >> Finally, a question that is not directly related to your change now. >> But what is the plan for CMS and UseAdaptiveSizePolicy? Do we plan on >> fixing it or should we just remove it? If the latter is the case, is >> there a CR to remove it? With your change there is quite a few lines >> of code that are essentially dead now. > > If you do a CR to remove the UseAdaptiveSizePolicy for CMS, please do > no remove the code > from ParNew and SerialGC. I think client side GC ergo is in our > future. I'd also suggest > making the CR dependent on G1 replacing CMS. Just on the decision > that we will be > removing CMS, not waiting for the actual removal of the CMS code. Good point. My intent was not to file a CR, but to find out what the plans are. Let's do what John Coomes suggested and include this topic in our discussions around the future path for the GC code. Bengt > > Jon > >> >> Bengt >> >> On 2012-01-25 06:26, John Coomes wrote: >>> I'd appreciate reviews of a simple change to disable >>> AdaptiveSizePolicy with CMS and/or ParNew, even if it has been enabled >>> on the command line. >>> >>> http://cr.openjdk.java.net/~jcoomes/7112413-cms-adaptive >>> >>> -John >> From john.coomes at oracle.com Thu Jan 26 02:17:18 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Thu, 26 Jan 2012 10:17:18 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7112413: JVM Crash, possibly GC-related Message-ID: <20120126101721.51558471C8@hg.openjdk.java.net> Changeset: a5244e07b761 Author: jcoomes Date: 2012-01-25 21:14 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a5244e07b761 7112413: JVM Crash, possibly GC-related Summary: disable UseAdaptiveSizePolicy with the CMS and ParNew Reviewed-by: johnc, brutisso ! src/share/vm/runtime/arguments.cpp From tony.printezis at oracle.com Thu Jan 26 11:52:39 2012 From: tony.printezis at oracle.com (Tony Printezis) Date: Thu, 26 Jan 2012 14:52:39 -0500 Subject: CRR (S): 7129892: G1: explicit marking cycle initiation might fail to initiate a marking cycle Message-ID: <4F21AF07.4090401@oracle.com> Hi all, Can I please have a couple of code reviews for the following change? http://cr.openjdk.java.net/~tonyp/7129892/webrev.1/ The issue is that a GC attempt that's supposed to explicitly start a concurrent marking cycle might be unsuccessful (as another GC might get scheduled first) which will prevent the cycle from starting. The idea is to retry such unsuccessful attempts. I also changed should_do_concurrent_full_gc() from an if-statement to a switch statement. I discussed it with Bengt (the last person to modify that method) and we both think that the switch statement is more readable. BTW, I did a couple of iterations of this fix to address the slightly different approach Bengt took in the cycle initiation after hum allocation code (i.e., start the cycle before the allocation, not after). In the end the current version of retry_unsuccessful_concurrent_full_gc(), a new method I added, always returns true for all causes. I'm inclined to leave the switch in, even just for the comments per cause. I could be persuaded to replace it with return true; statement though. Tony From john.cuthbertson at oracle.com Thu Jan 26 12:42:01 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Thu, 26 Jan 2012 12:42:01 -0800 Subject: RFR(S): 7133038: G1: Some small profile based optimizations In-Reply-To: <4F1F4FA7.9080907@oracle.com> References: <4F1F4FA7.9080907@oracle.com> Message-ID: <4F21BA99.3040007@oracle.com> Hi All, Based upon review feed back - I've stripped out the prefetching changes and specialized an additional closure. The prefetching changes will be deferred to until further performance testing with the other collectors has been performed. The new webrev can be found at: http://cr.openjdk.java.net/~johnc/7133038/webrev.1/ Testing: GC test suite. Thanks, JohnC On 01/24/12 16:41, John Cuthbertson wrote: > Hi There, > > Can I have a couple of volunteers review the changes for this CR? The > webrev can be found at: > http://cr.openjdk.java.net/~johnc/7133038/webrev.0/ > > Summary: > While going through hardware profiles of various G1 workloads we were > seeing some high data cache miss rates, and a high number of branches > and branch mispredicts in some routines. These routines help to reduce > those by adding prefetching and some minor code refactoring. > > Testing: GC test suite; jprt; specjbb2005 (to verify the profiling). > > Thanks, > > JohnC From bengt.rutisson at oracle.com Thu Jan 26 13:05:41 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Thu, 26 Jan 2012 22:05:41 +0100 Subject: RFR(S): 7133038: G1: Some small profile based optimizations In-Reply-To: <4F21BA99.3040007@oracle.com> References: <4F1F4FA7.9080907@oracle.com> <4F21BA99.3040007@oracle.com> Message-ID: <4F21C025.5010909@oracle.com> Hi John, Looks good! Ship it! Bengt On 2012-01-26 21:42, John Cuthbertson wrote: > Hi All, > > Based upon review feed back - I've stripped out the prefetching > changes and specialized an additional closure. The prefetching changes > will be deferred to until further performance testing with the other > collectors has been performed. The new webrev can be found at: > http://cr.openjdk.java.net/~johnc/7133038/webrev.1/ > > Testing: GC test suite. > > Thanks, > > JohnC > > On 01/24/12 16:41, John Cuthbertson wrote: >> Hi There, >> >> Can I have a couple of volunteers review the changes for this CR? The >> webrev can be found at: >> http://cr.openjdk.java.net/~johnc/7133038/webrev.0/ >> >> Summary: >> While going through hardware profiles of various G1 workloads we were >> seeing some high data cache miss rates, and a high number of branches >> and branch mispredicts in some routines. These routines help to >> reduce those by adding prefetching and some minor code refactoring. >> >> Testing: GC test suite; jprt; specjbb2005 (to verify the profiling). >> >> Thanks, >> >> JohnC > From john.cuthbertson at oracle.com Thu Jan 26 17:02:51 2012 From: john.cuthbertson at oracle.com (john.cuthbertson at oracle.com) Date: Fri, 27 Jan 2012 01:02:51 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7133038: G1: Some small profile based optimizations Message-ID: <20120127010253.9D22E471E9@hg.openjdk.java.net> Changeset: b4ebad3520bb Author: johnc Date: 2012-01-26 14:14 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/b4ebad3520bb 7133038: G1: Some small profile based optimizations Summary: Some minor profile based optimizations. Reduce the number of branches and branch mispredicts by removing some virtual calls, through closure specalization, and refactoring some conditional statements. Reviewed-by: brutisso, tonyp ! src/share/vm/gc_implementation/g1/g1OopClosures.hpp ! src/share/vm/gc_implementation/g1/g1OopClosures.inline.hpp ! src/share/vm/gc_implementation/g1/g1RemSet.cpp ! src/share/vm/gc_implementation/g1/g1RemSet.hpp ! src/share/vm/gc_implementation/g1/g1RemSet.inline.hpp ! src/share/vm/gc_implementation/g1/g1_specialized_oop_closures.hpp ! src/share/vm/gc_implementation/g1/heapRegion.cpp From john.coomes at oracle.com Thu Jan 26 20:47:47 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 27 Jan 2012 04:47:47 +0000 Subject: hg: hsx/hotspot-gc: Added tag jdk8-b23 for changeset 60d6f64a86b1 Message-ID: <20120127044747.EA1974720E@hg.openjdk.java.net> Changeset: 1a5f1d6b98d6 Author: katleman Date: 2012-01-26 18:23 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/rev/1a5f1d6b98d6 Added tag jdk8-b23 for changeset 60d6f64a86b1 ! .hgtags From john.coomes at oracle.com Thu Jan 26 20:47:54 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 27 Jan 2012 04:47:54 +0000 Subject: hg: hsx/hotspot-gc/corba: Added tag jdk8-b23 for changeset 5218eb256658 Message-ID: <20120127044756.99C6E4720F@hg.openjdk.java.net> Changeset: b98f0e6dddf9 Author: katleman Date: 2012-01-26 18:23 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/corba/rev/b98f0e6dddf9 Added tag jdk8-b23 for changeset 5218eb256658 ! .hgtags From john.coomes at oracle.com Thu Jan 26 20:48:03 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 27 Jan 2012 04:48:03 +0000 Subject: hg: hsx/hotspot-gc/jaxp: Added tag jdk8-b23 for changeset 95102fd33418 Message-ID: <20120127044803.7A0E447210@hg.openjdk.java.net> Changeset: 7836655e2495 Author: katleman Date: 2012-01-26 18:23 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jaxp/rev/7836655e2495 Added tag jdk8-b23 for changeset 95102fd33418 ! .hgtags From john.coomes at oracle.com Thu Jan 26 20:48:15 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 27 Jan 2012 04:48:15 +0000 Subject: hg: hsx/hotspot-gc/jaxws: Added tag jdk8-b23 for changeset 25ce7a000487 Message-ID: <20120127044815.866A447211@hg.openjdk.java.net> Changeset: e0d90803439b Author: katleman Date: 2012-01-26 18:23 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jaxws/rev/e0d90803439b Added tag jdk8-b23 for changeset 25ce7a000487 ! .hgtags From john.coomes at oracle.com Thu Jan 26 20:49:16 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 27 Jan 2012 04:49:16 +0000 Subject: hg: hsx/hotspot-gc/jdk: 43 new changesets Message-ID: <20120127045700.1501E47217@hg.openjdk.java.net> Changeset: 44bd765c22f4 Author: prr Date: 2012-01-13 13:11 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/44bd765c22f4 7127827: JRE8: javaws fails to launch on oracle linux due to XRender Reviewed-by: bae, jgodinez ! src/solaris/classes/sun/java2d/xr/XRCompositeManager.java Changeset: b566004bcb1a Author: dbuck Date: 2012-01-16 11:52 +0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/b566004bcb1a 7083621: Add fontconfig file for OEL6 and rename RH/O EL 5 file so that it is picked up for all 5.x updates Reviewed-by: bae, prr ! make/sun/awt/Makefile Changeset: 397667460892 Author: lana Date: 2012-01-18 11:27 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/397667460892 Merge - test/tools/launcher/DefaultLocaleTest.sh Changeset: e0f94b9c53a8 Author: alexsch Date: 2012-01-10 15:46 +0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/e0f94b9c53a8 7110815: closed/javax/swing/JSplitPane/4885629/bug4885629.java unstable on MacOS Reviewed-by: kizune + test/javax/swing/JSplitPane/4885629/bug4885629.java Changeset: 79d14e328670 Author: alexsch Date: 2012-01-10 17:11 +0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/79d14e328670 6505523: NullPointerException in BasicTreeUI when a node is removed by expansion listener Reviewed-by: rupashka ! src/share/classes/javax/swing/plaf/basic/BasicTreeUI.java + test/javax/swing/JTree/6505523/bug6505523.java Changeset: ce32a4e1be1d Author: alexsch Date: 2012-01-13 12:39 +0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/ce32a4e1be1d 7121765: closed/javax/swing/JTextArea/4697612/bug4697612.java fails on MacOS on Aqua L&F Reviewed-by: rupashka + test/javax/swing/JTextArea/4697612/bug4697612.java + test/javax/swing/JTextArea/4697612/bug4697612.txt Changeset: 59b8875949e1 Author: malenkov Date: 2012-01-16 18:28 +0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/59b8875949e1 7122740: PropertyDescriptor Performance Slow Reviewed-by: rupashka ! src/share/classes/com/sun/beans/TypeResolver.java Changeset: 3e9d35e6ee4f Author: denis Date: 2012-01-17 19:09 +0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/3e9d35e6ee4f 7110590: DnDMerlinQLTestsuite_DnDJTextArea test fails with an java.awt.dnd.InvalidDnDOperationException Reviewed-by: art ! src/share/classes/java/awt/AWTKeyStroke.java Changeset: 89bc9d08fe82 Author: anthony Date: 2012-01-18 19:09 +0400 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/89bc9d08fe82 7130662: GTK file dialog crashes with a NPE Summary: Guard adding a back slash to the directory name with an if (!= null) check Reviewed-by: anthony, art Contributed-by: Matt ! src/solaris/classes/sun/awt/X11/GtkFileDialogPeer.java Changeset: fe1278123fbb Author: lana Date: 2012-01-18 11:41 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/fe1278123fbb Merge - test/tools/launcher/DefaultLocaleTest.sh Changeset: 4d8b49a45cff Author: lana Date: 2012-01-18 20:23 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/4d8b49a45cff Merge Changeset: 400cc379adb5 Author: alanb Date: 2012-01-06 15:00 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/400cc379adb5 7127235: (fs) NPE in Files.walkFileTree if cached attributes are GC'ed Reviewed-by: forax, chegar ! src/share/classes/java/nio/file/FileTreeWalker.java Changeset: cdc128128044 Author: valeriep Date: 2012-01-05 18:18 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/cdc128128044 6414899: P11Digest should support cloning Summary: Enhanced the PKCS11 Digest implementation to support cloning Reviewed-by: vinnie ! make/sun/security/pkcs11/mapfile-vers ! src/share/classes/sun/security/pkcs11/P11Digest.java ! src/share/classes/sun/security/pkcs11/wrapper/PKCS11.java ! src/share/lib/security/sunpkcs11-solaris.cfg ! src/share/native/sun/security/pkcs11/wrapper/pkcs11wrapper.h + test/sun/security/pkcs11/MessageDigest/TestCloning.java Changeset: e6ef778c1df4 Author: valeriep Date: 2012-01-06 11:02 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/e6ef778c1df4 Merge Changeset: 6720ae7b1448 Author: valeriep Date: 2012-01-06 16:06 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/6720ae7b1448 7033170: Cipher.getMaxAllowedKeyLength(String) throws NoSuchAlgorithmException Summary: Changed to always use full transformation as provider properties. Reviewed-by: mullan ! src/share/classes/sun/security/pkcs11/SunPKCS11.java ! test/javax/crypto/Cipher/GetMaxAllowed.java Changeset: 2050ff9dfc92 Author: darcy Date: 2012-01-06 18:47 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/2050ff9dfc92 7123649: Remove public modifier from Math.powerOfTwoF. Reviewed-by: smarks, alanb ! src/share/classes/java/lang/Math.java Changeset: 74c92c3e66ad Author: gadams Date: 2012-01-09 19:33 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/74c92c3e66ad 7030573: test/java/io/FileInputStream/LargeFileAvailable.java fails when there is insufficient disk space Reviewed-by: alanb ! test/java/io/FileInputStream/LargeFileAvailable.java Changeset: 858038d89fd5 Author: darcy Date: 2012-01-09 15:54 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/858038d89fd5 7128441: StrictMath performance improvement note shared with Math Reviewed-by: darcy Contributed-by: Martin Desruisseaux ! src/share/classes/java/lang/Math.java ! src/share/classes/java/lang/StrictMath.java Changeset: dd69d3695cee Author: darcy Date: 2012-01-09 20:14 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/dd69d3695cee 7128512: Javadoc typo in java.lang.invoke.MethodHandle Reviewed-by: mduigou ! src/share/classes/java/lang/invoke/MethodHandle.java Changeset: d72de8b3fe36 Author: chegar Date: 2012-01-10 10:57 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/d72de8b3fe36 7123415: Some cases of network interface indexes being read incorrectly Reviewed-by: chegar Contributed-by: brandon.passanisi at oracle.com ! src/solaris/native/java/net/net_util_md.c Changeset: bba276a6aa0d Author: chegar Date: 2012-01-10 12:48 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/bba276a6aa0d 7128584: Typo in sun.misc.VM's private directMemory field comment Reviewed-by: forax, chegar Contributed-by: Krystal Mok ! src/share/classes/sun/misc/VM.java Changeset: 49e64a8fc18f Author: darcy Date: 2012-01-10 17:12 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/49e64a8fc18f 7112008: Javadoc for j.l.Object.finalize() vs JLS 12.6 Finalization of Class Instances Reviewed-by: mduigou ! src/share/classes/java/lang/Object.java Changeset: 62dbcbe4c446 Author: darcy Date: 2012-01-10 17:46 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/62dbcbe4c446 7128931: Bad HTML escaping in java.lang.Throwable javadoc Reviewed-by: mduigou ! src/share/classes/java/lang/Throwable.java Changeset: 31a1fc60a895 Author: chegar Date: 2012-01-11 10:52 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/31a1fc60a895 7128648: HttpURLConnection.getHeaderFields should return an unmodifiable Map Reviewed-by: michaelm ! src/share/classes/sun/net/www/protocol/http/HttpURLConnection.java + test/java/net/HttpURLConnection/UnmodifiableMaps.java Changeset: 82144054d2d8 Author: alanb Date: 2012-01-11 13:07 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/82144054d2d8 7068856: (fs) Typo in Files.isSameFile() javadoc 7099208: (fs) Files.newBufferedReader has typo in javadoc Reviewed-by: forax ! src/share/classes/java/nio/file/Files.java ! src/share/classes/java/nio/file/Path.java Changeset: 96fe796fd242 Author: ksrini Date: 2012-01-11 08:14 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/96fe796fd242 7125442: jar application located in two bytes character named folder cannot be run with JRE 7 u1/u2 Reviewed-by: sherman, mchung, darcy ! src/share/bin/java.c + test/tools/launcher/I18NJarTest.java ! test/tools/launcher/TestHelper.java Changeset: 11e52d5ba64e Author: xuelei Date: 2012-01-12 03:39 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/11e52d5ba64e 7106773: 512 bits RSA key cannot work with SHA384 and SHA512 Reviewed-by: weijun ! src/share/classes/sun/security/pkcs11/P11Cipher.java ! src/share/classes/sun/security/pkcs11/P11Key.java ! src/share/classes/sun/security/pkcs11/P11RSACipher.java ! src/share/classes/sun/security/pkcs11/P11Signature.java ! src/share/classes/sun/security/ssl/ClientHandshaker.java ! src/share/classes/sun/security/ssl/ServerHandshaker.java ! src/share/classes/sun/security/ssl/SignatureAndHashAlgorithm.java ! src/share/classes/sun/security/util/DisabledAlgorithmConstraints.java + src/share/classes/sun/security/util/KeyLength.java + src/share/classes/sun/security/util/Length.java ! src/windows/classes/sun/security/mscapi/Key.java ! src/windows/classes/sun/security/mscapi/RSACipher.java ! src/windows/classes/sun/security/mscapi/RSASignature.java + test/sun/security/mscapi/ShortRSAKey1024.sh + test/sun/security/mscapi/ShortRSAKey512.sh + test/sun/security/mscapi/ShortRSAKey768.sh + test/sun/security/mscapi/ShortRSAKeyWithinTLS.java ! test/sun/security/pkcs11/KeyStore/ClientAuth.java ! test/sun/security/pkcs11/KeyStore/ClientAuth.sh ! test/sun/security/ssl/javax/net/ssl/SSLContextVersion.java + test/sun/security/ssl/javax/net/ssl/TLSv12/ShortRSAKey512.java Changeset: 38bf1e9b6979 Author: weijun Date: 2012-01-13 09:50 +0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/38bf1e9b6979 7090565: Move test/closed/javax/security/auth/x500/X500Principal/Parse.java to open tests Reviewed-by: mullan + test/javax/security/auth/x500/X500Principal/NameFormat.java Changeset: ef3b6736c074 Author: valeriep Date: 2012-01-12 16:04 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/ef3b6736c074 7088989: Improve the performance for T4 by utilizing the newly provided crypto APIs Summary: Added the OracleUcrypto provider for utilizing the Solaris ucrypto API. Reviewed-by: weijun ! make/com/oracle/Makefile + make/com/oracle/net/Makefile + make/com/oracle/nio/Makefile + make/com/oracle/security/ucrypto/FILES_c.gmk + make/com/oracle/security/ucrypto/Makefile + make/com/oracle/security/ucrypto/mapfile-vers + make/com/oracle/util/Makefile ! src/share/lib/security/java.security-solaris ! test/Makefile + test/com/oracle/security/ucrypto/TestAES.java + test/com/oracle/security/ucrypto/TestDigest.java + test/com/oracle/security/ucrypto/TestRSA.java + test/com/oracle/security/ucrypto/UcryptoTest.java ! test/java/security/Provider/DefaultPKCS11.java Changeset: a7ad2fcd7291 Author: valeriep Date: 2012-01-12 18:49 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/a7ad2fcd7291 Merge Changeset: 7e593aa6ad41 Author: littlee Date: 2012-01-13 13:20 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/7e593aa6ad41 7129029: (fs) Unix file system provider should be buildable on platforms that don't support O_NOFOLLOW Reviewed-by: alanb ! src/solaris/classes/sun/nio/fs/UnixChannelFactory.java ! src/solaris/classes/sun/nio/fs/UnixFileSystemProvider.java ! src/solaris/classes/sun/nio/fs/UnixNativeDispatcher.java ! src/solaris/classes/sun/nio/fs/UnixPath.java ! src/solaris/native/sun/nio/fs/genUnixConstants.c Changeset: e8e08d46cc37 Author: weijun Date: 2012-01-16 10:10 +0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/e8e08d46cc37 7118809: rcache deadlock Reviewed-by: valeriep ! src/share/classes/sun/security/krb5/internal/rcache/CacheTable.java ! src/share/classes/sun/security/krb5/internal/rcache/ReplayCache.java ! test/sun/security/krb5/auto/Context.java + test/sun/security/krb5/auto/ReplayCache.java Changeset: d1b0bda3a3c7 Author: alanb Date: 2012-01-16 16:30 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/d1b0bda3a3c7 7130398: ProblemList.txt updates (1/2012) Reviewed-by: chegar ! test/ProblemList.txt Changeset: e8a143213c65 Author: chegar Date: 2012-01-16 18:05 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/e8a143213c65 7129083: CookieManager does not store cookies if url is read before setting cookie manager Reviewed-by: michaelm ! src/share/classes/sun/net/www/http/HttpClient.java ! src/share/classes/sun/net/www/protocol/http/HttpURLConnection.java ! src/share/classes/sun/net/www/protocol/https/HttpsClient.java + test/sun/net/www/http/HttpClient/CookieHttpClientTest.java + test/sun/security/ssl/sun/net/www/protocol/https/HttpsURLConnection/CookieHttpsClientTest.java Changeset: 40d699d7f6a1 Author: chegar Date: 2012-01-17 14:10 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/40d699d7f6a1 6671616: TEST_BUG: java/io/File/BlockIsDirectory.java fails when /dev/dsk empty (sol) Reviewed-by: alanb ! test/ProblemList.txt - test/java/io/File/BlockIsDirectory.java Changeset: 2f096eb72520 Author: mchung Date: 2012-01-17 15:55 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/2f096eb72520 7117570: Warnings in sun.mangement.* and its subpackages Reviewed-by: mchung, dsamersoff Contributed-by: kurchi.subhra.hazra at oracle.com ! src/share/classes/sun/management/Agent.java ! src/share/classes/sun/management/ConnectorAddressLink.java ! src/share/classes/sun/management/Flag.java ! src/share/classes/sun/management/GarbageCollectionNotifInfoCompositeData.java ! src/share/classes/sun/management/GarbageCollectorImpl.java ! src/share/classes/sun/management/GcInfoBuilder.java ! src/share/classes/sun/management/GcInfoCompositeData.java ! src/share/classes/sun/management/HotSpotDiagnostic.java ! src/share/classes/sun/management/HotspotCompilation.java ! src/share/classes/sun/management/HotspotThread.java ! src/share/classes/sun/management/LazyCompositeData.java ! src/share/classes/sun/management/ManagementFactoryHelper.java ! src/share/classes/sun/management/MappedMXBeanType.java ! src/share/classes/sun/management/MonitorInfoCompositeData.java ! src/share/classes/sun/management/NotificationEmitterSupport.java ! src/share/classes/sun/management/RuntimeImpl.java ! src/share/classes/sun/management/ThreadInfoCompositeData.java ! src/share/classes/sun/management/counter/perf/PerfInstrumentation.java ! src/share/classes/sun/management/jmxremote/ConnectorBootstrap.java ! src/share/classes/sun/management/snmp/AdaptorBootstrap.java ! src/share/classes/sun/management/snmp/jvminstr/JVM_MANAGEMENT_MIB_IMPL.java ! src/share/classes/sun/management/snmp/jvminstr/JvmMemGCTableMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmMemManagerTableMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmMemMgrPoolRelTableMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmMemPoolTableMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmMemoryImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmMemoryMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmOSImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmRTBootClassPathEntryImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmRTBootClassPathTableMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmRTClassPathEntryImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmRTClassPathTableMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmRTInputArgsEntryImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmRTInputArgsTableMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmRTLibraryPathEntryImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmRTLibraryPathTableMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmRuntimeMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmThreadInstanceEntryImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmThreadInstanceTableMetaImpl.java ! src/share/classes/sun/management/snmp/jvminstr/JvmThreadingMetaImpl.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmClassesVerboseLevel.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmJITCompilerTimeMonitoring.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmMemManagerState.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmMemPoolCollectThreshdSupport.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmMemPoolState.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmMemPoolThreshdSupport.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmMemPoolType.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmMemoryGCCall.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmMemoryGCVerboseLevel.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmRTBootClassPathSupport.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmThreadContentionMonitoring.java ! src/share/classes/sun/management/snmp/jvmmib/EnumJvmThreadCpuTimeMonitoring.java ! src/share/classes/sun/management/snmp/jvmmib/JVM_MANAGEMENT_MIB.java ! src/share/classes/sun/management/snmp/jvmmib/JVM_MANAGEMENT_MIBOidTable.java ! src/share/classes/sun/management/snmp/jvmmib/JvmClassLoadingMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmCompilationMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmMemGCEntryMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmMemGCTableMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmMemManagerEntryMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmMemManagerTableMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmMemMgrPoolRelEntryMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmMemMgrPoolRelTableMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmMemPoolEntryMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmMemPoolTableMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmOSMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmRTBootClassPathEntryMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmRTBootClassPathTableMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmRTClassPathEntryMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmRTClassPathTableMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmRTInputArgsEntryMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmRTInputArgsTableMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmRTLibraryPathEntryMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmRTLibraryPathTableMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmRuntimeMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmThreadInstanceEntryMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmThreadInstanceTableMeta.java ! src/share/classes/sun/management/snmp/jvmmib/JvmThreadingMeta.java ! src/share/classes/sun/management/snmp/util/MibLogger.java ! src/share/classes/sun/management/snmp/util/SnmpListTableCache.java ! src/share/classes/sun/management/snmp/util/SnmpNamedListTableCache.java ! src/share/classes/sun/management/snmp/util/SnmpTableCache.java Changeset: b14e13237498 Author: lana Date: 2012-01-18 11:00 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/b14e13237498 Merge Changeset: e6614f361127 Author: lana Date: 2012-01-18 20:24 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/e6614f361127 Merge - test/java/io/File/BlockIsDirectory.java Changeset: 227fcf5d0bec Author: lana Date: 2012-01-24 13:43 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/227fcf5d0bec Merge - test/java/io/File/BlockIsDirectory.java Changeset: 954a1c535730 Author: amurillo Date: 2012-01-25 12:36 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/954a1c535730 Merge - test/java/io/File/BlockIsDirectory.java Changeset: d3b334e376d3 Author: mr Date: 2012-01-23 12:39 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/d3b334e376d3 7110396: Sound code fails to build with gcc 4.6 on multiarch Linux systems Reviewed-by: ohair ! make/javax/sound/jsoundalsa/Makefile Changeset: 54202e0148ec Author: katleman Date: 2012-01-25 13:54 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/54202e0148ec Merge Changeset: 34029a0c69bb Author: katleman Date: 2012-01-26 18:23 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/jdk/rev/34029a0c69bb Added tag jdk8-b23 for changeset 54202e0148ec ! .hgtags From stefan.karlsson at oracle.com Fri Jan 27 05:24:36 2012 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 27 Jan 2012 14:24:36 +0100 Subject: Review request (S): 7134655: Crash in reference processing when doing single-threaded remarking Message-ID: <4F22A594.9030504@oracle.com> http://cr.openjdk.java.net/~stefank/7134655/webrev.00/ 7134655: Crash in reference processing when doing single-threaded remarking Summary: Temporarily disabled multi-threaded reference discovery when entering a single-threaded remark phase. Reviewed-by: TBD1, TBD2 StefanK From bengt.rutisson at oracle.com Fri Jan 27 05:35:57 2012 From: bengt.rutisson at oracle.com (Bengt Rutisson) Date: Fri, 27 Jan 2012 14:35:57 +0100 Subject: Review request (S): 7134655: Crash in reference processing when doing single-threaded remarking In-Reply-To: <4F22A594.9030504@oracle.com> References: <4F22A594.9030504@oracle.com> Message-ID: <4F22A83D.1060400@oracle.com> On 2012-01-27 14:24, Stefan Karlsson wrote: > http://cr.openjdk.java.net/~stefank/7134655/webrev.00/ > > 7134655: Crash in reference processing when doing single-threaded > remarking > Summary: Temporarily disabled multi-threaded reference discovery when > entering a single-threaded remark phase. > Reviewed-by: TBD1, TBD2 > > StefanK Looks good! Copyright year... ;-) Bengt From bengt.rutisson at oracle.com Fri Jan 27 07:04:07 2012 From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com) Date: Fri, 27 Jan 2012 15:04:07 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 44 new changesets Message-ID: <20120127150538.F261B47229@hg.openjdk.java.net> Changeset: 5da7201222d5 Author: kvn Date: 2012-01-07 10:39 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5da7201222d5 7110824: ctw/jarfiles/GUI3rdParty_jar/ob_mask_DateField crashes VM Summary: Change yank_if_dead() to recursive method to remove all dead inputs. Reviewed-by: never ! src/cpu/sparc/vm/sparc.ad ! src/share/vm/opto/chaitin.hpp ! src/share/vm/opto/postaloc.cpp Changeset: e9a5e0a812c8 Author: kvn Date: 2012-01-07 13:26 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/e9a5e0a812c8 7125896: Eliminate nested locks Summary: Nested locks elimination done before lock nodes expansion by looking for outer locks of the same object. Reviewed-by: never, twisti ! src/cpu/sparc/vm/sparc.ad ! src/cpu/x86/vm/x86_32.ad ! src/cpu/x86/vm/x86_64.ad ! src/share/vm/ci/ciTypeFlow.cpp ! src/share/vm/ci/ciTypeFlow.hpp ! src/share/vm/opto/c2_globals.hpp ! src/share/vm/opto/callnode.cpp ! src/share/vm/opto/callnode.hpp ! src/share/vm/opto/escape.cpp ! src/share/vm/opto/locknode.cpp ! src/share/vm/opto/locknode.hpp ! src/share/vm/opto/macro.cpp ! src/share/vm/opto/macro.hpp ! src/share/vm/opto/output.cpp ! src/share/vm/opto/parse1.cpp ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/deoptimization.cpp Changeset: 35acf8f0a2e4 Author: kvn Date: 2012-01-10 18:05 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/35acf8f0a2e4 7128352: assert(obj_node == obj) failed Summary: Compare uncasted object nodes. Reviewed-by: never ! src/share/vm/opto/callnode.cpp ! src/share/vm/opto/cfgnode.cpp ! src/share/vm/opto/library_call.cpp ! src/share/vm/opto/locknode.cpp ! src/share/vm/opto/macro.cpp ! src/share/vm/opto/memnode.cpp ! src/share/vm/opto/node.cpp ! src/share/vm/opto/node.hpp ! src/share/vm/opto/phaseX.hpp ! src/share/vm/opto/subnode.cpp ! test/compiler/7116216/StackOverflow.java Changeset: c8d8e124380c Author: kvn Date: 2012-01-12 12:28 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c8d8e124380c 7064302: JDK7 build 147 crashed after testing my java 6-compiled web app Summary: Don't split CMove node if it's control edge is different from split region. Reviewed-by: never ! src/share/vm/opto/loopnode.cpp ! src/share/vm/opto/loopnode.hpp ! src/share/vm/opto/loopopts.cpp Changeset: 31a5b9aad4bc Author: jrose Date: 2012-01-13 00:27 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/31a5b9aad4bc Merge ! src/share/vm/runtime/arguments.cpp Changeset: 5acd82522540 Author: brutisso Date: 2012-01-13 06:18 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5acd82522540 Merge Changeset: b0ff910edfc9 Author: kvn Date: 2012-01-12 14:45 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/b0ff910edfc9 7128355: assert(!nocreate) failed: Cannot build a phi for a block already parsed Summary: Do not common BoxLock nodes and avoid creating phis of boxes. Reviewed-by: never ! src/share/vm/opto/callnode.cpp ! src/share/vm/opto/locknode.cpp ! src/share/vm/opto/locknode.hpp ! src/share/vm/opto/macro.cpp ! src/share/vm/opto/parse1.cpp Changeset: f4d8930a45b9 Author: jrose Date: 2012-01-13 00:51 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f4d8930a45b9 Merge Changeset: 89d0a5d40008 Author: kvn Date: 2012-01-13 12:58 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/89d0a5d40008 7129618: assert(obj_node->eqv_uncast(obj),""); Summary: Relax verification and locks elimination checks for new implementation (EliminateNestedLocks). Reviewed-by: iveresov ! src/share/vm/opto/locknode.cpp ! src/share/vm/opto/macro.cpp Changeset: e504fd26c073 Author: kvn Date: 2012-01-13 14:21 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/e504fd26c073 Merge Changeset: fe2c87649981 Author: katleman Date: 2011-12-29 15:14 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/fe2c87649981 Added tag jdk8-b19 for changeset 9232e0ecbc2c ! .hgtags Changeset: 9952d1c439d6 Author: katleman Date: 2012-01-05 08:42 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/9952d1c439d6 Added tag jdk8-b20 for changeset fe2c87649981 ! .hgtags Changeset: ed621d125d02 Author: katleman Date: 2012-01-13 10:05 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/ed621d125d02 Added tag jdk8-b21 for changeset 9952d1c439d6 ! .hgtags Changeset: 513351373923 Author: amurillo Date: 2012-01-14 00:47 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/513351373923 Merge Changeset: 24727fb37561 Author: amurillo Date: 2012-01-14 00:47 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/24727fb37561 Added tag hs23-b10 for changeset 513351373923 ! .hgtags Changeset: 4e80db53c323 Author: amurillo Date: 2012-01-14 00:52 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4e80db53c323 7129512: new hotspot build - hs23-b11 Reviewed-by: jcoomes ! make/hotspot_version Changeset: 94ec88ca68e2 Author: phh Date: 2012-01-11 17:34 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/94ec88ca68e2 7115199: Add event tracing hooks and Java Flight Recorder infrastructure Summary: Added a nop tracing infrastructure, JFR makefile changes and other infrastructure used only by JFR. Reviewed-by: acorn, sspitsyn Contributed-by: markus.gronlund at oracle.com ! make/Makefile ! make/bsd/makefiles/vm.make ! make/defs.make ! make/linux/makefiles/vm.make ! make/solaris/makefiles/vm.make ! make/windows/build.bat ! make/windows/create_obj_files.sh ! make/windows/makefiles/projectcreator.make ! make/windows/makefiles/vm.make ! src/share/vm/classfile/symbolTable.cpp ! src/share/vm/classfile/symbolTable.hpp ! src/share/vm/classfile/systemDictionary.cpp ! src/share/vm/oops/klass.cpp ! src/share/vm/oops/klass.hpp ! src/share/vm/oops/methodKlass.cpp ! src/share/vm/oops/methodOop.hpp ! src/share/vm/prims/jni.cpp + src/share/vm/prims/jniExport.hpp ! src/share/vm/runtime/java.cpp ! src/share/vm/runtime/mutexLocker.cpp ! src/share/vm/runtime/mutexLocker.hpp ! src/share/vm/runtime/os.cpp ! src/share/vm/runtime/thread.cpp ! src/share/vm/runtime/thread.hpp ! src/share/vm/runtime/vm_operations.hpp + src/share/vm/trace/traceEventTypes.hpp + src/share/vm/trace/traceMacros.hpp + src/share/vm/trace/tracing.hpp ! src/share/vm/utilities/globalDefinitions.hpp Changeset: 4f3ce9284781 Author: phh Date: 2012-01-11 17:58 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/4f3ce9284781 Merge ! src/share/vm/oops/klass.cpp ! src/share/vm/oops/klass.hpp Changeset: f1cd52d6ce02 Author: kamg Date: 2012-01-17 10:16 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/f1cd52d6ce02 Merge Changeset: d7e3846464d0 Author: zgu Date: 2012-01-17 13:08 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/d7e3846464d0 7071311: Decoder enhancement Summary: Made decoder thread-safe Reviewed-by: coleenp, kamg - src/os/bsd/vm/decoder_bsd.cpp + src/os/bsd/vm/decoder_machO.cpp + src/os/bsd/vm/decoder_machO.hpp ! src/os/linux/vm/decoder_linux.cpp ! src/os/linux/vm/os_linux.cpp ! src/os/solaris/vm/decoder_solaris.cpp ! src/os/solaris/vm/os_solaris.cpp ! src/os/windows/vm/decoder_windows.cpp + src/os/windows/vm/decoder_windows.hpp ! src/os/windows/vm/os_windows.cpp ! src/share/vm/utilities/decoder.cpp ! src/share/vm/utilities/decoder.hpp + src/share/vm/utilities/decoder_elf.cpp + src/share/vm/utilities/decoder_elf.hpp ! src/share/vm/utilities/elfFile.cpp ! src/share/vm/utilities/elfFile.hpp ! src/share/vm/utilities/elfStringTable.cpp ! src/share/vm/utilities/elfStringTable.hpp ! src/share/vm/utilities/elfSymbolTable.cpp ! src/share/vm/utilities/elfSymbolTable.hpp ! src/share/vm/utilities/vmError.cpp Changeset: 6520f9861937 Author: kamg Date: 2012-01-17 21:25 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/6520f9861937 Merge Changeset: db18ca98d237 Author: zgu Date: 2012-01-18 11:45 -0500 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/db18ca98d237 7131050: fix for "7071311 Decoder enhancement" does not build on MacOS X Summary: Decoder API changes did not reflect in os_bsd Reviewed-by: kamg, dcubed ! src/os/bsd/vm/os_bsd.cpp Changeset: eaa9557116a2 Author: bdelsart Date: 2012-01-18 16:18 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/eaa9557116a2 7120448: Fix FP values for compiled frames in frame::describe Summary: fix for debug method frame::describe Reviewed-by: never, kvn ! src/cpu/sparc/vm/frame_sparc.inline.hpp ! src/cpu/x86/vm/frame_x86.cpp ! src/cpu/x86/vm/frame_x86.hpp ! src/cpu/zero/vm/frame_zero.inline.hpp ! src/share/vm/runtime/frame.cpp ! src/share/vm/runtime/frame.hpp Changeset: 15d394228cfa Author: jrose Date: 2012-01-19 13:00 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/15d394228cfa 7111138: delete the obsolete flag -XX:+UseRicochetFrames Reviewed-by: dholmes, bdelsart, kvn, twisti ! src/cpu/sparc/vm/methodHandles_sparc.cpp ! src/cpu/x86/vm/methodHandles_x86.cpp ! src/cpu/zero/vm/methodHandles_zero.hpp ! src/share/vm/prims/methodHandles.cpp ! src/share/vm/prims/methodHandles.hpp ! src/share/vm/runtime/globals.hpp ! src/share/vm/runtime/sharedRuntime.cpp Changeset: 898522ae3c32 Author: iveresov Date: 2012-01-19 10:56 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/898522ae3c32 7131288: COMPILE SKIPPED: deopt handler overflow (retry at different tier) Summary: Fix exception handler stub size, enable guarantees to check for the correct deopt and exception stub sizes in the future Reviewed-by: kvn, never, twisti ! src/cpu/sparc/vm/c1_LIRAssembler_sparc.cpp ! src/cpu/sparc/vm/c1_LIRAssembler_sparc.hpp ! src/cpu/x86/vm/c1_LIRAssembler_x86.cpp Changeset: 469e0a46f2fe Author: jrose Date: 2012-01-19 17:20 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/469e0a46f2fe Merge Changeset: 50d9b7a0072c Author: jrose Date: 2012-01-19 18:35 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/50d9b7a0072c Merge Changeset: 338d438ee229 Author: katleman Date: 2012-01-20 13:08 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/338d438ee229 Added tag jdk8-b22 for changeset 24727fb37561 ! .hgtags Changeset: dcc292399a39 Author: amurillo Date: 2012-01-20 16:56 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/dcc292399a39 Merge - src/os/bsd/vm/decoder_bsd.cpp Changeset: e850d8e7ea54 Author: amurillo Date: 2012-01-20 16:56 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/e850d8e7ea54 Added tag hs23-b11 for changeset dcc292399a39 ! .hgtags Changeset: 5f3fcd591768 Author: amurillo Date: 2012-01-20 17:07 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5f3fcd591768 7131979: new hotspot build - hs23-b12 Reviewed-by: jcoomes ! make/hotspot_version Changeset: 53a127075045 Author: kvn Date: 2012-01-20 09:43 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/53a127075045 7131302: connode.cpp:205 Error: ShouldNotReachHere() Summary: Add Value() methods to short and byte Load nodes to truncate constants which does not fit. Reviewed-by: jrose ! src/share/vm/opto/memnode.cpp ! src/share/vm/opto/memnode.hpp Changeset: 9164b8236699 Author: iveresov Date: 2012-01-20 15:02 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/9164b8236699 7131028: Switch statement takes wrong path Summary: Pass correct type to branch in LIRGenerator::do_SwitchRanges() Reviewed-by: kvn, never ! src/share/vm/c1/c1_LIR.hpp ! src/share/vm/c1/c1_LIRGenerator.cpp Changeset: a81f60ddab06 Author: never Date: 2012-01-22 14:03 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/a81f60ddab06 7130676: Tiered: assert(bci == 0 || 0<= bci && bciis_loaded() Summary: handle not loaded array klass in Parse::do_checkcast(). Reviewed-by: kvn, never ! src/share/vm/opto/parseHelper.cpp Changeset: 5dbed2f542ff Author: bdelsart Date: 2012-01-26 16:49 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/5dbed2f542ff 7120468: SPARC/x86: use frame::describe to enhance trace_method_handle Summary: improvements of TraceMethodHandles for JSR292 Reviewed-by: never, twisti ! src/cpu/sparc/vm/frame_sparc.cpp ! src/cpu/sparc/vm/methodHandles_sparc.cpp ! src/cpu/sparc/vm/methodHandles_sparc.hpp ! src/cpu/x86/vm/frame_x86.cpp ! src/cpu/x86/vm/methodHandles_x86.cpp ! src/cpu/x86/vm/methodHandles_x86.hpp ! src/cpu/zero/vm/frame_zero.cpp ! src/share/vm/runtime/frame.cpp ! src/share/vm/runtime/frame.hpp Changeset: 20334ed5ed3c Author: iveresov Date: 2012-01-26 12:15 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/20334ed5ed3c 7131259: compile_method and CompilationPolicy::event shouldn't be declared TRAPS Summary: Make sure that CompilationPolicy::event() doesn't throw exceptions Reviewed-by: kvn, never ! src/share/vm/c1/c1_Runtime1.cpp ! src/share/vm/compiler/compileBroker.cpp ! src/share/vm/compiler/compileBroker.hpp ! src/share/vm/interpreter/interpreterRuntime.cpp ! src/share/vm/runtime/advancedThresholdPolicy.cpp ! src/share/vm/runtime/advancedThresholdPolicy.hpp ! src/share/vm/runtime/compilationPolicy.cpp ! src/share/vm/runtime/compilationPolicy.hpp ! src/share/vm/runtime/simpleThresholdPolicy.cpp ! src/share/vm/runtime/simpleThresholdPolicy.hpp ! src/share/vm/utilities/exceptions.hpp Changeset: 072384a61312 Author: jrose Date: 2012-01-26 19:39 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/072384a61312 Merge Changeset: 0a10d80352d5 Author: brutisso Date: 2012-01-27 09:04 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/0a10d80352d5 Merge - src/os/bsd/vm/decoder_bsd.cpp ! src/share/vm/runtime/arguments.cpp ! src/share/vm/runtime/mutexLocker.cpp ! src/share/vm/runtime/mutexLocker.hpp From John.Coomes at oracle.com Fri Jan 27 12:02:35 2012 From: John.Coomes at oracle.com (John Coomes) Date: Fri, 27 Jan 2012 12:02:35 -0800 Subject: Review request (S): 7134655: Crash in reference processing when doing single-threaded remarking In-Reply-To: <4F22A594.9030504@oracle.com> References: <4F22A594.9030504@oracle.com> Message-ID: <20259.731.978201.379124@oracle.com> Stefan Karlsson (stefan.karlsson at oracle.com) wrote: > http://cr.openjdk.java.net/~stefank/7134655/webrev.00/ > > 7134655: Crash in reference processing when doing single-threaded remarking > Summary: Temporarily disabled multi-threaded reference discovery when > entering a single-threaded remark phase. > Reviewed-by: TBD1, TBD2 s/TBD1/jcoomes/ (In other words, looks good to me.) -John From john.coomes at oracle.com Fri Jan 27 12:44:22 2012 From: john.coomes at oracle.com (john.coomes at oracle.com) Date: Fri, 27 Jan 2012 20:44:22 +0000 Subject: hg: hsx/hotspot-gc/langtools: 8 new changesets Message-ID: <20120127204443.2FF594723C@hg.openjdk.java.net> Changeset: 70d92518063e Author: mcimadamore Date: 2012-01-11 18:23 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/70d92518063e 7126754: Generics compilation failure casting List to List Summary: Problems with Types.rewriteQuantifiers not preserving variance Reviewed-by: jjg ! src/share/classes/com/sun/tools/javac/code/Types.java + test/tools/javac/cast/7126754/T7126754.java + test/tools/javac/cast/7126754/T7126754.out Changeset: 133744729455 Author: mcimadamore Date: 2012-01-12 15:28 +0000 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/133744729455 7123100: javac fails with java.lang.StackOverflowError Summary: Inference of under-constrained type-variables creates erroneous recursive wildcard types Reviewed-by: jjg ! src/share/classes/com/sun/tools/javac/comp/Infer.java + test/tools/javac/cast/7123100/T7123100a.java + test/tools/javac/cast/7123100/T7123100a.out + test/tools/javac/cast/7123100/T7123100b.java + test/tools/javac/cast/7123100/T7123100b.out + test/tools/javac/cast/7123100/T7123100c.java + test/tools/javac/cast/7123100/T7123100c.out + test/tools/javac/cast/7123100/T7123100d.java + test/tools/javac/cast/7123100/T7123100d.out Changeset: 1e2f4f4fb9f7 Author: jjh Date: 2012-01-17 17:14 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/1e2f4f4fb9f7 7127924: langtools regression tests sometimes fail en-masse on windows Reviewed-by: jjg ! test/tools/javac/diags/CheckExamples.java ! test/tools/javac/diags/MessageInfo.java ! test/tools/javac/diags/RunExamples.java Changeset: f00afa80f1f0 Author: lana Date: 2012-01-18 11:00 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/f00afa80f1f0 Merge Changeset: cf2496340fef Author: darcy Date: 2012-01-18 16:43 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/cf2496340fef 7130768: Clarify behavior of Element.getEnclosingElements in subtypes Reviewed-by: mcimadamore, jjg ! src/share/classes/javax/lang/model/element/Element.java ! src/share/classes/javax/lang/model/element/PackageElement.java ! src/share/classes/javax/lang/model/element/TypeElement.java Changeset: 99261fc7d95d Author: jjh Date: 2012-01-18 18:26 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/99261fc7d95d 7131308: Three regression tests fail due to bad fix for 7127924 Reviewed-by: jjg ! test/tools/javac/diags/CheckExamples.java ! test/tools/javac/diags/MessageInfo.java ! test/tools/javac/diags/RunExamples.java Changeset: 601ffcc6551d Author: lana Date: 2012-01-24 13:44 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/601ffcc6551d Merge Changeset: 6c9d21ca92c4 Author: katleman Date: 2012-01-26 18:23 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/langtools/rev/6c9d21ca92c4 Added tag jdk8-b23 for changeset 601ffcc6551d ! .hgtags From stefan.karlsson at oracle.com Sat Jan 28 03:40:43 2012 From: stefan.karlsson at oracle.com (stefan.karlsson at oracle.com) Date: Sat, 28 Jan 2012 11:40:43 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 2 new changesets Message-ID: <20120128114059.73B124724A@hg.openjdk.java.net> Changeset: be649fefcdc2 Author: stefank Date: 2012-01-27 14:14 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/be649fefcdc2 7134655: Crash in reference processing when doing single-threaded remarking Summary: Temporarily disabled multi-threaded reference discovery when entering a single-threaded remark phase. Reviewed-by: brutisso, tonyp, jmasa, jcoomes ! src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp Changeset: c03e06373b47 Author: stefank Date: 2012-01-28 01:15 -0800 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/c03e06373b47 Merge - src/os/bsd/vm/decoder_bsd.cpp From rednaxelafx at gmail.com Sun Jan 29 21:56:48 2012 From: rednaxelafx at gmail.com (Krystal Mok) Date: Mon, 30 Jan 2012 13:56:48 +0800 Subject: review request (XS) - 7112413: disable AdaptiveSizePolicy w/CMS In-Reply-To: <20256.40712.465298.881929@oracle.com> References: <20255.37487.600174.703384@oracle.com> <4F1FB6E1.90306@oracle.com> <20256.40712.465298.881929@oracle.com> Message-ID: Hi John, Comments inline: On Thu, Jan 26, 2012 at 8:32 AM, John Coomes wrote: > Bengt Rutisson (bengt.rutisson at oracle.com) wrote: > > > > Hi John, > > > > Looks good. > > > > One minor comment: > > > > I'd prefer the test: > > > > 1045 if (!FLAG_IS_DEFAULT(UseAdaptiveSizePolicy)) { > > > > to be: > > > > 1045 if (FLAG_IS_CMDLINE(UseAdaptiveSizePolicy)) { > > > > I think users are only interested in the warning if they actually had > > the switch on the command line. If hotspot turns on the flag > > ergonomically I think it is just confusing to customers to see the > warning. > > Thanks for the review. I'll make that change; it's more future-proof. > > Just a nitpick: the FLAG_IS_CMDLINE doesn't cover VM arguments that were set from a config file (.hotspotrc), whose origin would have been CONFIG_FILE. It's nicer if there's a FLAG_IS_USER_SET or something, that covers all cases where the argument might be set by a user instead of ergo. - Kris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20120130/fdad80a8/attachment.html From bengt.rutisson at oracle.com Tue Jan 31 02:08:50 2012 From: bengt.rutisson at oracle.com (bengt.rutisson at oracle.com) Date: Tue, 31 Jan 2012 10:08:50 +0000 Subject: hg: hsx/hotspot-gc/hotspot: 7140909: Visual Studio project builds broken: need to define INCLUDE_TRACE Message-ID: <20120131100855.BC653472AA@hg.openjdk.java.net> Changeset: 2eeebe4b4213 Author: brutisso Date: 2012-01-30 15:21 +0100 URL: http://hg.openjdk.java.net/hsx/hotspot-gc/hotspot/rev/2eeebe4b4213 7140909: Visual Studio project builds broken: need to define INCLUDE_TRACE Summary: Add define of INCLUDE_TRACE Reviewed-by: sla, kamg ! src/share/tools/ProjectCreator/BuildConfig.java