RFR(M/L): 7176479: G1: JVM crashes on T5-8 system with 1.5 TB heap

John Cuthbertson john.cuthbertson at oracle.com
Wed Jan 30 19:43:24 UTC 2013

Hi Everyone,

Here's a new webrev based upon comments from Vitaly: 



On 1/15/2013 3:31 PM, John Cuthbertson wrote:
> Hi Everyone,
> Can I have a couple of people look over the changes for this CR - the 
> webrev can be found at: 
> http://cr.openjdk.java.net/~johnc/7176479/webrev.0/
> Background:
> The issue here was that we were encoding the card index into the card 
> counts table entries along with the GC number so that we could 
> determine if the count associated with was valid. We had a check to 
> ensure that the maximum card index could be encoded in an int. With 
> such large heap size - the number of cards could not be encoded and so 
> the check failed.
> The previous mechanism was an attempt to solve the problem of one 
> thread arriving late to the actual GC work. The thread in question was 
> being held up zeroing the card counts table at the start of the GC. 
> The card counts table is used to determine which cards are being 
> refined frequently. Once a card has been refined frequently enough, 
> further refinements of that card are delayed by placing the card into 
> a fixed size evicting table - the hot card cache. The card would then 
> be refined when it was evicted from the hot card cache or when the 
> cache was drained during the next GC.
> To solve the problem of zeroing we added an epoch (GC number) to the 
> entries in the counts table and, eliminate the increase in footprint, 
> we made the counts table into a cache which would expand if needed. 
> This approach had some negatives: we might have to refine two cards 
> during a single refinement operation, hashing the card, and performing 
> CAS operations increasing the overhead of concurrent refinement. Also 
> expanding the counts table during a GC incurred a penalty.
> This approach also limited the heap size to just under 1TB - which the 
> systems team ran into.
> The new approach effectively undoes the previous mechanism and 
> re-simplifies the card counts table.
> Summary of Changes:
> The hot card cache and card counts table have been moved from the 
> concurrent refinement code into their own files.
> The hot card cache can now exist independently of whether the counts 
> table exists. In this case refining a card once adds it to the hot 
> card cache, i.e. all cards are treated as 'hot'.
> The interface to the hot card cache has been simplified - a simple 
> query and a simple drain routine. This simplifies the calling code in 
> g1RemSet.cpp and results in up to only a single card being refined for 
> every call to "refine_card" instead of possibly two. This should 
> reduce the overhead of concurrent refinement.
> The number of cards that the hot card cache can hold before cards 
> start getting evicted is controlled by the flag G1ConcRSLogCacheSize, 
> which is now product flag. The default value is 10 giving a hot card 
> cache that can hold 1K cards.
> The card counts table has been greatly simplified. It is a simple 
> array of counts how many times a card has been refined. The space for 
> the table is now allocated from virtual memory instead of C heap. The 
> space for the table is committed when the heap is initially committed 
> and the spans the committed size of the heap. When the committed size 
> of the heap is expanded, the counts table is also expanded to cover 
> the newly expanded heap. If we fail to commit the memory for the 
> counts table, cards that map to the uncommitted space will be treated 
> as cold, i.e. they will be refined immediately. Having a simpler 
> counts table also should reduce the overhead of concurrent refinement 
> (there is no need to hash the card index and there are no CAS 
> operations) Having a simpler interface will allow us to change the 
> underlying data structure to an alternative that's perhaps more sparse 
> in the future.
> During an incremental GC we no longer zero the entire counts table. We 
> now zero the cards spanned by a region when the region is freed (i.e. 
> when we free the collection set at the end of a GC and when we free 
> regions at the end of a cleanup).  If a card was "hot" before a GC 
> then we will consider it hot after the GC and the first refinement 
> after the GC will insert the card into the hot card cache. 
> Furthermore, since we don't refine cards in young regions, we only 
> need to clear the counts associated with cards spanned by non-young 
> regions.
> During a full GC we still discard the entries in the hot card cache 
> and zero the counts for all the cards in the heap.
> Testing:
> GC Test suite with MaxTenuringThreshold=0 (to increase the amount of 
> refinement) and a low IHOP value (to force cleanups).
> SPECjbb2005 with a 1.5TB heap size and 256GB young size, 
> MaxTenuringThreshold=0 and a low IHOP value (1%). The systems team are 
> continuing to test with very large heaps.

More information about the hotspot-gc-dev mailing list