RFR(M): 7163191: G1: introduce a "heap spanning table" abstraction

Bengt Rutisson bengt.rutisson at oracle.com
Wed Feb 20 12:59:23 UTC 2013

Hi John,

Thanks for your answers! Some comments inline...

On 2/19/13 10:56 PM, John Cuthbertson wrote:
> Hi Bengt,
> Thanks for taking a look. I've applied your supplied patch and don't 
> have an issue with the changes contained therein but there is/was a 
> slight bug in HeapRegionSeq::at(). More detailed responses inline...

Hm. I don't find the comment about the HeapRegionSeq::at() bug further 
down. Did I miss it somehow?

> .
> On 2/15/2013 5:10 AM, Bengt Rutisson wrote:
>> Hi John,
>> Thanks for putting this together! I definitely think it is a good 
>> idea to break some of the logic out of HeapRegionSeq!
>> I have a couple of minor things that I would like to clean up. I 
>> figured the easiest way was to send you a patch. Please try out the 
>> attached patch and see what you think. It should be applied on top of 
>> the patch from your webrev.
>> I also have a couple of design questions:
>> The G1HeapSpanningTableBase::DefaultValue has this comment:
>>   // Currently, in all the uses of this class the default value of
>>   // array's elements is 0. If this changes in the future we can
>>   // allow the subclasses to parametarize it.
>> I don't think this will work unless we make the DefaultValue of type 
>> T and thus move it into G1HeapSpanningTable. The DefalutValue now is 
>> 0 which works fine. But when we memset we will write DefalutValue 
>> which I guess is int in the enum case (and char in my patch). But 
>> when we read we do:
>> array[index]
>> where array is T* array, so on 64 bit we probably read 64 bits but we 
>> only write 32 or less with DefaultValue. So, setting it to anything 
>> else than 0 probably won't work.
> OK. I kind of agree that this is awkward. What do you think of Thomas' 
> idea of having a virtual function default_value() that the subclasses 
> have to implement?

I think Thomas only answered to you and not the list (he did this before 
he had a usable Oracle email) so I actually don't know what Thomas 
suggested. But a virtual default_value() function would probably work. 
On the other hand if we remove the split between G1HeapSpanningTableBase 
and G1HeapSpanningTable, as suggested further down, I think we could 
just have the default value be of the template type.

>> How do you feel about changing the relationship between HeapRegionSeq 
>> and G1HeapSpanningTable a bit? Right now HeapRegionSeq inherits from 
>> G1HeapSpanningTable but I kind of think that makes the distinction 
>> between HeapRegionSeq and G1HeapSpanningTable hard to find. I think 
>> it would be clearer if HeapRegionSeq just had an instance of 
>> G1HeapSpanningTable and delegated to it.
>> That way HeapRegionSeq can be more focused on its responsibilities 
>> and G1HeapSpanningTable can be more self contained.
> Personally I don't have a strong opinion about this. I typically like 
> embedded objects and use them in the hot card cache changes. I guess 
> it comes down to what is a good way to model the relationship: "is_a" 
> or "contains". If we consider the fast cset arrays case then we either 
> have:
> class FastCSetTestArray: public HeapRegionSpanningTable<int> {
>   ...
> }
> or
> class FastCSetTestArray: public CHeapObj<mtGC> {
>   HeapRegionSpanningTable<int> _the_table;
> }

Yes, this is what I mean.

> It seems a bit more natural to me that the HeapRegionSeq is a heap 
> spanning table; the FastCSetTestArray is a heap spanning table.

It depends a bit on what methods you envision that FastCSetTestArray 
will have. I could imagine that the class would be called just 
FastCSetTest and have the method bool is_in_cset(). In that case I don't 
think it is natural to have a "is a" relationship.

> Either would work. If we do this then Thomas' idea of the 
> default_value() virtual method becomes irrelevant. Its in the eye of 
> the beholder.

Again, I'm not really sure what Thomas suggested. :)

>> Another thing, which is not only related to this change, is the use 
>> of initilize methods. It's a bit strange to me that we use empty 
>> constructors and then have to remember to call the initilize methods 
>> in the right order. I think it would be good to break the 
>> initialize-chain somewhere and go over to using constructors instead. 
>> If HeapRegionSeq would not inherit from G1HeapSpanningTable we could 
>> set the G1HeapSpanningTable instance in HeapRegionSeq up in its 
>> intialize method. That would allow us to use the constructor instead 
>> of another level of initialize methods for G1HeapSpanningTable.
> I'm not necessarily a fan but it does allow for deferred 
> initialization based upon the committed part of the heap so I can see 
> that it makes/made sense. Let me make sure I have what you're saying 
> correct. Are you saying we should have:
> class HeapRegionSeq {
>   HeapRegionSpanningTable<HeapRegion*>* _the_table;
>   ...
> };
> HeapRegionSeq::HeapRegionSeq(..): _the_table(NULL), ...
> HeapRegionSeq::initialize(...) {
>   _the_table = new HeapRegionSpanningTable<HeapRegion*>(...);
> }

Yes, this is what I meant.
> Again - no strong opinions other than Thomas' idea makes use of the 
> inheritance relationship.

Can't really comment on that... :)

>> Finally, there is a comment about why we have the division between 
>> G1HeapSpanningTableBase and G1HeapSpanningTable:
>> // ... The reason for the
>> // split is that we found that compilers like all the methods of a
>> // parametarized class to be in the .hpp file and this was the only
>> // way to put some of the methods in the .cpp file.
>> This is a downside of templates but the .cpp file now only contains 4 
>> relevant methods and they are not very large. Does this still 
>> motivate the extra complexity introduced by the split into two 
>> classes? Currently we also only have one instantation of 
>> G1HeapSpanningTable. Will there be more of other types than 
>> HeapRegion* or could we just drop the templates?
> I kind of agree the split is a bit superfluous. I'll undo the split. 
> If we see that there's a performance penalty it can always be reinstated.

Great. Thanks!

> Current other uses of the HeapRegionSpanningTable will be for the fast 
> cset test arrays (I think the template type parameter is a byte or a 
> short) which are currently used for quickly deciding whether a region 
> is in the collection set. I know Tony envisioned them for other 
> region-based queries. Using the heap spanning table abstraction to 
> implement the fast cset test arrays would enable extending them to 
> determine whether a region is in the collection set and old in a 
> fairly natural way.

OK, thanks for this background information. Do you have follow up 
patches for these changes too? I mean, do we think we will do them in 
the near future?


> Cheers,
> JohnC
>> Thanks,
>> Bengt
>> On 2/14/13 11:16 PM, John Cuthbertson wrote:
>>> Hi Everyone,
>>> Here's another change from Tony that I've been maintaining since he 
>>> left Oracle. Can I have a volunteer look over the changes? They look 
>>> good to me. The webrev can be found at: 
>>> http://cr.openjdk.java.net/~johnc/7163191/webrev.0/
>>> Summary:
>>> Currently in G1 there are several tables where a each element is 
>>> used to hold some information about a fixed size subdivision of the 
>>> heap. For example the HeapRegion table and the fast CSet test 
>>> tables. This change encapsulates the base properties of these tables 
>>> in a generic heap spanning table and uses the abstraction for the 
>>> heap region sequence table. A future change will extend this to the 
>>> fast cset test arrays.
>>> Testing consisted of the GC test suite with a low marking threshold 
>>> and heap verification. Testing did catch an assertion failure that I 
>>> have included a small fix for. As with the additional marking 
>>> validation changes, if no-one objects, I will push this change with 
>>> Tony as the author and myself as a reviewer.
>>> JohnC

More information about the hotspot-gc-dev mailing list