RFR: 8227226: Segmented array clearing for ZGC
thomas.schatzl at oracle.com
Thu Aug 1 09:43:52 UTC 2019
On 01.08.19 01:28, Per Liden wrote:
> Hi Thomas,
> On 7/31/19 7:59 PM, Thomas Schatzl wrote:
>> On 31.07.19 10:19, Per Liden wrote:
>>> I found some time to benchmark the "GC clears pages"-approach, and
>>> it's fairly clear that it's not paying off. So ditching that idea.
>>> However, I'm still looking for something that would not just do
>>> segmented clearing of arrays in large zpages. Letting oop arrays
>>> temporarily be typed arrays while it's being cleared could be an
>>> option. I did a prototype for that, which looks like this:
>>> There's at least one issue here, the code doing allocation sampling
>>> will see that we allocated long arrays instead of oop arrays, so the
>>> reporting there will be skewed. That can be addressed if we go down
>>> this path. The code is otherwise fairly simple and contained. Feel
>>> free to spot any issues.
>> that looks like a really neat way of doing this.
>> Looking over this there does not seem to be any real dependency on ZGC
>> code, so if you went this way, would it be possible to provide this
>> solution for all collectors?
> This is potentially dangerous for any GC doing concurrent oop_iterate(),
> as in that case the klass pointer must only be read once, with acquire
> An example in G1 where this would break is
> HeapRegion::do_oops_on_memregion_in_humongous(), and I'm thinking there
> are more cases.
Point taken, you are completely right, I was not thinking it through.
However for humongous objects it might be sufficient to just zero
manually in a loop with basically the same safepoint polling loop while
the klass is still NULL (and make sure it is not done again later).
Of course, also making sure that these seemingly empty regions are not
reclaimed during the safepoint somehow in a different way. :)
> For example, when a half zeroed type array in young is
> promoted to old, and then we switch the klass pointer.
In G1 we are probably not so much worried by "large" objects into young
gen - while 16M max object size takes some time to clear, only handling
the humongous objects would already help a lot I believe.
Actually another approach could be the GC completing the zeroing in
parallel for young gen objects - at that time it does have all memory
bandwidth for itself. Which would at least improve the situation unless
many threads do that at the same time (still these objects may be 16m in
Or just guaranteeing that such objects stay in survivor "zeroing"
regions during a gc (in case of evac failure, do the work in the pause).
Another option would be delaying refinement for cards in these regions
if after gc we have such objects until completed (which may be not
enough due to memory visibility issues, but I just like that idea right
now :) ).
It is unclear if such large effort makes sense though, and probably
there are better options with a bit more thought :).
> I wouldn't be surprised if CMS have similar problems, but haven't check.
At this time I would not spend time on any new feature for CMS that is
not absolutely necessary.
> However, this would probably work fine for Serial and Parallel. On the
> other hand, depending on the performance impact, it's not completely
> obvious that you'd want it there.
> We could perhaps add this code to the shared ObjArrayAllocator, and
> introduce a CollectedHeap::supports_segmented_array_clearing() so that
> GCs can easily opt-in when they are ready to do so.
Not sure. It is probably worth looking into how this would work in the
other collectors in a different CR, I would keep it ZGC local for now
>> For other collectors slightly larger segment sizes might be sufficient
>> too to slightly favor performance.
>> Did you measure the impact on zeroing throughput of this?
> I haven't done any performance measurements of this yet. The current 4K
> segment size was just an educated guess, but it might not be the optimal
More information about the hotspot-gc-dev