Discussion: improve humongous objects handling for G1

Liang Mao maoliang.ml at alibaba-inc.com
Tue Jan 21 06:25:51 UTC 2020

Hi Thomas,

In fact we saw this issue with 8u. One issue I forgot to tell is that when 
CPU usage is quite high which is nearly 100% the concurrent mark will
get very slow so the to-space exhuasted happened. BTW, is there any improvements
for this point in JDK11 or higher versions? I didn't notice so far. Increasing
reserve percent could alleviate the problem but seems not a completed solution.

Cancelling concurrent mark cycle in initial-mark pause seems a delicate optimization
which can cover some issues if a lot of humongous regions have been reclaimed in
this pause. It can avoid the unnecessary cm cycle and also trigger cm earlier if neened.
We will take this into the consideration. Thanks for the great idea:)

If there is a short-live humongous object array which also references other
 short-live objects the situation could be worse. If we increase the G1HeapRegionSize,
some humongous objects become normal objects and the behavior is more like CMS then
everything goes fine. I don't think we have to not allow humongous objects to behave 
as normal ones. A new allocated humongous object array can probably reference 
objects in young generation and scanning the object array by remset couldn't be better
 than directly iterating the array in evacuation because of possible prefetch. We can
 have an alternative max survivor age for humongous object, maybe 5 or 8 at most
 otherwise let eager reclam do it. A tradeoff can be made to balance the pause time
 and reclamation possibility of short-live objects.

So the enhanced solution can be
1. Cancelling concurrent mark if not necessary.
2. Increase the reclamation possibility of short-live humongous objects.

 An important reason for this issue is that Java developers easily challenge CMS can 
handle the application without significant CPU usage increase(caused by concurrent mark)
 but why G1 cannot. Personally I believe G1 can do anything not worse than CMS:) 
This proposal aims for the throughput gap comparing to CMS. If works with the barrier
optimization which is proposed by Man and Google, imho the gap could be obviously reduced.


From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Jan. 20 (Mon.) 19:11
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; Man Cao <manc at google.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: Discussion: improve humongous objects handling for G1

Hi Liang,

On 19.01.20 08:08, Liang Mao wrote:
> Hi Guys,
> We Alibaba have experienced the same problem as Man introduced.
> Some applications got frequent concurrent mark cycles and high
> cpu usage and even some to-space exhausted failures because of
> large amount of humongous object allocation even with
> G1HeapRegionSize=32m. But those applications worked fine
> with ParNew/CMS. We are working on some enhancements for better

Can you provide logs? (with gc+heap=debug,gc+humongous=debug)

> reclamation of humongous objects. Our first intention is to reduce
> the frequent concurrent cycles and possible to-space exhausted so
> the heap utility or arraylets are not taken into consideration yet.
> Our solution is more like a ParNew/CMS flow and will treat a
> humongous object as young or old.
> 1. Humongous object allocation in mutator will be considered into
> eden size and won't directly trigger concurrent mark cycle. That
> will avoid the possible to-space exhausted while concurrent mark
> is working and humongous allocations are "eating" the free regions.

(I am trying to imagine situations here where this would be a problem 
since I do not have a log)

That helps if G1 is already trying to do a marking cycle if the space is 
tight and already eating into the reserve that has explicitly been set 
aside for this case (G1ReservePercent - did you try increasing that for 
a workaround?). It does make young collections much more frequent than 
necessary otherwise.

Particularly if these humongous regions are eager-reclaimable. In these 
cases the humongous allocations would be "free", while with that policy 
they would cause a young gc.

The other issue, if these humongous allocations cause too many 
concurrent cycles could be managed by looking into canceling the 
concurrent marking if that concurrent start gc freed lots and lots of 
humongous objects, e.g. getting way below the mark threshold again.

I did not think this through though, of course at some point you do need 
to start the concurrent mark.

Some (or most) of that heap pressure might have been caused by the 
internal fragmentation, so allowing allocation into the tail ends would 
very likely decrease that pressure too.
This would likely be the first thing I would be looking into if the logs 
indicate that.

> 2. Enhance the reclamation of short-live humongous object by
> covering object array that current eager reclaim only supports
> primitive type for now. This part looks same to JDK-8048180 and
> JDK-8073288 Thomas mentioned. The evacuation flow will iterate
> the humongous object array as a regular object if the humongous
> object is "young" which can be distinguished by the "age" field
> in markoop. >
> The patch is being tested. We will share it once it proves to
> work fine with our applications. I don't know if any similar
> approach has been already tried and any advices?

The problem with treating humongous reference arrays as young is that 
this heuristic significantly increases the garbage collection time if 
that object survives the collection.
I.e. the collector needs to iterate over all young objects, and while 
you do save the time to copy the object by in-place aging, scanning the 
references tends to take more time than copying.

In that "different regional collector" I referenced in the other email 
exactly this had been implemented with the above issues. That collector 
also had configurable regions down to 64k (well, basically even less, 
but anything below that was just for experimentation, and 64k had been 
very debatable too), so the humongous object problem had been a lot 
larger. It might not be the case with G1's "giant" humongous objects.

Treating them as old like they are now within G1 allows you to be a lot 
more selective about what you take in for garbage collection. Now the 
policy isn't particularly smart (just take humongous objects of a 
particular type with less than a low, fixed threshold of remembered set 
entries), but that could be improved.

I.e. G1 has a measure of how long scanning a remembered set entry 
approximately takes, so that could be made dependent on available time.


More information about the hotspot-gc-dev mailing list