Proposal: MaxTenuringThreshold and age field size changes
email at nmichael.de
Wed Jun 4 23:26:57 PDT 2008
please see inline.
Y Srinivas Ramakrishna schrieb:
> Hi Nick --
> thanks for sharing that experience and for the nice description!
> Looking at the PrintTenuringDistribiution for your application
> with the old, 5-age bit JVM run with MTT=24 would probably be
> illuminating. By the way, would you expect the objects surviving
> 24 scavenges (in that original configuration) to live for a pretty long time?
> Do you know if your object age distribution has a long thin tail,
> and, if so, where it falls to 0? If it is the case that most applications
> have a long thin tail and that the population in that tail (age > MTT)
> is large, then NeverTenure is probably a bad idea.
Basically, all our scenarios work like this: Upon an incoming request,
we create some hundred kilobytes of objects. Most of those objects die
after this request has been processed, which usually takes something
between 10ms and 100ms. A few kilobytes of objects remain alive in the
Java heap until the session is terminated, which is signaled through a
further request. The session length may vary between few seconds, some
minutes or even hours, depending on the scenario. There are no
session-related objects dying "in between". So, assume a fix session
duration of 60 seconds for all requests, this means that all objects
that are not just temporary and survive the first gc cycle will only die
after 60 seconds. With a gc interval of, let's say, 6 seconds, this
would be 10 cycles. MTT >= 10 would be sufficient to collect all those
objects in the young gen. As you can see, in this example there is no
"thin tail": We have the same amount of objects at each age 1-10.
Let's assume a session duration of 120 seconds. As you can easily see,
with the same gc pattern, all those objects would survive 20 gc cycles.
With a max MTT of 15, they would all tenure into old after 15 cycles.
It's quite obvious that for this scenario, it would be better to set
MTT=1, since we would avoid copying those objects 14 times before all of
them tenure anyway.
Another way would be to use "never tenure" for 120 second sessions. This
would allow to keep them in the young gen for 20 cycles, provided the
survivor spaces are large enough. But imagine a third scenario with a
session duration of 10 minutes. Such a scenario would definately
overflow the survivor spaces with the "never tenure" policy.
We need a set of JVM settings that fits all scenarios. Assuming that
scenarios with mixed session durations run simultaneously, our aim is to
collect the objects for most scenarios in the young gen (before any of
them tenure), and accept that the objects of some scenarios (after a
reasonable amount of copying) tenure into old.
The question is where we draw that line... With our original gc
intervals, 24 cycles seemed to be a good trade-off. Now, that we can
only set MTT<=15, MTT=15 with a streched gc interval (by enlarging the
eden) achieves the same.
> As you found, the basic rule is always that if a small transient overload
> causes survivor space overflow, that in turn can cause a much longer-term "ringing effect"
> because of "nepotism" (which begets more nepotism, ....,) the effects of
> which can last much longer than the initial transient that set it off.
> And, yes, NeverTenure will lead to overflow in long-tailed distributions
> unless your survivor spaces are large enough to accomodate the tail;
> which is just another way of saying, if the tail does not fit, you will
> have overflow which will cause nepotism and its ill-effects.
Unfortunately, not only long-tailed distributions lead to survivor space
overflow! This is what I meant with my sentence "most surprisingly..."
in my previous mail:
I've run a scenario where all object died after 2 gc cycles. With
MTT=15, they were nicely collected after 2 cycles and the survivor
spaces were only something like 10% full.
The *same* scenario with MTT=24 ("never tenure") filled up the survivor
space to 100%, even causing tenuring of (live or dead, I don't know)
objects into the old gen. It's obvious that 90% of the objects filling
up the survivor spaces must have been dead already, just gc didn't
This doesn't happen always, but often (so it is reproducable). To make
it more clear: The same scenario with the same configuration (MTT=24)
doesn't necessarily fill up the survivor spaces to 100%. I've also had
runs where it only filled 20% of the survivor spaces. That's still
factor 2 of MTT=15 (which still means, there's 50% garbage being copied
around), but not as worse as filling them up to 100%.
And it get's even better: During a 30 minute run, I've even seen a
change in gc behavior (without any change in the load): For the first 20
minutes, the target survivor space was always 100% full after a
collection. Then, all of a sudden, from the next collection on, it was
only 20% full for the remaining 10 minutes.
Tony has an explanation for this (Tony? You can probably explain this
better than me?). This is one of the main reasons why "never tenure"
works very poor for us.
> Thanks again for sharing that experience and for the nice
> explanation of your experiments.
> -- ramki
>> So, as a conclusion: Yes, we are missing the 5th age bit. NeverTenure
>> works very bad for us. MTT=15 with our original eden size is too
>> but increasing the eden size allows us to get similar behavior with
>> MTT=15 as with the original configuration (MTT=24 and 5 age bits).
>> I think, Tony's suggestion to limit the configurable MTT to 2^n-1
>> n being the age bits) is a good solution. At least according to my
>> tests, this is much better then activating the "never tenure" policy
>> when the user is not aware of this.
>> I hope some of this may have been helpful for you. I will send some more
>> detailed results and logfiles directly to Tony.
More information about the hotspot-gc-dev