Feedback on G1GC
bengt.rutisson at oracle.com
Tue Jan 12 14:18:14 UTC 2016
On 2015-12-20 23:43, charlie hunt wrote:
> Hi Fabian,
> I’m glad you brought this discussion to hotspot-gc-dev. This is a very
> good place to have this discussion.
> If others on hotspot-gc-dev need a bit more context of the thread of
> discussion, I’d be glad to do so.
> The gist of the issue is whether G1 should reduce the size of eden
> space when MaxGCPauseMillis is exceeded.
> To pickup things on where this thread is going …
> If the workload is very reproducible, then is it unreasonable to ask
> for a configuration run that enabled ParallelRefProcEnabled with set
> of command line options that were used in the first run? How about we
> exercise some good practices here and change one configuration setting
> at a time? And, let’s also ensure we have results that are
> reproducible. We have several unanswered questions between the first
> and second run, i.e. why did ref proc times drop so drastically, is it
> all due to ParallelRefProcEnabled? How could a forced larger Eden
> size allow Ref Proc times to be reduced? Is the workload producing
> repeatable results / behavior?
> Aside from the specifics just mentioned, I think the key thing to
> understand here is the school thought behind shrinking the size of
> eden when GC pauses exceed MaxGCPauseMillis, and why it is not a good
> idea to grow the size of Eden in such a case? Perhaps one of the long
> time GC engineers would like to join the fun? ;-)
I haven't read the other email thread, but just thought I could answer
this one question quickly.
The reason for shrinking the young gen size when the pause time goal is
exceeded is simply that it is assumed that a larger young gen takes
longer to collect. Shrinking the young gen is then a natural response
when the pause times get too long.
However, this is a very simplistic view of the world. As noted here, it
can sometimes be much better to grow the young gen size instead of
shrinking it. Growing the young gen will make GCs happen less frequently
and may allow more objects to die. And in reality the collection time is
related to the number of live objects, not the young gen size itself.
The problem is that it is very hard to know if it is better to grow or
shrink the young gen in the general case.
> @Kirk: You mentioned, “reference processing times clearly dominated
> resulting in Eden being shrunk in a feeble attempt to meet the pause
> time goal”. Can you offer some alternatives that would be a better
> alternative that G1 could do adaptively to meet the pause time goal in
> the presence of high reference processing times, and for bonus points,
> could you file those enhancements in JIRA so they can be further
> evaluated and vetted?
>> On Dec 20, 2015, at 1:48 PM, Fabian Lange
>> <fabian.lange at codecentric.de <mailto:fabian.lange at codecentric.de>> wrote:
>> Hi Kirk,
>> I know that it is questioned also on the other list, where I will
>> continue to discuss potential better settings, but I can tell you
>> that the workload is really reproducible, as this system measures its
>> data ingress and the rate was close to equal. Data egress was
>> radically different.
>> My main concern here on hotspot-gc-dev is that the defaults produced
>> a bad result. Plus I have the feeling the GC optimizes in the wrong
>> direction (shrinking eden instead of increasing eden).
>> I will come back to this list when we manually figured out good settings.
>> On Sun, Dec 20, 2015 at 7:38 PM, kirk at kodewerk.com
>> <mailto:kirk at kodewerk.com> <kirk at kodewerk.com
>> <mailto:kirk at kodewerk.com>> wrote:
>> Hi Fabian,
>> I don’t think the experimentation with your app is over. I don’t
>> think the differences between the two runs can easily be
>> dismissed as the result of changing the values of a few flags. In
>> the first relatively short run, reference processing times
>> clearly dominated resulting in Eden being shrunk in a feeble
>> attempt to meet the pause time goal. I don’t think that the
>> shrinkage in reference processing time cannot be solely
>> attributed to turning on parallel reference processing. It seems
>> as if something else changed. At any rate, I believe you should
>> relax the minimum Eden size from 25%. I have posted a number of
>> charts which anyone should be able to see @
>>> On Dec 20, 2015, at 1:27 PM, Fabian Lange
>>> <fabian.lange at codecentric.de
>>> <mailto:fabian.lange at codecentric.de>> wrote:
>>> (originall posted on adoption-discuss)
>>> since a while I have been recommending and using G1GC for JDK 8
>>> This week I was looking at an application which should be the
>>> ideal candidate.
>>> It was given 4GB ram, has a steady memory usage of about 1-2GB
>>> and during its work it generates only garbage. It reads data
>>> from sockets, deserializes it, manipulates it, serializes it and
>>> writes it out to sockets. It is processing 100k to 500k of such
>>> requests per second.
>>> With the default G1 settings the machine was very loaded. The
>>> collection times were pretty long. It even ran out of memory a
>>> few times because the GC could not catch up.
>>> When looking at the logs I was surprised to see extremely small
>>> eden/young sizes. The old gen was really big (like 3.5GB, but
>>> mostly empty) while G1 was churning on 300MB young.
>>> I raised the question on
>>> where Charlie Hunt was so kind to explain the reasons behind the
>>> behaviour. It either did not make sense to me, or I did not
>>> understand the explanation.
>>> What I did is what I always did regardless of the collector: I
>>> increased young space, knowing it contains mostly garbage.
>>> The overall behaviour of the JVM was much improved by that.
>>> I found it irritating, that according to Charlie, the main
>>> reason for the small eden is the Pause Time Limit. Because GC
>>> was not meeting its goal it reduced eden. While I observed
>>> better results doing the opposite.
>>> I also enabled -XX:+ParallelRefProcEnabled.
>>> Logs are available from the above discussion, but I can send
>>> them in separate mail if desired.
>>> As far as I can tell the ergonomics are not working for me, and
>>> the changes I need to do are counter intuitive. From other
>>> discussions I learned that quite many people observed better
>>> overall performance with raising the pause time restriction.
>>> Is there public information to why the current defaults are as
>>> they are? How would feedback on these defaults work?
>>> Best regards,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-gc-dev