GC and HeapSize questions

Clemens Eisserer linuxhippy at gmail.com
Fri Sep 28 12:16:43 UTC 2007

Hi Jerry,

> With java we have the benefit of the garbage collector.  And there is
> some overhead that the GC has when you have a very large heap that is
> close to fully allocated.  The question is how much is this overhead and
> would it be worth the extra effort of coding some caching into your java
> application.  Or would it be better to just allocate a really large heap
> and let java and the operating system manage the paging for you.  My
> guess is that it would be hard for the developer to beat the OS and Java
> GC so it would be better to use a large amount of heap and let java gc
> take care of it for you, especially now that we have all of this cool
> generational stuff in the GC.
Well the "overhead" a GC causes is really hard to classify because
some a lot of this overhead you would also see in a C program where
you would use malloc/free and in some areas a GC can even improve
performance (e.g. better cache locality). Further the amount of
overhead always depends on the application running and which memory
requirements it has.

I don't understand what you mean with "caching" :-/

However there are problems with large heaps which are swapped. At a
full GC a lot of memory is accessed which is usually paged out - which
means long pauses to wait for data from I/O. Running the concurrent
mark&sweep collector can maybe help with this.

> The below is a very primitive test program that tries to measure the
> overhead that large heaps add to the GC.  On a windows laptop with a 1.5
> gig heap it appeared to add around 30% overhead to the GC.  Does this
> sound right?  Are there things that can be done to tune the GC to make
> it behave better in these cases?  And is there any work being done to
> handle very large memory based java applications?
Sorry but your benchmark is seriously flawed. Of course if your work
is only to allocate Objects and add it to some lists you have a lot of
GC overhead - because the only thing you do is allocating objects ...
so if you stess the GC, and only the GC a lot of time will be used in
it ;)

So after all whats the problem the bpel team experiences? Do they
experience large pauses at full GC, slow allocation or paging?
What does running with gc logging turned on say?

Good luck, lg Clemens

More information about the hotspot-gc-dev mailing list