Breadth-first vs. Depth-first
martinrb at google.com
Mon Jun 9 10:16:25 PDT 2008
Thanks for the reply.
I continue to see parallels between code hotspots and data hotspots.
In both cases there are few opportunities to rectify previous bad
decisions. There is no way to declare that the initialization phase of
the application is over, and the "money phase" is starting.
Identifying code hotspots pays off better, because code inlining
may give rise to further optimizations, unlike data colocation.
I imagine that data colocation will increase in value over time,
with memory access becoming a more important part of
I wonder why they y'all never seem to use a technique from the
AOT compiler world - provide a profiling mode to allow users
to run the app in for a "typical" run, then use that profiling
information in future runs.
Martin (definitely *not* a hotspot engineer)
On Mon, Jun 9, 2008 at 8:51 AM, Tony Printezis <tony.printezis at sun.com> wrote:
> I'm personally skeptical on whether we should pursue this further. There are
> a couple of reasons why.
> Such profiling could be expensive and I'm not convinced that we will be able
> to reclaim back all the extra cost (the benefit we got from depth-first was
> very dependent on the architecture; some architectures got a lot, some got
> peanuts; the latter might be penalized, instead of benefit, due to the extra
> Additionally, the benefit from depth-first really depends on the GC used.
> For the GCs in HotSpot, the only opportunity we have to re-arrange objects
> is when new objects are copied from the eden to the survivor spaces and from
> the survivor spaces to the old generation. After that, you're stuck with the
> order in which objects moved to the old generation, as the old generation
> typically doesn't re-order objects (UseParallelGC and UseParallelOldGC do
> sliding compaction and they maintain object order, CMS doesn't typically do
> compaction; in fact, due to its in-place de-allocation policy, CMS cannot
> even guarantee that two objects that are allocated one after the other, will
> actually be located in the same order in the old gen).
> As an aside, Garbage-First:
> might have the edge over the other GCs in that respect, as it does copying
> (not sliding) GC in the old generation too and it might have more
> opportunities to re-order objects.
> Now, think about the following (very realistic!) scenario. An application
> has a non-trivial initialization phase and a lot of objects created during
> it will survive the lifetime of that application. After the initialization
> phase, the application goes into a steady-state phase (this is the "request
> serving" phase). Now, you would think that the initialization and
> steady-state phases will have quite different access patterns (you see where
> this is going, aren't you?). So, the objects created during the
> initialization phase will be copied according to the initialization access
> pattern, which might or might not be what you want during the steady-state
> phase. I think this is an example that shows that, in some (many?) cases,
> when the decision would be made on which objects to cluster together, it
> would be made prematurely.
> Now, do I think depth-first was a bad idea? Nope. I think it was a good
> idea, not because depth-first is really good, but instead because
> breadth-first was pathological in many many cases.
> Regarding, the GC times vs. mutator speed trade-off, I was actually
> expecting depth-first to have exactly this effect: slower GCs, but faster
> mutator threads. Interestingly, in most tests I ran, GCs didn't slow down
> much and, in some cases, they actually speeded up. I think this could have
> been due to the new order also benefiting the GC too, as well as the
> mutators threads.
> My two cents,
> Tony, HS GC Group
> Martin Buchholz wrote:
>> JavaOne 2008 technical session PDFs are now available
>> I enjoyed the discussion of breadh-first vs. depth-first GC copying
>> in the JavaOne talk
>> I'd like to suggest that this is a fruitful avenue for investigation,
>> if hotspot engineers go further than simply switching whole-hog
>> to depth-first.
>> Profiling can find reference fields that are rarely dereferenced,
>> and also ones that are likely to be dereferenced soon after the object
>> itself is accessed.
>> The former can be sequestered by the next copying GC into
>> an unloved-object ghetto, while the latter can be stored contiguously
>> with its parent object. Tricky to implement, and will certainly
>> make GC slower, but might make the runtime of the mutator win big.
>> And should be easier to implement than true object inlining.
>> This ought to be very similar to the moving of dead or cold code
>> fragments inside a method away from the hot code, that hotspot
>> already does.
> | Tony Printezis, Staff Engineer | Sun Microsystems Inc. |
> | | MS BUR02-311 |
> | e-mail: tony.printezis at sun.com | 35 Network Drive |
> | office: +1 781 442 0998 (x20998) | Burlington, MA01803-0902, USA |
> e-mail client: Thunderbird (Solaris)
More information about the hotspot-gc-dev