Parallel GC and array object layout: way off the base and laid out in reverse?
aleksey.shipilev at oracle.com
Wed Sep 4 07:05:16 PDT 2013
On 09/04/2013 05:49 PM, Thomas Schatzl wrote:
> Imo it's not clear whether there is a big difference, as future
> access order would be important here. Preferential access may go in
> either direction or completely independent of the array (if the
> program accesses lots of unrelated objects for each array element
> anyway). In this particular case, modern hw prefetchers also work
> well in the reverse direction.
I'd like to fix this particular parallel GC behavior because:
a) Depending on HW, you may or may not have the same performance
walking back the memory; in particular, think about the non-x86 embedded
scenarios where you don't have the luxury of advanced memory prefetchers;
b) Even if you *do* have the good memory prefetcher ready at your
disposal, accessing the first element will entail two memory accesses,
because the first element is rather far off the base; keeping the first
element closer to base may have the effect of having the first element
right there on the same cache line;
c) The Parallel GC layout is inconsistent with the layouts other GCs
produce; which can have the surprising performance differences vs other
collectors; I don't like surprising behaviors, and think we should
minimize them where possible.
All-an-all this looks like a simple implementation issue. Recording the
objects on stack somewhere, so they get LIFO-ed? If so, this seems easy
Do you want me to file the RFE?
> At the moment, access information is not gathered anywhere in the VM afaik.
> Even if the information were available and somehow used it is not clear
> whether the effort spent on gathering and applying this information
> amortizes itself later.
I'm not talking about dynamic layout policies here. I only want the
static one to be as coherent and intuitive as possible.
> Maybe there are good studies on current hardware on realistic loads
> about that somewhere?
I haven't come across any production-grade dynamic layout schemes.
More information about the hotspot-gc-dev