Review Request: UseNUMAInterleaving

Igor Veresov igor.veresov at oracle.com
Wed Aug 17 18:56:25 PDT 2011


 Tom, I've tried to repeat your experiments and here's what I've got: 

SPECjbb2005 on Linux on 4 socket Nehalem, 25G heap, 15G young gen. 
I did 8 runs of 80 warehouses on 80 hardware threads, peak result was then selected.

* base  481280
* usenuma  661524 (+37%)
* numactl -i all 551175 (+14%)
* usenuma and numa_local to numa_global hack 539724 (+12%)

So, I'd say numa-aware allocator gives +25-27% on top of interleaving with this benchmark.
But anyway, interleaving is substantially better than the base case.

igor

On Wednesday, August 17, 2011 at 4:50 PM, Deneau, Tom wrote:

> Igor --
> 
> Comments inline below...
> 
> > -----Original Message-----
> > From: Igor Veresov [mailto:igor.veresov at oracle.com]
> > Sent: Wednesday, August 17, 2011 6:38 PM
> > To: Deneau, Tom
> > Cc: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc-dev at openjdk.java.net)
> > Subject: Re: Review Request: UseNUMAInterleaving
> > 
> > On Tuesday, August 16, 2011 at 1:14 PM, Deneau, Tom wrote:
> > > Igor --
> > > 
> > > I am back from vacation, starting to address your comments...
> > > Regarding your comment #1 below.
> > > 
> > > You mention "future planned NUMA-aware implementations of GCs". How
> > > do these future planned NUMA-aware implementations of GCs differ from
> > > today's NUMA-aware GCs? My understanding of the current GCs use of
> > > NUMA is that they support numa_global (interleaved) and numa_local
> > > (memory pinned to one numa node).
> > > 
> > > In the currently released JVMs on Windows OSes, neither numa_local nor
> > > numa_global is implemented. The implementation I proposed in the
> > > patch maps both numa_local and numa_global requests to numa_global (on
> > > Windows). The reasons for this were:
> > > 
> > >  * it was very difficult (if not impossible) to implement the JVM's
> > >  current numa_local semantics on Windows
> > It's hard to realize numa_local semantics if you want to minimize the
> > number of memory segments per lgroup. If you're prepared (like in your
> > patch) to have hundreds of thousands of segments, this is not a problem
> > and it's quite easily implementable. The only problem there would that
> > such a huge number of segments will penalize page fault handling a lot.
> > > * in the benchmarks we measured, the extra performance that was
> > >  left on the table by doing only numa_global and not doing
> > >  numa_local was only a few percent.
> > Hm, I have trouble believing that. How did you get such results? What
> > were the experiments?
> 
> As I recall, I took the linux implementation of UseNUMA and forced all the numa_make_local
> to just call numa_make_global, and then measured the difference between this and regular UseNUMA
> on jbb2005.
> 
> > 
> > > Are you saying that in the future numa_local will be supported on
> > > Windows, and that even then it might still be advantageous to have a
> > > flag (UseNUMAInterleaving) which instead maps all the regions to
> > > numa_global? Should this flag be available on all OSes?
> > Basically yes. And like Ramki said it would be nice to support that on
> > other OSes, so that we could at least get interleaving for the collectors
> > that do no explicitly support NUMA. I guess I didn't do that before
> > because the functionality is equivalent to just saying for example on
> > Linux "numactl -i all java <flags>", but since you can't do that on
> > windows (as far as I can see) we could support this flag on unixes as
> > well. Which is fairly easy to do, you just have to call
> > os::numa_make_global() for a freshly reserved region.
> 
> Ah, I had originally thought that this could also be done by just mapping
> numa_make_local to numa_make_global if the UseNUMAInterleaving flag is set.
> But I think I see your point, that you would also want the interleaving when
> you're using a non-numa-aware collector.
> 
> 
> > igor
> > > -- Tom
> > > 
> > > > -----Original Message-----
> > > > From: Igor Veresov [mailto:igor.veresov at oracle.com]
> > > > Sent: Monday, August 08, 2011 1:43 PM
> > > > To: hotspot-gc-dev at openjdk.java.net (mailto:hotspot-gc-
> > dev at openjdk.java.net (mailto:dev at openjdk.java.net)); Deneau, Tom
> > > > Subject: Re: Review Request: UseNUMAInterleaving
> > > > 
> > > > Hi, Tom!
> > > > 
> > > > Sorry it took me so long to get to that.
> > > > 
> > > > 1. I don't think the new version of flag usage is prudent. The reason
> > I
> > > > proposed to introduce a new flag for interleaving is that it would
> > make
> > > > life easier in the future when the proper NUMA-aware implementation
> > of
> > > > GCs are added (G1 would be the most probable candidate). I would
> > propose
> > > > to still have UseNUMAInterleaving flag.
> > > > 
> > > > The usage would be as follows:
> > > > - If UseNUMA is specified on Windows that would turn
> > UseNUMAInterleaving
> > > > (for the time being, and that behavior would change in the future).
> > > > - If UseNUMAInterleaving is specified on the command line, you just
> > do
> > > > the interleaving. If you don't add this flag now, you'll have to do
> > that
> > > > anyway as soon as NUMA-aware GCs start supporting windows.
> > > > 
> > > > 
> > > > igor
> > > > 
> > > > 
> > > > 
> > > > On 5/26/11 4:37 PM, Deneau, Tom wrote:
> > > > > I have incorporated the change suggested by Paul Hohensee to just
> > use
> > > > the existing UseNUMA flag rather than introduce a new flag. Please
> > let me
> > > > know when you think this will be able to be checked in...
> > > > > 
> > > > > The new webrev is at
> > > > > http://cr.openjdk.java.net/~tdeneau/UseNUMAInterleaving/webrev.02/
> > > > > 
> > > > > -- Tom Deneau, AMD




More information about the hotspot-gc-dev mailing list