RFR(XL): 8220310: Implementation: NUMA-Aware Memory Allocation for G1, Mutator (1/3)

Stefan Johansson stefan.johansson at oracle.com
Fri Oct 4 12:23:40 UTC 2019

Hi Sangheon,

First of all, thanks for this updated version incorporating a lot of our 
comments. I think we are getting closer to the goal, but I still have 
some more comments :)

On 2019-10-01 18:43, sangheon.kim at oracle.com wrote:
> Hi Kim and others,
> This webrev.2 simplified a bit more after changing 'heap expansion' 
> approach.
> Previously heap may expand with preferred numa id which means contiguous 
> same numa id heap regions may exist but current version is assuming to 
> have evenly split heap regions. i.e. 4 numa node system, heap regions 
> will be 012301230123, so if we know address or heap region index, we can 
> know preferred numa id.
> Many codes related to support previous style expansion were removed.
> ...
> webrev:
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2/
> http://cr.openjdk.java.net/~sangheki/8220310/webrev.2.inc

   31 #include "gc/g1/g1NUMA.hpp"
I don't see why this include is needed, but you might want to include 
gc/g1/g1MemoryNodeManager.hpp instead.

1518   _mem_node_mgr(G1MemoryNodeManager::create()),

I saw your response to Kim regarding G1Allocator needing it do be 
initialized and I get that, but have you looked at moving the creation 
of G1Allocator to initialize() as well, I think it's first use is 
actually below:
1802   _mem_node_mgr->set_page_size(page_size);
1851   _allocator->init_mutator_alloc_regions();

I might be missing some other place where it gets called, but I think it 
should be safe to create both the node manager and the allocator early 
in initialize().

28 #include "gc/g1/g1MemoryNodeManager.hpp"

Remove this include.

326                range(0, 100)

Remove the backslash and add back the removed line to leave the file gc, 
heap, numa, verificationunchanged.

  142   if (hr != NULL) {
  143     assert(hr->next() == NULL, "Single region should not have next");
  144     assert(is_available(hr->hrm_index()), "Must be committed");
  146     verify_actual_node_index(hr->bottom(), hr->node_index());
  147   }

I don't think this is a good place to do the verification, we allocate 
the free region while holding a lock and I think we should avoid doing a 
system call there. I would rather see this done during a safepoint, 
having a closure that iterates the heap and verify all regions.

I also think it would be nice to have two levels of the output, the one 
line for each region on trace level and on debug we can have a summary, 
something like:
NUMA Node 1: expected=25, actual=23
NUMA Node 2: expected=25, actual=27

What do you (and others) think about that?
  216 static void print_node_id_of_regions(uint start, uint num_regions){
  217   LogTarget(Trace, gc, heap, numa) lt;

I understand that it might make the test a bit more complicated, but 
have you thought about instead adding the node index to the heap 
printing done when <gc, heap, region> is enabled on trace level?
  235 static void set_heapregion_node_index(HeapRegion* hr) {

I don't think we should special case for when AlwaysPreTouch is on and 
instead always just call hr->set_node_index(preferred_index) directly in 
make_regions_available. The reason is that I think it will make the NUMA 
support harder to understand and explain and it can potentially also 
hide problems with a systems configuration. It might also actually be 
worse then using the preferred id, because the OS might decide to move 
the pages back to the preferred node right after we checked this (not 
sure it will happen, but in theory).

An other problem with this code is the call to:
verify_actual_node_index(hr->bottom(), node_index)

This function will only return the "actual" node index if logging for 
<gc, heap, numa, verification> is enable on debug level.

  346  bool HeapRegionManager::is_on_preferred_index(uint region_index, 
uint preferred_node_index) {
  347    uint region_node_index = 
  349   return region_node_index == preferred_node_index ||
  350          preferred_node_index == G1MemoryNodeManager::AnyNodeIndex;

I guess adding the AnyNodeIndex case here is because in this patch 
nobody is expanding on a preferred node, right? To me this is just 
another argument to not do any changes to the expand code in this patch. 
I know I suggested adding expand_on_preferred_node(), but I should have 
been clearer about when I think we should add it.

   56   // Returns memory node ids
   57   virtual const int* node_ids() const;

Doesn't seem to be used, remove.

  67   LINUX_ONLY(if (UseNUMA) {
  79     delete numa;
  80   })

A bit confusing with a multi-line LINUX_ONLY, I would prefer to hide 
this in a private helper, something like:
   if (UseNUMA) {

   if (_inst == NULL) {
     _inst = new G1MemoryNodeManager();

Not really happy about this either, but we can look at simplifying the 
NUMA initialization as a follow up.

   87   // Returns numa id of the given numa index.
   88   inline int numa_id_of_index(uint numa_index) const;

Currently unused, either remove or make use of it when calling 
   94   // Returns current active numa ids.
   95   const int* numa_ids() const { return _numa_ids; }

Only used by memory manager above, which in turn is unused, remove.

   55 // Request the given memory to locate on preferred node.
   56 // There are 2 things to consider.
   57 // First, size comparison for G1HeapRegionSize and page size.
   62 // Examples of 4 numa ids with non-preferred numa id.

What do you think about this instead:
// Request to spread the given memory evenly across the available NUMA
// nodes. Which node to request for a given address is given by the
// region size and the page size. Below are two examples:

I would also like a "NUMA node" row for each example showing which numa 
node the pages and regions end up on.


> Testing: hs-tier1 ~ 5 +-UseNUMA
> Thanks,
> Sangheon
>> ------------------------------------------------------------------------------ 

More information about the hotspot-runtime-dev mailing list