RFR: 8243961: ForceNUMA and only one available NUMA node fails assertion on Windows

Thomas Schatzl thomas.schatzl at oracle.com
Mon May 4 09:43:59 UTC 2020


Hi,

On 02.05.20 11:59, Kim Barrett wrote:
>> On Apr 30, 2020, at 6:24 AM, Thomas Schatzl
>> <thomas.schatzl at oracle.com> wrote:
>> 
>> Hi,
>> 
>> On 29.04.20 14:28, Kim Barrett wrote:
>>> Please review this Windows-specific change to the initialization
>>> of UseNUMAInterleaving. […] CR: 
>>> https://bugs.openjdk.java.net/browse/JDK-8243961 Webrev: 
>>> https://cr.openjdk.java.net/~kbarrett/8243961/open.00/ Testing: 
>>> mach5 tier1-3 mach5 testing on Windows with explict +UseG1GC
>>> +UseNUMA +ForceNUMA and each of
>>> +/-/(default)UseNUMAInterleaving. TestUseNUMAInterleaving fails
>>> for some of those test configurations on non-NUMA hardware,
>>> because the test expects UseNUMAInterleaving to be enabled if
>>> UseNUMA is enabled.  As the test failure can only happen with
>>> +ForceNUMA, that configuration doesn't seem to otherwise lead to 
>>> other problems, +ForceNUMA doesn't seem to be a tested
>>> configuration, and ForceNUMA is going away (JDK-8243628), we'll
>>> not worry about that failure.
>> 
>> Change is okay, but what do you think about always disabling
>> UseNUMA regardless of ForceNUMA on Windows instead?
>> 
>> There does not seem to be a code path in the windows OS files that
>> acts on UseNUMA. For NUMA related methods it returns default
>> values, so any path in the GCs will just do nothing or act on
>> useless default values from the os layer and do weird things.
>> 
>> UseNUMAInterleaving is completely separate from UseNUMA
>> functionality anyway (although "UseNUMA" is misleading at least to
>> me, but that's another discussion).
>> 
>> There is the problem then that probably many people enable UseNUMA
>> to get UseNUMAInterleaving, but that could be handled in a windows
>> specific way too. I.e. when -XX:+UseNUMA and -XX:+/-ForceNUMA is
>> set, disable UseNUMA and enable UseNUMAInterleaving.
>> 
>> That sounds much easier to understand to me than the current
>> change.
>> 
>> Some collectors (Z, Shenandoah) will set UseNUMA, but this
>> evaluation is before os::init_2 so above idea would do the right
>> thing as neither have any special UseNUMA path. I.e. both actually
>> seem to want UseNUMAInterleaving. Shenandoah explicitly says so, so
>> actually changing them to enable UseNUMAInterleaving instead of
>> UseNUMA would be good too imo.
>> 
>> On non-Windows there will be no difference to now.
>> 
>> Thomas
> 
> [Added ppc-aix-port-dev, as there is a small AIX change here.]
> 
[...]
> What I've ended up doing is moving the conditional enabling of 
> UseNUMAInterleaving into the platform-specific code, where there's 
> enough information to consistently get it right. I've also made all 
> the os variants be explicit about UseNUMA and UseNUMAInterleaving; 
> those that have no support for either now unconditionally set them 
> false.  (This happens to keep TestUseNUMAInterleaving working on
> those platforms.)
> 
> New webrev: https://cr.openjdk.java.net/~kbarrett/8243961/open.01/
> 
> (No incremental; the code change to os_windows.cpp is the same, but 
> the rationale and commentary are entirely different.)

os_linux.cpp: s/explicilty/explictly (pre-existing, may ignore)

There is an inconsistency in the use of FLAG_SET_ERGO vs. directly 
setting flag values when the implementation does not support NUMA. E.g. 
on bsd FLAG_SET_ERGO is used to disable UseNUMA*, on Windows it is 
directly set to false when not setting ForceNUMA.

Since ForceNUMA is "soon" going away it's not a big issue for me though.

> 
> Testing: mach5 tier1-3, normally, with -XX:+UseNUMA -XX:+ForceNUMA
> added.
> 
> TestUseNUMAInterleaving still fails on Windows with +UseNUMA 
> +ForceNUMA on a single-node machine, because +UseNUMAInterleaving
> gets turned off, which is not what the test expects.

One option could be to add a @requires vm.opt.ForceNUMA != true to avoid 
unnecessary failures.

> 
> Manually verified that on Linux -XX:+UseNUMAInterleaving alone gets 
> overridden off when libnuma_init is patched to fail.
> 
> 

Thanks,
   Thomas


More information about the hotspot-gc-dev mailing list