RFR: JDK-8216528: test/jdk/java/rmi/transport/runtimeThreadInheritanceLeak/RuntimeThreadInheritanceLeak.java failing with Xcomp

Roger Riggs Roger.Riggs at oracle.com
Mon Jan 28 15:39:00 UTC 2019


The simplest fix for this failing test is to add a call to 
reachabilityFence to prevent
the loader and loaderRef from going out of scope early.  It maintains 
getting debug
output on a failure.

Offlist, Jie and I explored some alternate ways to write the test and 
settled on this one.

Please review:

diff --git 
@@ -113,11 +113,13 @@ public class RuntimeThreadInheritanceLea
              Reference dequeued = refQueue.remove(TIMEOUT);
              if (dequeued == null) {
-                    "TEST FAILED: loader not deteced weakly reachable");
+                    "TEST FAILED: loader not detected weakly reachable");
                  throw new RuntimeException(
                      "TEST FAILED: loader not detected weakly reachable");
+            Reference.reachabilityFence(loader);
+            Reference.reachabilityFence(loaderRef);

                  "TEST PASSED: loader detected weakly reachable");

Thanks, Roger

On 01/11/2019 07:25 PM, Jie Fu wrote:
> Thanks David and Roger.
> On 2019年01月12日 06:52, David Holmes wrote:
>> Hi Roger,
>> On 12/01/2019 2:22 am, Roger Riggs wrote:
>>> Hi,
>>> The proposed patch changes the test in a way that is unintended.
>>> Adding the infinite loop of gc() and sleep, will change the timeout 
>>> behavior
>>> from the existing timeout of TIMEOUT to the jtreg default timeout of 
>>> the whole test.
>> Partially true. If the new loop gets stuck then yes the jtreg default 
>> timeout will apply - I don't see that is necessarily a bad thing. The 
>> existing timeout only applies to the refQueue.remove operation 
>> itself, you don't know how much time was spent before you got there, 
>> nor how much will be spent after in the dumpThreads() calls - so the 
>> jtreg timeout can still come into affect.
>>> Further, it renders the check at lines 114-120 irrelevant since 
>>> loaderRef.get()
>>> will have returned null and the ref will have been enqueued by then.
>> I wouldn't say irrelevant as it double-checks the interaction between 
>> the ref.get() and the queue.remove() - the result of one should imply 
>> the result of the other, but if enqueuing had a bug ....
>>> While it is true that calling gc() only once is unreliable, a better 
>>> fix is to
>>> put the code from 108-120 in a loop with a fixed number of durations
>> That would also work - say 5 loops and reduce TIMEOUT to 4000.
>>> and add Reachability.reachabilityFence(loaderRef) to ensure the ref 
>>> is not ignored.
>> Adding ReachabilityFence, alone, may solve the observed problem given 
>> one gc() seems to be working in practice (and because we don't 
>> actually have the leaked loaders anymore because those threads 
>> (sun.misc.GC threads) don't exist anymore).
>> Cheers,
>> David
>>> Regards, Roger

More information about the core-libs-dev mailing list