RFR: 8055232 (ref) Exceptions while processing Reference pending list

Peter Levart peter.levart at gmail.com
Fri Sep 19 07:35:00 UTC 2014

On 09/19/2014 09:06 AM, David Holmes wrote:
> Hi Peter,
> Two comments:
> 1. Doing the Object.wait() violates the thou-shalt-not-allocate rule, 
> due to the potential for creating the InterruptedException. But there 
> is no way to avoid that.

It depends on when the InterruptedException is allocated. While 
Object.wait()-ing, the lock is not held. If, while waiting, the 
interruption is detected, it could be that InterruptedException is 
allocated before the lock is attempted to be re-gained. This can easily 
be verified - stay tuned...

> 2. I don't see how the instanceof can still result in OOME if we have 
> pre-loaded and initialized the relevant classes.

I don't see either. So you're more for a variant that we move instanceof 
check outside the synchronized block to where it was before and wait for 
stress tests to show if we're right? This might be a cleaner way to 
final solution...

> So I'm not sure what it is that needs fixing now ?? Note that if the 
> OOME is thrown while holding the lock it means we haven't deadlocked 
> with the GC.

This might be an indication that there's not any allocation taking place 
in instanceof check any more (wait is different, since it's releasing 
lock temporarily). If it was taking place, the stress tests that 
detected OOME before, would detect deadlock now, right?

I understand that this is more of a clean-up attempt than fixing any 
real issue.

Regards, Peter

> David
> On 19/09/2014 4:43 PM, Peter Levart wrote:
>> Hi,
>> This story has a long tail. It started with:
>>      https://bugs.openjdk.java.net/browse/JDK-7038914
>> Some stress tests triggered OOME in ReferenceHandler thread which would
>> die. The first attempt at fixing this was the following discussion:
>> https://www.mail-archive.com/core-libs-dev%40openjdk.java.net/msg16250.html 
>> Which resulted in patch:
>>      http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/0b8dab7fec54
>> This assumed that ReferenceHandler thread doing Object.wait() could be
>> interrupted (by stress test - normal application don't do that) and
>> failed to allocate InterruptedException object. A jtreg test was
>> designed which triggered that situation and a fix would catch OOME and
>> ignore it.
>> But the stress tests (the same or some other, I don't know) apparently
>> were not entirely happy with this fix. The following discussion
>> describes this:
>> https://www.mail-archive.com/core-libs-dev%40openjdk.java.net/msg23596.html 
>> The other "unprotected" point at which OOME could be thrown and was
>> later confirmed by debugging is (r instanceof Cleaner) test. The
>> assumption was that it could trigger Cleaner class loading at 1st
>> execution which would cause OOME to be thrown. The fix that finally
>> silenced stress tests was the following:
>>      http://hg.openjdk.java.net/jdk9/dev/jdk/rev/d04102f69d46
>> This part of code (the j.l.r.Reference class and its members) has
>> undergone further changes afterwards for fixing another bug:
>>      https://bugs.openjdk.java.net/browse/JDK-6857566
>> With following patch:
>>      http://hg.openjdk.java.net/jdk9/jdk9/jdk/rev/9934d34ed3c0
>> But this did not change the code paths - just factored-out the content
>> of the loop into a separate method that could be used from outside.
>> All well until kim.barrett at oracle.com noticed that the 2nd fix
>> introduced a potentially illegal situation. There is a
>> j.l.r.Reference.Lock inner class and a singleton object assigned to
>> static field in j.l.Reference class with the following notice:
>>      /* Object used to synchronize with the garbage collector. The
>> collector
>>       * must acquire this lock at the beginning of each collection
>> cycle.  It is
>>       * therefore critical that any code holding this lock complete as
>> quickly
>>       * as possible, allocate no new objects, and avoid calling user 
>> code.
>>       */
>>      static private class Lock { }
>>      private static Lock lock = new Lock();
>> The conflicting part is "allocate no new objects". Catching OOME inside
>> a synchronized block holding this lock implies that new objects could be
>> allocated. I have a feeling that the 2nd fix prevented that by
>> pre-loading the Cleaner class at Reference class initialization time.
>> But because it was hard to reproduce the situation where OOME was thrown
>> from (r instanceof Cleaner) check, we nevertheless handled this
>> hypothetical situation. Perhaps it would be better that we didn't and
>> just see if OOME returned after just adding the pre-loading of Cleaner
>> class...
>> So here we are, at an attempt to clean this up:
>>      https://bugs.openjdk.java.net/browse/JDK-8055232
>> We can move (r instanceof Cleaner) check outside of synchronized block
>> to where it was before the 2nd fix and wait what stress tests will show.
>> Another possibility is to move the instanceof check outside of
>> synchronized block, but handle the hypothetical OOME by re-linking the
>> unlinked reference back into the pending chain:
>> http://cr.openjdk.java.net/~plevart/jdk9-dev/ReferenceHandlerExceptions/webrev.01/ 
>> What would you suggest?
>> Regards, Peter

More information about the core-libs-dev mailing list