RFR(M,v3): JDK-8059036 : Implement Diagnostic Commands for heap and finalizerinfo

Peter Levart peter.levart at gmail.com
Sat May 16 12:48:06 UTC 2015

On 05/16/2015 02:38 PM, Peter Levart wrote:
> On 05/16/2015 09:35 AM, Dmitry Samersoff wrote:
>> Derek,
>> Personally, I'm for volatile over explicit fence too.
>> So I'll change the code if Peter don't have objections.
> There are no objections. There's one unneeded volatile store to .next 
> field then in enqueue(), but it is followed by another volatile store, 
> so there shouldn't be overhead on Intel and SPARC at least.
> Regards, Peter

If you make .next field volatile, then perhaps you could also cache it's 
value in reallyPoll() into a local variable, so that it is not read 
twice. Like this:

     private Reference<? extends T> reallyPoll() {       /* Must hold 
lock */
         Reference<? extends T> r = head;
         if (r != null) {
             Reference rn = r.next; // HERE !!!
             head = (rn == r) ?
                 null :
                 rn; // Unchecked due to the next field having a raw 
type in Reference
             r.queue = NULL;
             r.next = r;
             if (r instanceof FinalReference) {
             return r;
         return null;


>> -Dmitry
>> On 2015-05-16 01:59, Derek White wrote:
>>> Hi Dmitry, Peter,
>>> I should read my email more frequently - I missed Dmitry's last webrev
>>> email.
>>> My comments on a preference for volatile over Unsafe are not a strong
>>> objection to using Unsafe fences. I just don't have enough experience
>>> with Unsafe fences yet to agree that this use is correct.
>>> While looking over this Reference code I found 3-6 really simple things
>>> that need cleaning up that have been there for years, so I'm not a big
>>> cheerleader for more complicated things here :-)
>>>   - Derek
>>> On 5/15/15 6:46 PM, Derek White wrote:
>>>> Hi Peter,
>>>> Some comments below...
>>>> On 5/14/15 3:50 AM, Peter Levart wrote:
>>>>> Hi Derek,
>>>>> On 05/13/2015 10:34 PM, Derek White wrote:
>>>>>> Hi Peter,
>>>>>> I don't have smoking gun evidence that your change introduces a bug,
>>>>>> but I have some concerns...
>>>>> I did have a concern too, ...
>>>>>> On 5/12/15 6:05 PM, Peter Levart wrote:
>>>>>>> Hi Dmitry,
>>>>>>> You iterate the queue then, not the unfinalized list. That's more
>>>>>>> logical.
>>>>>>> Holding the queue's lock may pause reference handler and finalizer
>>>>>>> threads for the entire time of iteration. This can blow up the
>>>>>>> application. Suppose one wants to diagnose the application because
>>>>>>> he suspects that finalizer thread hardly keeps up with production
>>>>>>> of finalizable instances. This can happen if the allocation rate of
>>>>>>> finalizable objects is very high and/or finalize() methods are not
>>>>>>> coded to be fast enough. Suppose the queue of Finalizer(s) builds
>>>>>>> up gradually and is already holding several million objects. Now
>>>>>>> you initiate the diagnostic command to print the queue. The
>>>>>>> iteration over and grouping/counting of several millions of
>>>>>>> Finalizer(s) takes some time. This blocks finalizer thread and
>>>>>>> reference handler thread. But does not block the threads that
>>>>>>> allocate finalizable objects. By the time the iteration is over,
>>>>>>> the JVM blows up and application slows down to a halt doing GCs
>>>>>>> most of the time, getting OOMEs, etc...
>>>>>>> It is possible to iterate the elements of the queue for diagnostic
>>>>>>> purposes without holding a lock. The modification required to do
>>>>>>> that is the following (havent tested this, but I think it should work):
>>>>>>> http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.01/
>>>>>> One issue to watch out for is the garbage collectors inspect the
>>>>>> Reference.next field from C code directly (to distinguish active vs.
>>>>>> pending, iterate over oops, etc.). You can search hotspot for
>>>>>> java_lang_ref_Reference::next*, java_lang_ref_Reference::set_next*.
>>>>>> Your change makes "inactive" References superficially look like
>>>>>> "enqueued" References. The GC code is rather subtle and I haven't
>>>>>> yet seen a case where it would get confused by this change, but
>>>>>> there could easily be something like that lurking in the GC code.
>>>>> ...but then I thought that GC can in no way treat a Reference
>>>>> differently whether it is still enqueued in a ReferenceQueue or
>>>>> already dequeued (inactive) - responsibility is already on the Java
>>>>> side. Currently the definition of Reference.next is this:
>>>>>      /* When active:   NULL
>>>>>       *     pending:   this
>>>>>       *    Enqueued:   next reference in queue (or this if last)
>>>>>       *    Inactive:   this
>>>>>       */
>>>>>      @SuppressWarnings("rawtypes")
>>>>>      Reference next;
>>>>> We see that, unless GC inspects all ReferenceQueue instances and
>>>>> scans their lists too, the state of a Reference that is enqueued as
>>>>> last in list is indistinguishable from the state of inactive
>>>>> Reference. So I deduced that this distinction (enqueued/inactive)
>>>>> can't be established solely on the value of .next field ( == this or
>>>>> != this)...
>>>>> But I should inspect the GC code too to build a better understanding
>>>>> of that part of the story...
>>>>> Anyway. It turns out that there is already enough state in Reference
>>>>> to destinguish between it being enqueued as last in list and already
>>>>> dequeued (inactive) - the additional state is Reference.queue that
>>>>> transitions from ReferenceQueue.ENQUEUED -> ReferenceQueue.NULL in
>>>>> ReferenceQueue.reallyPoll. We need to just make sure the two fields
>>>>> (r.next and r.queue) are assigned and read in correct order. This can
>>>>> be achieved either by making Reference.next a volatile field or by a
>>>>> couple of explicit fences:
>>>>> http://cr.openjdk.java.net/~plevart/misc/Finalizer.printFinalizationQueue/webrev.02/
>>>>> The assignment of r.queue to ReferenceQueue.ENQUEUED already happens
>>>>> before linking the reference into the queue's head in
>>>>> ReferenceQueue.enqueue():
>>>>>              r.queue = ENQUEUED;
>>>>>              r.next = (head == null) ? r : head;
>>>>>              head = r;
>>>>> Both stores are volatile.
>>>> This sounds a bit better. I don't have a good handle on the pros and
>>>> cons of a volatile field over explicit fences via Unsafe in this case.
>>>> Kim might jump in soon.
>>>> I'd like to suggest an entirely less-clever solution. Since the
>>>> problem is that forEach() might see inconsistent values for the queue
>>>> and next fields of a Reference, but we don't want to grab a lock
>>>> walking the whole queue, we could instead grab the lock as we look at
>>>> each Reference (and do the "back-up" trick if we were interfered
>>>> with). Of course this is slower than going lock-free, but it's not any
>>>> more overhead than the ReferenceHandler thread or FinalizerThreads incur.
>>>> The only benefit of this solution over the others is that it is less
>>>> clever, and I'm not sure how much cleverness this problem deserves.
>>>> Given that, and ranking the solutions as "lock" < (clever) "volatile"
>>>> < "fence", I'd be happier with the "volatile" or "lock" solution if
>>>> that is sufficient.
>>>>   - Derek
>>>>>>> I also suggest the addition to the ReferenceQueue to be contained
>>>>>>> (package-private) and as generic as possible. That's why I suggest
>>>>>>> forEach iteration method with no concrete logic.
>>>>>>> It would be possible to encapsulate the entire logic into a special
>>>>>>> package-private class (say java.lang.ref.DiagnosticCommands) with
>>>>>>> static method(s) (printFinalizationQueue)... You could just expose
>>>>>>> a package-private forEach static method from Finalizer and code the
>>>>>>> rest in DiagnosticCommands.
>>>>>> That's good for encapsulation. But I'm concerned that if "forEach"
>>>>>> got exposed beyond careful use in DiagnosticCommands, and the
>>>>>> Referents were leaked back into the heap, then we could get
>>>>>> unexpected object resurrection after finalization. This isn't a bug
>>>>>> on it's own - any finalizer might resurrect its object if not
>>>>>> careful, but ordinarily /only/ the finalizer could resurrect the
>>>>>> object. This could invalidate application invariants?
>>>>> Well, it all stays in the confines of package-private API - internal
>>>>> to JDK. Any future additional use should be reviewed carefully.
>>>>> Comments on the forEach() method should warn about that.
>>>>>> I agree that using a lock over the ReferenceQueue iteration opens up
>>>>>> /another/ case of diagnostics affecting application behavior. And
>>>>>> there are pathological scenarios where this gets severe. This is
>>>>>> unfortunate but not uncommon. There is enough complication here that
>>>>>> you should be sure that the fix for diagnostics performance doesn't
>>>>>> introduce subtle bugs to the system in general. A change in this
>>>>>> area should get a thorough review from both the runtime and GC sides.
>>>>> Is the webrev.02 proposed above more acceptable in that respect? It
>>>>> does not introduce any logical changes to existing code.
>>>>>> Better yet, the reference handling code in GC and runtime should
>>>>>> probably be thrown out and re-written, but that's another issue :-)
>>>>> I may have some proposals in that direction. Stay tuned.
>>>>> Regards, Peter
>>>>>> - Derek
>>>>>>> Regards, Peter
>>>>>>> On 05/12/2015 07:10 PM, Dmitry Samersoff wrote:
>>>>>>>> Everybody,
>>>>>>>> Updated version:
>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.03/
>>>>>>>> Now it iterates over queue and output result sorted by number of instances.
>>>>>>>> -Dmitry
>>>>>>>> On 2015-05-07 00:51, Derek White wrote:
>>>>>>>>> Hi Dmitry, Staffan,
>>>>>>>>> Lots of good comments here.
>>>>>>>>> On the topic of what list should be printed out, I think we should focus
>>>>>>>>> on objects waiting to be finalized - e.g. the contents of the
>>>>>>>>> ReferenceQueue. It's more of a pain to walk the ReferenceQueue, but you
>>>>>>>>> could add a summerizeQueue(TreeMap<String, Integer>) method, or a
>>>>>>>>> general iterator and lambda.
>>>>>>>>> A histogram of objects with finalize methods might also be interesting,
>>>>>>>>> but you can get most of the same information from a heap histogram
>>>>>>>>> (ClassHistogramDCmd, and search for count of Finalizer instances).
>>>>>>>>> BTW, by only locking the ReferenceQueue, we should only be blocking the
>>>>>>>>> FinalizerThread from making progress (and I think there's a GC thread
>>>>>>>>> that runs after GC that sorts found References objects that need
>>>>>>>>> processing into their respective ReferenceQueues). But locking the
>>>>>>>>> "unfinalized" list blocks initializing any object with a finalize() method.
>>>>>>>>> The sorting suggestion is a nice touch.
>>>>>>>>>   - Derek White, GC team
>>>>>>>>> On 5/5/15 10:40 AM, Peter Levart wrote:
>>>>>>>>>> Hi Dmitry, Staffan,
>>>>>>>>>> On 05/05/2015 12:38 PM, Staffan Larsen wrote:
>>>>>>>>>>> Dmitry,
>>>>>>>>>>> I think this should be reviewed on hotspot-gc and core-libs-dev as
>>>>>>>>>>> well considering the changes to Finalizer.
>>>>>>>>>>> I’m a little worried about the potentially very large String that is
>>>>>>>>>>> returned from printFinalizationQueue(). A possible different approach
>>>>>>>>>>> would be to write printFinalizationQueue() in C++ inside Hotspot.
>>>>>>>>>>> That would allow for outputting one line at a time. The output would
>>>>>>>>>>> still be saved in memory (since the stream is buffered), but at least
>>>>>>>>>>> the data is only saved once in memory, then. It would make the code a
>>>>>>>>>>> bit harder to write, so its a question of how scared we are of
>>>>>>>>>>> running out of memory.
>>>>>>>>>> If the output is just a histogram of # of instances per class name,
>>>>>>>>>> then it should not be too large, as there are not many different
>>>>>>>>>> classes overriding finalize() method (I counted 72 overriddings of
>>>>>>>>>> finalize() method in the whole jdk/src tree).
>>>>>>>>>> I'm more concerned about the fact that while traversing the list, a
>>>>>>>>>> lock is held, which might impact prompt finalization(). Is it
>>>>>>>>>> acceptable for diagnostic output to impact performance and/or
>>>>>>>>>> interfere with synchronization?
>>>>>>>>>> It might be possible to scan the list without holding a lock for
>>>>>>>>>> diagnostic purposes if Finalizer.remove() didn't set the removed
>>>>>>>>>> Finalizer.next pointer to point back to itself:
>>>>>>>>>>      private void remove() {
>>>>>>>>>>          synchronized (lock) {
>>>>>>>>>>              if (unfinalized == this) {
>>>>>>>>>>                  if (this.next != null) {
>>>>>>>>>>                      unfinalized = this.next;
>>>>>>>>>>                  } else {
>>>>>>>>>>                      unfinalized = this.prev;
>>>>>>>>>>                  }
>>>>>>>>>>              }
>>>>>>>>>>              if (this.next != null) {
>>>>>>>>>>                  this.next.prev = this.prev;
>>>>>>>>>>              }
>>>>>>>>>>              if (this.prev != null) {
>>>>>>>>>>                  this.prev.next = this.next;
>>>>>>>>>>              }
>>>>>>>>>>              // this.next = this; must not be set so that we can
>>>>>>>>>> traverse the list unsynchronized
>>>>>>>>>>              this.prev = this;   /* Indicates that this has been
>>>>>>>>>> finalized */
>>>>>>>>>>          }
>>>>>>>>>>      }
>>>>>>>>>> For detecting whether a Finalizer is already processed, the 'prev'
>>>>>>>>>> pointer could be used instead:
>>>>>>>>>>      private boolean hasBeenFinalized() {
>>>>>>>>>>          return (prev == this);
>>>>>>>>>>      }
>>>>>>>>>> Also, to make sure a data race dereferencing 'unfinalized' in
>>>>>>>>>> unsynchronized printFinalizationQueue() would get you a fully
>>>>>>>>>> initialized Finalizer instance (in particular the next pointer), you
>>>>>>>>>> would have to make the 'unfinalized' field volatile or insert an
>>>>>>>>>> Unsafe.storeFence() before assigning to 'unfinalized':
>>>>>>>>>>      private void add() {
>>>>>>>>>>          synchronized (lock) {
>>>>>>>>>>              if (unfinalized != null) {
>>>>>>>>>>                  this.next = unfinalized;
>>>>>>>>>>                  unfinalized.prev = this;
>>>>>>>>>>              }
>>>>>>>>>>              // make sure a data race dereferencing 'unfinalized'
>>>>>>>>>>              // in printFinalizationQueue() does see the Finalizer
>>>>>>>>>>              // instance fully initialized
>>>>>>>>>>              // (in particular the 'next' pointer)
>>>>>>>>>>              U.storeFence();
>>>>>>>>>>              unfinalized = this;
>>>>>>>>>>          }
>>>>>>>>>>      }
>>>>>>>>>> By doing these modifications, I think you can remove
>>>>>>>>>> synchronized(lock) {} from printFinalizationQueue().
>>>>>>>>>>> I see you are traversing the ‘unfinalized’ list in Finalizer, whereas
>>>>>>>>>>> the old SA code for ‘-finalizerinfo' traverses the ‘queue’ list. I am
>>>>>>>>>>> not sure of the difference, perhaps someone from the GC team can help
>>>>>>>>>>> out?
>>>>>>>>>> The 'queue' is a ReferenceQueue of Finalizer (FinalReference)
>>>>>>>>>> instances pending processing by finalizer thread because their
>>>>>>>>>> referents are eligible for finalization (they are not reachable any
>>>>>>>>>> more). The 'unfinalized' is a doubly-linked list of all Finalizer
>>>>>>>>>> instances for which their referents have not been finalized yet
>>>>>>>>>> (including those that are still reachable and alive). The later serves
>>>>>>>>>> two purposes:
>>>>>>>>>> - it keeps Finalizer instances reachable until they are processed
>>>>>>>>>> - it is a source of unfinalized instances for
>>>>>>>>>> running-finalizers-on-exit if requested by
>>>>>>>>>> System.runFinalizersOnExit(true);
>>>>>>>>>> So it really depends on what one would like to see. Showing the queue
>>>>>>>>>> may be interesting if one wants to see how the finalizer thread is
>>>>>>>>>> coping with processing the finalize() invocations. Showing unfinalized
>>>>>>>>>> list may be interesting if one wants to know how many live +
>>>>>>>>>> finalization pending instances are there on the heap that override
>>>>>>>>>> finalize() method.
>>>>>>>>>> Regards, Peter
>>>>>>>>>>> For the output, it would be a nice touch to sort it on the number of
>>>>>>>>>>> references for each type. Perhaps outputting it more like a table
>>>>>>>>>>> (see the old code again) would also make it easier to digest.
>>>>>>>>>>> In diagnosticCommand.hpp, the new commands should override
>>>>>>>>>>> permission() and limit access:
>>>>>>>>>>>     static const JavaPermission permission() {
>>>>>>>>>>>       JavaPermission p = {"java.lang.management.ManagementPermission",
>>>>>>>>>>>                           "monitor", NULL};
>>>>>>>>>>>       return p;
>>>>>>>>>>>     }
>>>>>>>>>>> The two tests don’t validate the output in any way. Would it be
>>>>>>>>>>> possible to add validation? Perhaps hard to make sure an object is on
>>>>>>>>>>> the finalizer queue… Hmm.
>>>>>>>>>>> Thanks,
>>>>>>>>>>> /Staffan
>>>>>>>>>>>> On 5 maj 2015, at 12:01, Dmitry Samersoff
>>>>>>>>>>>> <dmitry.samersoff at oracle.com>  wrote:
>>>>>>>>>>>> Everyone,
>>>>>>>>>>>> Please review the fix:
>>>>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8059036/webrev.01/
>>>>>>>>>>>> heap dcmd outputs the same information as SIGBREAK
>>>>>>>>>>>> finalizerinfo dcmd outputs a list of all classes in finalization queue
>>>>>>>>>>>> with count
>>>>>>>>>>>> -Dmitry
>>>>>>>>>>>> -- 
>>>>>>>>>>>> Dmitry Samersoff
>>>>>>>>>>>> Oracle Java development team, Saint Petersburg, Russia
>>>>>>>>>>>> * I would love to change the world, but they won't give me the sources.

More information about the core-libs-dev mailing list