(Preliminary) RFC 7038914: VM could throw uncaught OOME in ReferenceHandler thread

Thomas Schatzl thomas.schatzl at oracle.com
Thu May 2 11:30:26 UTC 2013


On Tue, 2013-04-30 at 16:44 +0100, Alan Bateman wrote:
> On 30/04/2013 15:57, Thomas Schatzl wrote:
> > Hi all,
> >
> >    the webrev at http://cr.openjdk.java.net/~tschatzl/7038914/webrev/
> > presents a first stab at the CR "7038914: VM could throw uncaught OOME
> > in ReferenceHandler thread".
> >
> > The problem is that under very heavy memory pressure, there is the
> > reference handler throws an exception with the message "Exception:
> > java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in
> > thread "Reference Handler".
> >
> > The change improves handling of out-of-memory conditions in the
> > ReferenceHandler thread. Instead of crashing the thread, and then
> > disabling reference processing, it catches this exception and continues.
> It's surprising to heard that the Reference Handler thread failed with 
> OOME. I wouldn't expect anything in this code path to throw OOME, except 
> maybe in fast-path for sun.misc.Cleaner but that will abort the VM be it 
> fails. The enqueue method that you override in the test to provoke this 
> is package-private so it's unlikely that the test or whatever that 
> resulted in this bug report is doing that.

The test is just that: a somewhat artificial way to reproduce the
problem always.

I tried some of the example programs listed below thousands of times,
sometimes without any issue. The developer previously working on that
also had severe problems reproducing it.

> So I'm again this proposed change, rather I'm just trying to understand 
> how it happened. Is there instrumentation involved by any chance? It the 
> OOME something other than "java heap" or do we know?

No instrumentation I can see of, but a whole set of weak reference
related nightly UTE tests fail at different times (I would suspect
nightly testing does not do any instrumentation). Here is a list with
exactly these errors:


Apart from these failures, the more serious problem seems to be that the
reference handler thread silently dies. Which means that weak reference
processing is effectively disabled after such an error.

A VM abort like for the Cleaner processing would be lot preferable than
the current situation too.


More information about the hotspot-gc-dev mailing list