RFR: 8062036: ConcurrentMarkThread::slt may be invoked before ConcurrentMarkThread::makeSurrogateLockerThread causing intermittent crashes

Kim Barrett kim.barrett at oracle.com
Sun Nov 2 04:12:25 UTC 2014

Please review this fix for a nightly test failure:



I'll also need a sponsor for this.

The failing test is run with the -XX:+ScavengeALot option.  That leads
to collections during VM initialization, and one of them might need
the surrogate locking thread before that thread has been created.

The proximate cause of the failure, use of -XX:+ScavengeALot (or
-XX:+FullGCALot) leading to such problematic collections, is being
addressed by suppressing "gc alot" until VM initialization is
complete, conditionalizing it on Threads::is_vm_complete(), rather
than the previously used is_init_completed().  The latter function was
never really the proper predicate for this decision, but happened to
work for collectors that don't use a Java thread as part of their
implementation; that predicate has never been adequate for G1 or CMS,
which (sometimes) involve the SLT (Java) thread.

However, this doesn't address the possibility of a collection with
some other cause occurring before SLT creation in VM initialization,
and failing because it requires the SLT.  This might happen if, for
example, the initial memory configuration options are overly
restrictive.  There are limited options to deal with this situation.

* In some cases it might be possible to report OOME, but there are no
application threads running yet that might do anything useful with it.
It's also not clear this should be treated as an OOME; there may be
lots of available memory, if only the collector could actually run.

* Creating the SLT on demand isn't a reliable solution; such a
collection could occur before it is possible to create and run a Java
thread. (The SLT is created soon after Java thread creation is
possible, but there is a period between when the heap supports
allocation (which might trigger GC) and the point where Java thread
creation is allowed.)

* Instead we're changing the reporting of the SLT being needed before
created situation to use vm_exit_during_initialization() with a
message about what happened, instead of the previous segfault.

More information about the hotspot-gc-dev mailing list