RFR(XS): 7143858: G1: Back to back young GCs with the second GC having a minimally sized eden
john.cuthbertson at oracle.com
Mon May 21 16:12:36 PDT 2012
Can I have a couple of volunteers review the fix for this CR? The webrev
can be found at: http://cr.openjdk.java.net/~johnc/7143858/webrev.0/
In some recent G1 logs we have seen some evacuation pauses that looked
premature, i.e. the eden occupancy was much less than the target
capacity - usually only one or two regions. These premature pauses
always (almost immediately) followed a normal evacuation pause with
little or no application activity in-between. Bengt's recent changes to
display the GC cause indicated that the premature pauses were always
GCLocker Initiated GCs.
What seemed to have been happening was that as the last left a JNI
critical region, before it scheduled the GCLocker Initiated GC, another
thread would attempt an allocation. The allocating thread would see that
GCLocker was no longer active and successfully schedule the evacuation
pause. As part of this GC operation, a mutator alloc region would get
allocated and the object allocation request would be satisfied. After,
the thread initating the GCLocker GC would schedule its GC and the eden
occupancy would be fairly minimal.
Inserting a 500ms sleep just before scheduling the GCLocker initated GC
was able to reproduce the problem with Dacapo fairly frequently.
The solution implemented in this webrev is to stall the allocating
thread until the GCLocker Initiated GC is performed and then retry the
Testing: GC Test Suite and Kitchensink with the additional sleep call
and verify that no pauses occurred during the sleep, GC Test suite and
jprt with the additional sleep.
I have been unable to reproduce the issue with the other Hotspot
collectors but I can't see why they wouldn't be vulnerable. I changed
the G1 slow case allocation code to match the other collectors and still
saw the issue.
More information about the hotspot-gc-dev