RFR (XS): 7114303: G1: assert(_g1->mark_in_progress()) failed: shouldn't be here otherwise
john.cuthbertson at oracle.com
Wed Nov 23 11:37:58 PST 2011
Hi Bengt, Tony,
Thanks for the review. Good point about
The maximum number of buffers that will be processed by this code is
the # of filled SATB buffers + # of Java threads + 1. Each Java thread
has an embedded SATB pointer queue. In some threads, however, these
could be empty. Likewise with the shared SATB pointer queue (used by
Obtaining the number of completed SATB buffers that the drain closure is
applied to is fairly straightforward but do we count empty buffers?
I'm not sure how important the number of buffers processed is currently
- given that it's currently serial code. It would be more important if
the code were parallel as then we could see the distribution. I guess
the number of buffers processed could tell us if marking is not keeping
up with the amount of mutation but we get that indirectly from the time.
Anwyay I've removed the dead code. New webrev here:
On 11/23/11 02:26, Tony Printezis wrote:
> That methods looks indeed not to be called, which means that
> _last_satb_drain_processed_buffers is never set (I can't find any
> assignments to it) and the reported SATB processed buffers
> print_stats(2, "Processed Buffers",
> will probably always be 0. So, if we think that this output is
> important, we should make sure we actually
> record_satb_drain_processed_buffers() it with the appropriate number.
> If we don't, we should also remove _last_satb_drain_processed_buffers
> and related code.
> On 11/23/2011 02:44 AM, Bengt Rutisson wrote:
>> Hi John,
>> Looks good to me too.
>> It seems like the method "void
>> record_satb_drain_processed_buffers(int processed_buffers)" in
>> g1CollectorPolicy.hpp is dead code. I found it since it contains the
>> same assert that your CR reported.
>> Any chance of removing that method as well as part of your change?
>> On 2011-11-22 21:45, John Cuthbertson wrote:
>>> Hi Everyone,
>>> Can I have a couple of volunteers review this small change? The
>>> webrev can be found at:
>>> There is a race between the concurrent mark thread updating the mark
>>> in progress flag and it being read by the VM thread during an
>>> evacuation pause. This is because, at the time the concurrent mark
>>> thread updates the flag, it is not part of the suspendible thread
>>> set and so is not blocked. The solution is to have the concurrent
>>> mark thread join the STS just before updating the flag.
>>> The race was confirmed by inserting a small delay immediately before
>>> calling G1CollectorPolicy::record_satb_drain_time(). The delay
>>> increased failure rate to 1/5 for a 20ms delay and 4/5 for a 50ms
>>> Testing: a few thousand iterations of the failing test case with a
>>> JVM that included both fix and delay.
More information about the hotspot-gc-dev