CRR (XS): 7098085: G1: partially-young GCs not initiated under certain circumstances
Y. S. Ramakrishna
y.s.ramakrishna at oracle.com
Thu Oct 6 10:29:29 PDT 2011
On 10/06/11 09:46, Tony Printezis wrote:
> I have to eat humble pie. I got two (infrequent failures) during testing
OK, i'll share the pie :-) I thought (but didn't check) that
we clear both marking bit maps at the end of the full gc,
and the yielded/aborted marking threads will not create any
new marks. But perhaps they are not checking for abortion when
they emerge from an arbitrary yield? (You can tell that I have
not looked sufficiently at the G1 abort mechanism yet; i'll try
and look at the sources for that when i get some time.)
> overnight that I'm 99.9% sure are caused by the fact that I stopped
> clearing the bitmap in the CM::abort() method. As I said, I've backed
> out of those changes in concurrentMark.cpp, here's the latest webrev
> (same as before but without the changes to the CM::abort() method):
> I'll carry on testing this for the next day or two.
> Tony Printezis wrote:
>> Thanks for taking a look at it! See inline.
>> On 10/5/2011 7:48 PM, Ramki Ramakrishna wrote:
>>> Hi Tony --
>>> The changes look good.
>>> On a side note, was the removal of the bit map clearing a different
>>> bug fix,
>>> or is it somehow related to the issue described in the CR?
>>> I can see that it is unnecessary and may be even wasteful,
>>> but it seemed to me to be unrelated to the issue described in the CR,
>>> unless I am missing something here.
>> Agreed. I was looking at the abort() method to see what it did and I
>> noticed that it was also clearing the bitmap. It's indeed wasteful
>> and, in fact, it does not achieve what the comment says it should
>> achieve. abort() is only called at the start of a Full GC but, in
>> order to do the Full GC safepoint, the concurrent marking workers
>> should have noticed that a safepoint was initiated and should have all
>> yielded (otherwise the safepoint would have to wait for them). And,
>> the concurrent marking thread will actually clear the next bitmap as
>> the last thing it does before waiting for the next cycle to start.
>> But, yeah, point taken: I'll back out of the concurrentMark.cpp
>> changes and do them as a separate CR.
>>> As a related (but otherwise orthogonal) matter, it appears as though
>>> the idiom:-
>>> might gain from the usual constructor/destructor idiom used
>>> elsewhere for code that wants to synchronize with other
>>> parts of the system (in this case safepoints). For example:
>>> SynchronizeWithSafepoints ss(); // or appropriate other name
>> Yes, I do like this idea very much.
>>>  you will have to invent a suitable other name, perhaps
>>> for when you do the reverse, i.e. perform an action outside of the
>>> _sts, as happens
>>> when you are doing those synch barriers amongst the concurrent marking
>>> threads in case of overflow and restart.
>> I can't think of a good name off-hand but I'll think about it for a
>> bit... (suggestions are welcome!)
>>> The benefit is that the extra scopes help clearly demarcate
>>> the code that wants to be in the "critical section", if you
>>> will, wrt safepoints that will want to synchronize wrt these
>>> (As far as I can tell all invocations of join/leave or stsJoin/stsLeave
>>> are paired and can become block-structured.)
>> They should definitely be paired (like lock/unlock pairs). Otherwise,
>> we're in trouble. :-)
>>> The c'tor/d'tor idiom change is of course orthogonal to yr changes
>>> and would likely want to be a separate CR, if you agree that this
>>> may be another clean-up worth doing.
>> Absolutely. So, I take an action item to open two hs23 CRs: one to
>> avoid clearing the marking bitmap and one for the ctor/dtor idiom to
>> join / leave the STS and vice versa.
>>> Otherwise looks good.
>>> -- ramki
>>> On 10/5/2011 2:21 PM, Tony Printezis wrote:
>>>> I'd like to get a couple of people to have a look at the following
>>>> The code changes are very minor, most of the added lines are in fact
>>>> detailed comments on the changes.
>>>> I'm going to do testing on two machines overnight, as this is quite
>>>> tricky code (even though it doesn't look it). But, I thought I'd
>>>> open the changes for code review early as the change is small and
>>>> some early feedback, if possible, would be greatly appreciated.
>>>> PS Correct url? Check. Text actually applies to the CR in the
>>>> subject? Check.
More information about the hotspot-gc-dev