RFR: JDK-8061259: ParNew promotion failed is serialized on a lock
jon.masamitsu at oracle.com
Thu Nov 13 04:01:00 UTC 2014
On 11/10/2014 3:04 PM, Kim Barrett wrote:
> On Nov 6, 2014, at 8:36 PM, Jungwoo Ha <jwha at google.com> wrote:
>> These are really important for performance. If we remove line 1358,
>> the pause time with the given example goes up by 2x.
> I wasn't suggesting eliminating the check at line 1358 altogether;
> rather, I was suggesting moving it into
> expand_and_ensure_spooling_space(), right before the mutex locking
> that begins that function.
> There's another reason to question the check at line 1358 though, and
> suggest it should be moved closer to (or inside of)
> expand_and_ensure_spooling_space(). It looks like the two places that
> are calling set_promotion_failure() are doing so for different
> reasons. The check at line 1358 will short circuit all further
> par_promote() calls in the collection cycle, even those for which
> allocation would have succeeded if not for that check. For example, in
> the old code, failure to promote some largish object will not
> necessarily block later promotion attempts for other, smaller objects.
When any promotion failure occurs the concurrent CMS collections is
abandoned and a full collection is done instead. There is some
cleanup that is done to get the heap back into a state where the
full collection can be used.
> The proposed change modifies that behavior. Since the calling code
> seems to be going to some effort to deal with par_promote() returning
> NULL (indicating promotion failure) and keep going, it's not clear to
> me whether this behavioral change is desirable. And if it *is*
> desirable, it seems like a better place to do the short circuiting
> might be higher up the call chain, and possibly simplifying the
> handling of par_promote() returning NULL. Note that I've not yet
> investigated where such a higher point in the call chain might be.
> OTOH, it seems plausible to me that keeping the test at line 1358 as
> is (e.g. don't move it, which assumes we're ok with the behavioral
> change discussed above) makes the test that was at 1381 (and which I
> suggested moving into expand_and_par_allocate()) pretty much
> redundant. It would be interesting to know whether that check at 1381
> fires more than extremely rarely. Of course, this alternative would be
> contrary to my desire to have the racy flag access located as close as
> possible to the lock and recheck code.
More information about the hotspot-gc-dev