RFR (S) 8078438: Interpreter should support conditional card marks (UseCondCardMark)

Thomas Schatzl thomas.schatzl at oracle.com
Wed Apr 29 08:28:30 UTC 2015

Hi Andrew,

On Tue, 2015-04-28 at 17:23 +0100, Andrew Haley wrote:
> Hi,
> On 04/28/2015 04:54 PM, Thomas Schatzl wrote:
> > 
> > I dug a little through the CMS code, and I think the preclean code
> > mentioned is actually something like this:
> > 
> > {
> > MutexLocker x(some-lock); // In the CMSTockenSync constructor
> > }
> > 
> > MemRegion range = find next range of dirty cards by scanning the card
> > table
> > 
> > write 0x1 to range
> > 
> > {
> > MutexLocker x(some-lock); // In the CMSTockenSync destructor
> > }
> > 
> > inspect all object references in range x
> > 
> > See CMSCollector::preclean_card_table().
> > 
> > I.e. the implicit barriers (the CAS'es) executed by acquiring the
> > mutexes in the preclean thread provide the necessary synchronization
> > (actually between reading the card and inspecting the memory there are a
> > few of them).
> > 
> > I am not sure if this was the intention of using these MutexLockers (I
> > doubt that) but it seems sufficient.
> Yes, I think it does.  Well, kinda-sorta.  Monitor::IUnlock() does a

Good to hear.

I was more thinking of the Atomic::cmpxchg_ptr() in Monitor::ILock()
(in ::TryFast()) of the next mutex locking attempt.

Either explanation is fine.

> storeload barrier which isn't really enough, but storeload is
> implemented as a full barrier on every processor I've come across so
> it'll do.  It's not a nice thing to depend on, though.  But if there

We can add an RFE to add an extra explicit load barrier with a comment
(or at least a comment) between the two phases of the precleaning. I do
not think there is a performance concern there about an extra barrier,
given the code.

> is another lock/unlock after the unlock in the CMSTockenSync
> destructor then we're golden even on those processors I've never come
> across.

There is another CMSTokenSync, and grabbing of two additional locks. See
CMSTokenSyncWithLocks before the iteration (call to
CompactibleFreeListSpace::object_iterate_careful_m()) in the mentioned

> So we could make conditional card marking the default and use STLRB
> (or DMB ST?  Might be better).  I really don't want to emit
> unnecessary store barriers.

The only condition that is necessary is that the card table store only
becomes visible after the reference store. I do not know how this
translates to the ARMv8 instruction set in an optimal way.

> One other thing: do other GCs need this releasing store?

For all but G1 it is sufficient that the memory contents are consistent
at the safepoint. The safepointing makes sure that this is the case.
They do not do any concurrent operations.

G1 issues the correct barriers as far as we know (StoreLoad/LoadStore
pair), so if these are emitted appropriately in the compiler and the
runtime, it should be fine.


More information about the hotspot-gc-dev mailing list