Single byte Atomic::cmpxchg implementation

Erik Österlund erik.osterlund at
Thu Sep 11 11:30:25 UTC 2014

On 11 Sep 2014, at 03:25, David Holmes <david.holmes at> wrote:
> The Atomic operations must provide full bi-directional fence semantics, so a full sync on entry is required in my opinion. I agree that the combination of bne+isync would suffice on the exit path.

I see no reason for the atomic operations to support more than full acquire and release (hence sequential consistency) memory behaviour as well as atomic updates.
For this, I see no reason why a full sync rather than lwsync is required (for the write barrier). The XNU kernel implementation also uses lwsync for release semantics and isync for the acquire.
Why would this be different for us? From the XNU kernel (note the choice of fences I argue for):

compare_and_swap32_on64b:			// bool OSAtomicCompareAndSwapBarrier32( int32_t old, int32_t new, int32_t *value);
        lwsync                      // write barrier, NOP'd on a UP
		lwarx   r7,0,r5
		cmplw   r7,r3
		bne--	2f
		stwcx.  r4,0,r5
		bne--	1b
        isync                       // read barrier, NOP'd on a UP
		li		r3,1
		li		r8,-8				// on 970, must release reservation
		li		r3,0				// return failure
		stwcx.  r4,r8,r1			// store into red zone to release

> But this is a complex area, involving hardware that doesn't always follow the rules, so conservatism is understandable.

As far as wrong hardware goes, I don't know what to do about that but can we confirm that there is hardware not doing the fences according to specification in particular?
It becomes very difficult to respect incorrect hardware implementations in my opinion.

> But this needs to be taken up with the PPC64 folk who did this port.

I agree, it would be very helpful to hear the perspective of the ones who wrote our implementation.


More information about the hotspot-dev mailing list