RFR(M): 8154736: enhancement of cmpxchg and copy_to_survivor for ppc64
david.holmes at oracle.com
Sat Nov 5 17:48:29 UTC 2016
Sorry for the delayed response. I think I should start a new thread to
discuss Atomic::r-m-w memory semantics.
On 1/11/2016 7:44 PM, Andrew Haley wrote:
> On 31/10/16 21:30, David Holmes wrote:
>> On 31/10/2016 7:32 PM, Andrew Haley wrote:
>>> On 30/10/16 21:26, David Holmes wrote:
>>>> On 31/10/2016 4:36 AM, Andrew Haley wrote:
>>>>> And, while we're on the subject, is memory_order_conservative actually
>>>>> defined anywhere?
>>>> No. It was chosen to represent the current status quo that the Atomic::
>>>> ops should all be (by default) full bi-directional fences.
>>> Does that mean that a CAS is actually stronger than a load acquire
>>> followed by a store release? And that a CAS is a release fence even
>>> when it fails and no store happens?
>> Yes. Yes.
>> // All of the atomic operations that imply a read-modify-write
>> // action guarantee a two-way memory barrier across that
>> // operation. Historically these semantics reflect the strength
>> // of atomic operations that are provided on SPARC/X86. We assume
>> // that strength is necessary unless we can prove that a weaker
>> // form is sufficiently safe.
> Mmmm, but that doesn't say anything about a CAS that fails. But fair
> enough, I accept your interpretation.
>> But there is some contention as to whether the actual implementations
>> obey this completely.
> Linux/AArch64 uses GCC's __sync_val_compare_and_swap, which is specified
> as a
> "full barrier". That is, no memory operand is moved across the
> operation, either forward or backward. Further, instructions are
> issued as necessary to prevent the processor from speculating loads
> across the operation and from queuing stores after the operation.
> ... which reads the same as the language you quoted above, but looking
> at the assembly code I'm sure that it's really no stronger than a seq
> cst load followed by a seq cst store.
> I guess maybe I could give up fighting this and implement all AArch64
> CAS sequences as
> CAS(seq_cst); full fence
> or, even more extremely,
> full fence; CAS(relaxed); full fence
> but it all seems unreasonably heavyweight.
>>> And that a conservative load is a *store* barrier?
>> Not sure what you mean. Atomic::load is not a r-m-w action so not
>> expected to be a two-way memory barrier.
More information about the hotspot-compiler-dev