[aarch64-port-dev ] RFR: aarch64: minor improvements of atomic operations
felix.yang at huawei.com
Tue Nov 12 12:02:34 UTC 2019
> -----Original Message-----
> From: Andrew Dinn [mailto:adinn at redhat.com]
> Sent: Tuesday, November 12, 2019 5:42 PM
> To: Andrew Haley <aph at redhat.com>; Yangfei (Felix)
> <felix.yang at huawei.com>; aarch64-port-dev at openjdk.java.net
> Cc: hotspot-dev at openjdk.java.net
> Subject: Re: [aarch64-port-dev ] RFR: aarch64: minor improvements of atomic
> On 12/11/2019 09:25, Andrew Haley wrote:
> > On 11/12/19 8:37 AM, Yangfei (Felix) wrote:
> >> This has been discussed somewhere before:
> >> https://patchwork.kernel.org/patch/3575821/
> >> Let's keep the current status for safe.
> > Yes.
> > It's been interesting to see the progress of this patch. I don't think
> > it's the first time that someone has been tempted to change this code
> > to make it "more efficient".
> > I wonder if we could perhaps add a comment to that code so that it
> > doesn't happen again. I'm not sure exactly what the patch should say
> > beyond "do not touch". Perhaps something along the lines of "Do not
> > touch this code unless you have at least Black Belt, 4th Dan in memory
> > ordering." :-)
> > More seriously, maybe simply "Note that memory_order_conservative
> > requires a full barrier after atomic stores. See
> > https://patchwork.kernel.org/patch/3575821/"
> Yes, that would be a help. It's particularly easy to get confused here because
> we happily omit the ordering of an stlr store wrt subsequent stores when the
> strl is implementing a Java volatile write or a Java cmpxchg.
> So, it might be worth adding a rider that implementing the full
> memory_order_conservative semantics is necessary because VM code relies
> on the strong ordering wrt writes that the cmpxchg is required to provide.
I also suggest we implement these functions with inline assembly here.
For Atomic::PlatformXchg, we may issue two consecutive full memory barriers with the current status.
I used GCC 7.3.0 to compile the following function:
$ cat test.c
long foo(long add_value, long volatile* dest, long exchange_value)
long val = __sync_lock_test_and_set(dest, exchange_value);
$ cat test.s
.type foo, %function
ldxr x0, [x1]
stxr w3, x2, [x1]
cbnz w3, .L2
dmb ish < ========
dmb ish < ========
.size foo, .-foo
.ident "GCC: (GNU) 7.3.0"
.section .note.GNU-stack,"", at progbits
More information about the hotspot-dev