AARCH64: 8139041: Redundant DMB instructions (CORRECTED )

Andrew Haley aph at redhat.com
Wed Oct 7 11:45:40 UTC 2015

In many places we issue redundant memory barriers.

For example, in (C2-compiled) java.util.concurrent.ConcurrentHashMap$Node::<init> we see:

  0x000003ffa85d73d0: dmb ish
  0x000003ffa85d73d4: mov x11, x4
  0x000003ffa85d73d8: lsr xmethod, x1, #9
  0x000003ffa85d73dc: str w11, [x1,#20]
  0x000003ffa85d73e0: dmb ishst
  0x000003ffa85d73e4: strb wzr, [x10,x12,lsl #0]
 ;; membar_volatile
  0x000003ffa85d73e8: dmb ish ;*putfield val
                                                ; - java.util.concurrent.ConcurrentHashMap$Node::<init>@16 (line 629)

 ;; membar_release
  0x000003ffa85d73ec: dmb ish
  0x000003ffa85d73f0: mov x11, x5
  0x000003ffa85d73f4: lsr xmethod, x1, #9
  0x000003ffa85d73f8: str w11, [x1,#24]
  0x000003ffa85d73fc: dmb ishst
  0x000003ffa85d7400: strb wzr, [x10,x12,lsl #0]
 ;; membar_volatile
  0x000003ffa85d7404: dmb ish ;*putfield next
                                                ; - java.util.concurrent.ConcurrentHashMap$Node::<init>@22 (line 630)

We see essentially the same effect in C1-compiled code.

I'm sure that it is possible to write (or modify) a C2 pass to remove
these.  However, the ideal graph structure around the barriers is
complex, getting it right would be hard, and it would add compilation

There is a much simpler way: remove adjacent barriers in
MacroAssembler.  Thanks to the way that the AArch64 ISA is designed,
barriers can be merged simply by ORing them together.  Of course, this
technique works for C1 and C2, and it adds essentially nothing to the
compilation time.


One thing which may be controversial is that I've added a field to
CodeBuffer to keep track of barriers and labels.  I had to do this
because when we're compiling there is (AFAICS) essentially nowhere
else to keep the state.



