[aarch64-port-dev ] Using LoadAcquire & StoreRelease instructions

Lindenmaier, Goetz goetz.lindenmaier at sap.com
Mon Nov 24 14:42:39 UTC 2014

Hi Andrew, 

yes, you are right in both points.
The first is only needed for IA64, because there we leave out the 
store fence after the initialization.

The second, in graphKit, is superfluous.  We figured that, too, 
not too long ago.

Thanks for improving this,
  Martin & Goetz.

-----Original Message-----
From: Andrew Haley [mailto:aph at redhat.com] 
Sent: Montag, 24. November 2014 15:08
To: Lindenmaier, Goetz; hotspot-dev Source Developers; aarch64-port-dev at openjdk.java.net
Subject: Using LoadAcquire & StoreRelease instructions

Just to recap: when we have LoadAcquire & StoreRelease instructions we
don't also need fences.  We respect MemNode::unordered in the C2
aarch64.ad and generate LoadAcquire & StoreRelease.  I have changed
HotSpot in a few places so that we can disable the separate fences.

However, there are two places in the HotSpot code base where I've had
to conditionalize on AArch64 because a store is marked as a release
where we don't need it to be.

The first is a store to a non-volatile OOP field, which I think you
said was for IA-64, because IA-64 does not have a store fence at the
end of object initialization.  I understand that argument and it makes
sense, but can we make this IA64_ONLY, or is it wanted for other
architectures as well?

--- old/src/share/vm/opto/memnode.hpp	2014-11-21 12:09:22.766963837 -0500
+++ new/src/share/vm/opto/memnode.hpp	2014-11-21 12:09:22.546983320 -0500
@@ -503,6 +503,10 @@
   // Conservatively release stores of object references in order to
   // ensure visibility of object initialization.
   static inline MemOrd release_if_reference(const BasicType t) {
+    // AArch64 doesn't need a release store here because object
+    // initialization contains the necessary barriers.
+    AARCH64_ONLY(return unordered);
     const MemOrd mo = (t == T_ARRAY ||
                        t == T_ADDRESS || // Might be the address of an object reference (`boxing').
                        t == T_OBJECT) ? release : unordered;

The second is for release stores into the card table, which I believe
are not needed on any architecture.  (G1 is irrelevant here: it has
its own card table code, with the fences it needs.)  If updating the
card table really does need to be a release store then we must insert
a fence here for every architecture.  However, I don't think we do,
and the release here has a significant performance impact.

--- old/src/share/vm/opto/graphKit.cpp	2014-11-21 12:09:20.017207376 -0500
+++ new/src/share/vm/opto/graphKit.cpp	2014-11-21 12:09:19.787227745 -0500
@@ -3813,7 +3813,8 @@

   // Smash zero into card
   if( !UseConcMarkSweepGC ) {
-    __ store(__ ctrl(), card_adr, zero, bt, adr_type, MemNode::release);
+    __ store(__ ctrl(), card_adr, zero, bt, adr_type,
+             NOT_AARCH64(MemNode::release) AARCH64_ONLY(MemNode::unordered));
   } else {
     // Specialized path for CM store barrier
     __ storeCM(__ ctrl(), card_adr, zero, oop_store, adr_idx, bt, adr_type);

So, while I am perfectly happy to just disable these for AArch64, it
would be better for all concerned to have a resolution of this.


More information about the aarch64-port-dev mailing list