erik.osterlund at lnu.se
Mon Jan 12 14:18:51 UTC 2015
Thank you for the feedback! :)
On 10/01/15 11:11, David Holmes wrote:
> acquire/release semantics are problematic because there is no single
> common accepted meaning. Even talking about acquire() and release() as
> independent operations is somewhat meaningless - it is only when they
> are paired, and fixed in place that they start to give meaning to things.
I agree it is a bit vague. The description in the comments about
preventing all memory accesses from floating up and down is the one I
recognize and it is conceptually good. But then it is later defined in a
table as acquire = Load(Store|Load) and release = (Store|Load)Store. The
reason is understandable - even though the description of
acquire/release is quite general, they are only intended and designed to
be used in certain contexts like release store and load acquire (with
variants). Outside such similar contexts, it would not give the
This gets more confusing when you go on reading that fence is
conceptually acquire and release. It is true on the conceptual level
using the conceptual descriptions of acquire/release, but the actual
guarantees of release + acquire is StoreStore, LoadLoad and LoadStore
according to the tables below, while fence() also provides StoreLoad
constraints. I don't know about you but I find that a bit misleading.
I mean, it might be quite obvious for anyone using it to only use
acquire/release in the intended contexts where it makes sense and then
there is no trouble, but I guess it doesn't hurt to add some comments
making it explicit to avoid any confusion. We don't want confusion I think.
> load_acquire/store_release don't necessarily map cleanly to
> acquire()/release() as currently described in orderAccess.hpp; nor do
> they necessarily map cleanly to the same named operations on eg ARMv8;
> nor to similar named entities in C++11. As Paul stated these were
> introduced to map to the ia64 architecture - which I believe had
> somewhat different semantics.
They do map cleanly after I change acquire() and release() to compiler
barriers on TSO platforms (and fencing on PPC). Except some exceptions
like load acquire on PPC that was specialized for speed using twi-isync
and x86 on windows that don't need the stronger barriers since volatile
is more accurate anyway.
> This whole area is a minefield because there are different concepts of
> what "acquire" and "release" may mean - so you need to start with
> detailed definitions. Personally, and I know I am not alone, I abhor the
> acquire/release terminology and usage, I much prefer the unambiguous
> storeload, storestore etc. :)
Yes I think we need a better concrete guarantee/contract. The current
conceptual description of acquire/release is nice I think and useful
also for understanding when to use it, but I want to make it explicit
that release is only used in conjunction with writes and acquire only in
conjunction with loads (and swapping instructions etc), and therefore do
not give the same guarantees as described as the more general conceptual
The concrete guarantee users can rely on is acquire = Load(Store|Load)
and release = (Store|Load)Store, and that's consistent with the
implementation and reasonable intended usage. Hence, it's fine to use
them in the way acquire/release is described now, iff these barriers
suffice to guarantee it, otherwise acquire/release should not be used in
the given context. Does that make sense?
> The current code often expresses things in a semantically incorrect way
> eg that:
> store_release(addr, val) == store(addr, val); release();
> but in practice the end code provides what is required (often providing
> a stronger barrier because that's all that is available on a platform).
I'm not sure I get what you mean by this being semantically incorrect. I
get it's suboptimal and might be slower triggering compiler barriers
that won't be necessary when joined, but semantically surely it should
be the same, right?
> And I agree with Karen - tackle correctness first then refactoring. IT
> will be too hard to track correctness otherwise. Personally, given I
> work in this code quite a bit, I like to see the complete set of
> definitions for a platform in one place. The duplication is something I
> can live with when it is mostly one-liners.
Okay no problem. I built the general variant first (because it was
easier since I don't have access to other platforms except mine haha).
But I fully understand yours and Karen's motivation so I will get back
to you when I manage to do it the code duplicated way so it's easier to
follow the evolution.
Oh and speaking of which, I find it very strange that store_fence and
release_store_fence is duplicated on x86 because we don't want to cast
away the volatile. This whole non-volatile thing seems a bit off to me -
surely all calls to OrderedAccess should be expected to be volatile,
nothing else would make any sense? And if so, why is store_fence the one
exception that is non-volatile while none of the others are? Unless
anyone knows there is a good reason for this, I'd like to make it
consistent with the rest of the code and make it volatile as well (and
hence get rid of some code duplication while at it). Permission granted? ;)
Thank you David for a nice discussion about this! :)
More information about the hotspot-dev