david.holmes at oracle.com
Tue Jan 13 02:37:36 UTC 2015
Quick partial response ...
On 13/01/2015 12:18 AM, Erik Österlund wrote:
> Hi David,
> Thank you for the feedback! :)
> On 10/01/15 11:11, David Holmes wrote:
>> acquire/release semantics are problematic because there is no single
>> common accepted meaning. Even talking about acquire() and release() as
>> independent operations is somewhat meaningless - it is only when they
>> are paired, and fixed in place that they start to give meaning to things.
> I agree it is a bit vague. The description in the comments about
> preventing all memory accesses from floating up and down is the one I
> recognize and it is conceptually good. But then it is later defined in a
> table as acquire = Load(Store|Load) and release = (Store|Load)Store.
The table is not a definition, it is an example of how the
acquire/release semantics can be implemented on the different architectures:
is sufficient to implement acquire, but not exactly equivalent -
acquire/release semantics can not be expressed exactly in terms of
loadload|loadStore etc, as acquire/release define one-way barriers
whereas loadload etc are bi-directional.
> reason is understandable - even though the description of
> acquire/release is quite general, they are only intended and designed to
> be used in certain contexts like release store and load acquire (with
> variants). Outside such similar contexts, it would not give the
> guarantees described.
> This gets more confusing when you go on reading that fence is
> conceptually acquire and release. It is true on the conceptual level
> using the conceptual descriptions of acquire/release, but the actual
> guarantees of release + acquire is StoreStore, LoadLoad and LoadStore
> according to the tables below, while fence() also provides StoreLoad
> constraints. I don't know about you but I find that a bit misleading.
Yes conceptually fence is "back-to-back acquire and release" but that
doesn't mean fence can necessarily be implemented by concatenating the
implementations of acquire and release on any given platform. The
problem is the "back-to-back" part as you need something to stop loads
from moving down past the acquire, and stores from moving up before the
release. Hence to implement fence you need all 4 membars and you don't
need the "dummy" load or store.
> I mean, it might be quite obvious for anyone using it to only use
> acquire/release in the intended contexts where it makes sense and then
> there is no trouble, but I guess it doesn't hurt to add some comments
> making it explicit to avoid any confusion. We don't want confusion I think.
>> load_acquire/store_release don't necessarily map cleanly to
>> acquire()/release() as currently described in orderAccess.hpp; nor do
>> they necessarily map cleanly to the same named operations on eg ARMv8;
>> nor to similar named entities in C++11. As Paul stated these were
>> introduced to map to the ia64 architecture - which I believe had
>> somewhat different semantics.
> They do map cleanly after I change acquire() and release() to compiler
> barriers on TSO platforms (and fencing on PPC). Except some exceptions
> like load acquire on PPC that was specialized for speed using twi-isync
> and x86 on windows that don't need the stronger barriers since volatile
> is more accurate anyway.
>> This whole area is a minefield because there are different concepts of
>> what "acquire" and "release" may mean - so you need to start with
>> detailed definitions. Personally, and I know I am not alone, I abhor the
>> acquire/release terminology and usage, I much prefer the unambiguous
>> storeload, storestore etc. :)
> Yes I think we need a better concrete guarantee/contract. The current
> conceptual description of acquire/release is nice I think and useful
> also for understanding when to use it, but I want to make it explicit
> that release is only used in conjunction with writes and acquire only in
> conjunction with loads (and swapping instructions etc), and therefore do
> not give the same guarantees as described as the more general conceptual
> description promises.
Or put another way there is no acquire()/release() only
> The concrete guarantee users can rely on is acquire = Load(Store|Load)
> and release = (Store|Load)Store, and that's consistent with the
Nope as per above that is not correct.
> implementation and reasonable intended usage. Hence, it's fine to use
> them in the way acquire/release is described now, iff these barriers
> suffice to guarantee it, otherwise acquire/release should not be used in
> the given context. Does that make sense?
>> The current code often expresses things in a semantically incorrect way
>> eg that:
>> store_release(addr, val) == store(addr, val); release();
>> but in practice the end code provides what is required (often providing
>> a stronger barrier because that's all that is available on a platform).
> I'm not sure I get what you mean by this being semantically incorrect. I
> get it's suboptimal and might be slower triggering compiler barriers
> that won't be necessary when joined, but semantically surely it should
> be the same, right?
Sorry I got the API backwards. What we often see is:
release_store(addr, val) == release(); store(addr,val);
but semantically that is wrong because release() allows stores to move
ahead of it so the RHS could be executed as:
store(addr, val); release()
which != release_store (which binds the store tightly to the 'release'
part). But on a given platform release() might be implemented in such a
way that the store can't move ahead of it - so the code is right, but
the way it is written is potentially incorrect given the abstract
specifications. Though another way to look at this, given these are all
contained in per-platform files, is that when you see something like:
release_store(addr, val) == release(); store(addr,val);
it isn't a claim of general equivalence, but a statement that is
accurate for the implementation on the current platform. (Though I'd
still prefer not to see it written this way - and that is what I want to
fix under https://bugs.openjdk.java.net/browse/JDK-7143664 )
BTW this is another point of confusion when we have an API like
release_store() but instructions like st.rel - which side of the barrier
is the store on, and does it matter?
>> And I agree with Karen - tackle correctness first then refactoring. IT
>> will be too hard to track correctness otherwise. Personally, given I
>> work in this code quite a bit, I like to see the complete set of
>> definitions for a platform in one place. The duplication is something I
>> can live with when it is mostly one-liners.
> Okay no problem. I built the general variant first (because it was
> easier since I don't have access to other platforms except mine haha).
> But I fully understand yours and Karen's motivation so I will get back
> to you when I manage to do it the code duplicated way so it's easier to
> follow the evolution.
> Oh and speaking of which, I find it very strange that store_fence and
> release_store_fence is duplicated on x86 because we don't want to cast
> away the volatile. This whole non-volatile thing seems a bit off to me -
Very long, depressing history of issues with volatiles and casting with
different compilers. So what you see is the result of what needed to be
done to "make things work".
> surely all calls to OrderedAccess should be expected to be volatile,
I agree, and I've suggested this in the past too, but was told that
making variables volatile incurs too much performance penalty in some
cases, so we only cast them to volatile when needed for these ops.
> nothing else would make any sense? And if so, why is store_fence the one
> exception that is non-volatile while none of the others are? Unless
> anyone knows there is a good reason for this, I'd like to make it
> consistent with the rest of the code and make it volatile as well (and
> hence get rid of some code duplication while at it). Permission granted? ;)
That case would need to be examined in detail. :)
> Thank you David for a nice discussion about this! :)
I'd like to say it is fun, but it really isn't :) This stuff gives me
More information about the hotspot-dev