david.holmes at oracle.com
Sat Jan 10 10:11:40 UTC 2015
Some very quick comments as there was a related meta-discussion on this
very recently (core-libs-dev) in relation to some descriptions of the
Unsafe memory barrier operations.
storestore, loadload etc are just as important (more so in my opinion)
acquire/release semantics are problematic because there is no single
common accepted meaning. Even talking about acquire() and release() as
independent operations is somewhat meaningless - it is only when they
are paired, and fixed in place that they start to give meaning to things.
load_acquire/store_release don't necessarily map cleanly to
acquire()/release() as currently described in orderAccess.hpp; nor do
they necessarily map cleanly to the same named operations on eg ARMv8;
nor to similar named entities in C++11. As Paul stated these were
introduced to map to the ia64 architecture - which I believe had
somewhat different semantics.
This whole area is a minefield because there are different concepts of
what "acquire" and "release" may mean - so you need to start with
detailed definitions. Personally, and I know I am not alone, I abhor the
acquire/release terminology and usage, I much prefer the unambiguous
storeload, storestore etc. :)
The current code often expresses things in a semantically incorrect way
store_release(addr, val) == store(addr, val); release();
but in practice the end code provides what is required (often providing
a stronger barrier because that's all that is available on a platform).
I also agree that these primitives have to manage both compiler
reorderings and hardware reorderings.
And I agree with Karen - tackle correctness first then refactoring. IT
will be too hard to track correctness otherwise. Personally, given I
work in this code quite a bit, I like to see the complete set of
definitions for a platform in one place. The duplication is something I
can live with when it is mostly one-liners.
On 10/01/2015 7:17 AM, Erik Österlund wrote:
> [Hoping to fix nesting of comments removed by server so you can actually
> see who said what in previous post :-|]
> Hi Karen,
> Thanks for the reply!
> On 09 Jan 2015, at 15:18, Karen Kinnear
> <karen.kinnear at oracle.com<mailto:karen.kinnear at oracle.com>> wrote:
>> I am delighted to have you looking at this code in detail to make sure it is accurate and maintainable.
>> In theory I like refactoring. In practice I hope there is a very clear way to specify the per-platform assumptions and to make it very obvious what the end results are
>> for a given platform and API.
> For each platform you only have to specify two things:
> 1. What is needed for acquire, release and fence for memory accesses,
> for that platform (5 LoC per platform). The pattern of when to use them
> (semantics) is generalized in shared code.
> 2. To provide any specializations for optimization purposes when
> applicable (certain TSO platforms that can join the access and semantics
> to a single inline asm instruction). Note that this is optional, and
> only done to get faster accesses. Many platforms do not do it at all.
> I think it’s very intuitive! Will provide webrev if this seems worthwhile.
>> Having code reviewed (and obviously missed things) earlier rounds of OrderAccess - I just wanted to note that we never thought that C++ volatile would provide
>> acquire/release semantics. volatile has always been just a compiler directive, and yes volatile-volatile only. Acquire/release are instructions to the hardware - not
>> what I would call compiler barriers.
> Acquire/release are semantics, not necessarily bound to neither
> compiler/hardware. It gives a certain contract to the user that expects
> these acquire/release semantics. Compiler barriers and fences is merely
> a means of providing these semantics, but the user should not have to
> care about such details.
> If what you are saying is that it was intended only as acquire/release
> for hardware but not constrain compiler reorderings to comply with the
> semantics, then the contract is broken (does in fact not promise acquire
> release, only on hardware level but compilers could do whatever with
> non-volatiles which is equally as bad).
> A store release should synchronize with an acquire load to the same
> location, like in this example:
> T1: store x1
> T1: release store x2
> T2: load acquire x2
> T2: load x1
> The semantics should guarantee that if load acquire x2 happens after
> store release x2 in total order, then it is guaranteed that store x1
> will be observed by the load x1, and not some old value that was there
> before store x1. This contract does not currently hold since the normal
> non-volatile memory accesses may be reordered with volatile memory
> accesses; the stores and loads may be reordered in this example by the
> compiler at will.
> Therefore acquire release semantics are not guaranteed in general, but
> only w.r.t. volatiles. This is neither clearly documented nor used in
> such a way.
> OrderedAccess is used in a way that assumes acquire/release semantics.
> But for certain TSO platforms, the contract only provides volatile
> semantics, not acquire release semantics. On other platforms, it
> properly provides acquire release semantics, which is expected.
> It only makes sense to by contract promise acquire/release semantics
> (both for compiler and hardware reorderings) for uses of these methods,
> and consistently do so on all platforms.
>> Personally I would do the correctness fixes first - and make sure they are really well tested and carefully code reviewed studying the cookbook and memory model.
>> I have not been following the more recent java memory model evolution - I believe Aleksey Snipilev has - if you have references to any updated advice
>> from Doug Lea - that would be very helpful for your reviewers to study. I know the last time I reviewed this I had to make my own small charts and walk it
>> through based on per-platform guarantees.
> There is nothing wrong with the Java memory model. The only problem is
> that OrderAccess promises semantics that are only guaranteed on some
>> I would find it helpful if you are aware of specific bugs - if you could list the actual problems and call those out? It is very hard to get this right and
>> hard to review - so if you were to have targeted reviews for specific fixes we could give the changes the depth of study that they need.
> In for instance https://bugs.openjdk.java.net/browse/JDK-8061964 this
> was observed and caused trouble for G1 card refinement that went boom.
> In the discussion I pointed out the issue of relying on volatiles as a
> cause for the crash, and it was fixed, but not consistently for all
> platforms, but only for gcc on linux where the reordering was observed
> in the generated asm output. But it is equally as dangerous on other
> (TSO) platforms (except windows where the compiler actually explicitly
> promises not to reorder - their volatiles follow acquire/release in
> compiler reorderings). I think hoping for for compilers to not reorder
> without any guarantees, waiting for things to go boom where it has
> already been observed on other platforms seems dangerous. Better safe
> than sorry. :)
> One different solution is to redocument the contract clearly as being
> acquire/release w.r.t. volatile accesses, which is currently the actual
> contract provided by the implementation.
> But since bugs were already discovered (which is very understandable
> because no other atomic library has such semantics that I know of) and
> its code is still unchanged, it seems dangerous to do so without
> re-evaluating the use of OrderedAccess in hotspot, which seems a bit
> painful. Also such a revised contract would be a disaster when you need
> to copy loads of data and then store release for instance. This would
> force the copying (of potentially lots of data) to either be done using
> volatile accesses rather than in an accelerated fashion (e.g. SSE
> extensions, write combining), or add compiler barriers to such copying
> to give the release store actual release store semantics. This would
> also require fixing G1 where a bug was observed to not rely on the
> intuitive and expected contract, but on an unintuitive and provably
> error prone contract. Ugh!!
> The only solution that seems to make sense to me is to fix the semantics
> of OrderedAccess (that were already changed on Linux) to be what is
> expected, and that is acquire/release w.r.t. other accesses (volatile or
>> Doing the refactoring afterwards allows us the easier exercise of ensuring that the refactoring produces the same code that we already have studied
>> in depth and believe to be correct.
>> So as you refactor please do find a place to clarify the per-platform assumptions so if we realize our assumptions are wrong we will re-review the code.
> I was hoping to do both at the same time, since there are many platforms
> to be changed, and making them over and over for every type of access on
> every platform to be changed is a bit tedious (and error prone) compared
> to doing it once per platform, reusing the shared pattern (to be
> reviewed once instead of once per platform) and review the choice of
> barriers once for each platform instead of for every memory access on
> every platform. Of course I don’t mind if we choose to do things in the
> other order though if that is preferred. :)
> If you compare the change in semantics I propose, without thinking about
> the refactoring, just compare x86 on bsd and linux, you see that on
> linux a compiler barrier is correctly used, but not on bsd. This is the
> change that needs to be added to all TSO platforms except linux and
> windows to ensure correctness.
> thank you for offering and for asking for opinions :-)
> Thank you for the good feedback! It looks like this change seems wanted,
> so I’ll go ahead and prepare a webrev with the proposed changes. :)
More information about the hotspot-dev