[concurrency-interest] RFR: 8065804: JEP 171:Clarifications/corrections for fence intrinsics
davidcholmes at aapt.net.au
Tue Nov 25 23:41:35 UTC 2014
Stephan Diestelhorst writes:
> Am Dienstag, 25. November 2014, 11:15:36 schrieb Hans Boehm:
> > I'm no hardware architect, but fundamentally it seems to me that
> > load x
> > acquire_fence
> > imposes a much more stringent constraint than
> > load_acquire x
> > Consider the case in which the load from x is an L1 hit, but a preceding
> > load (from say y) is a long-latency miss. If we enforce
> ordering by just
> > waiting for completion of prior operation, the former has to
> wait for the
> > load from y to complete; while the latter doesn't. I find it hard to
> > believe that this doesn't leave an appreciable amount of
> performance on the
> > table, at least for some interesting microarchitectures.
> I agree, Hans, that this is a reasonable assumption. Load_acquire x
> does allow roach motel, whereas the acquire fence does not.
> > In addition, for better or worse, fencing requirements on at least
> > Power are actually driven as much by store atomicity issues, as by
> > the ordering issues discussed in the cookbook. This was not
> > understood in 2005, and unfortunately doesn't seem to be amenable to
> > the kind of straightforward explanation as in Doug's cookbook.
> Coming from a strongly ordered architecture to a weakly ordered one
> myself, I also needed some mental adjustment about store (multi-copy)
> atomicity. I can imagine others will be unaware of this difference,
> too, even in 2014.
Sorry I'm missing the connection between fences and multi-copy atomicity.
> Concurrency-interest mailing list
> Concurrency-interest at cs.oswego.edu
More information about the core-libs-dev