[concurrency-interest] RFR: 8065804: JEP 171: Clarifications/corrections for fence intrinsics

Hans Boehm boehm at acm.org
Thu Jan 1 07:38:50 UTC 2015

If we look at using purely store fences and purely load fences in the
"initialized flag" example as in this discussion, I think it's worth
distinguishing too possible scenarios:

1) We guarantee some form of dependency-based ordering, as most real
computer architectures do.  This probably invalidates the example from my
committee paper that's under discussion here.  The problem is, as always,
that we don't know how to make this precise at the programming language
level.  It's the compiler's job to break certain dependencies, like the
dependency of the store to x on the load of y in x = 0 * y.  Many people
are thinking about this problem, both to deal with "out-of-thin-air" issues
correctly in various memory models, and to design a version of C++'s
memory_order_consume that's more usable.  If we had a way to guarantee some
well-defined notion of dependency-based ordering, then at least some of the
examples here would need to be revisited.

2) We don't guarantee that dependencies imply any sort of ordering.  Then I
think the weird example under discussion here stands.  There is officially
nothing to prevent the load of x.a in thread 1 from being reordered with
the store to x_init.

But there may actually be better examples as to why the store-store
ordering in the initializing thread is not always enough.  Consider:

Thread 1:
x.a = 1;
if (x.a != 1) world_is_broken = true;
StoreStore fence;
x_init = true;
if (world_is_broken) die();

Thread 2:
if (x_init) {
    full fence;

I think there is nothing to prevent the read of x.a in Thread 1 from seeing
the incremented value, at least if (1) the compiler promotes
world_is_broken to a register, and  (2) at the assembly level the store to
x_init is not dependent on the load of x.a.  (1) seems quite plausible, and
(2) seems very reasonable if the architecture has a conditional move
instruction or the like.  (For Itanium, (2) holds even for the naive

This is not a particularly likely scenario, but I have no idea how would
concoct programming rules that would guarantee to prevent this kind of
weirdness.  The first two statements of Thread 1 might appear inside an
"initialize a" library routine that knows nothing about concurrency.


On Wed, Dec 17, 2014 at 10:54 AM, Martin Buchholz <martinrb at google.com>

> On Wed, Dec 17, 2014 at 1:28 AM, Peter Levart <peter.levart at gmail.com>
> wrote:
> > On 12/17/2014 03:28 AM, David Holmes wrote:
> >>
> >> On 17/12/2014 10:06 AM, Martin Buchholz wrote:
> >> Hans allows for the nonsensical, in my view, possibility that the load
> of
> >> x.a can happen after the x_init=true store and yet somehow be subject
> to the
> >> ++ and the ensuing store that has to come before the x_init = true.
> >
> > Perhaps, he is speaking about why it is dangerous to replace BOTH release
> > with just store-store AND acquire with just load-load?
> I'm pretty sure he's talking about weakening EITHER.
> """Clearly, and unsurprisingly, it is unsafe to replace the
> load_acquire with a version that restricts only load ordering in this
> case. That would allow the store to x in thread 2 to become visible
> before the initialization of x by thread 1 is complete, possibly
> losing the update, or corrupting the state of x during initialization.
> More interestingly, it is also generally unsafe to restrict the
> release ordering constraint in thread 1 to only stores."""
> (What's "clear and unsurprising" to Hans may not be to the rest of us)

More information about the core-libs-dev mailing list