RFR(M): 8080289: Intermediate writes in a loop not eliminated by optimizer
john.r.rose at oracle.com
Wed Jun 17 21:27:36 UTC 2015
> On Jun 17, 2015, at 1:23 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
> Nope, that's an oversimplified understanding. One place where the JMM will bite you is with publication of object state via final fields. Normal stores used to initialize a structure which is published via final-field semantics must be ordered to take place before the object is published. We don't (and perhaps can't) track object publication events, nor their relation to stores into newly-reachable subgraphs. Instead, we have fences that gently but firmly ensure that data (from normal stores, even to non-final fields and array elements!) is posted to memory before any store which could be a publishing store for that data.
> Not sure what's oversimplified —
I probably misread you, then.
> you're describing a JMM semantic for final fields, which I'd expect to be modeled as barriers in the IR, just like volatile writes would be modeled as barriers, preventing removal or reordering of them. I appreciate that it can be troublesome to track this information, but that only means compiler will have to play more conservative and there may be some optimization opportunities lost. I'd think the pattern would look like:
> obj = allocZerodMemory(); // obj has final fields
> obj.ctor(); // arbitrarily long/complex CFG
> _someRef = obj;
> I'd expect redundant stores to be removed as part of ctor() CFG without violating the storestore barrier. But, I do understand the complexity/trickiness of getting this right.
You are correct. The StoreStore approximates the point at which the object is first published to other threads. All normal stores above the StoreStore can be issued in any order (as far as this fence is concerned) but must settle before the object is published. Presumably it is published shortly after the StoreStore, and the StoreStore could be sunk until that point, if we wanted to do this, or even eliminated if the object never gets published. Also, stores provably unrelated to (unreachable from) the published object could drop below the StoreStore. We don't attempt to make this distinction. None of these train of thought affects the basic assertion that (if fences are absent) normal stores can be reordered.
If we wish to remove that StoreStore (for some reason) we would either need a more precise set of fences (or HB edges), or else we would have to hold back on aggressive store reordering. This is what makes me think we may discover a missing fence, once we start letting those little stores swarm around each other.
What makes me more nervous about this is the clear fact that non-TSO platforms (TSO, Itanium) have to tweak their fences in various ad hoc ways to avoid breaking user code. See, for example, Parse::do_exits. If we make our thread-local orderings more non-TSO-ish, we might run into the same subtle issues that the PPC port wrestles with. By "subtle" I partly mean "relating to unstated user expectations even if not supported by the JMM", and I also mean "hard to detect, characterize, and fix".
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-compiler-dev