From dl at cs.oswego.edu Sun Nov 2 12:36:19 2014 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 02 Nov 2014 07:36:19 -0500 Subject: [jmm-dev] Another way to punt OOTA In-Reply-To: <20141101144717.GA7558@linux.vnet.ibm.com> References: <54539A7E.8030606@cs.oswego.edu> <20141101144717.GA7558@linux.vnet.ibm.com> Message-ID: <54562543.7070605@cs.oswego.edu> On 11/01/2014 10:47 AM, Paul E. McKenney wrote: > On Fri, Oct 31, 2014 at 10:19:42AM -0400, Doug Lea wrote: >> On 10/28/2014 09:39 PM, Hans Boehm wrote: >>> Here's another conceivable approach. This sounds somewhat crazy based on >>> our prior assumptions, but it may only be mildly crazy, so here goes. >> >> I think some of the details are crazy, but I agree with most of >> the broad points: >> >> First, odds are that we will "solve" OOTA and related speculation >> constraints, but in a somewhat disappointing way: most likely by >> defining a form of "dep" such that (rf U dep) is acyclic, but where >> dep may sometimes be undecidable. Or any of several variations with >> similar impact. Paul et al's N4216 >> (http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2014/n4216.html) >> presents a meta-litmus test along these lines. With Alan, Peter, >> Viktor, and others looking at this from various angles, I'm still >> optimistic about discovering a solution straddled by C11 (allowing >> OOTA) and JSR133 (disallowing valid transformations). So it still >> seems best to imagine that we arrive (out-of-thin-air!) at a core >> model that suffices, even if in some cases it includes caveats >> along the lines of "read r could be 42 only if P==NP". > > IMHO, the undecidable piece seems a bit overblown. If you show me > an OOTA litmus test that is undecidable, I bet I can transform it into > a message-passing litmus test that is also undecidable. ;-) We'd like to say: No program encounters OOTA, as one of the base safety guarantees mentioned below (for Java, but related concerns arise for C/C++). Maybe we cannot quite say this, but instead something close that is equivalent for all practical purposes. Your N4216 is helpful in making concrete Hans's long-standing concerns about defining "dependency": Not only do concurrent semantics become intertwined with the base specifications of every operation available in a language, they also encounter computational undecidability problems when trying to establish simple properties. But at this point, I don't see how to avoid these problems: We are pretty sure we need (rf U dep) to be acyclic to arrive at anomaly-free models. (Or, as Hans suggested, punting, which no one else seems to like.) So, I agree that OOTA (and related speculations) concerns are overblown in that they don't impact any practical programs, but I still don't know how to make further progress on core memory models without somehow addressing them. -Doug > > Thanx, Paul > >>> - The argument for providing well-defined behavior for data races in Java >>> was based either on the original model of Java, or on the desire to >>> unconditionally maintain secondary properties, like memory safety, even in >>> the presence of data races. >> >> Yes. No-OOTA is only one part of the minimal guarantees of even >> arbitrarily racy Java programs. This set of predicates forms the Java >> analog of C/C++ undefinedness. Informally, we (in conjunction with >> other part of the JLS) need to ensure that nothing anyone can write in >> Java proper causes a JVM to crash (with disclaimers about JNI etc that >> from a security point of view arbitrarily widen the trusted computing >> base. BTW, this is the main reason we need to stop making people use >> Unsafe to get the effects of enhanced-volatiles.) As I sketched out >> last January (mostly on the pre-jmm9 list), we ought to solidify >> meanings of memory safety and security -- these are at least as >> important as guaranteeing SC for DRF programs. And from there, we need >> simple characterizations of at least release->acquire and create->use >> (constructors). But I don't think we can simply characterize all the >> intermediate points. It seems impossible to routinely distinguish >> subtly correct vs subtly incorrect code. For example you need to know >> that the String hash function is idempotent, and that if hash is zero, >> it intentionally recomputes, to see that racy lazy initialization of >> String.hashCode is OK. >> >> >>> - We're essentially already making the preceding dubious assumption. We >>> have all sorts of library APIs that don't specify how they behave in the >>> presence of races, but only say that you shouldn't do that. We're somehow >>> supposed to conclude that the heap can't be corrupted even in the presence >>> of racy library calls, because of the (unstated?) assumption that libraries >>> should behave as though they were implemented in Java, and Java programs >>> preserve e.g. memory safety. >> >> To emphasize: I agree. >> >>> >>> - Switch to a much more C++-like (or Java library-like) model in which data >>> races have something like "undefined behavior". Exactly how to model that >>> is an open question. Ordinary loads can only see stores that happen-before >>> them, but racing loads trigger "undefined behavior". "Undefined behavior" >>> should be defined to allow reporting an error and produce any type correct >>> answer for the racing load. >> >> And to emphasize: I disagree in the general case because I don't >> know how to do it. However, I don't think there is anything stopping >> JVM+core-libraries conspiring to throw exceptions in particular >> kinds of races that are never correct. Or for add-on tools to >> dynamically or statically detect them. >> >>> >>> - [Probably challenging, others understand the Java constraints better than >>> I] Introduce a mechanism for (nearly) unordered memory-order-relaxed like >>> racing loads. Require current racing accesses to use that mechanism >>> instead. (Open question: coherence) >> >> I have yet to see a compelling example/case for guaranteeing coherence >> for relaxed accesses to volatiles (i.e., in Java 9+, the >> enhanced-volatile mechanism for a relaxed read.) >> >> -Doug >> > From Stephan.Diestelhorst at arm.com Wed Nov 5 11:44:32 2014 From: Stephan.Diestelhorst at arm.com (Stephan Diestelhorst) Date: Wed, 5 Nov 2014 11:44:32 +0000 Subject: [jmm-dev] Another way to punt OOTA In-Reply-To: <54539A7E.8030606@cs.oswego.edu> References: <54539A7E.8030606@cs.oswego.edu> Message-ID: On 31.10.2014 14:19, "Doug Lea"
wrote: >On 10/28/2014 09:39 PM, Hans Boehm wrote: >> Here's another conceivable approach. This sounds somewhat crazy based >>on >> our prior assumptions, but it may only be mildly crazy, so here goes. > >I think some of the details are crazy, but I agree with most of >the broad points: > >First, odds are that we will "solve" OOTA and related speculation >constraints, but in a somewhat disappointing way: most likely by >defining a form of "dep" such that (rf U dep) is acyclic, but where >dep may sometimes be undecidable. I keep thinking (feeling, really) that in most cases where programmers would want high performance, it is relatively easy to make this dependency chain should be obvious (i.e. not obfuscated through functions not consuming their arguments etc.). In all other cases, it should not be too costly to force a reasonable serialisation primitive (fence-like) into the right spots? This seems to be the take from the N4216 document if I understand it correctly. Stephan -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782 From martinrb at google.com Wed Nov 19 19:12:00 2014 From: martinrb at google.com (Martin Buchholz) Date: Wed, 19 Nov 2014 11:12:00 -0800 Subject: [jmm-dev] Will the real memory read barrier please stand up? Message-ID: I'm struggling to educate myself about memory barriers, and I discovered both Paul McKenney's awesome book and those newfangled Unsafe fences in OpenJDK. Paul's book says """A read barrier is a partial ordering on loads only; it is not required to have any effect on stores.""" Meanwhile, Unsafe.loadFence doc says, """Ensures lack of reordering of loads before the fence with loads or stores after the fence.""" Spelunking in the hotspot sources, Unsafe.loadFence is implemented via LoadFenceNode, which says: """"Acquire" - no following ref can move before (but earlier refs can follow, like an early Load stalled in cache)""" So ... that's 3 different definitions?! Further, the doc for LoadFenceNode seems non-sensical to me - if following refs (load or store) can cross the barrier backwards, how is that different in effect from allowing preceding refs to cross the barrier forwards? Both seem to allow any reordering at all?! Should that have read """but earlier Stores (perhaps stalled in cache) can follow""" ? From boehm at acm.org Thu Nov 20 00:38:24 2014 From: boehm at acm.org (Hans Boehm) Date: Wed, 19 Nov 2014 16:38:24 -0800 Subject: [jmm-dev] Will the real memory read barrier please stand up? In-Reply-To: References: Message-ID: I agree that the third definition is nonsensical unless a LoadFenceNode is attached to a specific load. That seems implausible given the rest of your description. There doesn't seem to be any consistency about the use of definition 1 or 2. I think a definition 1 "read barrier" has very limited utility, but some architectures provide it. ARMv8's load barrier is type 2, but ARM's store barrier is the analog of type 1, i.e. it orders only stores. Hans On Wed, Nov 19, 2014 at 11:12 AM, Martin Buchholz wrote: > I'm struggling to educate myself about memory barriers, and I > discovered both Paul McKenney's awesome book and those newfangled > Unsafe fences in OpenJDK. Paul's book says > > """A read barrier is a partial ordering on loads only; it is not > required to have any effect on stores.""" > > Meanwhile, Unsafe.loadFence doc says, > > """Ensures lack of reordering of loads before the fence with loads or > stores after the fence.""" > > Spelunking in the hotspot sources, Unsafe.loadFence is implemented via > LoadFenceNode, which says: > > """"Acquire" - no following ref can move before (but earlier refs can > follow, like an early Load stalled in cache)""" > > So ... that's 3 different definitions?! Further, the doc for > LoadFenceNode seems non-sensical to me - if following refs (load or > store) can cross the barrier backwards, how is that different in > effect from allowing preceding refs to cross the barrier forwards? > Both seem to allow any reordering at all?! Should that have read > """but earlier Stores (perhaps stalled in cache) can follow""" ? > From martinrb at google.com Thu Nov 20 03:27:43 2014 From: martinrb at google.com (Martin Buchholz) Date: Wed, 19 Nov 2014 19:27:43 -0800 Subject: [jmm-dev] Will the real memory read barrier please stand up? In-Reply-To: References: Message-ID: Maybe enlightenment can be found at preshing.com http://preshing.com/20130922/acquire-and-release-fences/ Is Unsafe.loadFence precisely C11 atomic_thread_fence(memory_order_acquire); Is Unsafe.storeFence precisely C11 atomic_thread_fence(memory_order_release); If so, it would be nice if it said so! On Wed, Nov 19, 2014 at 11:12 AM, Martin Buchholz wrote: > I'm struggling to educate myself about memory barriers, and I > discovered both Paul McKenney's awesome book and those newfangled > Unsafe fences in OpenJDK. Paul's book says > > """A read barrier is a partial ordering on loads only; it is not > required to have any effect on stores.""" > > Meanwhile, Unsafe.loadFence doc says, > > """Ensures lack of reordering of loads before the fence with loads or > stores after the fence.""" > > Spelunking in the hotspot sources, Unsafe.loadFence is implemented via > LoadFenceNode, which says: > > """"Acquire" - no following ref can move before (but earlier refs can > follow, like an early Load stalled in cache)""" > > So ... that's 3 different definitions?! Further, the doc for > LoadFenceNode seems non-sensical to me - if following refs (load or > store) can cross the barrier backwards, how is that different in > effect from allowing preceding refs to cross the barrier forwards? > Both seem to allow any reordering at all?! Should that have read > """but earlier Stores (perhaps stalled in cache) can follow""" ? From aph at redhat.com Thu Nov 20 09:33:52 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 20 Nov 2014 09:33:52 +0000 Subject: [jmm-dev] Will the real memory read barrier please stand up? In-Reply-To: References: Message-ID: <546DB580.9010602@redhat.com> On 20/11/14 00:38, Hans Boehm wrote: > There doesn't seem to be any consistency about the use of definition 1 or > 2. I think a definition 1 "read barrier" has very limited utility, but > some architectures provide it. ARMv8's load barrier is type 2, but ARM's > store barrier is the analog of type 1, i.e. it orders only stores. Are you saying that ARM's DMB LD is LoadLoad|LoadStore ? Doug's JSR-133 Cookbook suggests otherwise. Andrew. From aph at redhat.com Thu Nov 20 10:25:17 2014 From: aph at redhat.com (Andrew Haley) Date: Thu, 20 Nov 2014 10:25:17 +0000 Subject: [jmm-dev] [concurrency-interest] Will the real memory read barrier please stand up? In-Reply-To: References: <546DB580.9010602@redhat.com> Message-ID: <546DC18D.7020103@redhat.com> On 20/11/14 10:12, Joe Bowbeer wrote: > As far as I can tell, the cookbook hasn't been updated yet for ARMv8. It > looks like the following might be helpful in updating the recipes. > > http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html Yes, it looks like the DMB LD is now stronger if, that page is right. It'd be nice if someone could point me at the language in the ARMv8 spec that this derives from. Andrew. From dl at cs.oswego.edu Sun Nov 23 23:58:50 2014 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 23 Nov 2014 18:58:50 -0500 Subject: [jmm-dev] Will the real memory read barrier please stand up? In-Reply-To: References: Message-ID: <547274BA.3060201@cs.oswego.edu> On 11/19/2014 02:12 PM, Martin Buchholz wrote: > I'm struggling to educate myself about memory barriers,.. Catching up after being diverted with other things... here are a few minor footnotes to other replies: Terminology is not very standard. To maximize clarity, I use "fence" (vs "barrier") to reduce confusion vs garbage-collection barriers. And "load" (vs "read") to reduce confusion vs IO. Similarly for "store" vs "write". And when applicable, Sparc-ese beforeAfter designations: LoadLoad, LoadStore, StoreStore, StoreLoad and their combinations. (Although there can be fence types that don't nicely fall into any of these categories.) > > """A read barrier is a partial ordering on loads only; it is not > required to have any effect on stores.""" > Meanwhile, Unsafe.loadFence doc says, > > """Ensures lack of reordering of loads before the fence with loads or > stores after the fence.""" This is a [LoadLoad|LoadStore] fence, also known as a "load acquire" fence, which is sometimes (including inside hotspot) just called a "load fence" because it is the only kind of Load* fence that is ever wanted. The hotspot intrinsic name name was chosen to match this existing internal convention, at the expense of external confusion. (It's not the only such case with intrinsics.) > > Spelunking in the hotspot sources, Unsafe.loadFence is implemented via > LoadFenceNode, which says: > > [...confusing things...] I think most of the internal docs/comments remain from pre-JDK5. It might be a good idea to update them. -Doug From dl at cs.oswego.edu Mon Nov 24 00:09:54 2014 From: dl at cs.oswego.edu (Doug Lea) Date: Sun, 23 Nov 2014 19:09:54 -0500 Subject: [jmm-dev] [concurrency-interest] Will the real memory read barrier please stand up? In-Reply-To: <546DC18D.7020103@redhat.com> References: <546DB580.9010602@redhat.com> <546DC18D.7020103@redhat.com> Message-ID: <54727752.5060504@cs.oswego.edu> On 11/20/2014 05:25 AM, Andrew Haley wrote: > On 20/11/14 10:12, Joe Bowbeer wrote: >> As far as I can tell, the cookbook hasn't been updated yet for ARMv8. Mainly out of conservatism given the history of revising ARM and POWER entries multiple times over the years, sometimes based on incomplete or wrong information. It seems worth waiting for multiple validations, including experience from the ARMv8 hotspot port. This is cruel to Andrew (sorry!) but still better than alternatives. >> looks like the following might be helpful in updating the recipes. >> >> http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html > > Yes, it looks like the DMB LD is now stronger if, that page is right. > > It'd be nice if someone could point me at the language in the > ARMv8 spec that this derives from. I'm hoping that someone from ARM answers this. -Doug From Peter.Sewell at cl.cam.ac.uk Mon Nov 24 00:17:07 2014 From: Peter.Sewell at cl.cam.ac.uk (Peter Sewell) Date: Mon, 24 Nov 2014 00:17:07 +0000 Subject: [jmm-dev] [concurrency-interest] Will the real memory read barrier please stand up? In-Reply-To: <54727752.5060504@cs.oswego.edu> References: <546DB580.9010602@redhat.com> <546DC18D.7020103@redhat.com> <54727752.5060504@cs.oswego.edu> Message-ID: On 24 November 2014 at 00:09, Doug Lea
wrote: > On 11/20/2014 05:25 AM, Andrew Haley wrote: >> >> On 20/11/14 10:12, Joe Bowbeer wrote: >>> >>> As far as I can tell, the cookbook hasn't been updated yet for ARMv8. > > > Mainly out of conservatism given the history of revising ARM and POWER > entries multiple times over the years, sometimes based on incomplete or > wrong information. It seems worth waiting for multiple validations, > including experience from the ARMv8 hotspot port. This is cruel to Andrew > (sorry!) but still better than alternatives. > >>> looks like the following might be helpful in updating the recipes. >>> >>> http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html >> >> >> Yes, it looks like the DMB LD is now stronger if, that page is right. >> >> It'd be nice if someone could point me at the language in the >> ARMv8 spec that this derives from. > > > I'm hoping that someone from ARM answers this. That table came from ARM, though not via the ARM ARM, so it certainly matched their intent. We are working on modelling the v8 primitives, but I couldn't say we have a definitive story (backed up by extensive testing and model comparison) yet. Peter > -Doug > From aph at redhat.com Mon Nov 24 09:33:23 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 24 Nov 2014 09:33:23 +0000 Subject: [jmm-dev] [concurrency-interest] Will the real memory read barrier please stand up? In-Reply-To: <54727752.5060504@cs.oswego.edu> References: <546DB580.9010602@redhat.com> <546DC18D.7020103@redhat.com> <54727752.5060504@cs.oswego.edu> Message-ID: <5472FB63.2020804@redhat.com> On 24/11/14 00:09, Doug Lea wrote: > On 11/20/2014 05:25 AM, Andrew Haley wrote: >> On 20/11/14 10:12, Joe Bowbeer wrote: >>> As far as I can tell, the cookbook hasn't been updated yet for ARMv8. > > Mainly out of conservatism given the history of revising ARM and POWER > entries multiple times over the years, sometimes based on incomplete or > wrong information. It seems worth waiting for multiple validations, > including experience from the ARMv8 hotspot port. This is cruel to Andrew > (sorry!) but still better than alternatives. > >>> looks like the following might be helpful in updating the recipes. >>> >>> http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html >> >> Yes, it looks like the DMB LD is now stronger if, that page is right. >> >> It'd be nice if someone could point me at the language in the >> ARMv8 spec that this derives from. > > I'm hoping that someone from ARM answers this. I have communicated with ARM and (although it is implied in the the ARMv8 Architecture Reference Manual in a roundabout way) they have raised a ticket for clarification in a future update. They agree that Load-Load|Load-Store is correct. In the meantime, there is a table of memory barriers on Page 105 of the ARMv8 Instruction Set Overview which states this explicitly. Andrew. From stephan.diestelhorst at arm.com Mon Nov 24 15:15:26 2014 From: stephan.diestelhorst at arm.com (Stephan Diestelhorst) Date: Mon, 24 Nov 2014 15:15:26 +0000 Subject: [jmm-dev] [concurrency-interest] Will the real memory read barrier please stand up? In-Reply-To: <5472FB63.2020804@redhat.com> References: <54727752.5060504@cs.oswego.edu> <5472FB63.2020804@redhat.com> Message-ID: <8017617.Z7IkjnHZrM@mymac-ubuntu> On Monday 24 November 2014 09:33:23 Andrew Haley wrote: > On 24/11/14 00:09, Doug Lea wrote: > > On 11/20/2014 05:25 AM, Andrew Haley wrote: > >> On 20/11/14 10:12, Joe Bowbeer wrote: > >>> As far as I can tell, the cookbook hasn't been updated yet for ARMv8. > > > > Mainly out of conservatism given the history of revising ARM and > > POWER entries multiple times over the years, sometimes based on > > incomplete or wrong information. It seems worth waiting for multiple > > validations, including experience from the ARMv8 hotspot port. This > > is cruel to Andrew (sorry!) but still better than alternatives. > > > >>> looks like the following might be helpful in updating the recipes. > >>> > >>> http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html > >> > >> Yes, it looks like the DMB LD is now stronger if, that page is right. > >> > >> It'd be nice if someone could point me at the language in the > >> ARMv8 spec that this derives from. [..] > > I'm hoping that someone from ARM answers this. The usual thing about the group constructions applies with added type restrictions. _Shareability and access limitations on the data barrier operations_ The DMB and DSB instructions can each take an optional limitation argument that specifies: * The shareability domain over which the instruction must operate. This is one of: - Full system. - Outer Shareable. - Inner Shareable. - Non-shareable. * The accesses for which the instruction operates. This is one of: - Read and write accesses in Group A and Group B. - Write accesses only in Group A and Group B. - Read access only in Group A and read and write accesses in Group B. Note: This is occasionally referred to as a Load-Load/Store barrier. > I have communicated with ARM and (although it is implied in the the > ARMv8 Architecture Reference Manual in a roundabout way) they have > raised a ticket for clarification in a future update. Not sure if this is already what you had in mind or whether this is the "roundabout way", please let me know. The only issue I have had with this is that the clarification for the restricted type happens pretty late. Until then one (I!) too assumed it would only be loads (due to the name / mnemonic type being "LD"). > They agree that Load-Load|Load-Store is correct. Yes. > In the meantime, there is a table of memory barriers on Page 105 of > the ARMv8 Instruction Set Overview which states this explicitly. Indeed. -- Sincerely, Stephan Stephan Diestelhorst Staff Engineer, ARM R&D Systems +44 (0)1223 405662 -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782 From stephan.diestelhorst at arm.com Mon Nov 24 15:21:42 2014 From: stephan.diestelhorst at arm.com (Stephan Diestelhorst) Date: Mon, 24 Nov 2014 15:21:42 +0000 Subject: [jmm-dev] Will the real memory read barrier please stand up? In-Reply-To: <547274BA.3060201@cs.oswego.edu> References: <547274BA.3060201@cs.oswego.edu> Message-ID: <3211150.DiDWp5U5A7@mymac-ubuntu> On Sunday 23 November 2014 23:58:50 Doug Lea wrote: > This is a [LoadLoad|LoadStore] fence, also known as > a "load acquire" fence, which is sometimes (including > inside hotspot) just called a "load fence" because it is the > only kind of Load* fence that is ever wanted. Doug, you are the second person to say that. Is that just common wisdom or am I missing a fundamental reason, here? Or are all the other constructs strong enough to not requiring plain LoadLoad? Not really, since a simpe, non-pointer consumer could still have their flag / data loads reordered. Or is it rather because every consumer will eventually produce data (that should be ordered behind the data read if it were derived from it though)? -- Thanks, Stephan Stephan Diestelhorst Staff Engineer ARM R&D Systems +44 (0)1223 405662 -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782 From dl at cs.oswego.edu Mon Nov 24 16:00:53 2014 From: dl at cs.oswego.edu (Doug Lea) Date: Mon, 24 Nov 2014 11:00:53 -0500 Subject: [jmm-dev] Will the real memory read barrier please stand up? In-Reply-To: <3211150.DiDWp5U5A7@mymac-ubuntu> References: <547274BA.3060201@cs.oswego.edu> <3211150.DiDWp5U5A7@mymac-ubuntu> Message-ID: <54735635.9010707@cs.oswego.edu> [Trimming off unneeded CCs, but since Aleksey's attempt to trim lists didn't work, keeping both.] On 11/24/2014 10:21 AM, Stephan Diestelhorst wrote: > On Sunday 23 November 2014 23:58:50 Doug Lea wrote: >> This is a [LoadLoad|LoadStore] fence, also known as >> a "load acquire" fence, which is sometimes (including >> inside hotspot) just called a "load fence" because it is the >> only kind of Load* fence that is ever wanted. > > Doug, you are the second person to say that. Is that just common wisdom I guess you could call it common wisdom: I think that no current language-level models have any cases of mappings requiring issue of only one or the other, and no current processors have a fence that provides LoadLoad but not LoadStore. (Although maybe some pseudo-fences have this effect?) -Doug From aph at redhat.com Mon Nov 24 16:35:50 2014 From: aph at redhat.com (Andrew Haley) Date: Mon, 24 Nov 2014 16:35:50 +0000 Subject: [jmm-dev] [concurrency-interest] Will the real memory read barrier please stand up? In-Reply-To: <8017617.Z7IkjnHZrM@mymac-ubuntu> References: <54727752.5060504@cs.oswego.edu> <5472FB63.2020804@redhat.com> <8017617.Z7IkjnHZrM@mymac-ubuntu> Message-ID: <54735E66.6070106@redhat.com> On 11/24/2014 03:15 PM, Stephan Diestelhorst wrote: >> I have communicated with ARM and (although it is implied in the the >> > ARMv8 Architecture Reference Manual in a roundabout way) they have >> > raised a ticket for clarification in a future update. > > Not sure if this is already what you had in mind or whether this is the > "roundabout way", please let me know. The only issue I have had with > this is that the clarification for the restricted type happens pretty > late. Until then one (I!) too assumed it would only be loads (due to the > name / mnemonic type being "LD"). Quite. two things: Firstly, that section needs to say that the first is DMB, the second is DMB ST, and the third is DMB LD. It may be "obvious", but you still have to say it. And the page which defines DMB has to say it too, or at least have a direct link to the page where the barrier option names are defined. Andrew.