RFR: 8079315: UseCondCardMark broken in conjunction with CMS precleaning
erik.osterlund at lnu.se
Mon May 11 16:05:48 UTC 2015
A compiler barrier I guess. ;)
In my original proposal, the storestore would still be there so nothing would really change regarding compiler reordering.
The card value load and write are dependent so the compiler wouldn’t change them and the storestore would already contain the necessary compiler barrier.
But yeah, even that storestore can probably be removed too and replaced by a compiler barrier if that makes us happier. :)
On 11 May 2015, at 14:49, Vitaly Davidovich <vitalyd at gmail.com<mailto:vitalyd at gmail.com>> wrote:
What would prevent compiler based reordering in your suggestions?
sent from my phone
On May 11, 2015 7:33 AM, "Erik Österlund" <erik.osterlund at lnu.se<mailto:erik.osterlund at lnu.se>> wrote:
> On 11 May 2015, at 11:58, Andrew Haley <aph at redhat.com<mailto:aph at redhat.com>> wrote:
> On 05/11/2015 11:40 AM, Erik Österlund wrote:
>> I have heard statements like this that such mechanism would not work
>> on RMO, but never got an explanation why it would work only on
>> TSO. Could you please elaborate? I studied some kernel sources for
>> a bunch of architectures and kernels, and it seems as far as I can
>> see all good for RMO too.
> Dave Dice himself told me that the algorithm is not in general safe
> for non-TSO. Perhaps, though, it is safe in this particular case. Of
> course, I may be misunderstanding him. I'm not sure of his reasoning
> but perhaps we should include him in this discussion.
I see. It would be interesting to hear his reasoning, because it is not clear to me.
> From my point of view, I can't see a strong argument for doing this on
> AArch64. StoreLoad barriers are not fantastically expensive there so
> it may not be worth going to such extremes. The cost of a StoreLoad
> barrier doesn't seem to be so much more than the StoreStore that we
> have to have anyway.
Yeah about performance I’m not sure when it’s worth removing these fences and on what hardware.
In this case though, if it makes us any happier, I think we could probably get rid of the storestore barrier too:
The latent reference store is forced to serialize anyway after the dirty card value write is observable and about to be cleaned. So the potential consistency violation that the card looks dirty and then cleaning thread reads a stale reference value could not happen with my approach even without storestore hardware protection. I didn’t give it too much thought but on the top of my mind I can’t see any problems. If we want to get rid of storestore too I can give it some more thought.
But you know much better than me if these fences are problematic or not. :)
More information about the hotspot-dev