RFR: 8006971: More missing barriers in taskqueues for RMO architectures

Srinivas Ramakrishna ysr1729 at gmail.com
Mon Aug 5 23:17:23 PDT 2013

The task queue stuff, from what I vaguely recall, was based on ABP as David C noted.


On Aug 5, 2013, at 23:02, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:

> I rubber-stamp it :)
> Reviewed.
> But you should use original changeset header (bug id 8012144, Summary: , Contributed-by: ) from hsx24 to track the same changes in both releases. You could use hg export/import since changes are the same.
> We can keep 8006971 open for more general solution later with added comment that we used 8012144 changes in 7u40 and jdk8.
> Thanks,
> Vladimir
> On 8/5/13 9:04 PM, David Holmes wrote:
>> Hi David,
>> On 6/08/2013 12:14 PM, David Chase wrote:
>>> I'm not a reviewer, but if that's ABP queues (a comment pointing to the relevant literature would sure be nice, no
>>> matter what it is; this stuff is toxically hard) I am in theory qualified to eyeball it, and I can perhaps nab some
>>> people from SSRG upstairs.  (It's late and I am still jetlagged, so I am not looking hard tonight, but it sure looks
>>> like ABP).
>>> If it's not based on ABP queues, say so, and I will look more cautiously and with less authority.
>> I believe, based on other correspondence, it is based on Chase-Lev work-stealing queue - but I'm not familiar with the
>> ABP queues. I don't know who actually wrote this code but I suspect they are no longer working on the JDK.
>> There is quite a bit of history with this one. The TaskQueue has been around for a while and works beautifully (as far
>> as we know :) ) on TSO systems. The SAP folk doing the PPC64 port encountered some issues and proposed some fixes. Long
>> story short we settled on this particular patch (a full fence() on the non-TSO systems) for 7u40 as it appeared to fix
>> the known problems (even if potentially a heavier barrier than needed - hence the conditional compilation to not affect
>> TSO systems).
>> I considered this the "wrong fix" as to me the need for the fence() indicated a missing barrier elsewhere - plus the
>> algorithm should be written without ifdefs - and I hoped we would resolve that before fixing it in JDK 8. But we haven't
>> been able to resolve that and we need this fix in place ASAP - hence the fallback to using the same form as was accepted
>> for 7u40.
>> So basically this is a forward port of an existing fix and we just need to rubber-stamp it. But one of our original
>> stampers is no longer with us so ...
>> Thanks,
>> David H.
>>> David
>>> On 2013-08-05, at 9:01 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>> Hi Vlad
>>>> On 6/08/2013 12:43 AM, Vladimir Danushevsky wrote:
>>>>> On Aug 3, 2013, at 6:53 AM, David Holmes wrote:
>>>>>> On 2/08/2013 11:57 PM, Vladimir Danushevsky wrote:
>>>>>>> The issue of missing memory barriers in the GC taskqueue code was first flagged here:
>>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-March/008848.html
>>>>>>> JDK7u40 fix for the same issue is located here:
>>>>>>> http://cr.openjdk.java.net/~dholmes/8012144/webrev.hs24/
>>>>>>> Initially we planned to port same solution to JDK8 however after reviewing the algorithm more we've started
>>>>>>> questioning a need for a full fence in between 'age' and 'bottom' elements. Since the intent is to keep 'bottom'
>>>>>>> memory reference from being executed before 'age' would a LoadLoad barrier (which in many cases is a cheaper
>>>>>>> solution) be sufficient? If so, the webrev below could possible be an adequate solution.
>>>>>>> http://cr.openjdk.java.net/~vladidan/8006971/webrev.00/
>>>>>>> We have tested both cases (fence and LL) on a hexa-core Power5 box running several test suites that currently
>>>>>>> expose the problem. The results are positive.
>>>>>> The loadload() should not be in any ifdef. The loadload() is part of the algorithmic correctness. The loadload()
>>>>>> will become a no-op on any platform that does not need to do anything special to preserve the ordering.
>>>>> As I understand LL is not an issue on platforms that are excluded from emitting the barrier in the provided patch.
>>>>> However I went to read further discussion at
>>>>> http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-March/008848.html
>>>>> and seems the concern is the Store is not guaranteed to propagate to all observers if read before the Writer's side
>>>>> 'sync'. I speculate that might not be an issue on PowerPC implementations with L1 cache snooping though, but even if
>>>>> this a case there is no robust way to detect that in runtime.
>>>> That sounds like the fence() is missing on the writer side.
>>>>> But that is likely not an issue on ARM (not sure about IA64, as it was listed in the very first webrev from Goetz)
>>>>> therefore we might inject OrderAccess::fence() for PPC (both 32- and 64-bit) and OrderAccess::loadload() for ARM
>>>>> (again, need info on IA64).
>>>>> That being said , for simplicity we can go with fence() for ARM case too since current ARMv7 implementations do not
>>>>> imply a separate barrier instruction for Loads.
>>>>> So in other words use same patch as in JDK7u40:
>>>>> http://cr.openjdk.java.net/~vladidan/8006971/webrev.01
>>>> I think this is masking the real problem with the algorithm but unless we get some true expertise involved here I
>>>> don't see how we will resolve this. And time is now critical.
>>>> So I will accept the 7u version of the fix under the same rationale as for 7u. We do need this to be fixed. Reviewed.
>>>> I hope this is not the end of it though as this data structure needs to be revisited in my opinion.
>>>> We need a second reviewer.
>>>> Thanks,
>>>> David
>>>>> Thanks,
>>>>> Vlad
>>>>>> Thanks,
>>>>>> David
>>>>>>> As a side note -
>>>>>>> perhaps it is possible to eliminate age/bottom potential reordering by loading both simultaneously through an
>>>>>>> Atomic class method. That would require though some structural changes to the layout of TaskQueueSuper class to
>>>>>>> align both fields together and ensure proper integer alignment (depending on 32/64-bit port), therefore this
>>>>>>> solution is less practical for the short term.
>>>>>>> Thanks,
>>>>>>> Vlad

More information about the hotspot-dev mailing list