RFR(S): 8147844: new method j.l.Runtime.onSpinWait() and the corresponding x86 hotspot instrinsic

Vitaly Davidovich vitalyd at gmail.com
Wed Jan 27 05:35:37 UTC 2016

On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com> wrote:

> On Jan 26, 2016, at 8:08 PM, Vitaly Davidovich <vitalyd at gmail.com
> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>> wrote:
> You would but subsequent volatile load could move before the pause.  If
> you unroll the loop, you could (theoretically) end up with all loads moved
> before the pause but all appearing ordered with respect to each other, eg:
> cmp addr, 0 // from iteration 1
> je label
> cmp addr, 0 // from iteration 2
> je label
> ...
> pause
> What prevents that if pause is not a compiler member?
> I think volatile loads explicitly depend on control. If the pause node
> consumes and produces control it all should be in a rigid control chain.
Other regular loads (that don’t have control dependencies) would still be
> free to move around.

Is this to avoid out of thin air values? That is, suppose you have:

if (some condition)
    read volatile (or regular)

Regular load can be scheduled before the if and result used if control
reaches there.  For volatile, load cannot be scheduled above the if since
value can be bogus at that point?

Is it safe for compiler to assume that something else anchors loads around
the pause?

That aside, given the intended usage, I'm not sure what other regular loads
would be there.  The usage is a tight spin loop waiting for exit condition
to be met.  Although I suppose if compiler sees regular loads after the
loop exits successfully, perhaps scheduling them before the loop can be
beneficial.  Is that what you have in mind?

> igor
> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com
> <javascript:_e(%7B%7D,'cvml','igor.veresov at oracle.com');>> wrote:
>> Wouldn’t you use a volatile load for the memory location you’re polling?
>> igor
>> On Jan 26, 2016, at 6:15 PM, Vitaly Davidovich <vitalyd at gmail.com> wrote:
>> Subsequent loads at this point will likely be polls of same memory
>> location that just failed a test, and the author inserted a pause.  It's
>> unlikely that the memory changed that quickly and scheduling the next load
>> before the pause is equivalent to two loads back to back essentially, which
>> wouldn't make sense given the intended usage.  There's also the risk that
>> the compiler would move enough of those load+test pairs before the pause
>> and fill up the speculative pipeline with them; that pipeline will need to
>> be flushed once the spin exits since those load instructions likely
>> speculated incorrectly.  And here we're basically describing the reason for
>> putting pause there in the first place :).
>> On Tuesday, January 26, 2016, Igor Veresov <igor.veresov at oracle.com>
>> wrote:
>>> So, why does the new node have a memory effect? That would seem to
>>> prevent any movement of the subsequent loads in your loop, right? If that’s
>>> intentional I wonder why is that?
>>> igor
>>> On Jan 26, 2016, at 2:59 AM, Ivan Krylov <ivan at azulsystems.com> wrote:
>>> Hello,
>>> Some of you may have a seen a few e-mails on the core-libs alias about a
>>> proposed “spin wait hint”. The JEP is forming up nicely at
>>> https://bugs.openjdk.java.net/browse/JDK-8147832. There seems to be a
>>> consensus on the API side. It is now in a draft state and I hope this JEP
>>> will get targeted for java 9 shortly.  The upcoming API changes can be seen
>>> at the webrev:
>>> http://cr.openjdk.java.net/~ikrylov/8147844.jdk.00/
>>> At this time I would like to ask for a review of the hs-comp changes.
>>> The plan is push changes into class libraries and hotspot synchronously but
>>> that may happen after the JEP gets targeted.
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8147844
>>> Webrev: http://cr.openjdk.java.net/~ikrylov/8147844.hs.00/
>>> The idea of the fix is pretty simple: hotspot replaces a call to
>>> java.lang.Runtime.onSpinWait() with an intrinsic that is effectively a
>>> 'pause' instruction on x86.  This intrinsic is guarded by the
>>> -XX:±UseOnSpinWaitIntrinsic flag. For non-x86 platforms there is a
>>> verification code that makes sure the flag is off, VM will just execute at
>>> empty method java.lang.Runtime.onSpinWait() – effectively a no-op.
>>> According the [1] the 'pause' instruction is functional since SSE2, but
>>> even on CPUs prior to SSE2 the  'pause' instruction is a no-op and hence
>>> harmless, there seems to be no need to add guarding code for older
>>> generations of Intel CPUs.
>>> The proposed patch includes a simple regression test that simply makes
>>> sure that method java.lang.Runtime.onSpinWait() gets intrinsified.  There
>>> are several other producer-consumer-like performance tests ready that the
>>> authors of this JEP would be happy to make available under JEP-230 but I am
>>> uncertain about the process.
>>> Thanks,
>>> Ivan
>>> [1]  -
>>> https://software.intel.com/en-us/articles/benefitting-power-and-performance-sleep-loops
>> --
>> Sent from my phone
> --
> Sent from my phone

Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160127/10b4c56f/attachment.html>

More information about the hotspot-compiler-dev mailing list