RFR(m): 8214271: Fast primitive to wake many threads

David Holmes david.holmes at oracle.com
Wed Dec 19 04:51:01 UTC 2018

Hi Robbin,

Note this was focusing on the design of the API to ensure the semantics 
are clear and well-defined so that future users of the API know how to 
use it correctly and safely. It also then allows a review of the 
implementation to check back to the spec to see if it behaves as 
intended and checks all necessary constraints That's why I wanted it to 
be as complete as possible (spurious wakeups allowed? No - then lets 
document that) and now allow implementation specific choices unless 

I'll get to updated webrevs before I finish for the year on Friday.


On 17/12/2018 8:39 pm, Robbin Ehn wrote:
> Hi David,
> On 11/26/18 7:12 AM, David Holmes wrote:
>> Hi Robbin,
>> On 24/11/2018 2:55 am, Robbin Ehn wrote:
>>> Forgot RFR in subject.
>> Yep and now you have two different review threads happening in 
>> parallel unfortunately :(
>> - src/hotspot/share/utilities/waitBarrier.hpp
>> I'm studying just the WaitBarrierType API. Is this inherently tied to 
>> safepoint usage or intended as a general synchronization tool? As a 
>> general tool the API does not have clear semantics on how it should be 
>> used:
> There is two potential uses-cases in G1, which might require some changes.
>> - How do you communicate the current tag between the arming thread and 
>> the waiting threads? There would seem to be inherent races between 
>> arm(tag) and wait(tag) unless access to the tag itself is synchronized 
>> via another mechanism.
> As seen in gtest:
> - arm tag
> - publish tag
>> - What happens if wake() is called before disarm()? Should it be 
>> disallowed? (For other readers disarm() and wake() are distinct 
>> operations so that they fit better into the existing safepoint 
>> protocol which disarms the safepoint polling page at one spot and 
>> wakes blocked threads at another.)
> Yes, wake() have no effect if you have not disarmed.
> There are asserts checking this.
>> - Should there be constraints that the same thread must 
>> arm/disarm/wake? It doesn't really make sense to allow these 
>> operations to happen in arbitrary order from multiple threads.
> Maybe, *thinking* if there is nice way to get the arming/waking thread 
> into an
> argument.
>> The semantics for re-arming with the same tag should be clearly set 
>> out not "implementation-defined". This should probably be a usage 
>> error IMHO - but it comes back to how the tag is expected to be used.
> If you have a way to detect that all threads have left the waitbarrier, 
> there is
> no problem with re-using the same tag. So yes, re-arming before that 
> with the
> same tag should be an error. But since the waitbarrier can't tell the 
> difference
> and do not know which tag it was last armed for I have no assert.
>> As Andrew mentioned there needs to be documentation regarding spurious 
>> wakeups or other "interruptions" at the API level. And I assume a 
>> blocked wait() only ever returns in response to a wake().
> Yes, if it's disarmed before. Since these are handled internally there 
> are no
> spurious wakeups, not sure why we want to document that, except inside the
> implementation. E.g. semaphore.hpp does not document that it do not have 
> it.
>> Nothwithstanding clarification of the above may I suggest the 
>> following rewrite of the API documentation for further clarity:
>> /* Platform independent WaitBarrier API.
>>     An armed WaitBarrier prevents threads from advancing until the
>>     barrier is disarmed and the waiting threads woken. The barrier is
>>     armed by setting a non-zero value - the tag.
>>     Expected Usage:
>>       - Arming thread:
>>           tag = ...;  // non-zero value
>>           barrier.arm(tag);
>>           <work>
>>           barrier.disarm();
>>           barrier.wake();
>>         - After arm(tag) returns any thread calling wait(tag) will block
>>         - After disarm() returns any subsequent calls to wait(tag) 
>> will not block
>>         - After wake() returns all blocked threads are unblocked and 
>> eligible to execute again
>>         - After calling disarm() and wake() the barrier is ready to be 
>> re-armed with a new tag
>>      - Waiting threads
>>          wait(tag); // don't execute following code unless 'safe'
>>          <work>
>>        - A call to wait(tag) will block if the barrier is armed with 
>> the value 'tag'; else it will return immediately.
>>        - A blocked thread is eligible to execute again once the 
>> barrier is disarmed and wake() has been called.
>>   A primary goal of the WaitBarrier implementation is to wake all waiting
>>   threads as fast, and as concurrently, as possible.
>> */
> Updated, added <publish tag> after barrier.arm(tag);
>> Looking at the "implementation" in this file I'm unclear on the way 
>> the Linux specialization is being handled here. Why do we need the 
>> template? Can't this just be done the same way we do Semaphore?
> Because the gtest can test all implementations available on that platform.
> Meaning I can test both implementation locally and compare performance 
> etc..
> I'll post one update after I sorted out the other review parts also.
> Thanks, Robbin
>> More to follow.
>> Thanks,
>> David
>> -----
>>> /Robbin
>>> On 2018-11-23 17:51, Robbin Ehn wrote:
>>>> Hi all, please review.
>>>> When a safepoint is ended we need a way to get back to 100% 
>>>> utilization as fast
>>>> as possible. 100% utilization means no idle cpu in the system if 
>>>> there is a
>>>> JavaThread that could be executed. The traditional ways to wake 
>>>> many, e.g.
>>>> semaphore, pthread_cond, is not implemented with a single syscall 
>>>> instead they
>>>> typical do one syscall per thread to wake.
>>>> This change-set contains that primitive, the WaitBarrier, and a 
>>>> gtest for it.
>>>> No actual users, which is in coming patches.
>>>> The WaitBarrier solves by doing a cooperative semaphore posting, 
>>>> threads woken
>>>> will also post. On Linux we can instead directly use a futex and 
>>>> with one
>>>> syscall wake all. Depending on how many threads and cpus the 
>>>> performance vary,
>>>> but a good utilization of the machine, just on the edge of 
>>>> saturated, the time to reach 100% utilization is around 3 times 
>>>> faster with the WaitBarrier (where futex is faster than semaphore).
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~rehn/8214271/webrev/
>>>> CR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8214271
>>>> Passes 100 iterations of gtest on our platforms, both fastdebug and 
>>>> release.
>>>> And have been stable when used in safepoints (t1-8) (coming patches).
>>>> Thanks, Robbin

More information about the hotspot-dev mailing list