RFR(XS) for PeriodicTask_lock cleanup (8072439)

Daniel D. Daugherty daniel.daugherty at oracle.com
Wed Feb 18 16:28:37 UTC 2015


Thanks for the fast review! Replies embedded below...

On 2/17/15 5:53 PM, David Holmes wrote:
> Hi Dan,
> A few comments, mainly on the comments :)

Apropos since the changeset is mostly comments...

> On 18/02/2015 8:42 AM, Daniel D. Daugherty wrote:
>> On 2/17/15 3:22 PM, Carsten Varming wrote:
>>> Dear Daniel,
>>> Looks good to me.
>> Thanks for the fast review.
>>> The line: "OrderAccess::fence();  // ensure WatcherThread sees update
>>> in main loop" seems unnecessary as the lock acts as a memory barrier.
>> Yes, I keep looking at that line from the original work on
>> JDK-7127792 and wonder why it's there... I'll chase that down
>> with the original folks...
> Only needed when using lock-free access, so can be removed now.

Yes, that will be the likely outcome.

However, the original code added by JDK-7127792 wasn't lock-free so
I still want to chase down the original thinking.

> java.cpp:
> Typo: likelyhood -> likelihood


> ---
> task.cpp
>  81 int PeriodicTask::time_to_wait() {
>  82   assert(Thread::current()->is_Watcher_thread(), "must be 
> WatcherThread");
> This is currently true but not sure it has to be true. If we are 
> assuming/constraining a subset of the task API to only be callable by 
> the WatcherThread then perhaps we should document that in task.hpp ? 
> And as the WatcherThread is a friend, none of those methods need to be 
> public - ie execute_if_pending and time_to_wait (realtime_tick is 
> already private).

I was too focused on adding asserts that enforced how it works today
and not on how it might be used down the road. There's no reason to
restrict the caller of time_to_wait() to the WatcherThread. I've
deleted the assert on line 82 and I added a comment to task.hpp:

   // Requires the PeriodicTask_lock.
   static int time_to_wait();

By leaving time_to_wait() public, that allows code other than the
WatcherThread to use it perhaps for debugging...

> 112 void PeriodicTask::enroll() {
>  113   // Follow normal safepoint aware lock enter protocol if the 
> caller does
>  114   // not already own the PeriodicTask_lock. Otherwise, we don't 
> try to
>  115   // enter it again to avoid deadlocking with a safepoint that 
> started
>  116   // after the initial enter and before this enter.
>  117   //
>  118   MutexLockerEx ml(PeriodicTask_lock->owned_by_self() ? NULL
>  119                                                       : 
> PeriodicTask_lock);
> The deadlock is not with a safepoint interaction but the normal 
> self-deadlock that would occur trying to lock() a Monitor/Mutex that 
> you already own. (Monitors/Mutexes do not support recursive locking - 
> in contrast to ObjectMonitors.) Ditto for disenroll.

I have been working on Java Monitors for waaaayyyy too long!

In the header comment for src/share/vm/runtime/mutex.cpp:

//   Native Monitors do *not* support nesting or recursion but ...

> The original comment is the correct one with respect to acquiring with 
> a safepoint check if not already locked:
> /* XXX could be called from a JavaThread, so we have to check for
>  * safepoint when taking the lock to avoid deadlocking */

I'm still not fond of that comment. Perhaps:

   // Follow normal safepoint aware lock enter protocol if the caller does
   // not already own the PeriodicTask_lock. Otherwise, we don't try to
   // enter it again because VM internal Mutexes do not support recursion.

I'm trying to make it clear that we're following safepoint aware protocol
even though we're using MutexLockerEx. I also wanted a clear statement
about why we skipped the enter if we already owned the lock. I don't
think that we could be called from a JavaThread to be the important part.

> ---
> thread.cpp:
> 1198 int WatcherThread::sleep() const {
> 1199   // The WatcherThread does not honor the safepoint protocol for
> 1200   // PeriodicTask_lock in order to provide more consistent task
> 1201   // execution timing.
> Not sure that is really the correct characterisation. The WT doesn't 
> need to honor the safepoint protocol because it isn't a JavaThread and 
> so won't block even if a safepoint is in progress. So checking for a 
> safepoint is simply a waste of time.

Much better way of putting it. Thanks! How about:

   // The WatcherThread is not a JavaThread so we do not honor the
   // safepoint protocol for the PeriodicTask_lock.

> 3565     MutexLocker ml(PeriodicTask_lock);
> Why did you allow for a safepoint check here? It would only be 
> necessary if a JavaThread might concurrently acquire the 
> PeriodicTask_lock and a safepoint is initiated - during VM 
> initialization. Not completely impossible I guess but have never heard 
> of this occurring, so would require some other startup change to 
> trigger it.

There's no good reason to use MutexLockerEx here and I didn't
want to write a comment explaining why we were using one.
Earlier code in Threads::create_vm() uses MutexLocker instead
of MutexLockerEx.

Thanks again for the very thorough review!


> Thanks,
> David
> -----
>> Dan
>>> Carsten
>>> On Tue, Feb 17, 2015 at 4:44 PM, Daniel D. Daugherty
>>> <daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>> 
>>> wrote:
>>>     Greetings,
>>>     My fix for the following bug:
>>>         JDK-8047720 Xprof hangs on Solaris
>>>     that was pushed to JDK9 last June needs to be cleaned up.
>>>     Thanks to Alex Garthwaite (agarthwaite at twitter.com
>>>     <mailto:agarthwaite at twitter.com>) and Carsten
>>>     Varming (varming at gmail.com <mailto:varming at gmail.com>) for
>>>     reporting the mess that I made
>>>     in WatcherThread::stop() and for suggesting fixes.
>>>     This code review is for a general cleanup pass on PeriodicTask_lock
>>>     and some of the surrounding code. This is a targeted review in that
>>>     I would like to hear from three groups of people:
>>>     1) The author and reviewers for:
>>>        JDK-7127792 Add the ability to change an existing PeriodicTask's
>>>                    execution interval
>>>        Rickard, David H, and Markus G.
>>>     2) The reviewers for:
>>>        JDK-8047720 Xprof hangs on Solaris
>>>        Markus G and Coleen
>>>     3) Alex and Carsten
>>>     Here's the webrev URL:
>>> http://cr.openjdk.java.net/~dcubed/8072439-webrev/0-for_jdk9_hs_rt/ 
>>> <http://cr.openjdk.java.net/%7Edcubed/8072439-webrev/0-for_jdk9_hs_rt/>
>>>     I've attached the original RFR for JDK-8047720 that explains
>>>     the original deadlock that was being fixed. Similar testing
>>>     will be done with this fix.
>>>     Dan

More information about the hotspot-runtime-dev mailing list