RFR(XL): 8167108 - SMR and JavaThread Lifecycle

Daniel D. Daugherty daniel.daugherty at oracle.com
Tue Nov 21 14:46:10 UTC 2017


Hi Robin W!

Welcome to the Thread-SMR party!!

On 11/20/17 4:48 AM, Robin Westberg wrote:
> Hi Dan,
>
> Overall I must say this looks very nice, can’t wait to use it! (Also a disclaimer: not a reviewer, and have not looked at the gc or jmvti specific changes nor the tests (yet)).

Code reviews are always welcome (OpenJDK role does not matter).


> Didn’t spot any real issues, just a few small efficiency related things:
>
> src/hotspot/share/runtime/biasedLocking.cpp
>
> 217     for (JavaThread* cur_thread = Threads::first(); cur_thread != NULL; cur_thread = cur_thread->next()) {
> 218       if (cur_thread == biased_thread) {
> 219         thread_is_alive = true;
>
> This loop could be replaced with:
>
> ThreadsListHandle tlh;
> thread_is_alive = tlh.includes(biased_thread);

Nice catch! Fixed with your proposed change.

The careful reader will notice that in revoke_bias() we are not
creating a ThreadsListHandle that is in scope for the entire
function and we are doing cached monitor info walks without an
obvious ThreadsListHandle in place.

revoke_bias() is called at a safepoint from most locations. The
one call that is not made at a safepoint is from
BiasedLocking::revoke_and_rebias() and it is made by a JavaThread
that is revoking a bias on itself which is also safe.

We should add an assertion to revoke_bias() that looks something
like this:

   assert(requesting_thread == Thread::current() ||
          SafepointSynchronize::is_at_safepoint(),
          "must be operating on self or at a safepoint.");

but we'll do that in a separate follow up bug since that will
require biased locking focused testing.


> src/hotspot/share/runtime/memprofiler.cpp
>
> 55-56 grabs the Threads_lock, shouldn’t be needed anymore.

Nice catch! Deleting lines 55-56.


> src/hotspot/share/runtime/thread.inline.hpp
>
> 198   TerminatedTypes l_terminated = (TerminatedTypes)
> 199       OrderAccess::load_acquire((volatile jint *) &_terminated);
> 200   return check_is_terminated(_terminated);
>
> The variable used in the return statement was intended to be l_terminated I guess?

Ouch! Looks like JavaThread::is_exiting() is right, but
JavaThread::is_terminated() has been inefficient all this
time. Fixed.


> The following are more minor suggestions / comments:
>
> src/hotspot/share/runtime/thread.cpp
>
> 3432 // operations over all threads.  It is protected by its own Mutex
> 3433 // lock, which is also used in other contexts to protect thread
>
> Should this comment perhaps be revised to mention SMR?

It definitely needs some updating... Here's a stab at it:

// The Threads class links together all active threads, and provides
// operations over all threads. It is protected by the Threads_lock,
// which is also used in other global contexts like safepointing.
// ThreadsListHandles are used to safely perform operations on one
// or more threads without the risk of the thread exiting during the
// operation.
//
// Note: The Threads_lock is currently more widely used than we
// would like. We are actively migrating Threads_lock uses to other
// mechanisms in order to reduce Threads_lock contention.

> 4745   hash_table_size--;
> 4746   hash_table_size |= hash_table_size >> 1;
> ...
>
> This calculation is repeated around line 4922 as well, perhaps put it in a function?

The hash_table_size parameter is currently unused. We were using
a different hash table before that allowed the size to be set.
Unfortunately, that hash table didn't support being freed so we
switched to ResourceHashtable.

We have a project note to come back and update the underlying
hash table to work with dynamic table sizes, but Erik hasn't
had the cycles to do it yet.


> 4828   // If someone set a handshake on us just as we entered exit path, we simple cancel it.
>
> Perhaps something like “.. entered the exit path, we simply cancel it.”

I went with this:

   if (ThreadLocalHandshakes) {
     // The thread is about to be deleted so cancel any handshake.
     thread->cancel_handshake();
   }


> src/hotspot/share/runtime/thread.hpp
>
> 1179   bool on_thread_list() { return _on_thread_list; }
> 1180   void set_on_thread_list() { _on_thread_list = true; }
> 1181
> 1182   // thread has called JavaThread::exit() or is terminated
> 1183   bool is_exiting() const;
> 1184   // thread is terminated (no longer on the threads list); we compare
> 1185   // against the two non-terminated values so that a freed JavaThread
> 1186   // will also be considered terminated.
> 1187   bool check_is_terminated(TerminatedTypes l_terminated) const {
> 1188     return l_terminated != _not_terminated && l_terminated != _thread_exiting;
> 1189   }
> 1190   bool is_terminated();
>
> Use of const is inconsistent here, it’s added to 1183, should it perhaps be added to 1179 and 1190 as well then?

Fixed.


> src/hotspot/share/runtime/vm_operations.hpp
>
> 406   DeadlockCycle* result()               { return _deadlocks; };
> 407   VMOp_Type type() const                { return VMOp_FindDeadlocks; }
>
> Whitespace only change that seems spurious.

Agreed. Restored to the original.

Thanks for the review!

Dan


>
> Best regards,
> Robin
>
>> On 19 Nov 2017, at 02:49, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
>>
>> Greetings,
>>
>> Testing of the last round of changes revealed a hang in one of the new
>> TLH tests. Robbin has fixed the hang, updated the existing TLH test, and
>> added another TLH test for good measure.
>>
>> Here is the updated full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-full/
>>
>> Here is the updated delta webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-09-delta/
>>
>> Dan ran the bits thru the usual Mach5 tier[1-5] testing and there are
>> no unexpected failures.
>>
>> We welcome comments, suggestions and feedback.
>>
>> Dan, Erik and Robbin
>>
>>
>> On 11/15/17 3:32 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> Robbin rebased the project last night/this morning to merge with Thread
>>> Local Handshakes (TLH) and also picked up additional changesets up thru:
>>>
>>>> Changeset: fa736014cf28
>>>> Author:    cjplummer
>>>> Date:      2017-11-14 18:08 -0800
>>>> URL:http://hg.openjdk.java.net/jdk/hs/rev/fa736014cf28
>>>>
>>>> 8191049: Add alternate version of pns() that is callable from within hotspot source
>>>> Summary: added pns2() to debug.cpp
>>>> Reviewed-by: stuefe, gthornbr
>>> This is the first time we've rebased the project to bits that are this
>>> fresh (< 12 hours old at rebase time). We've done this because we think
>>> we're done with this project and are in the final review-change-retest
>>> cycle(s)... In other words, we're not planning on making any more major
>>> changes for this project... :-)
>>>
>>> *** Time for code reviewers to chime in on this thread! ***
>>>
>>> Here is the updated full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-full/
>>>
>>> Here's is the delta webrev from the 2017.11.10 rebase:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-08-delta/
>>>
>>> Dan has submitted the bits for the usual Mach5 tier[1-5] testing
>>> (and the new baseline also)...
>>>
>>> We're expecting this round to be a little noisier than usual because
>>> we did not rebase on a PIT snapshot so the new baseline has not been
>>> through Jesper's usual care-and-feeding of integration_blockers, etc.
>>>
>>> We welcome comments, suggestions and feedback.
>>>
>>> Dan, Erik and Robbin
>>>
>>>
>>> On 11/14/17 4:48 PM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I rebased the project to the 2017.11.10 jdk/hs PIT snapshot.
>>>> (Yes, we're playing chase-the-repo...)
>>>>
>>>> Here is the updated full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-07-full/
>>>>
>>>> Unlike the previous rebase, there were no changes required to the
>>>> open code to get the rebased bits to build so there is no delta
>>>> webrev.
>>>>
>>>> These bits have been run through JPRT and I've submitted the usual
>>>> Mach5 tier[1-5] test run...
>>>>
>>>> We welcome comments, suggestions and feedback.
>>>>
>>>> Dan, Erik and Robbin
>>>>
>>>>
>>>> On 11/13/17 12:30 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I rebased the project to the 2017.10.26 jdk10/hs PIT snapshot.
>>>>>
>>>>> Here are the updated webrevs:
>>>>>
>>>>> Here's the mq comment for the change:
>>>>>
>>>>>    Rebase to 2017.10.25 PIT snapshot.
>>>>>
>>>>> Here is the full webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-full/
>>>>>
>>>>> And here is the delta webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-06-delta/
>>>>>
>>>>> I ran the above bits throught Mach5 tier[1-5] testing over the holiday
>>>>> weekend. Didn't see any issues in a quick look. Going to take a closer
>>>>> look today.
>>>>>
>>>>> We welcome comments, suggestions and feedback.
>>>>>
>>>>> Dan, Erik and Robbin
>>>>>
>>>>>
>>>>> On 11/8/17 1:05 PM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> Resolving one of the code review comments (from both Stefan K and Coleen)
>>>>>> on jdk10-04-full required quite a few changes so it is being done as a
>>>>>> standalone patch and corresponding webrevs:
>>>>>>
>>>>>> Here's the mq comment for the change:
>>>>>>
>>>>>>    stefank, coleenp CR - refactor most JavaThreadIterator usage to use
>>>>>>      JavaThreadIteratorWithHandle.
>>>>>>
>>>>>> Here is the full webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-full/
>>>>>>
>>>>>> And here is the delta webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-05-delta/
>>>>>>
>>>>>> We welcome comments, suggestions and feedback.
>>>>>>
>>>>>> Dan, Erik and Robbin
>>>>>>
>>>>>>
>>>>>> On 10/9/17 3:41 PM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>>
>>>>>>> We have a (eXtra Large) fix for the following bug:
>>>>>>>
>>>>>>> 8167108 inconsistent handling of SR_lock can lead to crashes
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8167108
>>>>>>>
>>>>>>> This fix adds a Safe Memory Reclamation (SMR) mechanism based on
>>>>>>> Hazard Pointers to manage JavaThread lifecycle.
>>>>>>>
>>>>>>> Here's a PDF for the internal wiki that we've been using to describe
>>>>>>> and track the work on this project:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/SMR_and_JavaThread_Lifecycle-JDK10-04.pdf
>>>>>>>
>>>>>>> Dan has noticed that the indenting is wrong in some of the code quotes
>>>>>>> in the PDF that are not present in the internal wiki. We don't have a
>>>>>>> solution for that problem yet.
>>>>>>>
>>>>>>> Here's the webrev for current JDK10 version of this fix:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dcubed/8167108-webrev/jdk10-04-full
>>>>>>>
>>>>>>> This fix has been run through many rounds of JPRT and Mach5 tier[2-5]
>>>>>>> testing, additional stress testing on Dan's Solaris X64 server, and
>>>>>>> additional testing on Erik and Robbin's machines.
>>>>>>>
>>>>>>> We welcome comments, suggestions and feedback.
>>>>>>>
>>>>>>> Daniel Daugherty
>>>>>>> Erik Osterlund
>>>>>>> Robbin Ehn
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>



More information about the hotspot-gc-dev mailing list