RFR(S) 8021335: Missing synchronization when reading counters for live threads and peak thread count

David Holmes david.holmes at oracle.com
Wed Oct 17 02:33:55 UTC 2018

Hi Dean,

Thanks for tackling this.

I'm still struggling to fully grasp why we need both the PerfCounters 
and the regular counters. I get that we have to decrement the live 
counts before ensure_join() has allowed Thread.join() to return, to 
ensure that if we then check the number of threads it has dropped by 
one. But I don't understand why that means we need to manage the thread 
count in two parts. Particularly as now you don't use the PerfCounter to 
return the live count, so it makes me wonder what role the PerfCounter 
is playing as it is temporarily inconsistent with the reported live 
count? Is it simply that we can't atomically decrement the PerfCounters, 
and we can't require the Threads_lock when decrementing?

On the code itself one thing I find irksome is that we already have a 
daemon flag in remove_thread but decrement_thread_counts doesn't get 
passed that and so always re-examines the thread to see if it is a 
daemon. I know only one of the code paths has the daemon flag, so 
perhaps we should hoist the daemon check up into JavaThread::exit and 
then pass a daemon parameter to decrement_thread_counts.

There's also some ambiguity (existing) in relation to the existence of 
the jt->threadObj() and its affect on whether the thread is considered a 
daemon or not. Your check has to mirror the existing checks - the 
threadObj may be NULL at removal if it was a failed attach, but the 
addition considers a JNI-attached thread to be non-daemon. Overall this 
seems buggy to me but as long as your check follows the same rules that 
is okay. In that regard JavaThread::exit should never encounter a NULL 

Minor nit: I suggest moving the two occurrences of the comment

// Do not count VM internal or JVMTI agent threads

inside is_hidden_thread so that we don't have to remember to update the 
comments if the definition changes.


On 17/10/2018 9:43 AM, dean.long at oracle.com wrote:
> New webrev:
> http://cr.openjdk.java.net/~dlong/8021335/webrev.4/
> dl
> On 10/16/18 11:59 AM, dean.long at oracle.com wrote:
>> On 10/16/18 10:28 AM, JC Beyler wrote:
>>> Hi Dean,
>>> I noticed a typo here :
>>> "are atomically" is in the comment but should no longer be there I 
>>> believe; the sentence probably should be:
>>> // These 2 counters are like the above thread counts, but are
>> Fixed!
>>> Any way we could move this test into a helper method and both 
>>> add_thread/current_thread_exiting could use the same "ignore" thread 
>>> code so that we ensure the two never get out of sync?
>>> +  if (jt->is_hidden_from_external_view() ||
>>> +      jt->is_jvmti_agent_thread()) {
>>> +    return;
>>> +  }
>> Good idea.  Fixed.
>>> (Also by curiosity, add_thread has an assert on the Threads_lock but 
>>> not thread_exiting?)
>> Right, I followed the existing pattern where current_thread_exiting() 
>> only uses the atomic counters, so it doesn't need Threads_lock.
>> dl
>>> Thanks,
>>> Jc
>>> On Tue, Oct 16, 2018 at 9:52 AM <dean.long at oracle.com 
>>> <mailto:dean.long at oracle.com>> wrote:
>>>     https://bugs.openjdk.java.net/browse/JDK-8021335
>>>     http://cr.openjdk.java.net/~dlong/8021335/webrev.3/
>>>     <http://cr.openjdk.java.net/%7Edlong/8021335/webrev.3/>
>>>     This change attempts to fix an old and intermittent problem with
>>>     ThreadMXBean thread counts.
>>>     Changes have 3 main parts:
>>>     1. Make sure hidden threads don't affect counts (even when exiting)
>>>     2. Fix race reading counts using atomic counters
>>>     3. Remove some workarounds in test (because they hide bugs)
>>>     dl
>>> -- 
>>> Thanks,
>>> Jc

More information about the serviceability-dev mailing list