review for 7129164: JNI Get/ReleasePrimitiveArrayCritical doesn't scale
david.holmes at oracle.com
Sun Jan 29 18:26:09 PST 2012
On 30/01/2012 10:30 AM, Tom Rodriguez wrote:
> On Jan 29, 2012, at 3:53 PM, David Holmes wrote:
>> On 28/01/2012 5:25 AM, Vladimir Kozlov wrote:
>>> I change alias to hotspot-dev since it affects everyone.
>> Thanks Vladimir!
>> Overall I find it difficult to follow the protocol that applies here - in particular how safepoints can and can not appear between different methods and the way in which needs_gc() might change value between the enter and exit of a critical region. Without that understanding it is hard to evaluate the correctness of the new approach.
> I tried to put in enough asserts to ensure that it was clear what invariants are in force when updating the state. Safepoints can appear wherever they normally can appear, while locking and waiting on mutexes. _needs_gc can only change from false to true during a safepoint. For the most part everything operates exactly as it used to once _needs_gc is set and the only trick is computing the correct value of _jni_active_count when setting _needs_gc to be true. The debug code that still performs the atomics attempts to verify that the computed value and actual value are in sync. Is there any more that I can explain about how it operates?
It's my ignorance of the overall operation that makes evaluating this
change difficult for me. But thanks to your side-band emails I now have
a much clear understanding of things - thanks. And the protocol seems safe.
>> 72 _wait_begin = os::javaTimeMillis();
>> 135 os::javaTimeMillis() - _wait_begin,
>> Are you sure you want to use the non-monotonic javaTimeMillis here? Use of this was recently changed elsewhere in the GC code.
> It's only printing code, but I'll switch it to use tty->timestamp().milliseconds() which will make it match our internal time stamps, which seems like a useful property. Is that ok?
The recent GC changes switched to using javaTimeNanos() converted to
millis - see 7117303 and 7129514. The TimeStamp is based on
elapsedCounter which is still a time-of-day source on some platforms and
so not monotonic.
> I've updated the webrev.
>>> In GC_locker::check_active_before_gc() move wait_begin initialization
>>> under Print* check since it used only under such checks. Could method
>>> check_active_before_gc() be called from different threads concurrently?
>>> Does it need lock? Note that other methods have lock.
>>> Tom Rodriguez wrote:
>>>> 200 lines changed: 126 ins; 33 del; 41 mod; 3958 unchg
>>>> 7129164: JNI Get/ReleasePrimitiveArrayCritical doesn't scale
>>>> The machinery for GC_locker which supports GetPrimitiveArrayCritical
>>>> maintains a count of the number of threads that currently have raw
>>>> pointers exposed. This is used to allow existing critical sections to
>>>> drain before creating new ones so that a GC can occur. Currently
>>>> these counts are updated using atomic all the time, even if a GC isn't
>>>> being requested. This creates scalability problem when a lot of
>>>> threads are hammering atomic operations on the jni_lock_count. The
>>>> count only needs to be valid when checking if a critical section is
>>>> currently active and when draining the existing sections. The fix is
>>>> to make the count be computed as part of the safepointing machinery
>>>> and to only decrement the counters when _needs_gc is true. In debug
>>>> mode the old count is maintained and validated against the lazily
>>>> computed count.
>>>> On a microbenchmark that stresses GetPrimitiveArrayCritical with many
>>>> threads and relatively short critical section it can more than double
>>>> the throughput. This also slightly speeds up the normal
>>>> GetPrimtiveArrayCritical calls. Mostly it's not easily measurable
>>>> with larger scale programs.
>>>> Tested with microbenchmark that stresses GetPrimtiveArrayCritical and
>>>> the crypto benchmarks of specjvm2008 on x86 and sparc. I also ran the
>>>> java.io regression tests from the JDK.
More information about the hotspot-dev