RFR 8230594: Allow direct handshakes without VMThread intervention
patricio.chilano.mateo at oracle.com
Mon Jan 13 16:25:32 UTC 2020
The following patch adds the ability to execute direct handshakes
between JavaThreads without the VMThread intervention, and enables this
functionality for biased locking revocations.
The current handshake mechanism that uses the VMThread, either to
handshake one JavaThread or all of them, is still the default unless you
specify otherwise when calling Handshake::execute(). In order to avoid
adding additional overhead to this path that uses the VMThread
(especially the one that handshakes all JavaThreads) I added a new
HandshakeOperation pointer in the HandshakeState class,
_operation_direct, to be used for the direct handshake cases only and
whose access is serialized between JavaThreads by using a semaphore.
Thus, one direct handshake will be allowed at any given time, and upon
completion the semaphore will be signaled to allow the next handshaker
if any to proceed. In this way the old _operation can still be used only
by the VMThread without the need for synchronization to access it. The
handshakee will now check if any of _operation or _operation_direct is
set when checking for a pending handshake and will try to execute both
in HandshakeState::process_self_inner(). The execution of the
handshake’s ThreadClosure, either direct handshake or not, is still
protected by a semaphore, which I renamed to _processing_sem.
I converted the semaphore _done in HandshakeOperation to be just an
atomic counter because of bug
https://sourceware.org/bugzilla/show_bug.cgi?id=12674 (which I actually
hit once!). Since the semaphore could not be static anymore due to
possibly having more than one HandshakeOperation at a time, the
handshakee could try to access the nwaiters field of an already
destroyed semaphore when signaling it. In any case nobody was waiting on
that semaphore (we were not using kernel functionality), so just using
an atomic counter seems more appropriate.
In order to avoid issues due to disarming a JavaThread that should still
be armed for a handshake or safepoint, each JavaThread will now always
disarm its own polling page.
I also added a new test, HandshakeDirectTest.java, which tries to stress
the use of direct handshakes with revocations.
In terms of performance, I measured no difference in the execution time
of one individual handshake. The difference can be seen when several
handshakes at a time are executed as expected. So for example on Linux
running on an Intel Xeon 8167M cpu, test HandshakeDirectTest.java (which
executes 50000 handshakes between 32 threads) executes in around 340ms
using direct handshakes and in around 5.6 seconds without it. For a
modified version of that test that only executes 128 handshakes between
the 32 threads and avoids any suspend-resume, the test takes around 12ms
with direct handshakes and 19ms without it.
Tested with mach5, tiers1-6.
More information about the hotspot-runtime-dev