RFR(S): 8039458: Ensure consistency of Monitor/Mutex lock acquisitions in relation to safepoint protocol
markus.gronlund at oracle.com
Wed Apr 9 12:04:03 UTC 2014
Kindly asking for reviews for the following bug fix/enhancement
VM code acquire Monitor/Mutex locks by respecting the correct lock order, which is enforced in debug builds (asserts). This is great.
However, you could be taking locks in the correct order and everything might work just fine during development and testing (even product) but depending on runtime circumstances (load/concurrency), the way the lock(s) was acquired, i.e. the lock acquisition mode, can lead to very hard to debug problems (hangs/deadlocks). This is primarily related to the lock acquisition mode option "Mutex::_no_safepoint_check_flag" by which you inform your intent to respect or not-respect the safepoint protocol during lock acquisition/ownership.
If a lock has already been acquired by a thread by passing the Mutex::_no_safepoint_check_flag to circumvent the safepoint protocol, the thread must *not* be allowed to attempt taking yet another lock on which it *might* block ( a lock which was not acquired by passing Mutex::_no_safepoint_check_flag in its acquisition operation), as such a lock would be participating in the safepoint protocol.
Today, doing this mistake will work just fine _if the lock is currently uncontended_ which it normally is in development/testing situations. However, once the lock then becomes contended (highly volatile pressured/concurrent product deployment) you might get the worst kind of issues to debug (hangs/deadlocks).
Naturally and as always, you have to very careful when you are taking locks using the Mutex::_no_safepoint_check_flag -however it can be very hard to determine what other code you can safely call once you are the owner of a Mutex::_no_safepoint_check_flag:ed lock - and today there is nothing but the developers attention to details that will find/help stay clear of this. I think we can do better here in enforcing some invariants in code for preempting potential deadlock prone situations.
Add debug assertions that enforce correct usages of the Mutex::_no_safepoint_check_flag when taking multiple locks.
This suggestion leverages much of the existing code targeting enforcement of lock ordering- this is a simple additive change to also add invariant checking on "lock acquisition mode".
Do note this primarily targets Java Threads running in the VM.
Also included code comment:
* Ensure consistency of Monitor/Mutex lock acquisitions
* for Java Threads running inside the VM.
* If a thread has already acquired lock(s) using
* "Mutex::_no_safepoint_check_flag" (effectively going outside the
* safepoint protocol), the thread should be disallowed to acquire any
* additional lock which _is_ participating in the safepoint protocol.
* If a "safepoint protocol aware" lock is contended, and the thread entering
* is unable to "fast acquire" the lock using cas/try_spin,
* it will need to block/park. Blocking on a contended lock involves a
* state transition and a potential SafepointSynchronize::block() call.
* Transitioning to a blocked state still holding "Mutex::_no_safepoint_check_flag"
* acquired locks is allowed, but is *very* deadlock prone.
* The effect of allowing this state to happen without checking is subtle
* and hard to debug since a problem might only appear under heavy load and
* only in situations with increased concurrency levels (product builds).
Artificial code changes to MutexLocker and Mutex::_no_safepoint_check_flag on certain locks for detecting potential deadlock situations.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-runtime-dev