RFR (S): 8182703: Correct G1 barrier queue lock orderings
erik.osterlund at oracle.com
Mon Jul 3 12:53:58 UTC 2017
Thank you for the review!
Regarding the use of + x in the current enum system for lock rankings, I
agree that it is not a brilliant system and you feel a bit sad when your
lock rank is "leaf+2". However, sometimes I feel like abstracting
numbers with names can become confusing as well - even misleading. Like
for example how leaf is no longer a leaf and how it is questionable
whether special is really all that special any longer.
When I thought about possible name for access + 0 and access + 1, I was
thinking something in the lines of "perhaps access_inner/outer or
access_leaf/nonleaf", but then that might get confusing if suddenly
access will need 3 ranks for some reason and we get an "access_special"
rank or something.
Perhaps a different solution than enum names would be nice long-term for
lock ranks and deadlock detection, but I believe that might be outside
of the current scope for this change.
On 2017-07-03 13:57, Mikael Gerdin wrote:
> Hi Erik,
> On 2017-06-26 15:34, Erik Österlund wrote:
>> Webrev: http://cr.openjdk.java.net/~eosterlund/8182703/webrev.02/
> I think this change makes sense and I agree with your reasoning below.
> I'm leaning towards suggesting creating a named enum value for
> "access+1" to begin a move towards getting rid of adding and
> subtracting values from enums in this code. I don't have a good name
> for it, though.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8182703
>> The G1 barrier queues have very awkward lock orderings for the
>> following reasons:
>> 1) These queues may queue up things when performing a reference write
>> or resolving a jweak (intentionally or just happened to be jweak,
>> even though it looks like a jobject), which can happen in a lot of
>> places in the code. We resolve JNIHandles while holding special locks
>> in many places. We perform reference writes also in many places. Now
>> the unsuspecting hotspot developer might think that it is okay to
>> resolve a JNIHandle or perform a reference write while possibly
>> holding a special lock. But no. In some cases, object writes have
>> been moved out of locks and replaced with lock-free CAS, only to
>> dodge the G1 write barrier locks. I don't think the G1 lock ordering
>> issues should shape the shared code rather than the other way around.
>> 2) There is an issue that the shared queue locks have a "special"
>> rank, which is below the lock ranks used by the cbl monitor and free
>> list monitor. This leads to an issue when these locks have to be
>> taken while holding the shared queue locks. The current solution is
>> to drop the shared queue locks temporarily, introducing nasty data
>> races. These races are guarded, but the whole race seems very
>> I argue that if the G1 write barrier queue locks were simply set
>> appropriately in the first place by analyzing what ranks they should
>> have, none of the above issues would exist. Therefore I propose this
>> new ordering.
>> Specifically, I recognize that locks required for performing memory
>> accesses and resolving JNIHandles are more special than the "special"
>> rank. Therefore, this change introduces a new lock ordering category
>> called "access", which is to be used by barriers required to perform
>> memory accesses. In other words, by recognizing the rank is more
>> special than "special", we can remove "special" code to walk around
>> making its rank more "special". That seems desirable to me. The
>> access locks need to comply to the same constraints as the special
>> locks: they may not perform safepoint checks.
>> The old lock ranks were:
>> SATB_Q_FL_lock: special
>> SATB_Q_CBL_mon: leaf - 1
>> Shared_SATB_Q_lock: leaf - 1
>> DirtyCardQ_FL_lock: special
>> DirtyCardQ_CBL_mon: leaf - 1
>> Shared_DirtyCardQ_lock: leaf - 1
>> The new lock ranks are:
>> SATB_Q_FL_lock: access (special - 2)
>> SATB_Q_CBL_mon: access (special - 2)
>> Shared_SATB_Q_lock: access + 1 (special - 1)
>> DirtyCardQ_FL_lock: access (special - 2)
>> DirtyCardQ_CBL_mon: access (special - 2)
>> Shared_DirtyCardQ_lock: access + 1 (special - 1)
>> Each PtrQueue and PtrQueueSet group, SATB or DirtyCardQ have the same
>> group of locks. The free list lock, the completed buffer list monitor
>> and the shared queue lock.
>> 1) The free list lock and completed buffer list monitors (members of
>> PtrQueueSet) are disjoint. We never hold both of them at the same time.
>> Rationale: The free list lock is only used from
>> PtrQueueSet::allocate_buffer, PtrQueueSet::deallocate_buffer and
>> PtrQueueSet::reduce_free_list, and no callsite from there can be
>> expanded where the cbl monitor is acquired. So therefore it is
>> impossible to acquire the cbl monitor while holding the free list
>> lock. The opposite case of acquiring the free list lock while holding
>> the cbl monitor is also not possible; only the following places
>> acquire the cbl monitor: PtrQueueSet::enqueue_complete_buffer,
>> PtrQueueSet::notify_if_necessary, FreeIdSet::claim_par_id,
>> FreeIdSet::release_par_id, DirtyCardQueueSet::get_completed_buffer,
>> SATBMarkQueueSet::apply_closure_to_completed_buffer and
>> SATBMarkQueueSet::abandon_partial_marking. Again, neither of these
>> paths where the cbl monitor is held can expand callsites to a place
>> where the free list locks are held. Therefore it holds that the cbl
>> monitor can not be held while the free list lock is held, and the
>> free list lock can not be held while the cbl monitor is held.
>> Therefore they are held disjointly.
>> 2) We might hold the shared queue locks before acquiring the
>> completed buffer list monitor. (today we drop the shared queue lock
>> then and reacquire it later as a hack as already described)
>> 3) We do not acquire a shared queue lock while holding the free list
>> lock or completed buffer list monitor, as there is no reference from
>> a PtrQueueSet to its shared queue, so those code paths do not know
>> how to reach the shared PtrQueue to acquire its lock. The derived
>> classes are exceptions but they never use the shared queue lock while
>> holding the completed buffer list monitor or free list lock.
>> DirtyCardQueueSet uses the shared queue for concatenating logs (in a
>> safepoint without holding those locks). The SATBMarkQueueSet uses the
>> shared queue for filtering the buffers, fiddling with activeness,
>> printing and resetting, all without grabbing any locks.
>> 4) We do not acquire any other lock (above event) while holding the
>> free list lock or completed buffer list monitors. This was discovered
>> by manually expanding the call graphs from where these two locks are
>> Derived constraints:
>> a) Because of observation 1, the free list lock and completed buffer
>> list monitors can have the same rank.
>> b) Because of observations 1 and 2, the shared queue lock ought to
>> have a rank higher than the ranks of the free list lock and the
>> completed buffer list monitors (not the case today).
>> c) Because of of observation 3 and 2, the free list lock and
>> completed buffer list monitors ought to have a rank lower than the
>> rank of the shared queue lock.
>> d) Because of observation 4 (and constraints a-c), all the barrier
>> locks should be below the "special" rank without violating any
>> existing ranks.
>> The proposed new lock ranks conform to the constraints derived from
>> my observations. It is worth noting that the potential relationship
>> that could break (and why they do not) are:
>> 1) If a lock is acquired from within the barriers that does not
>> involve the shared queue lock, the free list lock or the completed
>> buffer list monitor, we have now inverted their relationship as that
>> other lock would probably have a rank higher than or equal to
>> "special". But due to observation 4, there are no such cases.
>> 2) The relationship between the shared queue lock and the completed
>> buffer list monitor has been changed so both can be held at the same
>> time if the shared queue lock is acquired first (which it is). This
>> is arguably the way it should have been from the first place, and the
>> old solution had ugly hacks where we would drop the shared queue lock
>> to not run into the lock order assert (and only not to run into the
>> lock order assert, i.e. not to avoid potential deadlock) to ensure
>> the locks are not held at the same time. That code has now been
>> removed, so that the shared queue lock is still held when enqueueing
>> completed buffers (no dodgy dropping and reclaiming), and the code
>> for handling the races due to multiple concurrent enqueuers has also
>> been removed and replaced with an assertion that there simply should
>> not be multiple concurrent enqueuers. Since the shared queue lock is
>> now held throughout the whole operation, there will be no concurrent
>> 3) The completed buffer list monitor used to have a higher rank than
>> the free list lock. Now they have the same. Therefore, they could
>> previously allow them to be held at the same time if the cbl monitor
>> was acquired first. However, as discussed, there is no such case, and
>> they ought to have the same rank not to confuse their true
>> disjointness. If anyone insists we do not break this relationship
>> despite the true disjointness, I could consent to adding another
>> access lock rank, like this:
>> http://cr.openjdk.java.net/~eosterlund/8182703/webrev.01/ but I think
>> it seems better to have the same rank since they are actually truly
>> disjoint and should remain disjoint.
>> I do recognize that long term, we *might* want a lock-free solution
>> or something (not saying we do or do not). But until then, the ranks
>> ought to be corrected so that they do not cause these problems
>> causing everyone to bash their head against the awkward G1 lock ranks
>> throughout the code and make hacks around it.
>> Testing: JPRT with hotspot all and lots of local testing.
More information about the hotspot-gc-dev