RFR: 8165808: Add release barriers when allocating objects with concurrent collection
david.holmes at oracle.com
Mon Sep 12 23:22:38 UTC 2016
On 13/09/2016 4:08 AM, Kim Barrett wrote:
>> On Sep 11, 2016, at 9:38 PM, David Holmes <david.holmes at oracle.com> wrote:
>> Hi Kim,
>> On 11/09/2016 8:59 AM, Kim Barrett wrote:
>>> Please review this change to add release barriers when allocating
>>> objects that may be visible to a concurrent collector.
>>> This change only addresses the allocation side, where the
>>> initialization occurs; other subtasks of JDK-8160369 will address the
>>> (mostly GC-specific) readers that need to accomodate in-progress
>>> As part of this change, eliminated post_allocation_install_obj_klass,
>>> which consisted of a call to [release_]set_klass + assertions that are
>>> redundant with those in that called function.
>> Based on experiences with RMO architectures (ref JDK-8033920 - sorry it is a non-open bug report) I would expect the use of release_set_klass to be unconditional, not dependent on the nature of the GC. Publication of the object reference also has to be safe for other mutator threads. I'm not convinced by the brief argument in JDK-8033920 that thread-state transitions will always ensure visibility and ordering.
> There were two different issues discussed in (non-open) JDK-8033920.
> (1) Ensure setting the klass happens after some other parts of the
> object initialization, to ensure (some) concurrent GCs can parse the
> (2) Ensure the klass is visible to other (mutator) threads if the
> object is accessible to those threads, e.g. if the object escapes from
> the allocating thread.
> JDK-8160369 is only about (1), and this change (for JDK-8165808) is
> part of fixing (1). I'm not trying to address (2) here, even assuming
> there are problems in that area. (I don't know that there *are*
I have my doubts about treating these two issues separately, but will
not press the issue. For the general mutator path the devil is in the
detail, and as Derek indicated if there is a problem there the solution
would likely need the matching load-acquire.
> problems; I haven't explored that question carefully. I did find
> storestore barriers in the appropriate places in the bytecode
> interpreter, but I don't know how the reader side works, nor have I
> examined relevant parts of the compilers.)
FWIW the presence of storestore barriers require further examination as
their use tends to reflect analysis that didn't account for the reader
side. But that's a different issue again.
> For (1), only concurrent GCs are affected. A purely STW GC can rely
> on safepointing to ensure the ordering. (Safepointing also immunizes
> STW GCs from (2).)
> The parsability constraint for CMS and G1 is a consequence of
> concurrent processing of dirty card information (CMS precleaning and
> G1 concurrent refinement), which needs to be able to find object
> boundaries in the regions covered by marked cards. Hence, for both
> CMS and G1, only allocations directly in the old generation need
> special care. And for G1, only humongous objects are directly
> allocated in the old-gen.
> [I don't know enough about Shenandoah to know whether it has any
> similar requirements.]
> This is the rationale for the conditionalization on INCLUDE_ALL_GCS.
> It is also the rationale for only doing the release_set_klass in the
> collectedHeap.inline.hpp code paths, and not the TLAB / Eden
> allocation paths.
> It would be possible to add runtime conditionalization, but that could
> quickly add more cost than just executing the barrier unconditionally.
> Especially on TSO platforms :-) And these are slow-path allocations,
> used only when TLAB / Eden allocation is disallowed or fails, so an
> unnecessary release barrier shouldn't have much overall impact even on
> non-TSO platforms. Hence keeping it simple.
More information about the hotspot-dev