8227226: Segmented array clearing for ZGC

Sciampacone, Ryan sci at amazon.com
Wed Aug 7 01:05:11 UTC 2019

Although least intrusive, it goes back to some of the earlier complaints about using false in the constructor for do_zero.  It also makes a fair number of assumptions (and goes against the hierarchies intent) on initialization logic to hide in finish().  That said, I agree that is fairly clean - and definitely addresses the missed cases of the earlier webrev.

2 things,

1. Isn't the substitute_oop_array_klass() check too narrow?  It will only detect types Object[], and not any other type of reference array (such as String[]) ?  I believe there's a bug here (correct me if I'm wrong).
2. I'd want to see an assert() on the sizeof(long) == sizeof(void *) dependency.  I realize what code base this is in but it would be properly defensive.

What does the reporting look like in this case?  Is the long[] type reported accepted?  I'm wondering if this depletes some of the simplicity.

On 8/2/19, 6:13 AM, "hotspot-gc-dev on behalf of Per Liden" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of per.liden at oracle.com> wrote:

    Did some micro-benchmarking (on a Xeon E5-2630) with various segment 
    sizes between 4K and 512K, and 64K seems to offer a good trade-off. For 
    a 1G array, the allocation time increases by ~1%, but in exchange the 
    worst case TTSP drops from ~280ms to ~0.6ms.
    Updated webrev using 64K:
    On 8/2/19 11:11 AM, Per Liden wrote:
    > Hi Erik,
    > On 8/1/19 5:56 PM, Erik Osterlund wrote:
    >> Hi Per,
    >> I like that this approach is unintrusive, does its thing at the right 
    >> abstraction layer, and also handles medium sized arrays.
    > It even handles small arrays (i.e. arrays in small zpages) ;)
    >> Looks good.
    > Thanks! I'll test various segment sizes and see how that affects 
    > performance and TTSP.
    > cheers,
    > Per
    >> Thanks,
    >> /Erik
    >>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
    >>> Here's an updated webrev that should be complete, i.e. fixes the 
    >>> issues related to allocation sampling/reporting that I mentioned. 
    >>> This patch makes MemAllocator::finish() virtual, so that we can do 
    >>> our thing and install the correct klass pointer before the Allocation 
    >>> destructor executes. This seems to be the least intrusive why of 
    >>> doing this.
    >>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
    >>> This passed function testing, but proper benchmarking remains to be 
    >>> done.
    >>> cheers,
    >>> Per
    >>>> On 7/31/19 7:19 PM, Per Liden wrote:
    >>>> Hi,
    >>>> I found some time to benchmark the "GC clears pages"-approach, and 
    >>>> it's fairly clear that it's not paying off. So ditching that idea.
    >>>> However, I'm still looking for something that would not just do 
    >>>> segmented clearing of arrays in large zpages. Letting oop arrays 
    >>>> temporarily be typed arrays while it's being cleared could be an 
    >>>> option. I did a prototype for that, which looks like this:
    >>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
    >>>> There's at least one issue here, the code doing allocation sampling 
    >>>> will see that we allocated long arrays instead of oop arrays, so the 
    >>>> reporting there will be skewed. That can be addressed if we go down 
    >>>> this path. The code is otherwise fairly simple and contained. Feel 
    >>>> free to spot any issues.
    >>>> cheers,
    >>>> Per
    >>>>> On 7/26/19 2:27 PM, Per Liden wrote:
    >>>>> Hi Ryan & Erik,
    >>>>> I had a look at this and started exploring a slightly different 
    >>>>> approach. Instead doing segmented clearing in the allocation path, 
    >>>>> we can have the concurrent GC thread clear pages when they are 
    >>>>> reclaimed and not do any clearing in the allocation path at all.
    >>>>> That would look like this:
    >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
    >>>>> (I've had to temporarily comment out three lines of assert/debug 
    >>>>> code to make this work)
    >>>>> The relocation set selection phase will now be tasked with some 
    >>>>> potentially expensive clearing work, so we'll want to make that 
    >>>>> part parallel also.
    >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
    >>>>> Moving this work from Java threads onto the concurrent GC threads 
    >>>>> means we will potentially prolong the RelocationSetSelection and 
    >>>>> Relocation phases. That might be a trade-off worth doing. In 
    >>>>> return, we get:
    >>>>> * Faster array allocations, as there's now less work done in the 
    >>>>> allocation path.
    >>>>> * This benefits all arrays, not just those allocated in large pages.
    >>>>> * No need to consider/tune a "chunk size".
    >>>>> * I also tend think we'll end up with slightly less complex code, 
    >>>>> that is a bit easier to reason about. Can be debated of course.
    >>>>> This approach might also "survive" longer, because the YC scheme 
    >>>>> we've been loosely thinking about currently requires newly 
    >>>>> allocated pages to be cleared anyway. It's of course too early to 
    >>>>> tell if that requirement will stand in the end, but it's possible 
    >>>>> anyway.
    >>>>> I'll need to do some more testing and benchmarking to make sure 
    >>>>> there's no regression or bugs here. The commented out debug code 
    >>>>> also needs to be addressed of course.
    >>>>> Comments? Other ideas?
    >>>>> cheers,
    >>>>> Per
    >>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
    >>>>>> Somehow I lost the RFR off the front and started a new thread.
    >>>>>> Now that we're both off vacation I'd like to revisit this.  Can 
    >>>>>> you take a look?
    >>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of Sciampacone, 
    >>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of 
    >>>>>> sci at amazon.com> wrote:
    >>>>>>       http://cr.openjdk.java.net/~phh/8227226/webrev.01/
    >>>>>>       This shifts away from abusing the constructor do_zero magic 
    >>>>>> in exchange for virtualizing mem_clear() and specializing for the 
    >>>>>> Z version.  It does create a change in mem_clear in that it 
    >>>>>> returns an updated version of mem.  It does create change outside 
    >>>>>> of the Z code however it does feel cleaner.
    >>>>>>       I didn't make a change to PinAllocating - looking at it, the 
    >>>>>> safety of keeping it constructor / destructor based still seemed 
    >>>>>> appropriate to me.  If the objection is to using the sequence 
    >>>>>> numbers to pin (and instead using handles to update) - this to me 
    >>>>>> seems less error prone.  I had originally discussed handles with 
    >>>>>> Stefan but the proposal came down to this which looks much cleaner.
    >>>>>>       On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of 
    >>>>>> Sciampacone, Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on 
    >>>>>> behalf of sci at amazon.com> wrote:
    >>>>>>           1) Yes this was a conscious decision.  There was 
    >>>>>> discussion on determining the optimal point for breakup but given 
    >>>>>> the existing sizes this seemed sufficient.  This doesn't preclude 
    >>>>>> the ability to go down that path if its deemed absolutely 
    >>>>>> necessary.  The path for more complex decisions is now available.
    >>>>>>           2) Agree
    >>>>>>           3) I'm not clear here.  Do you mean effectively going 
    >>>>>> direct to ZHeap and bypassing the single function PinAllocating?  
    >>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
    >>>>>>           4) Agree
    >>>>>>           5) I initially had the exact same reaction but I played 
    >>>>>> around with a few other versions (including breaking up 
    >>>>>> initialization points between header and body to get the desired 
    >>>>>> results) and this ended up looking correct.  I'll try mixing in 
    >>>>>> the mem clearer function again (fresh start) to see if it looks 
    >>>>>> any better.
    >>>>>>           On 7/8/19, 5:49 AM, "Per Liden" <per.liden at oracle.com> 
    >>>>>> wrote:
    >>>>>>               Hi Ryan,
    >>>>>>               A few general comments:
    >>>>>>               1) It looks like this still only work for large pages?
    >>>>>>               2) The log_info stuff should be removed.
    >>>>>>               3) I'm not a huge fan of single-use utilities like 
    >>>>>> PinAllocating, at
    >>>>>>               least not when, IMO, the alternative is more 
    >>>>>> straight forward and less code.
    >>>>>>               4) Please make locals const when possible.
    >>>>>>               5) Duplicating _do_zero looks odd. Injecting a "mem 
    >>>>>> clearer", similar to
    >>>>>>               what Stefans original patch did, seems worth exploring.
    >>>>>>               cheers,
    >>>>>>               /Per
    >>>>>>               (Btw, I'm on vacation so I might not be 
    >>>>>> super-responsive to emails)
    >>>>>>               On 2019-07-08 12:42, Erik Österlund wrote:
    >>>>>>               > Hi Ryan,
    >>>>>>               >
    >>>>>>               > This looks good in general. Just some stylistic 
    >>>>>> things...
    >>>>>>               >
    >>>>>>               > 1) In the ZGC project we like the letter 'Z' so 
    >>>>>> much that we put it in
    >>>>>>               > front of everything we possibly can, including all 
    >>>>>> class names.
    >>>>>>               > 2) We also explicitly state things are private 
    >>>>>> even though it's
    >>>>>>               > bleedingly obvious.
    >>>>>>               >
    >>>>>>               > So:
    >>>>>>               >
    >>>>>>               > 39 class PinAllocating {
    >>>>>>               > 40 HeapWord* _mem;
    >>>>>>               > 41 public: -> 39 class ZPinAllocating { 40 
    >>>>>> private: 41 HeapWord* _mem;
    >>>>>>               >    42
    >>>>>>               >   41 public: I can be your sponsor and push this 
    >>>>>> change for you. I don't
    >>>>>>               > think there is a need for another webrev for my 
    >>>>>> small stylistic remarks,
    >>>>>>               > so I can just fix that before pushing this for 
    >>>>>> you. On that note, I'll
    >>>>>>               > add me and StefanK to the contributed-by section 
    >>>>>> as we all worked out
    >>>>>>               > the right solution to this problem 
    >>>>>> collaboratively. I have run through
    >>>>>>               > mach5 tier1-5, and found no issues with this 
    >>>>>> patch. Thanks, /Erik
    >>>>>>               >
    >>>>>>               > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
    >>>>>>               >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
    >>>>>>               >> https://bugs.openjdk.java.net/browse/JDK-8227226
    >>>>>>               >>
    >>>>>>               >> This patch introduces safe point checks into 
    >>>>>> array clearing during
    >>>>>>               >> allocation for ZGC.  The patch isolates the 
    >>>>>> changes to ZGC as (in
    >>>>>>               >> particular with the more modern collectors) the 
    >>>>>> approach to
    >>>>>>               >> incrementalizing or respecting safe point checks 
    >>>>>> is going to be
    >>>>>>               >> different.
    >>>>>>               >>
    >>>>>>               >> The approach is to keep the region holding the 
    >>>>>> array in the allocating
    >>>>>>               >> state (pin logic) while updating the color to the 
    >>>>>> array after checks.
    >>>>>>               >>
    >>>>>>               >> Can I get a review?  Thanks.
    >>>>>>               >>
    >>>>>>               >> Ryan
    >>>>>>               >

More information about the hotspot-gc-dev mailing list