RFR: JDK-8086056: ParNew: auto-tune ParGCCardsPerStrideChunk

Bengt Rutisson bengt.rutisson at oracle.com
Fri Jun 26 06:39:21 UTC 2015


Hi Tony,

On 2015-06-25 15:29, Tony Printezis wrote:
> Bengt,
>
> Thanks for looking at this again. See inline.
>
> On June 25, 2015 at 4:06:40 AM, Bengt Rutisson 
> (bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>) wrote:
>
>>
>> Hi Tony,
>>
>> No worries about the delay, I'm also quite busy with other things 
>> here. ;)
>>
>> On 2015-06-24 20:41, Tony Printezis wrote:
>>> New webrev that covers all the suggestions:
>>>
>>> http://cr.openjdk.java.net/~tonyp/8086056/webrev.1/ 
>>> <http://cr.openjdk.java.net/%7Etonyp/8086056/webrev.1/>
>>
>> Thanks for providing a new webrev!
>>
>> Some comments:
>>
>> parNewGeneration.cpp
>>
>> The flag validation code on lines 886-912 can be replaced by the 
>> recently added support for command line verification. See JEP-245:
>>
>> JEP 245: Validate JVM Command-Line Flag Arguments
>> http://openjdk.java.net/jeps/245
>> http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/5bbf25472731
>>
>> You can simply specify a constraint function in the definition of the 
>> flags in globals.hpp.
>
>
> Thanks for pointing me to this. Given the flags are manageable, do the 
> constraints fire when the flags are updated dynamically? If they do, 
> what happens if the contrain fails?
>


Well, this is very new to me too. :)

The JEP says:

"The range and constraints checks are done every time a flag changes, as 
well as late in the JVM initialization routine (i.e., in init_globals() 
after stubRoutines_init2()) at the time when all flags have their final 
values set. We will continue to check the manageable flags as long as 
the JVM runs."

I haven't actively been using this myself, but I assume that the 
constraints will be checked every time a managaeble flag is changed and 
that the one trying to change it will get an error reported back if they 
try to change it in an invalid way.

I'm copying Gerard on this email. He has built this support and can 
surely answer the details.

>
>>
>> I have a very hard time following all the type conversions going on 
>> in adjust_cards_per_stride().
>>
>> The capacity flags are size_t, the DynamicParGCStrides[Min/Max]Size 
>> flags are uintx, we then convert the result through intptr_t,
>
>
> (to my defense!) I had to convert to intptr_t to use the log2_intptr() 
> function. And I assume the stride size values will be relatively small 
> (i.e., maybe several K). So, casting to intptr_t should be OK. But I 
> do like to contrain things as much as possible. Should we add an 
> artificial upper bound just in case?
>
>
>> int and uintx to eventually assign back to ParGCCardsPerStrideChunk 
>> as intx.
>
>
> So, I was going for the capacities being size_t and the stride sizes 
> being uintx. ParGCCardsPerStrideChunk is intx and I thought that uintx 
> would be more appropriate as it cannot be negative. I suggested to 
> also make ParGCCardsPerStrideChunk uintx but was I encouraged against 
> it. To keep the calculations simpler I’ll cast everything to size_t 
> locally and use log2_long() instead of log2_intrptr().
>
> What do I do with the types of the args? Let’s go for consistency: if 
> you guys want me to leave ParGCCardsPerStrideChunk as intx, I’ll leave 
> everything as intx. My preference would be to change everything to uintx.
>
>
>> The value of ParGCCardsPerStrideChunk is then used in the code to 
>> offset pointers of the jbyte* type. How can we be sure that this 
>> works correctly?
>
>
> I’d feel better if it was uintx. I’ll also check what happens if the 
> arg is not aligned properly.
>

This all sounds good. I need to see the new webrev to comment in detail, 
but in general I just think we should try to get rid of as much casting 
as we possibly can.

I wouldn't have minded changing ParGCCardsPerStrideChunk to uintx or 
size_t. But I read your comment to Thomas. I think it is fine to leave 
that for another change if you prefer.

>
>>
>> globalDefinitions.hpp
>>
>> I don't think that globalDefinitions.hpp is the right place to put 
>> linearly_interpolate_bounded(). I would prefer to have it as a local 
>> method in parNewGeneration.cpp.Or if you think it is actually going 
>> to be used more widely I think it should go somewhere in the 
>> utilities/ directory.
>
>
> Well, it is a general-purpose method, if you want to do such an 
> interpolation. Whether it will be needed by anything else, I don’t 
> know. If you don’t think I should package it up in this 
> general-purpose way, I’ll just go back to what I had before and not 
> factor it out, as the ParNewGeneration class would not be the right 
> place for it either.
>

Given that you can remove all the flag checking code I don't think I 
mind inining the logic inside 
ParNewGeneration::adjust_cards_per_stride() again.

>
>> Does it really make sense to use different types for X and Y in 
>> linearly_interpolate_bounded()?
>
>
> I think it does. The two ranges are separate and don’t have to be of 
> the same type. In this case, you can have the strides size as, say, 
> ushort and the capacity as size_t and it’d work. Note that the ratio 
> is calculated based on one range and then applied to the other range 
> so I think the calculation should be safe for separate types.
>

Ok.

>
>> I would think that one type is enough. Also, it does not look safe to 
>> just cast the type to double for a general purpose template method.
>
>
> Why (and how else would you do this)? This should be safe for numetic 
> types, which X and Y will be.
>


Right, as long as they are primitive numeric types it will work. And I 
guess that's most likely what you'd use. The template code itself would 
otherwise work with a class for for example BitIntegers if it had 
overloaded the comparison and numeric operators. But that's a bit far 
fetched I guess. ;)

Anyway, if you inline this code again you will just be casting a size_t 
to a double which should be safe. So, the discussion is kind of moot.

Finally, just a heads up. I will be on vacation for the coming three 
weeks. I doubt that I will be able to review any new version during that 
time. If you get reviews from others I'm perfectly fine with this being 
pushed. Depending on how different the changes that you push are 
compared to the versions I've reviewed I'll leave it up to you to decide 
whether you want to list me as a reviewer or not. I'm fine either way. 
It's not a problem to leave me out of the reviewer list if you are unsure.

Thanks,
Bengt

> Tony
>
>
>>
>> Thanks,
>> Bengt
>>
>>>
>>> (and I also found a version of hg that works OK with webrev; thanks 
>>> to Jon Masa for the help)
>>>
>>> I’m not really happy with the way I deal with the 32-bit case. 
>>> Better suggestions are welcome. Also, I’m not attached to what the 
>>> cmd line args are called. Better suggestions on that are welcome too.
>>>
>>> Tony
>>>
>>> On June 23, 2015 at 12:19:12 PM, Tony Printezis 
>>> (tprintezis at twitter.com <mailto:tprintezis at twitter.com>) wrote:
>>>
>>>> Hi Bengt,
>>>>
>>>> I’ll try factoring out the interpolation and tell me what you 
>>>> think. Sorry, I’m a bit behind on this as I’ve been busy with a 
>>>> couple of more urgent tasks hee. I’ll get this done shortly.
>>>>
>>>> Tony
>>>>
>>>> On June 22, 2015 at 3:36:10 AM, Bengt Rutisson 
>>>> (bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>) wrote:
>>>>
>>>>>
>>>>> Hi Tony,
>>>>>
>>>>> I agree with your comments. Thanks for fixing this!
>>>>>
>>>>> One answer at the end of this message...
>>>>>
>>>>> On 2015-06-18 15:50, Tony Printezis wrote:
>>>>>> Hi Bengt,
>>>>>>
>>>>>> Thanks for looking at it. Inline.
>>>>>>
>>>>>> On June 18, 2015 at 3:02:45 AM, Bengt Rutisson 
>>>>>> (bengt.rutisson at oracle.com <mailto:bengt.rutisson at oracle.com>) wrote:
>>>>>>
>>>>>>>
>>>>>>> Hi Tony,
>>>>>>>
>>>>>>> NIce to hear from you!
>>>>>>>
>>>>>>>
>>>>>>> On 18/06/15 00:30, Tony Printezis wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> A small patch for your consideration:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~tonyp/8086056/webrev.0/ 
>>>>>>>> <http://cr.openjdk.java.net/%7Etonyp/8086056/webrev.0/>
>>>>>>>
>>>>>>> A couple of style comments:
>>>>>>>
>>>>>>>  891   const size_t MinOldGenCapacity =      G;
>>>>>>>  892   const size_t MaxOldGenCapacity = 16 * G;
>>>>>>>  893 assert(MinOldGenCapacity <= MaxOldGenCapacity, "sanity");
>>>>>>>
>>>>>>> Local variables are normmaly named using lower case and underscore.
>>>>>>
>>>>>>
>>>>>> Yeah, I of course usually do that. But as they are constants I 
>>>>>> thought I’d make them look like constants. :-)
>>>>>>
>>>>>>
>>>>>>> So, min_old_gen_capacity and max_old_gen_capacity. Same for the 
>>>>>>> rest of the local variables in 
>>>>>>> ParNewGeneration::adjust_cards_per_stride().
>>>>>>>
>>>>>>> Having said that, I think it would be good to turn some of these 
>>>>>>> constants into flags (which means that the naming should be as 
>>>>>>> it is now ;) ).
>>>>>>
>>>>>>
>>>>>> Hang on! Are you and Masa encourating me to add 5 (!!!) more cmd 
>>>>>> line args to HotSpot? ;-)
>>>>>>
>>>>>>
>>>>>>> Especially since you say that you haven't done much performance 
>>>>>>> testing of these values.
>>>>>>
>>>>>>
>>>>>> To be clear: I did performance testing with chunks sizes in the 
>>>>>> 256-8k range and with three old gen sizes (2g, 10g, and 20g) and 
>>>>>> auto-tuning comes pretty close to optimal; I have numbers on the 
>>>>>> JIRA. But, yes, that was only with a couple of (synthetic) tests 
>>>>>> and only on one architecture (Linux/x64). If someone there could 
>>>>>> try this on a couple more apps and maybe on sparc, as we don’t 
>>>>>> happen to have any sparc boxes lying around :-), it’d be helpful.
>>>>>>
>>>>>> But, yes, your suggestion of making the chunk size / old gen 
>>>>>> capacity limits into cmd line args definitely makes sense and I 
>>>>>> had also considered that myself. But I had decided against it as 
>>>>>> I thought you wouldn’t want more cmd line args. :-) How about:
>>>>>>
>>>>>> +UseDynamicParGCStrides
>>>>>>
>>>>>> DynamicParGCStrides{Min,Max}OldGenCapacity
>>>>>>
>>>>>> DynamicParGCStrides{Min,Max}Size
>>>>>>
>>>>>> I should also make them all manageable.
>>>>>>
>>>>>>
>>>>>>> I would prefer that we could adjust them at runtime. Best would 
>>>>>>> of course be if you could take the time to do the performance 
>>>>>>> testing.
>>>>>>
>>>>>>
>>>>>> So, as I said earlier, I did. The numbers are on the JIRA. Is 
>>>>>> there anything else I could maybe run? Maybe larger old gen than 
>>>>>> 20g? (but that’s reaching the memory limits of my workstation, FWIW)
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I realize that the assert on line 893 might have been useful if 
>>>>>>> you were playing around with the numbers when you were 
>>>>>>> experimenting with the patch. But now it looks mostly like line 
>>>>>>> noise to me. Same with the assert on line 898.
>>>>>>
>>>>>>
>>>>>> Actually, it as basically stating the invariant for those 
>>>>>> parameters (in case someone wants to play around with them). But 
>>>>>> if we make the constants into cmd line args this will go away...
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I would also consider factoring the actual computation out into 
>>>>>>> a separate helper method.
>>>>>>
>>>>>>
>>>>>> You mean the interpolation? Like:
>>>>>>
>>>>>> (base_min, base_max, value_min, value_max, value) -> result in 
>>>>>> range [base_min, base_max]?
>>>>>>
>>>>>
>>>>> Yes, that's what I meant. But if you, as suggested above, get rid 
>>>>> of most (all?) of the constant definitions in 
>>>>> adjust_cards_per_stride() then I guess it is mostly only the 
>>>>> interpolation left. So, in that case it may not be necessary to 
>>>>> factor it out. I would have to see a new webrev to have more comments.
>>>>>
>>>>> Thanks,
>>>>> Bengt
>>>>>
>>>>>> Tony
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I also second Jon's comment on that it would be good with a more 
>>>>>>> explicit way of turning this feature off.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Bengt
>>>>>>>
>>>>>>>>
>>>>>>>> (BTW, for some reason some of the webrev output is a bit messed 
>>>>>>>> up. Not sure why, maybe some hg incompatibility I guess. The 
>>>>>>>> diffs look OK though. I also attached the patch to this e-mail.)
>>>>>>>>
>>>>>>>> There’s a bit of info on the JIRA on the rationale for the patch:
>>>>>>>>
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8086056
>>>>>>>>
>>>>>>>> The min / max values for the old gen capacity 
>>>>>>>> and ParGCCardsPerStrideChunk were chosen empircally after 
>>>>>>>> running a few (mostly synthetic tests) on Linux x64. If someone 
>>>>>>>> has the cycles to do a more extensive performance study, I’d be 
>>>>>>>> happy to revise them accordingly.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Tony
>>>>>>>>
>>>>>>>> -----
>>>>>>>>
>>>>>>>> Tony Printezis | JVM/GC Engineer / VM Team | Twitter
>>>>>>>>
>>>>>>>> @TonyPrintezis
>>>>>>>> tprintezis at twitter.com <mailto:tprintezis at twitter.com>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----
>>>>>>
>>>>>> Tony Printezis | JVM/GC Engineer / VM Team | Twitter
>>>>>>
>>>>>> @TonyPrintezis
>>>>>> tprintezis at twitter.com <mailto:tprintezis at twitter.com>
>>>>>>
>>>>>
>>>> -----
>>>>
>>>> Tony Printezis | JVM/GC Engineer / VM Team | Twitter
>>>>
>>>> @TonyPrintezis
>>>> tprintezis at twitter.com <mailto:tprintezis at twitter.com>
>>>>
>>> -----
>>>
>>> Tony Printezis | JVM/GC Engineer / VM Team | Twitter
>>>
>>> @TonyPrintezis
>>> tprintezis at twitter.com <mailto:tprintezis at twitter.com>
>>>
>>
>
>
>
>
>
> -----
>
> Tony Printezis | JVM/GC Engineer / VM Team | Twitter
>
> @TonyPrintezis
> tprintezis at twitter.com <mailto:tprintezis at twitter.com>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/attachments/20150626/bb8a5b3c/attachment-0001.html>


More information about the hotspot-gc-dev mailing list