RFR (S): 8067341: Modify PLAB sizing algorithm to waste less
eric.caspole at oracle.com
Tue Aug 25 18:56:03 UTC 2015
Could you massage this great explanation into those comments? Especially
the part about 1 vs N threads makes it more clear to me.
On 8/25/2015 2:24 PM, Thomas Schatzl wrote:
> On Tue, 2015-08-25 at 13:43 -0400, Eric Caspole wrote:
>> Hi Thomas,
>> I like the new simpler math but I found the comment kind of confusing -
>> 61 // E.g. assume that if we recently used 100 words and a
>> TargetPLABWastePct of 10.
>> 62 // If we had one thread, we could waste up to 10 words to meet
>> that percentage.
>> 63 // Given that we also assume that that buffer is typically
>> half-full, the new
>> 64 // desired PLAB size is 20 words.
>> So you mean this GC cycle used 100 words of 1 thread's PLAB when we are
>> doing this calculation?
> This is total amount of allocation. The amount of allocation and (total)
> waste G1 can spend to meet the threshold, is ideally the same whether
> you use one or hundred threads.
> So, if there were one thread to copy the same amount of objects, it
> could waste the mentioned 10 words, so we set PLAB size to that size
> because that's the maximum that can ever be wasted (barring only having
> objects larger than 10; and actually in that case it would not waste
> anything because 10 divides 100 evenly, but think 101 :).
> [And expanding that, if you have n threads, each of them can waste <max
> plab size for one thread> / n in a rough approximation. The number of
> threads only comes into play later when actually trying to retrieve the
> desired plab size]
>> Does the previous PLAB size matter or only the
>> amount actually used?
> This only calculates the "optimal" PLAB size for the current/last GC.
> That value is fed into the usual exponential decaying average to remove
> allocation/waste spikes to guess the next PLAB size.
>> 66 // We account region end waste fully to PLAB allocation (in
>> the calculation of
>> 67 // what we consider as "used" below). This is not completely
>> fair, but is a
>> 68 // conservative assumption because PLABs may be sized flexibly
>> while we cannot
>> 69 // adjust inline allocations.
>> I know this comment was there before but how do the direct allocations
>> affect the PLAB size? Because an object that does not fit immediately
>> causes the the current PLAB to be discarded? So that increases the waste
>> but it shows up in used()?
> Direct allocations affect PLAB size in how much waste is generated. This
> potentially decreases PLAB size (hence "conservative").
> The main problem here is region end waste. At the moment we still throw
> away regions that cannot satisfy the current allocation, potentially
> generating lots of that kind of waste.
> G1 almost generates no region end waste when allocating new PLABs due to
> JDK-8067336 at this time (consider that typically the minimum size
> requested is << than what is available). Direct allocation is not
> "flexible", so it still can cause lots of region end waste.
> There is one more larger functionality change (JDK-8067433) that tries
> to minimize that region end waste (quite successfully in my
> measurements) by keeping around regions that still have space left for
> allocating PLABs, and retrying.
> I still need to do some refactoring and splitting of that change into
> multiple changes for review as it is a bit large, so I decided to ask
> for reviews for this one earlier than planned.
> Everything else is just some logging, cleaning up or tweaking.
> There are certainly better methods, but in real world (tm) it already
> works much better than previously.
> There is always the option to disable automatic PLAB resizing (and still
> can try to set the "optimal" value for a particular application given
> the PLAB statistics which are now always calculated).
>>> jprt, perf benchmarks, tested internally in many applications for more
>>> than a year
More information about the hotspot-gc-dev