RFR: 8133051: Concurrent refinement threads may be activated and deactivated at random

Jon Masamitsu jon.masamitsu at oracle.com
Fri Apr 15 06:26:45 UTC 2016

On 4/14/2016 2:38 PM, Kim Barrett wrote:
>> On Apr 14, 2016, at 3:46 PM, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
>> Kim,
>> http://cr.openjdk.java.net/~kbarrett/8133051/webrev.01/src/share/vm/gc/g1/concurrentG1Refine.cpp.frames.html
>> 296 static size_t calc_new_green_zone(size_t green,
>> 297 double update_rs_time,
>> 298 size_t update_rs_processed_buffers,
>> 299 double goal_ms) {
>> 300 // Adjust green zone based on whether we're meeting the time goal.
>> 301 // Limit to max_green_zone.
>> 302 const double inc_k = 1.1, dec_k = 0.9;
>> 303 if (update_rs_time > goal_ms) {
>> 304 if (green > 0) {
>> 305 green = static_cast<size_t>(green * dec_k);
>> 306 }
>> If you're not achieving the goal and green is 0, it stays
>> stuck at 0.  If update_rs_time is too long, don't we want
>> more concurrent refinement?
> The green_zone value is the target number of buffers pending at the
> start of the GC pause.  The assumption is that the yellow and red
> zones are configured so that we're (usually) close to that target.  We
> then adjust the green_zone value based on whether update_rs met its
> time goal.  There may be additional buffers added between pause start
> and update_rs (partial buffers from Java threads, remset entries
> enqueued as part of eager reclaim of humongous objects setup, ...).
> The current green_zone update calculation doesn't take into account
> how successful concurrent refinement was in achieving the green_zone
> target, and the impact that has on update_rs time.  It also only uses
> the sign bit of the error between actual vs goal update_rs time.  And
> it uses the error between the actual and the *next* goal time.  And
> the goal time is calculated using the last hot card cache update time
> rather than a predicted next hot card cache update time.  And so on.
> So yes, there are *many* factors the current green_zone update
> calculation isn't taking into account, some of which may be
> important.  For this change set, I'm not trying to address this issue.
> For the specific question of what happens when the green_zone value
> gets driven to (near) zero, yes, it could stay stuck at zero.  If it
> does so, the present yellow and red zone calculations (which are based
> on the green_zone value) will produce minimum values so that the
> refinement thread activity will be maximized, subject to actual buffer
> availability.  And if that isn't keeping up then mutator-invoked
> refinement will also get involved.  One of the things this change set
> does do is reduce wasted refinement thread activation when green_zone
> is small, where rounding errors could result in more threads being
> activated than there were buffers to process.  That's the purpose of
> the minimum threshold step / minimum yellow zone size.
> So in answer to your question
>    If update_rs_time is too long, don't we want more concurrent
>    refinement?
> Yes, and the current control laws result in that.

Since it sounds like there is more work to do in this area, I won't 
dwell on the
current implementation but thank you for the explanation.  I had not 
that minimal values for the yellow zone would produce larger concurrent
refinement activity.

>> 101 if (worker_i == 0) {
>> 102 // Potentially activate worker 0 more aggressively, to keep
>> 103 // available buffers near green_zone value. When yellow_size is
>> 104 // large we don't want to allow a full step to accumulate before
>> 105 // doing any processing, as that might lead to significantly more
>> 106 // than green_zone buffers to be processed by update_rs.
>> 107 step = MIN2(step, ParallelGCThreads / 2.0);
>> 108 }
>> Line 107 says the more GC threads I have, the bigger the step I need to
>> start the first one.  Is that right?
> One thing to remember is that ParallelGCThreads is the number of
> parallel worker threads available to do work during a pause, e.g. the
> pause-time update_rs phase has this many threads available to do the
> work.  This number is only very weakly connected to the number of
> concurrent refinement threads (the present default for the latter
> happens to be ParallelGCThreads, but that could change).
> The idea behind
> 107 step = MIN2(step, ParallelGCThreads / 2.0);
> is that if step is large, we want to more aggressively activate the
> so-called "primary" thread (which is responsible for keeping us close
> to the green_zone number of buffers when there aren't so many that
> other threads are needed) than we would if we based its activation
> only on the normal step value.  That way, once we get close to the
> green_zone value, we stay close.
> We want some hysteresis in the activation of the primary thread.  It's
> more expensive to have it wake up, process one buffer, and resume
> waiting N times than to wake up, process N buffers, and resume
> waiting, so we want to wait until there are some number N buffers over
> the green_zone available for processing before waking up the primary
> thread.
> But we don't want N to be so large that if we have approached that
> overage when a pause occurs, that it will push us significantly over
> the update_rs budget.  We base N on ParallelGCThreads because any
> overage will be distributed among the parallel worker threads at
> update_rs time.
> So long as the configuration is not so bad that green_zone is driven
> to zero and we still can't achieve the update_rs goal, the feedback
> loop for green_zone update effectively takes whatever overage we
> permit into account; larger permitted overage applies more back
> pressure on the long-term green_zone value.
> So a primary thread activation step based on a multiple of
> ParallelGCThreads applies a roughly constant (based on that
> multiplier) back pressure on the green_zone, while providing some
> desired hysteresis in the primary thread's activation.
> The specific factor of 1/2 was plucked out of the air as a seemingly
> plausible value.

I'll have to study this a little more but I kinda get the idea.

Your changes look fine.


More information about the hotspot-gc-dev mailing list