RFR: 8133051: Concurrent refinement threads may be activated and deactivated at random

Kim Barrett kim.barrett at oracle.com
Thu Apr 14 02:54:47 UTC 2016

> On Apr 13, 2016, at 7:46 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:

Thomas, thanks for looking at this.

> - (pre-existing) in the options that were changed, maybe spell out "rem
> set" and "rset" for better understanding.

I *think* what you are referring to here is the update_rs_xxx
parameters?  That abbreviation seems to be fairly common usage in the
code base, occurring about 30 times, while update_rset occurs fewer
than 10.  In fact, the phase time tag is "G1GCPhaseTimes::UpdateRS".
I agree something other than "update_rs" might be clearer, but would
rather see any such change being done as a separate nomenclature

> - this is a change of existing behavior: the values for the global
> flags used are never written back, not even after initialization. It
> would be nice to do so. That is kind of useful without using adaptive
> concurrent refinement. The logs won't give you that information in this
> case.

I don't think writing back the values into the global options is
useful, and in fact think it would be (and was, for me) confusing,
since the "real" values are always in the ConcurrentG1Refine object's
data members.  The usage model here has always been that the global
options were input parameters used to determine the initial values
that would actually be used.  Not modifying the global options makes
that more apparent (at least to me).

> Anyway, a log message printing the initial values would be really nice
> regardless of whether adaptiveness is turned on.

I think the global option values can be obtained as part of
-XX:+PrintFlagsFinal.  I don't think it's worth duplicating that
information.  The initial zone values that are calculated from the
(defaulted) CLA's *are* logged.

However, the computed minimum yellow zone size was not being logged
but should be.  Fixing that ended up making the two uses of
log_refinement_zones different, so that helper has been dropped.

> - is there a reason why this change needs to particularly put the
> Thresholds struct into a namespace? Please remove the namespace usage
> declaration.

I think we should be using anonymous namespaces to limit the scope of
file-local utility functions.  But I don't think this review is the
place for that discussion, so I'm changing Thresholds to be a typedef
for Pair<size_t, size_t>.

> - the log message prints the thresholds, not the zones (ranges). This
> might lead to misunderstandings.
> The same with variable and parameter names. (this is mostly pre 
> -existing) 

I think the nomenclature in the log messages matches the nomenclature
and semantics for the corresponding command line options.  The term
"threshold" as used here refers to (de)activation buffer count values
for the threads.

I agree the chosen terminology is somewhat confusing (especially that
green and yellow refer to ranges below the value, while red refers to
the range above the value), but I don't want to mess with that as part
of this change.  Especially since future improvements to the control
of the threads might render some of this obsolete.

> - the defines should not use local variables with leading underscore.
> While there are not particular rules for it, variables with leading
> underscore are reserved for members.
> It initially got me confused with these defines being used in the
> context of some object, and referencing that object's members.

I was trying to make them stand out and be obviously local to these
macros, but I conceed the point about underscores and member names.
The ugly prefixes based on the macro name are sufficient anyway.
Leading underscores removed.

> Please also consider #undef'ing these local defines.

Why?  They are intended to be used anywhere in the file.  Really, they
ought to be file-scoped functions, but I was recently reminded that
such assert functions interract poorly with debugging Windows failures.

> - in concurrentG1Refine.cpp:154, the assignment of "size = green * 2"
> seems quite random, just as the previous factor of 3.
> Wouldn't it be better to just use min_size, or do nothing here because
> the following value clamping will fix it up anyway?

The old computation was (3 * green), the new is (2 * green) + green.
That is, I'm retaining the old computation (when the CLA is
defaulted), but doing it in a different way.  And then applying the
new range clamping.  Twice green can certainly be larger than
min_size.  [Note that I'm not really defending some of these
calculations; I expect some of them to change substantially as part of
addressing JDK-8137022, and for now am trying to minimize those sorts
of changes.]

> - the "update_rs_processed_buffers" parameter in various methods should
> imho be a size_t, not a double. There does not seem to be a reason to
> use a double. (pre-existing)

I was annoyed / confused by the use of double here, but had assumed it
was to match the value from the caller.  Now that I've actually gone
and looked, you are absolutely right, this should be size_t all the
way through, to match the type actually provided by the phase_time.
Thanks for noticing that.

> - I kind of dislike even requiring an unused parameter for a method in
> static methods like in calc_new_yellow/red_zone, but I kind of see the
> point for consistency's sake. Let's see what others think.

Those unused parameters are to some extent a failure on my part to pay
attention to the TODO notes (see below).  Since these are completely
internal and not exposed API functions, future changes to their
parameters are easy, so I'm removing the unused parameters, to be
reinstated later if/when they are actually needed.

> In any case, I would prefer if the TODO's were removed. Most likely
> they are related to improvements you have in your queue anyway,
> otherwise I would prefer to store TODO's in the bug tracker.

The TODO's were intended to serve two purposes.  They remind reviewers
that the existing code is something of a placeholder, merely
attempting to retain something like the pre-existing calculations.
They also reminded me to not mess with these calculations as part of
this change set, even though I think some of them are rather strange,
because doing so will make this change set bigger (possibly much
bigger) and make the review process harder.  I'll remove them now.

> - for consistency: all methods but calc_new_*_zone() parameters that
> contain the thresholds have the _zone suffix, but not those:

I *think* I was consistent in dropping the _zone suffix when in
functions whose name makes it clear they are about zones, such as
calc_{new,init}_*_zone.  Counter-examples?

> - concurrentG1Refine.cpp:174ff: indentation of the parameter list does
> not conform to any other in the same file.

I suspect that was because the line width was getting excessive during
development of these changes.  It looks like some later changes have
made that not such a problem, so back to the more typical style.

> - the * 2 in the calc_new_yellow_zone() seems to be the same factor as
> in calc_init_yellow_zone(). Unless this is changed soon, I would prefer
> if it were factored out somehow.

I intend to completely redo calc_new_yellow_zone in the near future,
so I'd rather not spend time on something like that.

> - concurrentRefine.cpp:360+362: instead of the C-style cast, maybe a
> static_cast should be used here too.

I intend to completely redo the chunk of code related to updating the
dcqs in the near future, so tried to avoid making unnecessary changes,
such as touching line 362.

> - in addition to the actual new thresholds, it might be interesting to
> get information about the values passed to G1ConcurrentRefine::adjust()
> in a trace message.

Agreed.  Added.

> Looks good overall though.

In addition to changes to address your comments, I'm also addressing
Jon's log augmenting suggestion, and fixing these:

 324   return yellow + (yellow - green);

Missing MIN2 with max_red_zone.

 300     if (green > 0) {
 301       green = MIN2(static_cast<size_t>(green * dec_k), green - 1);
 302     }

The MIN2 is unnecessary here; this can be just

  green = static_cast<size_t>(green * dec_k);

(which matches the pre-change code).


New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8133051/webrev.01/
incr: http://cr.openjdk.java.net/~kbarrett/8133051/webrev.01.inc/

Local testing looks good.
Currently running a Nightly G1 batch.

More information about the hotspot-gc-dev mailing list