RFR (M): 8201172: Parallelize Remset Tracking Update Before Rebuild phase

Thomas Schatzl thomas.schatzl at oracle.com
Wed Apr 11 11:42:59 UTC 2018


Hi Stefan,

On Tue, 2018-04-10 at 14:29 +0200, Stefan Johansson wrote:
> Hi Thomas,
> 
> On 2018-04-09 13:16, Thomas Schatzl wrote:
> > Hi all,
> > 
> >    can I have reviews for this straightforward change that improves
> > the
> > "Remset Tracking Update Before Rebuild" phase of the Remark pause?
> > 
> > Performance of this phase has not been a big issue, with ~3000
> > regions
> > you get around 1ms of time taken, but now it's even faster and
> > hopefully if run with 30k regions, there are no nasty suprises (or
> > not
> > as nasty at least).
> > 
> > Determining the number of threads to use has been done by doing a
> > few
> > measurements and tests and look at which point there does not seem
> > to
> > be any more scaling (i.e. the noise seems higher than the
> > improvements), and the pause time much smaller than a millisecond.
> > 
> > CR:
> > https://bugs.openjdk.java.net/browse/JDK-8201172
> > Webrev:
> > http://cr.openjdk.java.net/~tschatzl/8201172/webrev/
> 
> Looks good, just a few minor things:
> src/hotspot/share/gc/g1/g1ConcurrentMark.cpp
> 1108     G1UpdateRemSetTrackingBeforeRebuild cl(_g1h, _cm, &_cl);
> 1109     _g1h->heap_region_par_iterate_from_worker_offset(&cl, 
> &_hrclaimer, worker_id);
> 1110     Atomic::add(cl.num_selected_for_rebuild(), 
> &_num_selected_for_rebuild);
> 
> Using the same names for all closures and and having 
> num_selected_for_rebuild in both the task and the closure, makes
> this code a bit hard to follow. I suggest calling the task member 
> _total_selected_for_rebuild and also have the getter name match the 
> member for the closure.
> 
> For the closures I'm fine with the members being named _cl, but it
> would help if the instance above were name something else, like
> rebuild.

Done.

http://cr.openjdk.java.net/~tschatzl/8201172/webrev.0_to_1 (diff)
http://cr.openjdk.java.net/~tschatzl/8201172/webrev.1 (full)

This change contains one difference that I actually forgot to fix in
the last webrev - my notes tell me that the "best" value for
RegionsPerThread is 384, not 512. I forgot to change this after fixing
the ceil() operation in line 1174. I.e. originally I had
align_size_up() there, but it requires a power-of-2 alignment value,
that of course asserted when using 384. I fixed the align_size_up, but
did not change the RegionsPerThread value. Sorry.

Thanks,
  Thomas


More information about the hotspot-gc-dev mailing list