RFR (M): 8201172: Parallelize Remset Tracking Update Before Rebuild phase
sangheon.kim at oracle.com
Tue Apr 17 22:17:06 UTC 2018
On 04/11/2018 04:42 AM, Thomas Schatzl wrote:
> Hi Stefan,
> On Tue, 2018-04-10 at 14:29 +0200, Stefan Johansson wrote:
>> Hi Thomas,
>> On 2018-04-09 13:16, Thomas Schatzl wrote:
>>> Hi all,
>>> can I have reviews for this straightforward change that improves
>>> "Remset Tracking Update Before Rebuild" phase of the Remark pause?
>>> Performance of this phase has not been a big issue, with ~3000
>>> you get around 1ms of time taken, but now it's even faster and
>>> hopefully if run with 30k regions, there are no nasty suprises (or
>>> as nasty at least).
>>> Determining the number of threads to use has been done by doing a
>>> measurements and tests and look at which point there does not seem
>>> be any more scaling (i.e. the noise seems higher than the
>>> improvements), and the pause time much smaller than a millisecond.
>> Looks good, just a few minor things:
>> 1108 G1UpdateRemSetTrackingBeforeRebuild cl(_g1h, _cm, &_cl);
>> 1109 _g1h->heap_region_par_iterate_from_worker_offset(&cl,
>> &_hrclaimer, worker_id);
>> 1110 Atomic::add(cl.num_selected_for_rebuild(),
>> Using the same names for all closures and and having
>> num_selected_for_rebuild in both the task and the closure, makes
>> this code a bit hard to follow. I suggest calling the task member
>> _total_selected_for_rebuild and also have the getter name match the
>> member for the closure.
>> For the closures I'm fine with the members being named _cl, but it
>> would help if the instance above were name something else, like
> http://cr.openjdk.java.net/~tschatzl/8201172/webrev.0_to_1 (diff)
> http://cr.openjdk.java.net/~tschatzl/8201172/webrev.1 (full)
Webrev.1 looks good.
I only have minor nits.
1011 G1ConcurrentMark* _cm;
- _cm member seems not really necessary for
G1UpdateRemSetTrackingBeforeRebuildTask as it only passes to
G1UpdateRemSetTrackingBeforeRebuild and we can get it from G1CollectedHeap.
1013 uint _total_selected_for_rebuild;
> This change contains one difference that I actually forgot to fix in
> the last webrev - my notes tell me that the "best" value for
> RegionsPerThread is 384, not 512. I forgot to change this after fixing
> the ceil() operation in line 1174. I.e. originally I had
> align_size_up() there, but it requires a power-of-2 alignment value,
> that of course asserted when using 384. I fixed the align_size_up, but
> did not change the RegionsPerThread value. Sorry.
More information about the hotspot-gc-dev