RFR (M): JDK-8068394: Trace event for concurrent GC phases

sangheon.kim sangheon.kim at oracle.com
Tue Nov 24 18:48:46 UTC 2015


Hi Bengt,

Thank you so much for reviewing this patch!

On 11/24/2015 05:12 AM, Bengt Rutisson wrote:
>
> Hi Sangheon,
>
> Sorry for the very late reply to this review request.
>
> On 2015-10-22 01:40, sangheon.kim wrote:
>> Hi all,
>>
>> Can I get some reviews for this change of adding a trace event for 
>> concurrent phases on CMS and G1?
>>
>> Currently we only measure pause times for major phases.
>> But I want to add 'concurrent mark' and 'concurrent sweep' phases for 
>> CMS and 'concurrent mark' phase for G1.
>> To achieve this, I had to change ConcurrentGCTimer and related classes.
>>
>> This patch includes:
>> 1) A concurrent phase can be measured only from ConcurrentGCTimer and 
>> measuring an overlap between concurrent and pause is not allowed as 
>> currently we don't have that use case. And TimePartitions class(or 
>> related classes) will be simpler than an overlap allowed case.
>> 2) I removed PausePhase and ConcurrentPhase which are derived from 
>> GCPhase because I wanted to avoid heap allocation when adding to 
>> GrowableArray. Instead introduced 'type' member variable at GCPhase.
>>
>> CR: https://bugs.openjdk.java.net/browse/JDK-8068394
>> Webrev: http://cr.openjdk.java.net/~sangheki/8068394/webrev.00/
>
> Nice work! It is great to get some timing information for the 
> concurrent phases.
Thanks!

>
> A few questions/comments:
>
> CMS.
>
> You have added timing events for the concurrent phases mark and sweep, 
> but not for other concurrent phases (preclean, abortable_preclean and 
> reset_concurrent). I think that if you moved your calls to 
> _gc_timer_cm->register_gc_concurrent_start() and 
> _gc_timer_cm->register_gc_concurrent_end() into the constructor and 
> destructor of CMSPhaseAccounting you would automatically get timing 
> events for all concurrent phases.
I also considered about this idea but I was not clear whether it is good 
measurement especially for 'sweep'.
We are using CMSPhaseAccounting for 5 cases.
I think for 'mark' and 'preclean' they are okay. (there are non-product 
functions call before CMSPhaseAccounting. But they are okay).
For 'abortable-preclean' and 'reset', they are good to have that.
But when it comes to 'sweep', there are timer and CMSTokenSyncWithLocks 
related codes and I was not sure about these.
If you think they are also okay, I will change as you suggested.

>
>
> G1.
>
> I think the use of _concurrent_marking_from_roots is unfortunate. It 
> would be much cleaner if 
> ConcurrentMark::register_mark_from_roots_phase_end() called 
> register_gc_concurrent_end() directly. 
I wanted to avoid introducing the new flag but the reason of it was, as 
you know, to handle 'abort'.
When 'abort' happens we do need to end all timers before calling 
'GCTracer::report_gc_end()'.
And at this point we need a way to know whether concurrent timer is 
started or not.

> I realize that this would change the timing for when a concurrent 
> marking is aborted. The whole time for a full GC (or even several full 
> GCs) would be included in the concurrent marking phase. But from a 
> code perspective that is what happens, so maybe that is the correct 
> time to report? Also, I think the logging is reported that way so if 
> we want to make it easy to match the timing events with the logging we 
> might want to use about the same scope for timing.
>
> Why do we only measure the concurrent mark phase for G1? 
This is good question. :)
I was comparing CMS and G1 for major concurrent phases.
And the candidates were 'concurrent mark/sweep (CMS)' and 'concurrent 
mark(G1)'.

> Wouldn't it be good to measure all concurrent phases?
Okay, Stefan Karlsson also pointed to have them as well so I filed a 
separate CR for this.
https://bugs.openjdk.java.net/browse/JDK-8143951

>
> gcTraceSend.cpp
>
>  389   void visit(GCPhase* phase) {
>  390     switch (phase->type()) {
>  391       case GCPhase::PausePhaseType:
>  392         assert(phase->level() < PhasesStack::PHASE_LEVELS, "Need 
> more event types for PausePhase");
>  393
>  394         switch (phase->level()) {
>  395           case 0: send_phase<EventGCPhasePause>(phase); break;
>  396           case 1: send_phase<EventGCPhasePauseLevel1>(phase); break;
>  397           case 2: send_phase<EventGCPhasePauseLevel2>(phase); break;
>  398           case 3: send_phase<EventGCPhasePauseLevel3>(phase); break;
>  399           default: /* Ignore sending this phase */ break;
>  400         }
>  401         break;
>  402
>  403       case GCPhase::ConcurrentPhaseType:
>  404         assert(phase->level() < 1, "There's only one level for 
> ConcurrentPhase");
>  405
>  406         switch (phase->level()) {
>  407           case 0: send_phase<EventGCPhaseConcurrent>(phase); break;
>  408           default: /* Ignore sending this phase */ break;
>  409         }
>  410         break;
>  411     }
>  412   }
>  413 };
>
> Since there are only two values for GCPhase::PhaseType it seems a bit 
> odd to use a switch statement. I think I would prefer to factor out 
> the code for the different types a bit too. So, maybe something like:
>
> void visit(GCPhase* phase) {
>   if (phase->type() == GCPhase::PausePhaseType) {
>     visit_pause(phase);
>   } else {
>     assert(phase->type() == GCPhase::ConcurrentPhaseType, "");
>     visit_concurrent(phase);
>   }
> }
This seems better. I will fix.

Thanks,
Sangheon


>
> Thanks,
> Bengt
>
>> Testing: JPRT
>>
>> Thanks,
>> Sangheon
>



More information about the hotspot-gc-dev mailing list