RFR (M): 8013895 G1: G1SummarizeRSetStats output on Linux needs improvement

Thomas Schatzl thomas.schatzl at oracle.com
Mon May 13 21:25:34 UTC 2013

Hi all,

  I'd like to ask you for reviews for this change.

It improves the output of G1SummarizeRSetStats on all platforms.

The previous output of G1SummarizeRSetStats has the following issues as shown in the example output from the CR:

 Concurrent RS processed -1761455529 cards 
  Of 9224262 completed buffers: 
      9224262 (100.0%) by conc RS threads. 
            0 ( 0.0%) by mutator threads. 
  Conc RS threads times(s) 
          0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
  Total heap region rem set sizes = 1120947K. Max = 4047K. 
  Static structures = 600K, free_lists = 5312K. 
    717820015 occupied cards represented. 
    Max size region = 378:(O)[0x00000004fa000000,0x00000004fb000000,0x00000004fb000000], size = 4048K, occupied = 37303K. 
    Did 209966 coarsenings. 

"Concurrent RS processed" uses the wrong data type, overflowing easily.
"Conc RS threads times" are always zero except on Solaris
"Conc RS threads times" reports the young gen sampling thread as remembered set thread.
The data for everything except for the remembered set sizes (beginning
at "Total heap region..." is always cumulative, which is hard to

This change includes the following modifications:
- use a larger data type for the number of concurrent RS processed
- division by zeros when calculating percentages for the completed
buffers when the number of completed buffers is still zero
- previously printed the summary only for young gen gcs, now also
printing the summary at full gcs
- split out the young gen sampling thread from the conc RS threads times
- fixed the missing space between "times" and "(s)" in the same line
- changed slightly how the G1SummarizeRSetStats
G1SummarizeRSetStatsPeriod flags work. If G1SummarizeRSetStats is true,
always print a cumulative summary at VM exit; if
G1SummarizeRSetStatsPeriod is > 0 (default), print a cumulative summary
for the last period GCs; i.e. if G1SummarizeRSetStatsPeriod is 1, print
the summary for the most recent GC.
- fix gathering of thread times:
   - on linux supporting getrusage(RUSAGE_THREAD, ...), use this
information, otherwise fallback to the elapsed time as suggested in the
   - on Windows, use the GetThreadTimes() system API call, fallback to
elapsed time
   - on BSD, use the elapsed time counter as I do not know an API call.
(getrusage() does not seem to support RUSAGE_THREAD there)

Here is an example output for printing the summary for a different
application on a different machine:

 Cumulative RS summary

 Concurrent RS processed 1504863 cards
  Of 7194 completed buffers:
         7194 (100.0%) by concurrent RS threads.
            0 (  0.0%) by mutator threads.
  Concurrent RS threads times (s)
  Concurrent sampling threads times (s)
  Total heap region rem set sizes = 119K.  Max = 4K.
  Static structures = 7K, free_lists = 209K.
    23253 occupied cards represented.
    Max size region =
0:(O)[0x00000000c2200000,0x00000000c22fffe0,0x00000000c2300000], size =
5K, occupied = 5K.
    Did 0 coarsenings.

- the "Cumulative RS summary" header is only for the final printout at
the end of the GC
- separated out "Concurrent RS threads times" and "Concurrent sampling
threads times"

The remembered set information is always for the current situation and
never cumulative in any way as before. There does not seem to be a
useful way to make it cumulative...

Guide to read the patch
- I moved the remembered set summary code to a separate class
G1RemSetSummary, and added new files (g1RemSetSummary.?pp) for it.
- the main change has been about managing the periodicity, as G1 only
ever stores the accumulated data. For this reason, G1RemSet keeps a
local instance of G1RemSetSummary with the values of the last printout.
When it is time to create a new periodic output, a new temporary
G1RemSetSummary is created and filled with the current values from the
various sources, and the last values subtracted from it.
- the various os_* files contain the changes to return thread timing
- changes in G1ConcurrentRefine are mostly about some helper functions
to distinguish between actual worker and the RS sampling thread.
- g1CollectedHeap.cpp contains the change that potentially prints the
summary at every gc, not only at young gc
- the test case verifies how many times the remset summary is actually
printed for a few configurations of G1SummarizeRSetStats and




jprt, test case, manual testing of getrusage(RUSAGE_THREAD, ) and
fallback on Linux x64, manual testing of GetThreadTimes() on Windows 64
bit, no special testing on BSD/OS X 10.7 except noticing that
getrusage(RUSAGE_THREAD) is not available there.

Thanks in advance,

More information about the hotspot-gc-dev mailing list