Timing Differences Between JFR and GC Logs

Ana Marsh t-maana at microsoft.com
Fri Aug 14 22:09:23 UTC 2020


Hi everyone,

My name is Ana Marsh and I am interning at Microsoft on the Java Engineering Team (under Martijn Verburg) and this summer I have been diving into JFR in the JVM and analyzing how it gathers information on garbage collection events. I have noticed over the past two months of digging that the start time of a garbage collection event returned by JFR varies widely from the start time of the same event given with Unified Logging. I decided to explore this further by looking at the specifics of OpenJDK code and found some interesting points to highlight.


  1.  Logging decides to log the time of events as it prints and not necessarily as the events happen. That being said, it appears that JFR collects the time directly at the beginning of an event making it appear as a better marker of the start time of an event.


  1.  Summary:
Garbage collection timing is measured in Ticks which holds values from two different clocks when using an x86 machine. These clocks are called "os"  (ElapsedCounterSource) and "rdtsc" (FastUnorderedElapsedCounterSource) and one value of Ticks will hold a relevant time for each clock. When a GC event begins (specifically Parallel GC because that is what I have been looking at) it registers the start in Ticks by calling now(). This is what now() looks like:
jdk/src/hotspot/share/utilities/ticks.cpp
CompositeElapsedCounterSource::Type CompositeElapsedCounterSource::now() {
  CompositeTime ct;
  ct.val1 = ElapsedCounterSource::now();
#if defined(X86) && !defined(ZERO)
  static bool initialized = false;
  static bool valid_rdtsc = false;
  if (!initialized) {
    valid_rdtsc = Rdtsc::initialize();
    initialized = true;
  }
  if (valid_rdtsc) {
    ct.val2 = Rdtsc::elapsed_counter();
  }
#endif
  return ct;
}

ElapsedCounterSource::Type ElapsedCounterSource::now() {
  return os::elapsed_counter();
}

For JFR files, the start time is set to the RDTSC value while in GC logs it uses the OS clock as it prints out the log. This is obviously a further point of concern for timing differences, so I wanted to see how much of a contrast there was in the times these clocks reported.

Experimentation with Results:
To find the differences between the two I added this to the code:
jdk/src/hotspot/share/utilities/ticks.cpp
CompositeElapsedCounterSource::Type CompositeElapsedCounterSource::now() {
  CompositeTime ct;
  ct.val1 = ElapsedCounterSource::now();
#if defined(X86) && !defined(ZERO)
  static bool initialized = false;
  static bool valid_rdtsc = false;
  if (!initialized) {
    valid_rdtsc = Rdtsc::initialize();
    initialized = true;
  }
  if (valid_rdtsc) {
    ct.val2 = Rdtsc::elapsed_counter();
  }
#endif
  std::cout << "I got ticks!!\n";
  std::cout << "OS time:      ";
  std::cout << ElapsedCounterSource::seconds(ct.val1) << " seconds\n";
  std::cout<< "RDTSC time:   ";
  std::cout << FastUnorderedElapsedCounterSource::seconds(ct.val2) <<  " seconds\n\n";
  return ct;
}

I ran this with a test application that forces garbage collection events and I also had a JFR file and GC log being recorded during the run as well. For reference I am running on a Windows 10 machine but using WSL with Ubuntu.

Here are a few outputs from the beginning of the application:
I got ticks!!
OS time:      2.0739 seconds
RDTSC time:   1.16455 seconds

I got ticks!!
OS time:      2.07512 seconds
RDTSC time:   1.16591 seconds

I got ticks!!
OS time:      2.07549 seconds
RDTSC time:   1.16631 seconds

Here are a few outputs from ~40 seconds into the application:
I got ticks!!
OS time:      43.0537 seconds
RDTSC time:   46.717 seconds

I got ticks!!
OS time:      43.0562 seconds
RDTSC time:   46.7198 seconds

I got ticks!!
OS time:      43.0562 seconds
RDTSC time:   46.7198 seconds

Here are a few outputs from ~200 seconds into the application:
I got ticks!!
OS time:      191.654 seconds
RDTSC time:   211.898 seconds

I got ticks!!
OS time:      191.654 seconds
RDTSC time:   211.898 seconds

I got ticks!!
OS time:      191.654 seconds
RDTSC time:   211.898 seconds


Observations on Results:
Both ElapsedCounterSource (OS) and FastUnorderedElapsedCounterSource (RDTSC) have their own way of determining seconds so there could be some issues in there as well. Additionally in this example the clocks are not being set instantaneously causing some numbers to be off, however at the beginning of my test the OS clock ends up reporting a later time even though it is called first. Over time though RDTSC seems to begin reporting later times. Overall these clocks get far very off, in degrees of seconds which seems big in the clock world.

Any insight or thoughts on this matter would be helpful. Thank you!

Best,
Ana Marsh



More information about the hotspot-gc-dev mailing list