RFR for bug JDK-8004807: java/util/Timer/Args.java failing intermittently in HS testing
martinrb at google.com
Wed Jun 4 19:25:58 UTC 2014
Tests for Timer are inherently timing (!) dependent.
It's reasonable for tests to assume that:
- reasonable events like creating a thread and executing a simple task
should complete in less than, say 2500ms.
- system clock will not change by a significant amount (> 1 sec) during the
test. Yes, that means Timer tests are likely to fail during daylight
saving time switchover - we can live with that. (we could even try to fix
that, by detecting deviations between clock time and elapsed time, but
probably not worth it)
Can you detect any real-world unreliability in my latest version of the
test, not counting daylight saving time switch?
I continue to resist your efforts to "fix" the test by removing chances for
the SUT code to go wrong.
On Tue, Jun 3, 2014 at 11:28 PM, Eric Wang <yiming.wang at oracle.com> wrote:
> Hi Martin,
> Thanks for explanation, now I can understand why you set the DELAY_MS to
> 100 seconds, it is true that it prevents failure on a slow host, however, i
> still have some concerns.
> Because the test tests to schedule tasks at the time in the past, so all
> 13 tasks should be executed immediately and finished within a short time.
> If set the elapsed time limitation to 50s (DELAY_MS/2), it seems that the
> timer have plenty of time to finish tasks, so whether it causes above test
> point lost.
> Back to the original test, i think it should be a test stabilization
> issue, because the original test assumes that the timer should be cancelled
> within < 1 second before the 14th task is called. this assumption may not
> be guaranteed due to 2 reasons:
> 1. if test is executed in jtreg concurrent mode on a slow host.
> 2. the system clock of virtual machine may not be accurate (maybe faster
> than physical).
> To support the point, i changed the test as attached to print the
> execution time to see whether the timer behaves expected as the API
> document described. the result is as expected.
> The unrepeated task executed immediately: 
> The repeated task executed immediately and repeated per 1 second:
> [1401855509337, 1401855510337, 1401855511338]
> The fixed-rate task executed immediately and catch up the delay:
> [1401855509338, 1401855509338, 1401855509338, 1401855509338, 1401855509338,
> 1401855509338, 1401855509338, 1401855509338, 1401855509338, 1401855509338,
> 1401855509338, 1401855509836, 1401855510836]
> On 2014/6/4 9:16, Martin Buchholz wrote:
> On Tue, Jun 3, 2014 at 6:12 PM, Eric Wang <yiming.wang at oracle.com> wrote:
>> Hi Martin,
>> To sleep(1000) is not enough to reproduce the failure, because it is much
>> lower than the period DELAY_MS (10*1000) of the repeated task created by
>> "scheduleAtFixedRate(t, counter(y3), past, DELAY_MS)".
>> Try sleep(DELAY_MS), the failure can be reproduced.
> Well sure, then the task is rescheduled, so I expect it to fail in this
> But in my version I had set DELAY_MS to 100 seconds. The point of
> extending the DELAY_MS is to prevent flaky failure on a slow machine.
> Again, how do we know that this test hasn't found a Timer bug?
> I still can't reproduce it.
More information about the core-libs-dev