<div dir="ltr">Hi Matthias,<div><br></div><div>Looks good to me :)</div><div>Jc</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Jul 31, 2019 at 7:01 AM Baesken, Matthias <<a href="mailto:matthias.baesken@sap.com">matthias.baesken@sap.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
Hi upload works again, now with webrev :<br>
<br>
<a href="http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.2/" rel="noreferrer" target="_blank">http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.2/</a><br>
<br>
Best regards, Matthias<br>
<br>
<br>
> -----Original Message-----<br>
> From: Baesken, Matthias<br>
> Sent: Mittwoch, 31. Juli 2019 14:05<br>
> To: 'David Holmes' <<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a>>; Jean Christophe Beyler<br>
> <<a href="mailto:jcbeyler@google.com" target="_blank">jcbeyler@google.com</a>><br>
> Cc: <a href="mailto:hotspot-dev@openjdk.java.net" target="_blank">hotspot-dev@openjdk.java.net</a>; serviceability-dev <serviceability-<br>
> <a href="mailto:dev@openjdk.java.net" target="_blank">dev@openjdk.java.net</a>><br>
> Subject: RE: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on fast<br>
> Linux machines with Total safepoint time 0 ms<br>
> <br>
> Hello, here is a version following the latest proposal of JC .<br>
> <br>
> Unfortunately attached as patch, sorry for that - the uploads / pushes<br>
> currently do not work from here .<br>
> <br>
> Best regards, Matthias<br>
> <br>
> <br>
> > -----Original Message-----<br>
> > From: David Holmes <<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a>><br>
> > Sent: Mittwoch, 31. Juli 2019 05:04<br>
> > To: Jean Christophe Beyler <<a href="mailto:jcbeyler@google.com" target="_blank">jcbeyler@google.com</a>><br>
> > Cc: Baesken, Matthias <<a href="mailto:matthias.baesken@sap.com" target="_blank">matthias.baesken@sap.com</a>>; hotspot-<br>
> > <a href="mailto:dev@openjdk.java.net" target="_blank">dev@openjdk.java.net</a>; serviceability-dev <serviceability-<br>
> > <a href="mailto:dev@openjdk.java.net" target="_blank">dev@openjdk.java.net</a>><br>
> > Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java fails on<br>
> fast<br>
> > Linux machines with Total safepoint time 0 ms<br>
> ><br>
> > On 31/07/2019 9:08 am, Jean Christophe Beyler wrote:<br>
> > > FWIW, I would have done something like what David was suggesting, just<br>
> > > slightly tweaked:<br>
> > ><br>
> > > public static long executeThreadDumps() {<br>
> > >Â Â long value;<br>
> > >Â Â long initial_value = mbean.getTotalSafepointTime();<br>
> > >Â Â do {<br>
> > >Â Â Â Â Thread.getAllStackTraces();<br>
> > >Â Â Â Â value = mbean.getTotalSafepointTime();<br>
> > >Â Â } while (value == initial_value);<br>
> > >Â Â return value;<br>
> > > }<br>
> > ><br>
> > > This ensures that the value is a new value as opposed to the current<br>
> > > value and if something goes wrong, as David said, it will timeout; which<br>
> > > is ok.<br>
> ><br>
> > Works for me.<br>
> ><br>
> > > But I come back to not really understanding why we are doing this at<br>
> > > this point of relaxing (just get a new value of safepoint time).<br>
> > > Because, if we accept timeouts now as a failure here, then really the<br>
> > > whole test becomes:<br>
> > ><br>
> > > executeThreadDumps();<br>
> > > executeThreadDumps();<br>
> > ><br>
> > > Since the first call will return when value > 0 and the second call will<br>
> > > return when value2 > value (I still wonder why we want to ensure it<br>
> > > works twice...).<br>
> ><br>
> > The test is trying to sanity check that we are actually recording the<br>
> > time used by safepoints. So first check is that we can get a non-zero<br>
> > value; second check is we get a greater non-zero value. It's just a<br>
> > sanity test to try and catch if something gets unexpectedly broken in<br>
> > the time tracking code.<br>
> ><br>
> > > So both failures and even testing for it is kind of redundant, once you<br>
> > > have a do/while until a change?<br>
> ><br>
> > Yes - the problem with the tests that try to check internal VM behaviour<br>
> > is that we have no specified way to do something, in this case execute<br>
> > safepoints, that relates to internal VM behaviour, so we have to do<br>
> > something we know will currently work even if not specified to do so -<br>
> > e.g. dumping all thread stacks uses a global safepoint. The second<br>
> > problem is that the timer granularity is so coarse that we then have to<br>
> > guess how many times we need to do that something before seeing a<br>
> > change. To make the test robust we can keep doing stuff until we see a<br>
> > change and so the only way that will fail is if the overall timeout of<br>
> > the test kicks in. Or we can try and second guess how long it should<br>
> > take by introducing our own internal timeout - either directly or by<br>
> > limiting the number of loops in this case. That has its own problems and<br>
> > in general we have tried to reduce internal test timeouts (by removing<br>
> > them) and let overall timeouts take charge.<br>
> ><br>
> > No ideal solution. And this has already consumed way too much of<br>
> > everyone's time.<br>
> ><br>
> > Cheers,<br>
> > David<br>
> ><br>
> > > Thanks,<br>
> > > Jc<br>
> > ><br>
> > ><br>
> > > On Tue, Jul 30, 2019 at 2:35 PM David Holmes <<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a><br>
> > > <mailto:<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a>>> wrote:<br>
> > ><br>
> > >Â Â Â On 30/07/2019 10:39 pm, Baesken, Matthias wrote:<br>
> > >   > Hi David,  "put that whole code (the while loop) in a helper<br>
> > >   method."  was JC's idea, and I like the idea .<br>
> > ><br>
> > >Â Â Â Regardless I think the way you are using NUM_THREAD_DUMPS is<br>
> really<br>
> > >Â Â Â confusing. As an all-caps static you'd expect it to be a constant.<br>
> > ><br>
> > >Â Â Â Thanks,<br>
> > >Â Â Â David<br>
> > ><br>
> > >Â Â Â > Let's see what others think .<br>
> > >Â Â Â ><br>
> > >Â Â Â >><br>
> > >Â Â Â >> Overall tests like this are not very useful, yet very fragile.<br>
> > >Â Â Â >><br>
> > >Â Â Â ><br>
> > >   > I am also fine with putting the test on the exclude list.<br>
> > >Â Â Â ><br>
> > >Â Â Â > Best regards, Matthias<br>
> > >Â Â Â ><br>
> > >Â Â Â ><br>
> > >Â Â Â >> -----Original Message-----<br>
> > >Â Â Â >> From: David Holmes <<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a><br>
> > >Â Â Â <mailto:<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a>>><br>
> > >Â Â Â >> Sent: Dienstag, 30. Juli 2019 14:12<br>
> > >Â Â Â >> To: Baesken, Matthias <<a href="mailto:matthias.baesken@sap.com" target="_blank">matthias.baesken@sap.com</a><br>
> > >Â Â Â <mailto:<a href="mailto:matthias.baesken@sap.com" target="_blank">matthias.baesken@sap.com</a>>>; Jean Christophe<br>
> > >Â Â Â >> Beyler <<a href="mailto:jcbeyler@google.com" target="_blank">jcbeyler@google.com</a> <mailto:<a href="mailto:jcbeyler@google.com" target="_blank">jcbeyler@google.com</a>>><br>
> > >Â Â Â >> Cc: <a href="mailto:hotspot-dev@openjdk.java.net" target="_blank">hotspot-dev@openjdk.java.net</a><br>
> > >Â Â Â <mailto:<a href="mailto:hotspot-dev@openjdk.java.net" target="_blank">hotspot-dev@openjdk.java.net</a>>; serviceability-dev<br>
> > >Â Â Â <serviceability-<br>
> > >Â Â Â >> <a href="mailto:dev@openjdk.java.net" target="_blank">dev@openjdk.java.net</a> <mailto:<a href="mailto:dev@openjdk.java.net" target="_blank">dev@openjdk.java.net</a>>><br>
> > >Â Â Â >> Subject: Re: RFR: [XS] 8228658: test GetTotalSafepointTime.java<br>
> > >Â Â Â fails on fast<br>
> > >Â Â Â >> Linux machines with Total safepoint time 0 ms<br>
> > >Â Â Â >><br>
> > >Â Â Â >> Hi Matthias,<br>
> > >Â Â Â >><br>
> > >Â Â Â >> On 30/07/2019 9:25 pm, Baesken, Matthias wrote:<br>
> > >   >>> Hello JC / David,  here is a second webrev :<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> <a href="http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/" rel="noreferrer" target="_blank">http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.1/</a><br>
> > >Â Â Â >>><br>
> > >   >>> It moves  the thread dump execution into a method<br>
> > >   >>> executeThreadDumps(long)    , and also adds while loops<br>
> > >Â Â Â (but with a<br>
> > >   >>> limitation for the number of thread dumps, really don’t<br>
> > >   >>> want to cause timeouts etc.).   I removed a check for<br>
> > >Â Â Â >>> MAX_VALUE_FOR_PASSÂ Â because we cannot go over<br>
> > Long.MAX_VALUE .<br>
> > >Â Â Â >><br>
> > >Â Â Â >> I don't think executeThreadDumps is worth factoring out like out.<br>
> > >Â Â Â >><br>
> > >Â Â Â >> The handling of NUM_THREAD_DUMPS is a bit confusing. I'd rather<br>
> it<br>
> > >Â Â Â >> remains a constant 100, and then you set a simple loop iteration<br>
> > >Â Â Â count<br>
> > >Â Â Â >> limit. Further with the proposed code when you get here:<br>
> > >Â Â Â >><br>
> > >Â Â Â >>Â Â 85Â Â Â Â Â NUM_THREAD_DUMPS = NUM_THREAD_DUMPS * 2;<br>
> > >Â Â Â >><br>
> > >Â Â Â >> you don't even know what value you may be starting with.<br>
> > >Â Â Â >><br>
> > >Â Â Â >> But I was thinking of simply:<br>
> > >Â Â Â >><br>
> > >Â Â Â >> long value = 0;<br>
> > >Â Â Â >> do {<br>
> > >Â Â Â >>Â Â Â Â Thread.getAllStackTraces();<br>
> > >Â Â Â >>Â Â Â Â value = mbean.getTotalSafepointTime();<br>
> > >Â Â Â >> } while (value == 0);<br>
> > >Â Â Â >><br>
> > >Â Â Â >> We'd only hit a timeout if something is completely broken -<br>
> > >Â Â Â which is fine.<br>
> > >Â Â Â >><br>
> > >Â Â Â >> Overall tests like this are not very useful, yet very fragile.<br>
> > >Â Â Â >><br>
> > >Â Â Â >> Thanks,<br>
> > >Â Â Â >> David<br>
> > >Â Â Â >><br>
> > >   >>> Hope you like this version  better.<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> Best regards, Matthias<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> *From:*Jean Christophe Beyler <<a href="mailto:jcbeyler@google.com" target="_blank">jcbeyler@google.com</a><br>
> > >Â Â Â <mailto:<a href="mailto:jcbeyler@google.com" target="_blank">jcbeyler@google.com</a>>><br>
> > >Â Â Â >>> *Sent:* Dienstag, 30. Juli 2019 05:39<br>
> > >Â Â Â >>> *To:* David Holmes <<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a><br>
> > >Â Â Â <mailto:<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a>>><br>
> > >Â Â Â >>> *Cc:* Baesken, Matthias <<a href="mailto:matthias.baesken@sap.com" target="_blank">matthias.baesken@sap.com</a><br>
> > >Â Â Â <mailto:<a href="mailto:matthias.baesken@sap.com" target="_blank">matthias.baesken@sap.com</a>>>;<br>
> > >Â Â Â >>> <a href="mailto:hotspot-dev@openjdk.java.net" target="_blank">hotspot-dev@openjdk.java.net</a><br>
> > >Â Â Â <mailto:<a href="mailto:hotspot-dev@openjdk.java.net" target="_blank">hotspot-dev@openjdk.java.net</a>>; serviceability-dev<br>
> > >Â Â Â >>> <<a href="mailto:serviceability-dev@openjdk.java.net" target="_blank">serviceability-dev@openjdk.java.net</a><br>
> > >Â Â Â <mailto:<a href="mailto:serviceability-dev@openjdk.java.net" target="_blank">serviceability-dev@openjdk.java.net</a>>><br>
> > >Â Â Â >>> *Subject:* Re: RFR: [XS] 8228658: test<br>
> > >Â Â Â GetTotalSafepointTime.java fails<br>
> > >Â Â Â >>> on fast Linux machines with Total safepoint time 0 ms<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> Hi Matthias,<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> I wonder if you should not do what David is suggesting and then<br>
> > >Â Â Â put that<br>
> > >Â Â Â >>> whole code (the while loop) in a helper method. Below you have a<br>
> > >Â Â Â >>> calculation again using value2 (which I wonder what the added<br>
> > >Â Â Â value of<br>
> > >Â Â Â >>> it is though) but anyway, that value2 could also be 0 at some<br>
> > >Â Â Â point, no?<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> So would it not be best to just refactor the getAllStackTraces and<br>
> > >Â Â Â >>> calculate safepoint time in a helper method for both value / value2<br>
> > >Â Â Â >>> variables?<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> Thanks,<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> Jc<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> On Mon, Jul 29, 2019 at 7:50 PM David Holmes<br>
> > >Â Â Â <<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a> <mailto:<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a>><br>
> > >Â Â Â >>> <mailto:<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a><br>
> > >Â Â Â <mailto:<a href="mailto:david.holmes@oracle.com" target="_blank">david.holmes@oracle.com</a>>>> wrote:<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>>Â Â Â Hi Matthias,<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>>Â Â Â On 29/07/2019 8:20 pm, Baesken, Matthias wrote:<br>
> > >Â Â Â >>>Â Â Â Â > Hello , please review this small test fix .<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â > The test<br>
> > >Â Â Â >>><br>
> > >Â Â Â >><br>
> ><br>
> test/jdk/sun/management/HotspotRuntimeMBean/GetTotalSafepointTime.<br>
> > >Â Â Â >> java<br>
> > >Â Â Â >>>Â Â Â fails sometimes on fast Linux machines with this error<br>
> > >Â Â Â message :<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â > java.lang.RuntimeException: Total safepoint time<br>
> > >Â Â Â illegal value: 0<br>
> > >Â Â Â >>>Â Â Â ms (MIN = 1; MAX = 9223372036854775807)<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â > looks like the total safepoint time is too low<br>
> > >Â Â Â currently on these<br>
> > >Â Â Â >>>Â Â Â machines, it is < 1 ms.<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â > There might be several ways to handle this :<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >   >>>    >  *  Change the test in a way that it might generate<br>
> > >Â Â Â nigher<br>
> > >Â Â Â >>>Â Â Â safepoint times<br>
> > >   >>>    >  *  Allow safepoint time == 0 ms<br>
> > >Â Â Â >>>Â Â Â Â >Â Â *Â Â Offer an additional interface that gives<br>
> > >Â Â Â safepoint times<br>
> > >Â Â Â >>>Â Â Â with finer granularity ( currently the HS has safepoint<br>
> > >Â Â Â time values<br>
> > >   >>>   in ns , see jdk/src/hotspot/share/runtime/safepoint.cpp<br>
> > >Â Â Â >>>Â Â Â Â Â SafepointTracing::end<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â > But it is converted on ms in this code<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â > 114jlong RuntimeService::safepoint_time_ms() {<br>
> > >Â Â Â >>>Â Â Â Â > 115Â return UsePerfData ?<br>
> > >Â Â Â >>>Â Â Â Â > 116<br>
> > >Â Â Â >>><br>
> > >Â Â Â Management::ticks_to_ms(_safepoint_time_ticks->get_value()) : -1;<br>
> > >Â Â Â >>>Â Â Â Â > 117}<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â > 064jlong Management::ticks_to_ms(jlong ticks) {<br>
> > >Â Â Â >>>Â Â Â Â > 2065Â assert(os::elapsed_frequency() > 0, "Must be<br>
> > >Â Â Â non-zero");<br>
> > >Â Â Â >>>Â Â Â Â > 2066Â return (jlong)(((double)ticks /<br>
> > >Â Â Â >>>Â Â Â (double)os::elapsed_frequency())<br>
> > >Â Â Â >>>Â Â Â Â > 2067Â Â Â Â Â Â Â Â Â * (double)1000.0);<br>
> > >Â Â Â >>>Â Â Â Â > 2068}<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >   >>>    > Currently I go for the first attempt (and try to generate<br>
> > >Â Â Â >>>Â Â Â higher safepoint times in my patch) .<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>>Â Â Â Yes that's probably best. Coarse-grained timing on very<br>
> > >Â Â Â fast machines<br>
> > >Â Â Â >>>Â Â Â was bound to eventually lead to problems.<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>>Â Â Â But perhaps a more future-proof approach is to just add a<br>
> > >Â Â Â do-while loop<br>
> > >Â Â Â >>>Â Â Â around the stack dumps and only exit when we have a non-zero<br>
> > >Â Â Â >> safepoint<br>
> > >Â Â Â >>>Â Â Â time?<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>>Â Â Â Thanks,<br>
> > >Â Â Â >>>Â Â Â David<br>
> > >Â Â Â >>>Â Â Â -----<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>>Â Â Â Â > Bug/webrev :<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â > <a href="https://bugs.openjdk.java.net/browse/JDK-8228658" rel="noreferrer" target="_blank">https://bugs.openjdk.java.net/browse/JDK-8228658</a><br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â > <a href="http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/" rel="noreferrer" target="_blank">http://cr.openjdk.java.net/~mbaesken/webrevs/8228658.0/</a><br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>>Â Â Â Â > Thanks, Matthias<br>
> > >Â Â Â >>>Â Â Â Â ><br>
> > >Â Â Â >>><br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> --<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> Thanks,<br>
> > >Â Â Â >>><br>
> > >Â Â Â >>> Jc<br>
> > >Â Â Â >>><br>
> > ><br>
> > ><br>
> > ><br>
> > > --<br>
> > ><br>
> > > Thanks,<br>
> > > Jc<br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><br></div>Thanks,<div>Jc</div></div></div>