RFR(s) (JDK10): 8166944: Hanging Error Reporting steps may lead to torn error logs.
thomas.stuefe at gmail.com
Fri Feb 3 12:46:41 UTC 2017
Thanks again for your review, and sorry for forgetting parts of your input.
Will prepare a new webrev and incorporate yours and Davids input.
Kind Regards, Thomas
On Fri, Feb 3, 2017 at 12:11 AM, Chris Plummer <chris.plummer at oracle.com>
> Hi Thomas,
> Mostly looks good. A few minor comments:
> Most files need 2017 copyright updates.
> set_to_now() and get_timestamp() should have names that tie them together
> better. Maybe set_step_start_time() and get_step_start_time(). Also, they
> should probably be methods of VMError and the statics should all be fields.
> In the first review I had mentioned that you only are allowing for one
> step to hang, and still be able to continue with subsequent steps:
> 1509 // A step times out after half of the total timeout. Steps are
> mostly fast unless they
> 1510 // hang for some reason, so this simple rule allows for one
> hanging step and still
> 1511 // leaves time enough for the rest of the steps to finish.
> 1512 const jlong end = step_start_time_l + ErrorLogTimeout * 1000 / 2;
> You mentioned that for your SAP version you divide by 4, allowing for up
> to 3 steps to hang. Is there a reason not to do this for the open version?
> In fact it looks like TestUnresponsiveErrorHandler is relying on that being
> the case, since it will hang 3 times. Related to this,
> TimeoutInErrorHandlingTest.java is only checking for one instance of step
> timeout pattern. Probably it should check for 3 instances, and also one
> instance of the overall timeout pattern:
> 1272 st->print_cr("------ Timout during error reporting after "
> INT64_FORMAT "ms. ------",
> On 2/2/17 3:59 AM, Thomas Stüfe wrote:
>> Dear all,
>> please review (or re-review) this enhancement. It was already reviewed
>> extensively for JDK9 in Oct 16, but pushed back to JDK10. Now that JDK10
>> repo is open, I would like to close this issue.
>> In short, this enhancement lets the hotspot cancel hanging error
>> reporting steps (when writing the hs-err file), in order to allow
>> subsequent error reporting steps to run. In our JVM variant, this has been
>> active for a long time and has proven to make error reporting more stable
>> in the face of deadlocks or file system stalls.
>> Error-Reporting/jdk10-webrev.02/webrev/ <http://cr.openjdk.java.net/%7
>> Old Diskussion for JDK9:
>> The webrev is unchanged from version 2 discussed in october, just got
>> rebased atop of JDK10/hs.
>> Thank you,
>> Kind Regards, Thomas
More information about the hotspot-runtime-dev