RFR(S): JDK-8146697 : VM crashes in test Test7005594
frederic.parain at oracle.com
Fri Aug 12 13:07:31 UTC 2016
Thank you Coleen.
On 08/11/2016 04:39 PM, Coleen Phillimore wrote:
> Yes, I think this fix is good for robustness, given the difficulty and
> time needed to find the root cause.
> On 8/8/16 10:55 AM, Frederic Parain wrote:
>> Please review this small fix for JDK-8146697
>> Summary: The JVM sometimes tries to re-enable the Reserved Stack Area
>> while it is currently not disabled, leading to the following assertion
>> share/vm/runtime/thread.cpp:2551 assert(_stack_guard_state !=
>> stack_guard_enabled) failed: already enabled
>> This problem occurred while running different tests including tests
>> where stack overflows are unlikely. It is rare and very hard to
>> reproduce. At the beginning of the investigation, I've been able to
>> reproduce it three times out of 1,000+ runs of metaspace stress test
>> (the fact that was is a metaspace test doesn't matter). But once I've
>> instrumented the JVM, the bug didn't show up again, even after 30,000+
>> So, I've investigated it with the limited material I had. The failures
>> always occurred on x86/32bits platforms.
>> Regarding that some failures occurred on tests where stack overflows are
>> unlikely (no recursive calls, small call stack), and that all failures
>> occurred in interpreted Java code, my guess is that the issue is in the
>> test performed on interpreted method exit to determine if the Reserved
>> Stack Area should be enabled or not.
>> The test on method exit compares the SP of the caller frame to an
>> activation SP address stored in the JavaThread object when the Reserved
>> Stack Area has been disabled. Without a reproducible test case, I've not
>> been able to find what was the issue between the two values (de-opt,
>> OSR, other?). So, I've slightly changed the test to make it more robust
>> against the situation causing the assertion failure. Now the test checks
>> the status of the guard pages, and if no guard pages have been disabled,
>> the method exits normally. This means there's always only one test on
>> interpreted method exit if Reserved Stack Area has not been used, so no
>> difference on performances for most cases. If this first test detects
>> that guard pages have been disabled, then the previous test (caller SP
>> vs activation SP) is performed, to determine if this is the place where
>> the Reserved Stack Area should be re-enabled or not.
>> Even if the root cause of the bug is still unknown, the fix should make
>> the code more robust and prevent unnecessary re-enabling of the Reserved
>> Stack Area.
>> Thank you,
More information about the hotspot-runtime-dev