8065585: Change ShouldNotReachHere() to never return
mikael.gerdin at oracle.com
Fri Apr 17 11:49:13 UTC 2015
On 2015-04-16 15:32, Stefan Karlsson wrote:
> On 2015-04-16 14:33, David Holmes wrote:
>> Hi Stefan,
>> trimming ...
>> On 16/04/2015 10:07 PM, Stefan Karlsson wrote:
>>> On 2015-04-16 04:23, David Holmes wrote:
>>>> Second, more important question: have you examined how this attribute
>>>> affects the ability to walk the stack? We have already seen issues on
>>>> some platforms where library functions, like abort(), have the
>>>> noreturn attribute and as a result the call is optimized in a way that
>>>> prevents the stack from being walked - see eg:
>>>> though this:
>>>> suggests that problem may have been addressed by the libc folk. But it
>>>> still raises the question as to how our own noreturn functions will be
>>>> handled and how they will affect stacktrace generation in hs_err logs
>>>> or via gdb.
>>> I added a call to fatal(...) in the GC code. I get correct stacktraces
>>> in gdb, but the stacktraces in the hs_err files are broken with
>>> fastdebug and product builds:
>> Which platforms?
> On Linux x86 and x86_64.
>>> Stack: [0x00007f12518d2000,0x00007f12519d3000], sp=0x00007f12519d0eb0,
>>> free space=1019k
>>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code,
>>> C=native code)
>>> V [libjvm.so+0x11db44a] VMError::report_and_die()+0x1ba
>>> V [libjvm.so+0x7efb80] report_vm_error(char const*, int, char const*,
>>> char const*)+0x90
>>> V [libjvm.so+0x7efc49] report_vm_error_noreturn(char const*, int, char
>>> const*, char const*)+0x9
>>> V [libjvm.so+0x7efc63]
>>> V [libjvm.so+0xfd7937]
>>> V [libjvm.so+0xfeec51]
>> So what is the plan: try to get hs_err working again? Or file this
>> under "well it seemed like a good idea"? ;-)
> I'm leaning towards "seemed like a good idea", unless someone has an
> easy fix for these problems.
I've been looking a bit at this. It's not the stack trace per se that is
broken, but the decoding of the function names is not working for some
of the callers of the noreturn functions.
I tried this with report_fatal using -XX:ErrorHandlerTest=5 and got the
0x7fb71ccd98d0 <report_fatal>: push %rbp
0x7fb71ccd98d1 <report_fatal+1>: mov %rdx,%rcx
0x7fb71ccd98d4 <report_fatal+4>: lea 0x9b4b34(%rip),%rdx
0x7fb71ccd98db <report_fatal+11>: mov %rsp,%rbp
0x7fb71ccd98de <report_fatal+14>: callq 0x7fb71ccd98c0
0x7fb71ccd98e3: data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
So the report_fatal frame has ...98e3 as its return address, but that is
actually outside the function and this causes dladdr() to return NULL in
dli_saddr and dli_sname.
The JVM then attempts to decode using Decoder::decode but I wasn't able
to follow that code to understand why that fails.
The same appears to happen for the caller of report_fatal
(controlled_crash in my case) but there I can't explain why dladdr
returns NULL values there.
After these two functions the rest of the stack trace appears to be
One approach could be to attempt to inject a "nop" at the end of
functions which call a "noreturn" function. This would hopefully make
the instruction after the call to the noreturn function part of the
caller and would make symbol decoding work.
More information about the hotspot-dev