RFR(M): 8133749, 8133747, 8133740: NMT detail stack trace cleanup

Dmitry Samersoff dmitry.samersoff at oracle.com
Thu Aug 18 16:25:05 UTC 2016

PS: (hit send too early, sorry!)

3. nativeCallStack.cpp

43 #if (defined(_NMT_NOINLINE_) && defined(BSD) && defined(_LP64))

Is it OS X specific behavior of clang specific one? Should we change
define to __APPLE__ or __clang__, not BSD ?


On 2016-08-02 23:31, Chris Plummer wrote:
> Hello,
> Please review the following:
> webrev:
> http://cr.openjdk.java.net/~cjplummer/8133749-8133747-8133740/webrev-01/webrev.hotspot/
> Bugs fixed:
> JDK-8133749: os::current_frame() is not returning the proper frame on
> ARM and solaris-x64
> https://bugs.openjdk.java.net/browse/JDK-8133749
> JDK-8133747: NMT includes an extra stack frame due to assumption NMT is
> making on tail calls being used
> https://bugs.openjdk.java.net/browse/JDK-8133747
> JDK-8133740: NMT for Linux/x86/x64 and bsd/x64 slowdebug builds includes
> NativeCallStack::NativeCallStack() frame in backtrace
> https://bugs.openjdk.java.net/browse/JDK-8133740
> The above bugs all result in the NMT detail stack traces including extra
> frames in the stack traces. Certain frames are suppose to be skipped,
> but sometimes are not. The frames that show up are:
>   NativeCallStack::NativeCallStack
>   os::get_native_stack
> These are both methods used to generate the stack trace, and therefore
> should not be included it.  However, under some (most) circumstances,
> they were.
> Also, there was no test to make sure that any NMT detail output is
> generated, or that it is correct. I've added one with this webrev. Of
> the 27 possible builds (9 platforms * 3 build flavors), only 9 of the 27
> initially passed this new test. They were the product and fastdebug
> builds for solaris-sparc, bsd-x64, and linux-x64; and the slowdebug
> builds for solaris-x64, windows-x86, and windows-x64. All the rest
> failed. They now all pass with my fixes in place.
> Here's a summary of the changes:
> src/os/posix/vm/os_posix.cpp
> src/os/windows/vm/os_windows.cpp
> JDK-8133747 fixes: There was some frame skipping logic here which was
> sort of correct, but was misplace. There are no extra frames being added
> in os::get_native_stack() due to lack of inlining or lack of a tail
> call, so no need for toSkip++ here. The logic has been moved to
> NativeCallStack::NativeCallStack, which is where the tail call is
> (sometimes) made, and also corrected (see nativeCallStack.cpp below).
> src/share/vm/utilities/nativeCallStack.cpp
> JDK-8133747 fixes: The frame skipping logic that was moved here assumed
> that NativeCallStack::NativeCallStack would not appear in the call stack
> (due to a tail call be using to call os::get_native_stack) except in
> slow debug builds. However, some platforms also don't use a tail call
> even when optimized. From what I can tell that is the case for 32-bit
> platforms and for windows.
> src/os_cpu/bsd_x86/vm/os_bsd_x86.cpp
> src/os_cpu/windows_x86/vm/os_windows_x86.cpp
> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
> JDK-8133740 fixes: When _get_previous_fp is not inlined, we need to skip
> one extra frame
> src/os_cpu/solaris_x86/vm/os_solaris_x86.cpp
> JDK-8133749 fixes: os:current_frame() was not consistent with other
> platforms and needs to skip one more frame. This means it returns the
> frame for the caller's caller. So when called by os:get_native_stack(),
> it returns the frame for whoever called os::get_native_stack(). Although
> not intuitive, this is what os:get_native_stack() expects. Probably a
> method rename and/or a behavior change is justified here, but I would
> prefer to do that with a followup CR if anyone has a good suggestion on
> what to do.
> test/runtime/NMT/CheckForProperDetailStackTrace.java
> This is the new NTM detail test. It checks for frames that shouldn't be
> present and validates at least one stack trace is what is expected.
> I verified that the above test now passes on all supported platforms,
> and also did a full jprt "-testset hotpot" run. I plan on doing some RBT
> testing with NMT detail enabled before committing.
> Regarding the community contributed ports that Oracle does not support,
> I didn't make any changes there, but it looks like some of these bugs do
> exist. Notably:
> -linux-aarch64: Looks like it suffers from JDK-8133740. The changes done
> to the
>  os_linux_x86.cp should also be applied here.
> -linux-ppc: Hard to say for sure since the implementation of
> os::current_frame is
>  different than others, but it looks to me like it suffers from both
> JDK-8133749
>  and JDK-8133740.
> -aix-ppc: Looks to be the same implementation as linux-ppc, so would
> need the
>  same changes.
> These ports may also be suffering from JDK-8133747, but that fix is in
> shared code (nativeCallStack.cpp). My changes there will need some
> tweaking for these ports they don't use a tail call to call
> os::get_native_stack().
> If the maintainers of these ports could send me some NMT detail output,
> I can advise better on what changes are needed. Then you can implement
> and test them, and then send them back to me and I'll include them with
> my changes. What I need is the following command run on product and
> slowdebug builds. Initially run without any of my changes applied. If
> needed I may followup with a request that they be run with the changes
> applied:
> bin/java -XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=detail
> -XX:+PrintNMTStatistics -version
> thanks,
> Chris

Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.

More information about the hotspot-runtime-dev mailing list