RFR(10): 8172020: Internal Error (cpu/arm/vm/frame_arm.cpp:571): assert(obj == __null || Universe::heap()->is_in(obj)) failed: sanity check #
serguei.spitsyn at oracle.com
serguei.spitsyn at oracle.com
Tue Feb 21 23:29:45 UTC 2017
It looks good to me.
But let me mediate on the fix a little bit more.
Consider it reviewed if I'll not come back in 24 hours. :)
Thank you a lot for this great investigation and fix!
On 2/21/17 11:31, Chris Plummer wrote:
> Please review the following:
> [Note there are s390, ppc, and aarch64 changes in this bug fix that I
> will need to have someone verify for me or I won't be pushing them.]
> This bug turns up with a closed test so the CR is marked confidential.
> I'll try to provided the necessary details here. Sorry the explanation
> is lengthy, but it is a complex bug (with a relatively simple fix).
> If you are on the s390, ppc, or aarch64 teams and just want to help
> verify the fix, I would say it's not necessary to understand the
> details of the bug. What's most important is you test the changes.
> Unfortunately you won't be able to reproduce the issue since the test
> is closed, but you should at least verify the fix builds and
> introduces no new issues. I did test the open aarch64 port since I
> have the hardware, and someone was kind enough to provide me with
> binaries both without the patch (which reproduced the bug) and with
> the patch (which passed). No other aarch64 testing was done.
> Now on to an explanation of the bug and the fix:
> The test is testing the JDI/JVMTI forceEarlyReturn support. After the
> debugger side spawns the debugee process and the debugee thread is
> created, a vm.suspend() is issued to suspend all threads on the
> debugee side. The debugger side then issues a JDI forceEarlyReturn,
> which will cause the debugee thread to exit the currently executing
> method. Although the intent is to force exit of the run() method, the
> run() method does make some method calls to some very small methods
> with "inline" in their name, such as inlineMethodReturningObject(). If
> forceEarlyReturn is done while in one of them, JDI throws
> InvalidTargetException, which the test correctly handles and will pass
> if that happens.
> If the run() method is not yet compiled, what *normally* happens to
> force the thread to suspend is to swap in the interpreter safepoint
> dispatch table. This will result in a call_VM out to the interpreter
> to handle the safepointing. When later the thread is resumed (after
> initiating the forceEarlyReturn), there is code on the return side of
> call_VM to make sure the earlyret is handled immediately before any
> other bytecodes are exectuted. Thus the method that was executing at
> the time of the forceEarlyReturn should be the same as the method when
> the earlyret is handled. In fact it should be the same bytetcode(bci).
> Like I said, this is what *normally* happens, but is not the way the
> bug is exposed.
> The bug turns up when the run() method is not yet compiled, and when
> the suspend is attempted we are in one of the small "inline" methods,
> and it is compiled. In this case thread suspension is achieved in a
> different manner. When a compiled method executes its return code, it
> first pops its frame, and then the " poll_return" safepoint polling
> instruction is executed. This allows the thread to be suspended. We
> are technically in the run() method at this point since the frame of
> the "inline" method has been popped. Therefore the forceEarlyReturn
> is allowed. When the thread is resumed, the return from the "inline"
> method finally happens, which puts us in code generated by
> TemplateInterpreterGenerator::generate_return_entry_for(). However,
> there are no checks for an earlyret here, so we start executing
> bytecodes until we finally hit an earlyret check. This happens when
> the next invokespecial is done. There is a call_VM after the frame is
> pushed, and the call_VM always does an earlyret check. The problem is
> now the topmost frame (in the case where it asserts) is
> inlineMethodReturningObject(). Since it returns an object type, the
> earlyret code expects there to be an oop in the earlyret slot on the
> stack. But there isn't because the forceEarlyReturn was done on the
> void run() method. So the earlyret code fetches garbage, does an
> validity test on it, and gets the assert you see in the bug title.
> So what is needed here is an earlyret check after the compiled
> "inline" method returns, and before any new bytecodes are executed.
> Since TemplateInterpreterGenerator::generate_return_entry_for() is
> always executed when the compiled "inline" method returns, the check
> was added here. For completeness, the popframe check was also added.
> The header file changes are just to make these methods public so they
> can be called.
> I tested the failed test at least 100 times on each supported
> platform. The assert usually showed up at least 1 in 50 times. I also
> ran a large set of tests on all supported platforms, including
> everything we do for nightly testing except for some of the longest
> running tests.
More information about the hotspot-runtime-dev