RFR(S): 8059334: nsk/jvmti/scenarios/hotswap/HS201/hs201t001 fails with exit code 0 after timeout
chris.plummer at oracle.com
Sat Nov 4 00:25:20 UTC 2017
Please review the following:
The CR is closed, so I'll try to explain the issue here. The very short
explanation is that the JVMTI test was enabling SINGLE STEP and doing a
PopFrame, but the test method managed to get compiled and started
executing compiled after the thread was put in "interp only" mode (which
should never happen) and before the PopFrame was processed. The cause is
a lack of a check for "interp only" mode in the OSR related compilation
The test is testing JVMTI PopFrame support. The test thread has a small
method that sits in a tight loop. It will never exit. The main thread
enables SINGLE STEP on the test thread, and then does a PopFrame on the
test thread to force it out of the looping method. When the test failed
due to a time out, I noticed it was still stuck in the small method,
even though a PopFrame had been requested. Further, I noticed the method
was compiled, so there was no chance the method would ever detect that
it should do a PopFrame. Since "interp only" mode for SINGLE STEP had
been enabled, the method should not be running compiled, so clearly
something went wrong that allowed it to compile and execute.
When SINGLE STEP is requested, JVMTI will deopt the topmost method
(actually the top 2), put the thread in "interp only" mode, and then has
checks to make sure the thread continues to execute interpreted. To
avoid compilation when a back branch tries to trigger one, there is a
check for "interp only" mode in SimpleThresholdPolicy::event(). If the
thread is in "interp only" mode, it will prevent compilation.
SimpleThresholdPolicy::event() is called (indirectly) by
InterpreterRuntime::frequency_counter_overflow(), which is called from
the interpreter when the back branch threshold is reached.
After some debugging I noticed when the test timeout happens, "interp
only" mode is not yet enabled when
InterpreterRuntime::frequency_counter_overflow() is called, but is
enabled by the time InterpreterRuntime::frequency_counter_overflow() has
done the lookup of the nm. So there is a race here allowing the thread
to begin execution in a compiled method even though "interp only" mode
is enabled. I think the reason is because we safepoint during the
compilation, and this allows a SINGLE STEP request to be processed,
which enables "interp only" mode.
I should add that initially I only saw this bug with -Xcomp, but
eventually realized it was caused by disabling BackgroundCompilation.
That makes it much more likely that a SINGLE STEP request will come in
and be processed during the call to
InterpreterRuntime::frequency_counter_overflow() (because it will block
until the compilation completes).
I believe for the fix it is enough just to add an "interp only" mode
check in InterpreterRuntime::frequency_counter_overflow() after the nm
lookup, and set it nm to NULL if we are now in "interp only" mode. If we
are not in "interp only" mode at this point (and start executing the
compiled method) it should not be possible to enter "interp only" mode
until we reach a safepoint at some later time, and at that point the
method will be properly deopt so it can execute interpreted.
More information about the hotspot-compiler-dev