RFR(M): 8244383: jhsdb/HeapDumpTestWithActiveProcess.java fails with "AssertionFailure: illegal bci"

Chris Plummer chris.plummer at oracle.com
Thu Jun 25 20:52:10 UTC 2020

Ping #2. I still need one more reviewer (Thanks for the review, Dan). I 
updated the webrev based on Dan's comments:


I can still make the simplification mentioned below if necessary.



On 6/23/20 11:29 AM, Chris Plummer wrote:
> Ping!
> If this fix is too complicated, there is a simplification I can make, 
> but at the cost of abandoning some attempts to determine the current 
> frame when this error condition pops up. At the start of 
> validateInterpreterFrame() it attempts to verify that the frame is 
> valid by verifying that frame->method and frame->bcp are valid. This 
> part is pretty simple. The complicated part is everything that follows 
> if the verification fails. It attempts to error correct the situation 
> by looking at various register contents and stack contents. I could 
> just abandon this complicated code and return false if frame->method 
> and frame->bcp don't check out. Upon return, the caller's code would 
> be simplified to:
>             if (validateInterpreterFrame(sp, fp, pc)) {
>               return true; // We're done. setValues() has been called 
> for valid interpreter frame.
>             } else {
>               return checkLastJavaSP();
>             }
> So there's still a chance we can determine a valid current frame if 
> "last java frame" has been setup. However, if not setup we would not 
> be able to. This is where the complicated code in 
> validateInterpreterFrame() is useful because it can usually determine 
> the current frame, even if "last java frame" is not setup, but it's 
> rare enough that we run into this situation that I think failing to 
> get the current frame is ok.
> So if I can get a couple promises for reviews if I make this change, 
> I'll go ahead and do it and send out a new RFR.
> thanks,
> Chris
> On 6/18/20 5:54 PM, Chris Plummer wrote:
>> [I've added runtime-dev to this SA review since understanding 
>> interpreter invokes (code generated by 
>> TemplateInterpreterGenerator::generate_normal_entry()) and stack 
>> walking is probably more important than understanding SA.]
>> Hello,
>> Please help review the following:
>> https://bugs.openjdk.java.net/browse/JDK-8244383
>> http://cr.openjdk.java.net/~cjplummer/8244383/webrev.00/index.html
>> The crux of the bug is when doing stack walking the topmost frame is 
>> in an inconsistent state because we are in the middle of pushing a 
>> new interpreter frame. Basically we are executing code generated by 
>> TemplateInterpreterGenerator::generate_normal_entry(). Since the PC 
>> register is in this code, SA assumes the topmost frame is an 
>> interpreter frame.
>> The first issue with this interpreter frame assumption is if we 
>> haven't actually pushed the frame yet, then the current frame is the 
>> caller's frame, and could be compiled. But since SA thinks it's 
>> interpreted, later on it tries to convert the frame->bcp to a BCI, 
>> but frame->bcp is only valid for interpreter frames. Thus the 
>> "illegal BCI" failures. If the previous frame happened to be 
>> interpreted, then the existing SA code works fine.
>> The other state of frame pushing that was problematic was when the 
>> new frame had been pushed, but frame->method and frame->bcp were not 
>> setup yet. This also would lead to "illegal BCI" later on because 
>> garbage would be stored in these locations.
>> Fixing the above problems requires trying to determine the state of 
>> the frame push through a series of checks, and then adapting what is 
>> considered to be the current frame based on the outcome of the 
>> checks. The first things checked is that frame->method is valid (we 
>> can successfully instantiate a wrapper for the Method* without 
>> failure) and that frame->bcp is within the method. If both these pass 
>> then we can use the frame as-is.
>> If the above checks fail, then we try to determine whether the issue 
>> is that the frame is not yet pushed and the current frame is actually 
>> compiled, or the frame has been pushed but not yet initialized. This 
>> is done by first getting the return address from the stack or RAX 
>> (it's location depends on how far along we are in the entry code) and 
>> comparing this to what is stored in frame->return_addr. If they are 
>> the same, then we have pushed the frame but not yet initialized it. 
>> In this case we use the previous frame (senderSP() and senderFP()) as 
>> the current frame since the current frame is not yet initialized. If 
>> the return address check fails, then we assume the new frame is not 
>> yet pushed, and and treat the current frame as compiled, even though 
>> PC points into the interpreter (we replace PC with RAX in this case).
>> Comments in the code pretty well explain all the above, so it is 
>> probably easier to follow the logic in the code along with the 
>> comments rather than apply my above description to the code.
>> I should add that it's very rare that we ever get into this special 
>> error handling code. This bug was very hard to reproduce initially. I 
>> was only able to make progress with reproducing and debugging by 
>> inserting delay loops in various spots in the code generated by 
>> TemplateInterpreterGenerator::generate_normal_entry(). By doing this 
>> I was able to reproduce the issue quite easily and hit all the logic 
>> in the new code I've added.
>> The fix is basically entirely contained within 
>> AMD64CurrentFrameGuess.java. The rest of the changes are minor:
>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/amd64/AMD64CurrentFrameGuess.java 
>> -Main fix for CR
>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/x86/X86Frame.java 
>> -Added getInterpreterFrameBCP(), which is now needed by 
>> AMD64CurrentFrameGuess.java
>> -I also simplified some code by using the existing 
>> getInterpreterFrameMethod()
>>  rather than replicating inline what it does.
>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/amd64/BsdAMD64CFrame.java 
>> -I noticed the windows version of this code had some extra checks 
>> that were missing
>>  from the bsd version. I then looked at the linux version, but it had 
>> been heavily modified
>>  a short while back to leverage DWARF info to determine frames. So I 
>> looked at the previous
>>  rev and it too had these extra checks. I decided to add them to the 
>> BSD port. I'm not sure
>>  if it helps at all, but it certainly doesn't seem to do any harm.
>> thanks,
>> Chris

More information about the hotspot-runtime-dev mailing list