RFR(M): 8244383: jhsdb/HeapDumpTestWithActiveProcess.java fails with "AssertionFailure: illegal bci"

Daniil Titov daniil.x.titov at oracle.com
Wed Jul 1 05:24:46 UTC 2020

Hi Chris,

The fix,  in general,  looks good to me. 

Some small comments for src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/amd64/AMD64CurrentFrameGuess.java:
   1) Imports at lines 30 and 35 are not used and could be removed.

        30 import sun.jvm.hotspot.interpreter.*;
        35 import sun.jvm.hotspot.utilities.*;

  2) Local variable "end" defined at line 189 is not used.
      189     Address end = sp.addOffsetTo(regionInBytesToSearch);

No new webrev is required.


On 6/25/20, 1:55 PM, "serviceability-dev on behalf of Chris Plummer" <serviceability-dev-retn at openjdk.java.net on behalf of chris.plummer at oracle.com> wrote:

    Ping #2. I still need one more reviewer (Thanks for the review, Dan). I 
    updated the webrev based on Dan's comments:


    I can still make the simplification mentioned below if necessary.



    On 6/23/20 11:29 AM, Chris Plummer wrote:
    > Ping!
    > If this fix is too complicated, there is a simplification I can make, 
    > but at the cost of abandoning some attempts to determine the current 
    > frame when this error condition pops up. At the start of 
    > validateInterpreterFrame() it attempts to verify that the frame is 
    > valid by verifying that frame->method and frame->bcp are valid. This 
    > part is pretty simple. The complicated part is everything that follows 
    > if the verification fails. It attempts to error correct the situation 
    > by looking at various register contents and stack contents. I could 
    > just abandon this complicated code and return false if frame->method 
    > and frame->bcp don't check out. Upon return, the caller's code would 
    > be simplified to:
    >             if (validateInterpreterFrame(sp, fp, pc)) {
    >               return true; // We're done. setValues() has been called 
    > for valid interpreter frame.
    >             } else {
    >               return checkLastJavaSP();
    >             }
    > So there's still a chance we can determine a valid current frame if 
    > "last java frame" has been setup. However, if not setup we would not 
    > be able to. This is where the complicated code in 
    > validateInterpreterFrame() is useful because it can usually determine 
    > the current frame, even if "last java frame" is not setup, but it's 
    > rare enough that we run into this situation that I think failing to 
    > get the current frame is ok.
    > So if I can get a couple promises for reviews if I make this change, 
    > I'll go ahead and do it and send out a new RFR.
    > thanks,
    > Chris
    > On 6/18/20 5:54 PM, Chris Plummer wrote:
    >> [I've added runtime-dev to this SA review since understanding 
    >> interpreter invokes (code generated by 
    >> TemplateInterpreterGenerator::generate_normal_entry()) and stack 
    >> walking is probably more important than understanding SA.]
    >> Hello,
    >> Please help review the following:
    >> https://bugs.openjdk.java.net/browse/JDK-8244383
    >> http://cr.openjdk.java.net/~cjplummer/8244383/webrev.00/index.html
    >> The crux of the bug is when doing stack walking the topmost frame is 
    >> in an inconsistent state because we are in the middle of pushing a 
    >> new interpreter frame. Basically we are executing code generated by 
    >> TemplateInterpreterGenerator::generate_normal_entry(). Since the PC 
    >> register is in this code, SA assumes the topmost frame is an 
    >> interpreter frame.
    >> The first issue with this interpreter frame assumption is if we 
    >> haven't actually pushed the frame yet, then the current frame is the 
    >> caller's frame, and could be compiled. But since SA thinks it's 
    >> interpreted, later on it tries to convert the frame->bcp to a BCI, 
    >> but frame->bcp is only valid for interpreter frames. Thus the 
    >> "illegal BCI" failures. If the previous frame happened to be 
    >> interpreted, then the existing SA code works fine.
    >> The other state of frame pushing that was problematic was when the 
    >> new frame had been pushed, but frame->method and frame->bcp were not 
    >> setup yet. This also would lead to "illegal BCI" later on because 
    >> garbage would be stored in these locations.
    >> Fixing the above problems requires trying to determine the state of 
    >> the frame push through a series of checks, and then adapting what is 
    >> considered to be the current frame based on the outcome of the 
    >> checks. The first things checked is that frame->method is valid (we 
    >> can successfully instantiate a wrapper for the Method* without 
    >> failure) and that frame->bcp is within the method. If both these pass 
    >> then we can use the frame as-is.
    >> If the above checks fail, then we try to determine whether the issue 
    >> is that the frame is not yet pushed and the current frame is actually 
    >> compiled, or the frame has been pushed but not yet initialized. This 
    >> is done by first getting the return address from the stack or RAX 
    >> (it's location depends on how far along we are in the entry code) and 
    >> comparing this to what is stored in frame->return_addr. If they are 
    >> the same, then we have pushed the frame but not yet initialized it. 
    >> In this case we use the previous frame (senderSP() and senderFP()) as 
    >> the current frame since the current frame is not yet initialized. If 
    >> the return address check fails, then we assume the new frame is not 
    >> yet pushed, and and treat the current frame as compiled, even though 
    >> PC points into the interpreter (we replace PC with RAX in this case).
    >> Comments in the code pretty well explain all the above, so it is 
    >> probably easier to follow the logic in the code along with the 
    >> comments rather than apply my above description to the code.
    >> I should add that it's very rare that we ever get into this special 
    >> error handling code. This bug was very hard to reproduce initially. I 
    >> was only able to make progress with reproducing and debugging by 
    >> inserting delay loops in various spots in the code generated by 
    >> TemplateInterpreterGenerator::generate_normal_entry(). By doing this 
    >> I was able to reproduce the issue quite easily and hit all the logic 
    >> in the new code I've added.
    >> The fix is basically entirely contained within 
    >> AMD64CurrentFrameGuess.java. The rest of the changes are minor:
    >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/amd64/AMD64CurrentFrameGuess.java 
    >> -Main fix for CR
    >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/x86/X86Frame.java 
    >> -Added getInterpreterFrameBCP(), which is now needed by 
    >> AMD64CurrentFrameGuess.java
    >> -I also simplified some code by using the existing 
    >> getInterpreterFrameMethod()
    >>  rather than replicating inline what it does.
    >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/bsd/amd64/BsdAMD64CFrame.java 
    >> -I noticed the windows version of this code had some extra checks 
    >> that were missing
    >>  from the bsd version. I then looked at the linux version, but it had 
    >> been heavily modified
    >>  a short while back to leverage DWARF info to determine frames. So I 
    >> looked at the previous
    >>  rev and it too had these extra checks. I decided to add them to the 
    >> BSD port. I'm not sure
    >>  if it helps at all, but it certainly doesn't seem to do any harm.
    >> thanks,
    >> Chris

More information about the hotspot-runtime-dev mailing list