A hotspot patch for stack profiling (frame pointer)

Bertrand Delsart bertrand.delsart at oracle.com
Mon Feb 16 16:08:12 UTC 2015

On 13/02/2015 23:26, Brendan Gregg wrote:
> Thanks Bertrand and John for explaining the invokedynamic issue, and
> Vladimir for filing the bug.
> I'll reply here (I don't have a JBS account; I would like one!).
> The profilers I'm using (Linux perf, and Solaris DTrace) can already
> handle a broken RBP, and we see this all the time when profiling
> OracleJDK today (eg, as a flame graph:
> http://www.slideshare.net/brendangregg/netflix-from-clouds-to-roots/66).
> Including an option (eg, -XX:+NoOmitFramePointer, or
> -XX:+ReduceOmitFramePointer, or -XX:+MoreFramePointer) which improved
> RBP profiling (like my patch) would have great value for us. I'm fine
> with a profiler not working 100% of the time, provided we understand
> that there is an error margin and why (Bertrand and John's descriptions)
> for when interpreting the profiles. Any of these options could also be
> improved as follow-on changes, if and when needed.

As long as it is clear that RBP can be misleading and that trying to fix 
that would be an RFE, not a bug, I have no objection.

I'll let official Reviewers see whether a command line option is 
necessary (e.g. whether there are concerns about possible performance 
regressions when RBP cannot be used by the register allocator).



> I haven't had a chance yet to prototype more (eg, option processing).
> There's also work happening in Linux (two projects on lkml this week,
> one by Stephane Eranian and another by Carl Love) for improving Java JIT
> symbol support in perf_events. I think there will be more demand for
> system stack walking, as perf gets more symbol translation options.
> Brendan
> On Thu, Jan 15, 2015 at 9:50 AM, Vladimir Kozlov
> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>     Thank you, Bertrand and John
>     I added this conversation to the bug report.
>     Thanks,
>     Vladimir
>     On 1/15/15 3:13 AM, Bertrand Delsart wrote:
>         On 14/01/2015 20:12, John Rose wrote:
>             On Jan 14, 2015, at 6:42 AM, Bertrand Delsart
>             <bertrand.delsart at oracle.com
>             <mailto:bertrand.delsart at oracle.com>
>             <mailto:bertrand.delsart at __oracle.com
>             <mailto:bertrand.delsart at oracle.com>>> wrote:
>                 I would not prevent the JITs from using RBP as long as
>                 the changeset
>                 is not sufficient to guarantee the profiling will
>                 work... and IMHO
>                 solving the JSR292 issue will be much more intrusive
>                 (impacting
>                 HotSpot stack walking code).
>             Here are some thoughts on that.
>             SPARC uses L7 (L7_mh_SP_save) for the same purpose of method
>             handle
>             support as x86 uses RBP (rbp_mh_SP_save).  So there's not a hard
>             requirement for x86 to take over RBP.
>             (Deep background:  This purpose, in method handle support,
>             is to allow
>             an adapter to make changes to the caller's SP.  The adapter
>             is the
>             initial callee from the caller, but may change argument
>             shape, and
>             tail-calls the ultimate callee.  Because it is a tail-call,
>             the original
>             caller must have a spot where his original SP can be
>             preserved.  The
>             preservation works because the original caller knows he is
>             calling a
>             MH.invoke method, which requires the extra argument
>             preservation.  The
>             repertoire of argument shape changes is quite small,
>             actually; it is not
>             a very general mechanism since the LF machinery was put in.
>             Perhaps the
>             whole thing could be removed somehow, by finding alternative
>             techniques
>             for the few remaining changes.  OTOH, this SP-restoring
>             mechanism may be
>             helpful in doing more a general tail-call mechanism, and
>             perhaps in
>             managing int/comp mode changes more cleanly, so I'd like us
>             to keep it.
>                And document it better.)
>             Any register or stack slot will do for this purpose, as long
>             as (i) its
>             value can be recovered after the MH.invoke call returns to
>             the caller,
>             and (ii) its value can be dug up somehow during stack
>             walking.  There
>             are only a couple of places where stack walking code needs
>             to sample the
>             value, so they should be adjustable.
>             Both x86 and SPARC use registers which are callee-save (or
>             "non-volatile
>             across calls") which satisfy properties (i) and (ii).  A
>             standard stack
>             slot (addressed based on caller's RBP) would probably also
>             satisfy those
>             properties.
>             A variably-positioned stack slot would also work, which
>             would require
>             registering the position in each CodeBlob.  That's
>             unpleasant extra
>             detail, but it would align somewhat with the current logic
>             which allows
>             each CodeBlob (nmethod, actually) to advertise which call
>             sites need the
>             special processing (see the function
>             is_method_handle_return(__caller_pc)).
>             I recommend reserving a dead word of space in every stack
>             frame that
>             makes MH.invoke calls, at a fixed position relative to that
>             frame's RBP.
>             — John
>         I perfectly agree that it is doable (and with your proposed
>         approach).
>         I just wanted to be sure people were aware that the RFE is more
>         complex
>         than what the current changeset may suggest. We are not just taking
>         about reviewing and integrating a complete changeset contributed
>         by the
>         community. There is more work needed, either by the community or by
>         Oracle. This will require changes at least in C1 and C2 call
>         sequences,
>         in the stack walking, in the creation and sizing of compiled
>         frames...
>         Regards,
>         Bertrand.

Bertrand Delsart,                     Grenoble Engineering Center
Oracle,         180 av. de l'Europe,          ZIRST de Montbonnot
38330 Montbonnot Saint Martin,                             FRANCE
bertrand.delsart at oracle.com             Phone : +33 4 76 18 81 23

NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged
information. Any unauthorized review, use, disclosure or
distribution is prohibited. If you are not the intended recipient,
please contact the sender by reply email and destroy all copies of
the original message.

More information about the hotspot-compiler-dev mailing list