request for review (L): 7121756 Improve C1 inlining policy by using profiling at call sites

Krystal Mok rednaxelafx at
Thu Dec 15 07:28:42 PST 2011

Hi Roland,

Interesting. Now C1's getting more of the heavier gears :-)

A few things I'd like to ask:

1. How does this change interact with tiered mode?
I see that in a tiered build, C1ProfileInlining is set to false in
arguments.cpp. So this set of changes is only meant for a Client VM, right?

2. Is this change mainly targeted at embedded builds?
In a desktop/server scenario, I had a feeling that the Client VM was going
away, replaced by a unified tiered VM in the future. Hence the question.

3. Are there any plans to do a late-inlining phase for C1?
I tried to make C1 able to inline at more callsites by adding
Phi::exact_type(), but failed [1]. The main reason for failing is that
before the whole HIR graph is built, the CFG isn't stable yet, and quering
Phi::exact_type() during graph building just won't work. But if there's
more compilation budget to spend, say we're in a hot method, an optional
inlining phase after the HIR graph is built might be profitable. I'd like
to see more comments on this point.

Kris Mok


On Thu, Dec 15, 2011 at 10:52 PM, Roland Westrelin <
roland.westrelin at> wrote:

> Implements profile based inlining in C1.
> Execution of a method starts interpreted as usual. A method transitions
> from interpreted to compiled in the usual way as well. When the method is
> compiled, the compiler identifies a number of call sites that are
> candidates for profiling and further inlining. At those call sites, the
> method is compiled so that a per call site counter is incremented and
> tested for overflow when the call site is used. On first call site
> resolution, a timestamp is also recorded. The count and timestamp are used
> to compute a frequency. A frequency higher than a high water mark detects a
> hot call site. A hot call site triggers a recompilation of the caller
> method in which the callee is inlined. A frequency higher than a low water
> mark detects a warm call site. Otherwise the call site is cold. Recompiling
> with the extra inlining won't bring a performance advantage for a warm or
> cold call site. But keeping the profiling on at a warm call site is
> detrimental so it is dropped. At a cold call site profiling can be kept
> enabled to trigger later recompilation if the call site becomes hot.
> To perform profiling, the compiler identifies the candidate call sites and
> generates a stub similar to the static call stub in the nmethod's stub
> area. The profile call stub performs the following step:
> 1- load mdo pointer in register
> 2- increment counter for call site
> 3- branch to runtime if counter crosses threshold
> 4- jump to callee
> On call site resolution, for a call to a compiled method, the jump (4-
> above) is patched with the resolved call site info (to continue to callee's
> code or transition stub) then the call site is patched to point to the
> profile call stub. Profiling can be later fully disabled for the call site
> (if the call site is polymorphic or if the compilation policy finds it's
> better to not profile the call site anymore) by reresolving the call.
> The compiler also uses profile data to inline a frequent virtual method if
> profile data suggests a single receiver class. State changes of inline
> caches associated with call sites (performed in the runtime) are used to
> collect receiver class data. Correctness during execution is enforced with
> a compiled guard and a deoptimization can be triggered.
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the hotspot-compiler-dev mailing list