RFR(L): 8005071: Incremental inlining for JSR 292
roland.westrelin at oracle.com
Mon Dec 17 09:18:56 PST 2012
Thanks for reviewing this.
> Roland my main concern is GraphKit::kill_dead_call_outputs(). We avoiding such dead node elimination during paring since a graph may not complete and dominate information is not accurate. Can you bailout the round of later inlining if you detect such case and let PhaseRemoveUseless do cleanup for you?
Let's say we have 2 calls in a method:
A() dominates B(). During parsing, we build a complete graph with CallNodes for A() and B(). Let's say we decide to inline A() after parsing so we build a subgraph with A()'s inputs as inputs to parsing and then we replace B()'s outputs with the resulting new subgraph.
Now let's say A()'s outputs are dead and we try to inline B(). It depends on A()'s outputs but they are dead and often C2 will see an inconsistent state when it builds the subgraph for B(): control not top so it proceeds with inlining, but a MergedMem with some slices set to top or if B() uses the return from A(), top as an input. These lead to crashes.
That doesn't happen with parse-time inlining because A()'s subgraph is built inside the main graph, all results from A()'s inlining are propagated forward to B() by the gvn, the input state for B() is consistent and we don't try to inlining B() because this is a dead path.
I don't think PhaseRemoveUseless helps in this case. What we would need in the case of post parse inlining would be to inline A(), apply an igvn to propagate the results from the inlining forward to B(), then inline B(). But even an igvn is not sufficient if for instance, B() is inside a loop.
That's why I use GraphKit::kill_dead_call_outputs(). I don't understand why it's dangerous since it operates in well defined conditions. In the example above, we have a complete graph with a DirectCall to A(). We are done with parsing for A() and the subgraph for A() uses A()'s inputs but it is not yet connected to the rest of the graph by it's outputs. We want to find calls that A()'s control outputs dominate and we work on a complete graph, the subgraph for A() doesn't really matter.
> The same about Compile::look_for_dead_calls(). Why you need this check if you do PhaseRemoveUseless?
With CTW, I found some rare cases, where an igvn pass makes some path dead but similar to what I describe above the state at a call that we want to inline is not consistent.
More information about the hotspot-compiler-dev