Tailcalls in the jvm
arnold.schwaighofer at gmail.com
Wed Feb 13 14:29:12 PST 2008
On Feb 12, 2008 10:44 AM, John Rose <John.Rose at sun.com> wrote:
> These things happen. I think we'll make the Da Vinci Machine happen
> 0. I find that paper is obscure, since I don't often move in circles
> where the term "contractum" pops up in normal conversation.
> But there are circles within circles, I see, since Clements & Felleisen
> fall back on named lambdas instead of the mysterium tremendum
> of the Y combinator favored by Fournet & Gordon. :-)
> 1. If protection domains are few, then we could, as you suggest,
> put a single-word bitmap in every stack frame.
> The idea to deoptimize and reformat on the 33rd, 65th, etc.
> protection domains is clever. I suppose if you were to
> reserve a word in every stack frame, you could also
> make it an indirection to an expandable array.
> But I'd rather not go down this road.
Okay. I think Christian already told me that stack frame modifications
would be frowned upon. :)
> In HotSpot we try to keep optimized stack frames as "native" as
> If there is a need, say, to track GC roots, we try not to compromise
> code quality
> or stack frame layout but instead add metadata on the side.
> This is a powerful technique. It doesn't work for PD accumulation,
> since that is a (slightly) dynamic property of stack frames.
> I'd hate to invent stack frame customization machinery (complex,
> and with unknown overheads), just for this one use case.
> 3. Since we're defining our own VM here, let's see what we can do
> by refusing to handle the hard cases. One too-radical idea: Throw
> a linkage error if less-privileged code tail-calls more privileged code.
> This probably won't work well in the presence of method handles
> (or even virtual/interface calls) since the byte compiler cannot
> statically determine the identity of a callee and refrain from
> requesting a bad tail call. A scheme example would be:
> (define (revlist listfn a b) (listfn b a))
> Suppose revlist is in an applet, and listfn is bound to
> the actual a system routine 'list'.
> Then the draconian rule will cause the code to fail.
> This rule is well-motivated, since the listfn might also
> be bound to for-each, which could cause a closure
> b to be invoked in an privilege context elevated
> beyond revlist's capabilities.
yeah, i also wanted to do something along those lines until i realized
that the callee might be unknown (interface/virtual call) to the
> 4. OK, that was a bad start. Let's try again.
> Make a less draconian rule, one that allows
> the call but keeps more of the stack trace around.
> When revlist tail-calls list, the system can keep
> the stack frame around (or keep an adapter frame)
> to represent the PD of revlist.
> I call this a "defective tail call", because the compiler
> requested a tail call, but the runtime pushed a stack
> anyway. The application cannot detect this, unless
> it performs an unbounded recursion using only tail calls
> some of which are defective; eventually it will get
> a stack overflow from the frames pushed by the
> defective calls.
> Apparently Fournet and Gordon (whom I have not
> read) allow their engine to defeat tailcalls, because
> C&F say they make the frame take-down operation
> Basically, I propose the JVM should defeat a tail
> call when it detects a cross-PD call. It should
> refrain from taking down the frame, or (better)
> replace the frame with a holder for the PD.
That is probably the path i'll take. But there is the problem of
recognizing cross-pd calls for dynamic call targets
(virt./interface/method handle calls) as you suggest above.
A solution could be to add another function entry point like the one
for monomorphic calls. Let's call it unverified tail call entry point.
It checks that the current function's PD and the caller's function PD
are the same and jumps to the normal entry point if they match (fast
path). If they are not the appropriate action is taken dealing with
the situation (proceed as non-tail call, setup adapter frame,throw
Callee targets which are statically known to be in the same PD (a
potential optimization could be a PD analysis akin to the class
hierarchy analysis) would use the normal entry point (verified or
unverified entry point) while the others would go to the unverified
tail call entry point.
Pro: fast also for the dynamic callee case(if there is a efficient way
to compare the two PDs - maybe caller passes its PD in a register) in
the matching PD case
Contra: might be more complex then the solution below
Another solution would be to call always call a jvm runtime function
(more overhead, always a slow path) to perform the tail call in cases
where the call target is dynamic. I think this is what the .net
runtime does/used to do? (JIT_TailCall,
Pro: less complex
Contra: slower for dynamic call target case
What do you think?
> 5. Or we could use little adapter frames to represent
> PD information.
> This is the engineer's equivalent to C&F's continuation
> markers, which summarize annotations derived from
> Have a cross-PD tail call push an adapter (not the original
> frame) which the JVM creates specially for the purpose
> of temporarily defeating (potentially) cross-PD tail calls.
> But, keep a counter in them, and when too many pile up,
> crush them down into a single adapter, with a composite
> PD marker. The "crush" operation can compress an
> unlimited number of frames, since the PD set the JVM
> needs can be unordered, and PDs are few.
Yes that is a cool idea. Note to myself: Always think of adapter
frames. they seem to be the solution to many problems.
> (Oddly enough, I wouldn't need to produce an asymptotic space
> proof over a multi-kinded combinatorial algebra in order to
> code this up. It's a funny world.)
> Shall we do the simple "defeatist" thing first, and reserve
> the adapter frames for a second effort?
Yes i think that is the way to go. A least my lazy left big toe tells me so :).
More information about the mlvm-dev