Truffle and mlvm
chris.seaton at oracle.com
Sun Aug 31 22:27:44 UTC 2014
Why JRuby+Truffle doesn't use the existing JRuby core classes:
There's no fundamental reason why we couldn't have implemented JRuby+Truffle using the existing JRuby classes instead of writing our own. Perhaps we should have also tried just replacing the AST and using the existing core classes. The reason I didn't do that up-front is back then I was experimenting, not sure how things would need to work, and didn't want to have to deal with the (now necessary) complexity of the full JRuby system at that stage.
The reason I'm not using the JRuby core classes now is that as Truffle has removed performance bottlenecks in other areas, such as method dispatch and frame access, the core classes are now ripe for further optimisation. For example we are using a sophisticated storage strategy system for collections classes, that removes the need for boxing and is very amenable to further optimisations such as scalar replacement and loop unrolling. JRuby can do this without Truffle, but it wouldn't make much difference as they have other bottlenecks in other areas. New core classes isn't a requirement - it's just the next step to get further performance and is a benefit of the partial evaluation.
I think Truffle is more of an incremental approach than people are realising. Which is very understandable as our approach with Ruby has not been incremental.
Charles is right that the AST in Ruby is not simple any more - but you could have a much simpler set of nodes and still get good performance. Add a few more specialisation and you get better - that's incremental. You can use your existing core classes and get good performance. Add things like storage strategies and specialisations in new core classes and you get better performance - again incremental.
So JRuby is a good demonstrator of Truffle, but due to how it grew from a small project separate from JRuby it may not be the perfect demonstrator of how to incrementally transition an existing implementation to Truffle. Maybe we need to take another language, maybe something like Golo or Groovy, and see how far we can get by very gradually converting it to Truffle, without the goal of the best absolute performance for benchmarks from day one - which is what I have been focusing on.
If anyone wants to think about converting an existing system to use Truffle, I can give you all the help you need to get started and pass on all my experiences so far. Don’t hesitate to get in touch with me.
On 30 Aug 2014, at 20:21, Charles Oliver Nutter <headius at headius.com> wrote:
> Removing all context, so it's clear this is just my opinions and thoughts...
> As most of you know, we've opened up our codebase and incorporated the
> graciously-donated RubyTruffle directly into JRuby. It's available on
> JRuby master and we are planning to ship Truffle support with JRuby
> 9000, our next major version (due out in the next couple months).
> At the same time, we have been developing our own next-gen IR-based
> compiler, which will run unmodified on any JVM (with or without
> invokedynamic, though I still have to implement the "without" side).
> Why are we doing this when Truffle shows such promise?
> I'll try to enumerate the benefits and problems of Truffle here.
> * Benefits of using Truffle
> 1. Simpler implementation.
> From day 1, the most obvious benefit of Truffle is that you just have
> to write an AST interpreter. Anyone who has implemented a programming
> language can do this easily. This specific benefit doesn't help us
> implement JRuby, since we already have an AST interpreter, but it did
> make Chris Seaton's job easier building RubyTruffle initially. This
> also means a Truffle-based language is more approachable than one with
> a complicated compiler pipeline of its own.
> 2. Better communication with the JIT.
> Truffle, via Graal, has potential to pass much more information on to
> the JIT. Things like type shape, escaped references, frame access,
> type specialization, and so on can be communicated directly, rather
> than hoping and praying they'll be inferred by the shape of bytecodes.
> This is probably the largest benefit; much of my time optimizing JRuby
> has been spend trying to "trick" C2 into doing the right thing, since
> I don't have a direct way to communicate intent.
> The peak performance numbers for Truffle-based languages have been
> extremely impressive. If it's possible to get those numbers reasonably
> quickly and with predictable steady-state behavior in large,
> heterogeneous codebases, this is definitely the quickest path (on any
> runtime) to a high-performance language implementation.
> 3. OSS and pure Java
> Truffle and Graal are just OpenJDK projects under OpenJDK licenses,
> and anyone can build, hack, or distribute them. In addition, both
> Truffle and Graal are 100% Java, so for the first time a plain old
> Java developer can see (and manipulate) exactly how the JIT works
> without getting lost in a sea of plus plus.
> * Problems with Truffle
> I want to emphasize that regardless of its warts, we love Truffle and
> Graal and we see great potential here. But we need a dose of reality
> once in a while, too.
> 1. AST is not enough.
> In order to make that AST fly, you can't just implement a dumb generic
> interpreter. You need to know about (and generously annotate your AST
> for) many advanced compiler optimization techniques:
> A. Type specialization plus guarded fallbacks: Truffle will NOT
> specialize your code for you. You must provide every specialized path
> in your AST nodes as well as annotating "slow path", "transfer to
> interpreter", etc.
> B. Frame access and reification: In order to have cross-call access to
> frames or to squash frames created for multiple inlined calls, you
> must use Truffle's representation of a frame. This means loads/stores
> within your AST must be done against a Truffle object, not against an
> arbitrary object of your own creation.
> C. Method invocation and inlining: Up until fairly recently, if you
> wanted to inline methods you had to essentially build your own call
> site logic, profiling, deopt paths within your Truffle AST. When I did
> a little hacking on RubyTruffle around OSS time (December/January) it
> did *no* inlining of Ruby-to-Ruby calls. I hacked in inlining using
> existing classes and managed to get it to work, but I was doing all
> the plumbing myself. I know this has improved in the Truffle codebase
> since then, but I have my concerns about production readiness when the
> inlining call site parts of Truffle were just recently added and are
> still in flux.
> And there's plenty of other cases. Building a basic language for
> Truffle is pretty easy (I did a micro-language in about two hours at
> JVMLS last year), but building a high-performance language for Truffle
> still takes a fair investment of effort and working knowledge of
> dynamic compiler optimizations.
> 2. Long startup and warmup times.
> As Thomas pointed out in the other thread, because Truffle and Graal
> are normally run as plain Java libraries, they can actually aggravate
> startup time issues. Now, not only would all of JRuby have to warm up,
> but the eventual native code JIT has to warm up too. This is not
> surprising, really. It is possible to mitigate this by doing some form
> of AOT against Graal, but for every case I have seen the Truffle/Graal
> approach makes startup time much, much worse compared to just running
> atop JVM.
> Warmup time is also worsened significantly.
> The AST you create for Truffle must be heavily mutated while running
> in order to produce a specialized version of that AST. This must
> happen before the AST is eventually fed into Graal, which means you
> have a self-modifying interpreter spinning AST objects like mad while
> executing the early phases of your application. Compare to a dumb
> interpreter as in JRuby's old AST, where interpreting the AST produces
> no additional objects other than those necessary for execution of the
> The Truffle approach itself adds overhead too. Until optimized, the
> fully-reified frame objects, specialization markup (which triggers AST
> rewriting), deoptimization guards, and so on are all done manually
> against heap-level data structures. This is in addition to the
> JVM-level overhead of executing an AST (native frame-per-node, boxing
> and type-widening, poor inlining profile).
> Some amount of AOT *might* be applicable here, but the benefit of
> Truffle and Graal is lost in the AOT case if we're not getting
> real-world profile information. The Substrate VM has ben brought up to
> aid startup and warmup too...but that direction produces a
> closed-world executable based on optimizing all code up front...not
> exactly what we're looking for in a general-purpose language runtime.
> 3. Limited concurrency
> The RubyTruffle runtime currently has to execute code under the
> watchful eye of a global lock. Yes, you read that right...RubyTruffle
> is single-threaded right now.
> I would like to know if there's a deeper reason for this, but the
> obvious shallow reason is that you can't have multiple threads
> executing at the same time if they're making thread-unsafe mutations
> to the executing code. This is similar to the major stumbling block
> for e.g. Pypy, which rewrites currently-executing assembly
> instructions at deopt/reopt safe points.
> I believe once the code has transitioned to native, you can execute
> that safely across threads...but this is opaque to your Truffle-based
> language, and it's unclear how you'd manage re-acquiring some sort of
> lock when transferring back to the interpreter.
> The fact that concurrency has so far been hand-waved (or so it seems
> to me from the outside) scares the living hell out of me, especially
> when there's talk about rolling this stuff into Java 9.
> Obviously some of this could be mitigated with an immutable AST
> structure or other thread-friendly tree-transformation algorithm, but
> making the Truffle AST thread-safe may also make it even more
> object-heavy during interpretation, aggravating startup time further.
> 4. Limited availability
> This is the chicken-and-egg issue. Truffle is just a library, so we
> can ignore that for the moment (given any JVM, you can run a Truffle
> Graal is required for Truffle to perform well at all. The Truffle
> interpreter is without a doubt the slowest interpreter we've ever had
> for JRuby, and that's saying something (there could be startup/warmup
> effects in play here too). In order for us to go 100% Truffle, we'd
> need a Graal VM. That limits us to either pre-release or hand-made
> builds of Graal/OpenJDK. Even if Graal somehow did get into Java 9,
> we'd still have legions of users on 8, 7, ... even 6 in some cases,
> though we're probably leaving them behind with JRuby 9000. Ignoring
> other platforms (non-OpenJDK, Android) and assuming Graal in Java 9,
> I'd conservatively estimate JRuby could still not go 100% Truffle
> until 2017 or later.
> And it gets worse. Graal will probably never exist on other JVMs.
> Graal will probably never exist in an Android VM. Graal may not even
> be available in other non-Oracle OpenJDK derivatives for a very long
> time. We have users on dozens of different platform/JVM combinations,
> so there's really no practical way for us to abandon our JVM bytecode
> runtimes in the near future.
> Now of course if Graal became essential to users, it would be
> available in more places. We recognize the potential of Truffle and
> Graal, which is why we've been thrilled to work with Oracle on a
> RubyTruffle that's part of JRuby. We also recognize that the
> Truffle/Graal approach has some very compelling features for our
> users, and that our users may often be comfortable running custom
> JVMs. We're allowing all flowers to bloom and our users will pick the
> ones that work for them.
> 5. Unclear benefits for real-world applications
> There have been many published microbenchmarks for Truffle-based
> languages, but very few benchmarks of real-world applications
> performing significantly better than custom-made VMs (JS versus V8).
> There have been practically no studies of a Truffle-based language
> running a large application for a long period of time...and by long I
> mean server-scale.
> Chris Seaton has pushed this forward recently for Ruby, getting
> general-purpose, numeric-heavy libraries to run and optimize very well
> (a png library and a psd library). Going deeper requires having more
> of the language's standard libraries to be available, and I believe
> this is where Chris has spent much of his time (RubyTruffle currently
> requires mostly-custom versions of JRuby's core classes...versions
> that Truffle can recognize, specialize, and escape-analyze away).
> * Conclusion
> I again want to emphasize that we think Truffle and Graal are really
> awesome technology. I spent years with my nose smooshed against the
> glass, watching the Pypy guys add optimizations I wanted and make good
> on their promise of "just implement an interpreter...we'll do the
> rest". Finally we have what I wanted: a Pypy for JVM (in Truffle) and
> an LLVM for JVM (in Graal). These are exciting times indeed.
> But reality steps in. There's a long road ahead.
> I think we need to separate the questions about Truffle from questions
> about Graal. Truffle is ultimately just a library that uses Graal.
> Graal is promising JIT technology. Graal is simpler than C2 and may be
> able to match or beat its performance. Graal provides a better way to
> communicate intent to the JIT. These facts are not in question.
> However, Graal is not (other than when used as the JVM's JIT) a JVM.
> Targeting Graal directly acts against the promise of a standard,
> platform-and-VM-agnostic bytecode -- and that's the promise that
> brought most of us here. Graal is not yet ready to replace C2, which
> would mean adding to the size and complexity of Java 9. And Graal is
> almost completely untested in large production settings.
> I personally would love to see Graal get into a Java release soon as
> an experimental feature, but Java 9 seems ambitious but any standard.
> It *might* be possible/reasonable to include Graal as experimental in
> 9. Java 10 is certainly feasible for experimental, and may be feasible
> for product. But even if Graal got into mainstream OpenJDK and Java,
> there's a very long adoption tail ahead.
> I'd like to hear more from folks on the Graal and Truffle teams. Prove
> me wrong :-)
> - Charlie
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
More information about the mlvm-dev