Type Hierarchies and Guards in Truffle Languages

christian.humer at gmail.com christian.humer at gmail.com
Fri Jan 8 12:42:09 UTC 2016


>Hello Stefan.
>>  Hope such posts help.
>Your post could be even more useful, if included in the Javadoc. Do you 
>think <h3>Choosing Your Guards Wisely</h3> at 
>would be better place for a bit reworded version of this text? If so, 
>free to turn it into patch.
I don't think Specialization#guards is the right place. I think 
Specialization#guards should focus on semantics and minimal examples 
instead of big examples and their performance implications like this.
I think it would be a better fit in 

>In any case, thanks for sharing your experience.
>### Monday 04 of January 2016, 12:20:00 @ Stefan Marr ###
>>  Hi:
>>  In an effort to post a little more here on the list and discuss 
>>  relevant questions for Truffle language implementers, I wanted to 
>>report on
>>  some changes in SOMns.
>>  Over the last couple of days, I refactored the main message dispatch 
>>  in SOMns. As in Self and Newspeak, all interactions with objects are
>>  message sends. Thus, field accesses as well as method invocation are
>>  essentially the same. This means that message sending is a key to 
>>  performance.
>>  In my previous design, I structured the dispatch chain in a way that, 
>>  thought, I’d reduce the necessary runtime checks.
>>  My ’naive’ design essentially distinguished two different cases.
>>  One case where when the receiver were standard Java objects, for 
>>  boxed primitives such as longs and doubles, or other Java objects 
>>that are
>>  used directly. The second case were objects from my own hierarchy of
>>  Smalltalk objects.
>>  The hierarchy is a little more involved, it includes an abstract 
>>class, a
>>  class for objects that have a Smalltalk class `SObjectWithClass`, a 
>>  for objects without fields, for objects with fields, and that one is 
>>  again subclassed by classes for mutable and immutable objects. There 
>>  still a few more details to it, but I think you get the idea.
>>  So, with that, I thought, let’s structure the dispatch chain like 
>>  starting with a message send node as its root:
>>  MsgSend
>>   -> JavaRcvr -> JavaRcvr -> CheckIsSOMObject -> SOMRcvr -> SOMRcvr ->
>>  UninitializedSOMRcvr \-> UninitializedJavaRcvr
>>  This represents a dispatch chain for a message send site that has 
>>seen four
>>  different receivers, two primitive types, and two Smalltalk types. 
>>  could be the case for instance for the polymorphic ‘+’ message.
>>  The main idea was to split the chain in two parts so that I avoid 
>>  for the SOM object more than once, and then can just cast the 
>>receiver to
>>  `SObjectWithClass` in the second part of the chain to be able to read 
>>  Smalltalk class from it.
>>  Now it turns out, this is not the best idea.
>>  The main problem is that `SObjectWithClass` is not a leaf class in my
>>  hierarchy. This means, at runtime, the check, i.e., the guard for
>>  `SObjectWithClass` is pretty expensive. When I looked at the 
>>compilation in
>>  IGV, I saw many `instanceof` checks that could not be removed and 
>>  in runtime traversal of the class hierarchy, to confirm that a 
>>  concrete class was indeed a subclass of `SObjectWithClass`.
>>  In order to avoid these expensive checks, I refactored the dispatch 
>>nodes to
>>  extract the guard into its own node [1] that does only the minimal 
>>  of work for each specific case. And it only ever checks for the 
>>  leaf class of my hierarchy, that is expected for a specific receiver.
>>  This also means, the new dispatch chain is not separated in parts 
>>anymore as
>>  it was before. Instead, the nodes are simply added in the order in 
>>  the different receiver types are observed over time.
>>  Overall the performance impact is rather large. I saw on the Richards
>>  benchmark a gain of 10% and on DeltaBlue about 20% [3]. Unfortunately 
>>  refactoring [3] also changed a few other details beside the changes 
>>  to `instanceof` and casts. It also made the guards for objects with 
>>  depend on the object layout instead of the class, which avoids having
>>  multiple guards for essentially the same constraint further down the 
>>  So, the main take-away here is that the choice of guard types can 
>>have a
>>  major performance impact. I also had a couple of other 
>>  nodes that were using non-leaf classes. For instance like this:
>>  `@Specialization public Object doSOMObject(SObjectWithClass rcvr) 
>>  This looks inconspicuous at first, but fixing those and a few other 
>>  resulted in overall runtime reduction on multiple benchmarks between 
>>  and 30%.
>>  A good way to find these issues is to see in IGV that `instanceof` or
>>  checked cast snippets are inlined and not completely removed. Often 
>>  are already visible in the list of phases when the snippets are 
>>  Another way to identify them is the use of the Graal option
>>  `-Dgraal.option.TraceTrufflePerformanceWarnings=true` (I guess that 
>>  be `-G:+TraceTrufflePerformanceWarnings` when mx is used). The output 
>>  the specific non-leaf node checks that have been found in the graph. 
>>  all of them are critical, because they can be removed by later 
>>phases. To
>>  check that, you can use the id of the node from the output and search 
>>  it in the corresponding IGV graph using for instance `id=3235` in the
>>  search field.
>>  Hope such posts help.
>>  Best regards
>>  Stefan
>>  [1]
>>  ch/DispatchGuard.java [2]
>>  760 [3]

More information about the graal-dev mailing list