Variants/case classes/algebraic data types/sums/oh my!

Remi Forax forax at univ-mlv.fr
Sat Jun 11 11:18:40 UTC 2016


In my opinion, you can breakdown the support of ADT into 3 parts,
- definition of a case class
- definition of a closed type
- pattern matching

The first two parts are not very interesting, at least not from the VM point of view, the pattern matching part is in my opinion, the one that is fun. 
Also note that there are two different semantics for the pattern matching, you have the Scala way, were each case is tested linearly so if a test do a side effect, a further test can see the effect (this semantics is also called predicate dispatching [1]) or you have the "switch on class" semantics where you jump to the right action.

At runtime, the best way to implement the later semantics nowadays is to implement a kind of dynamic visitor (a hashmap of lambda),
here is a way to implement it in Java [2].

Now, what can be fun in the context of valhalla is to consider the way to define the closed class 'hierarchy' as a kind of specialization,
as John said, switching on an integer or an enum value seems the most promising. So an ADT can be seen as a specialization over an integer.

interface Expr<int K> {
}
value class Value implements Expr<0> {
  final int value;
}
value class Add implements Expr<1> {
  final Expr<?> left;
  final Expr<?> right;
}

static <int K> int eval(Expr<K> expr) {
  switch(K) {
     case 0:
       return ((Value)expr).value;
     case 1:
       Add add = (Add)expr;
       return eval(add.left) + eval(add.right);
     default:
       throw new AssertionError();
  }
}

because K is reified, there will be two methods eval() at runtime, one specialized for K = 0 and an other for K = 1,
which means that the switch will be constant fold (the real 'switch' is done when selecting the specialization).

Obviously, one can come with a nice syntax sugar on top of that to avoid users having to number things (enum are better here because they are an indirection on an integer which is better for separate compilation) and insert the cast automatically.

cheers,
Rémi

 
[1] https://homes.cs.washington.edu/~mernst/pubs/dispatching-ecoop98-abstract.html
[2] https://github.com/forax/design-pattern-reloaded/blob/master/src/main/java/visitor/visitor6.java


----- Mail original -----
> De: "John Rose" <john.r.rose at oracle.com>
> À: "org openjdk" <org.openjdk at io7m.com>
> Cc: valhalla-dev at openjdk.java.net
> Envoyé: Samedi 11 Juin 2016 04:17:35
> Objet: Re: Variants/case classes/algebraic data types/sums/oh my!
> 
> On Jun 10, 2016, at 12:58 PM, org.openjdk at io7m.com wrote:
> > 
> > On 2016-06-07T14:05:47 +0000
> > <org.openjdk at io7m.com> wrote:
> >> 
> >> I recently wrote a very rough and low-detail proposal in a GitHub
> >> comment for a project that is looking to implement yet another
> >> generator for pseudo algebraic data types in Java:
> >> 
> > 
> > Tough crowd. Is noone interested in this?
> > 
> > To be clear, I was looking to try to start implementing something
> > myself, and was hoping for discussion on what exactly should be
> > implemented. I've got a fairly clear idea of the semantics of what I'm
> > looking for, but suspected that there'd be overlap with the existing
> > valhalla work (hence the post to this list).
> > 
> > Is there a different list I should be approaching first?
> 
> We are concentrating on stuff which is probably a prerequisite to doing a
> good job with ADTs.
> 
> I like your review (in adt-jvm) of existing attempts to do them on the JVM.
> 
> To me as a JVM geek the most interesting paragraph was this:
> 
> > The library implementations of algebraic types in Java all suffer from the
> > same weaknesses due to lack of support two concepts in Java: The ability
> > to make a set of classes closed, and the ability to efficiently select one
> > class from a closed set in O(1) time and without allocating memory.
> 
> That gives me an idea about how the JVM enhancements we are talking about
> might play out when ADTs are added.
> 
> For example, the proposed nestmate mechanism can be used to create small
> sealed class hierarchies.  You make the related classes nestmates, and mark
> their constructors private.  We can also do things to seal interfaces, if we
> need to.  It's all a question of effort, priority, and resource.
> 
> The O(1) thing is tricky to evaluate.  I think the folks who use enums as
> kind-tags are on the right path there, since enums allow both separate
> compilation and efficient (O(1)) switching.
> 
> Elsewhere (note 4) you mention heap allocation as a design constraint, and
> it's a big one for Java API designs, and even bigger for language feature
> design.  The problem of allocation is one reason we are working on value
> types first, because successful value types will allow programmers and
> library designers to say what they mean with values and not worry about GC
> load.
> 
> In general, Valhalla is about reducing the number of pointers and objects, in
> favor of richer APIs involving primitives and inlined structures (values).
> 
> I think a successful ADT mechanism for Java is a long way off, but I'm glad
> you and others are thinking about it.  Please do think about how it will
> look on top of Valhalla.
> 
> Also, do keep thinking about specific, low-level, generic JVM mechanisms that
> could support ADTs.  By "low-level" I mean defined in terms of JVM
> operations, not language features (of any language), and by "generic" I mean
> JVM facilities that are logically complete and consistent without reference
> to a particular application (such as ADTs).  For example, sealed JVM types
> should be definable without reference to ADTs, yet be useful for ADTs and
> several other use cases.
> 
> I hope this helps!
> 
> — John


More information about the valhalla-dev mailing list