[patterns] Destructuring without matching?

John Rose john.r.rose at oracle.com
Mon Jul 10 00:19:42 UTC 2017

On Jul 8, 2017, at 7:22 PM, Tagir Valeev <amaembo at gmail.com> wrote:
> Yes, I was thinking about separate operator / syntax construct like 'let'
> in Remi example. I don't propose changing 'matching' behavior: when type is
> specified, it should be consistent with 'instanceof' which doesn't match
> null.

Whether a match is "soft" or "hard", that is "conditional" or "unconditional",
is one of the many design parameters we are working with.  The two cases
correspond, roughly, to "instanceof" vs. "checkcast" or "unbox" operations.

The most productive case to work with is the "soft" one, because we can
always ask the user to handle failure explicitly.  This lets us off the hook
from specifying a one-size-fits all failure behavior.

Annoyingly, failure can come from a surprise NPE, as with (int)(Integer)null.
Even more annoyingly, our legacy "hard" destructuring operations have
a mixed response to nulls:  checkcast allows them, while unbox does not.
And a source-level cast can be either, or a mix of both, so go figure!

This is one more reason to stay on the "soft" side of the design as long
as we can, since there is usually reasonable grounds for claiming that
null is simply a value that doesn't match some pattern.  And "instanceof"
provides a helpful precedent for failing nulls from type-narrowing matches,
without throwing NPE or letting the nulls "leak" into the case logic.

(IMO it's an important goal that match expressions provide a useful
sugar for the old instanceof/checkcast idiom, which requires users to
mention the same type twice.  This is another reason to lean on
"instanceof" as a precedent for matching against a type.)

Languages which support "hard" matches (Haskell, e.g.) allow you to
write lambdas whose formal parameters deconstruct an actual argument.
Such lambdas are not total, and so must have permission to throw some
analog of ClassCastException or WrongMethodTypeException.
It might seem tempting to do these, but it doesn't scale unless you
can *also* provide a way to overload multiple lambda bodies, to
give the user control over multiple deconstruction cases.

So in the end I think an expression-switch, with a programmable
default, is the best way to handle "hard" matches on the fly.
Something like:

   foo( x -> exprswitch(x) {
      case Foo(var y, var z) -> lambdaBodyHere(y,z);
      case _ || null -> throw…; }

I don't see any particular reason to sugar such a thing down further
into a special purpose multi-body lambda syntax.

(Maybe the "throw…" needs some sugar, along the lines of NPE,
which is in some sense sugar for the code "if (r==null)throw NPE()"
before a  dereference of "r".)

So, after this walk around the issues, it seems to me that all we
have to design are "soft" matching constructs, that are sweet enough
to let users build "hard" constructs on top of them.

— John

P.S. There is another legacy source of NPEs, which arises from "switch"
on a reference type (boxed int, enum, or string).  If you hand a null to such
a switch, you get an NPE.  This is a "hard" behavior in an otherwise "soft"
construct, since failing to match a case in a switch simply drops you to the
default label, or out the bottom.  Basically, the "default" label is "hard" with
respect to nulls, although it is permissive to every other value.  That's fine
as a heuristic for catching uninitialized references, etc., but it gets in the
way if you *expect* nulls as a possible value.  A generalized switch needs
to cover cases which *do expect* nulls as possible values, or it's one of
those awful 99% generalizations that lure you in just to cut you on the
1% sharp edge.

So this would be a very bad precedent to follow in new design; we must
give a retroactive account (a "ret-con", a fandom term) of why it is that
switch has this behavior.  I think the right answer there is say that switch
has a legacy behavior in which there is an implicit case added to every
switch (at least legacy ones) which looks like this:  "case null: throw NPE".
This lets us give the user a lifeline to escape the NPE:  If the user provides
an explicit case that matches a null, we can simply say that this case takes
precedence over the implicit one, rendering it dead code.

So if you really want to skip the NPE, you replace "default:" with the double
label "case null: default:".  Not pretty, but it gives continuity between past
and future semantics.

There's a fraught issue about how to deal with the degrees of freedom
between "default", "case _", "case var x", made much worse by the fact
that "default" by itself might cause legacy NPEs.  We are likely to disallow
bare "case _" since it is a confusing synonym for "default", but then we
also need a good way to defend against the NPEs, hence "case _ || null:",
which I wrote above as notation for a "softer default".  We need a story here.

I suppose it's likely that we can require totality on enhanced switches,
giving warnings if the ranges of the various cases don't cover the domain
of the switch value.  (NPE will likely get shuffled under the rug by such
logic, for compatibility and to keep nulls out of peoples' hair.)  If a switch,
especially an expr-switch, lacks a default, and provably doesn't cover
some (non-null) switch domain value, then we should issue a warning
or error, to help the user complete the logic.  To convert such a switch
to a "hard" switch, which "knows and throws" it's incompleteness, we
will probably want to require an explicit "default" with a throw.  That's
the "throw…" above in the multi-body lambda.  I think requiring users
to opt in to "hard" switches is a good idea, since assuming switches
are "soft" will allow us to check totality.  But, for the opt-in to "hard"
switches, we might want to make it easy to make the throw happen.
Perhaps a bare "throw;", in a switch, could be made sugar for
"SwitchRangeException" or some such.

More information about the amber-dev mailing list