Two new draft pattern matching JEPs
guy.steele at oracle.com
Fri Mar 5 22:41:30 UTC 2021
> On Mar 3, 2021, at 7:50 AM, Remi Forax <forax at univ-mlv.fr> wrote:
> ----- Mail original -----
>> De: "Gavin Bierman" <gavin.bierman at oracle.com>
>> À: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
>> Envoyé: Jeudi 18 Février 2021 13:33:20
>> Objet: Two new draft pattern matching JEPs
>> Dear all,
>> - Pattern Matching for switch: https://bugs.openjdk.java.net/browse/JDK-8213076
>> We split them up to try to keep the complexity down, but we might decide to
>> merge them into a single JEP. Let me know what you think.
> I think that we have got a little over our head with the idea of replacing the switch guard by the guard pattern + conditional-and pattern.
> The draft is poor in explanations on why we should do that apart because it's more powerful, which is true but that not how to evaluate a feature.
> Here, it doesn't seem we are trying to fix a broken feature or adapt an existing feature to Java. It's just more powerful, but with a lot of drawbacks, see below.
> My main concern is when mixing the deconstructing pattern with the guard + and pattern, those twos (two and a half) doesn't mix well.
> For a starter, at high level, the idea is to mix patterns and expressions (guards are boolean expressions), but at the same time, we have discussed several times to not allow constants inside patterns to make a clear distinction between patterns and expressions. We have a inconsistency here.
> The traditional approach for guards cleanly separate the pattern part from the expression part
> case Rectangle(Point x, Point y) if x > 0 && y > 0
> which makes far more sense IMO.
> The current proposal allows
> case Rectangle(Point x & true(x > 0), Point y & true(y > 0))
> which is IMO far least readable because the clean separation between the patterns and the expressions is missing.
As I have already indicated in another email, if Rectangle had six or eight components rather than just two, for some purposes it might be more readable to have the constraint for each component listed next to its binding, rather than making the reader compare a long list of bindings to a long list of constraints.
> There is also a mismatch in term of evaluation, an expression is evaluated from left to right, for a pattern, you have bindings and bindings are all populated at the same time by a deconstructor, this may cause issue, by example, this is legal in term of execution
> case Rectangle(Point x & true(x > 0 && y > 0), Point y)
> because at the point where the pattern true(...) is evaluated, the Rectangle has already been destructured, obviously, we can ban this kind of patterns to try to conserve the left to right evaluation but the it will still leak in a debugger, you have access to the value of 'y' before the expression inside true() is called.
I would like to question your assertion
bindings are all populated at the same time by a deconstructor
Is this really necessarily true? I would have thought that the job of the deconstructor is to provide the values for the bindings, and that int principle the values are then kept in anonymous variables or locations while the subpatterns are processed, one by one, from left to right. Because consider a more complex pattern:
case Rectangle(Point(int x1, int y1), Point(int x2, int y2))
I would expect that the deconstructor for Rectangle does not fill in all four variables x1, y1, x2, y2 all at once; rather, it just supplies two values that are points, and then the first point value is matched against pattern Point(int x1, int y1), and only then is the second point value matched against pattern Point(int x2, int y2)).
Now this example is not exactly analogous to your original, because we have not provided explicit variables for this purpose. I believe that in an earlier version of the design one would write
case Rectangle(Point(int x1, int y1), Point(int x2, int y2))
But perhaps in the current proposal one must write
case Rectangle(Point(int x1, int y1) & var p1, Point(int x2, int y2) & var p2)
case Rectangle(var p1 & Point(int x1, int y1), var p2 & Point(int x2, int y2))
In all of these cases, my argument is still the same: the simplest model is that that deconstructor for Rectangle just supplies two values that are points, and then the first point value is matched against the first sub pattern, and only then is the second point value matched against the second subpattern. As a result p2 and x2 and y2 do not yet have bindings or values while the first sub-pattern is being matched.
A compiler would likely optimize common special cases to effectively implement all-at-once population of bindings when it would be impossible to detect any difference in behavior. But I don’t think all-at-once population is the right theoretical model.
> In term of syntax, currently the parenthesis '(' and ')' are used to define/destructure the inside, either by a deconstructor or by a named pattern, but in both cases the idea is that it's describe the inside. Here, true() and false() doesn't follow that idea, there are escape mode to switch from the pattern world into the expression world.
> At least we can use different characters for that.
> Also in term of syntax again, introducing '&' in between patterns overloads the operator '&' with one another meaning, my students already have troubles to make the distinction between & and && in expressions.
> As i already said earlier, and this is also said in the Python design document, we don't really need an explicit 'and' operator in between patterns because there is already an implicit 'and' between the sub-patterns of a deconstructing pattern.
As I have already indicated in another email, I agree with you here; I very much share your concerns about the overloading of a single symbol `&` (or whatever spelling we give it) to mean two or three different things within patterns, not to mention its existing uses in expression contexts.
> To finish, there is also an issue with the lack of familiarity, when we have designed lambdas, we have take a great care to have a syntax similar to the C#, JS, Scala syntax, the concept of guards is well known, to introduce a competing feature in term of syntax and semantics, the bar has to be set very high because we are forcing people to learn a Java specific syntax, not seen in any other mainstream languages*.
> For me, the cons far outweigh the pro(s) here, but perhaps i've missed something ?
>> Draft language specs are under way - I will announce those as soon as they are
>> Comments welcome as always!
> * Java do not invent things, it merely stole ideas from the other and make them its own in a coherent way
More information about the amber-spec-experts