Yield as contextual keyword

Brian Goetz brian.goetz at oracle.com
Fri May 24 14:34:33 UTC 2019

var yield = 5;

yield is lexed as an identifier, so this is a valid variable declaration

  var res = switch(yield) {

yield is lexed as an identifier, so this is a valid switch operand

     default -> yield + yield;

RHS of a single-consequence case is expression | block statement | 
throw.  We're not in a YieldStatement production, so this is an expression.

Like lambdas, we can think of this as shorthand for

     default -> {
         yield yield + yield;

which parses as a yield statement whose operand is the expression 

> // are we
> returning result of binary plus (10) or yielding result of unary plus
> (5)? Seems the first one, yet confusing.
> Tagir.
> On Fri, May 24, 2019 at 4:30 AM Dan Smith <daniel.smith at oracle.com> wrote:
>>> On May 22, 2019, at 9:45 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>> The “compromise” strategy is like the smart strategy, except that it trades fixed lookahead for missing a few more method invocation cases.  Here, we look at the tokens that follow the identifier yield, and use those to determine whether to classify yield as a keyword or identifier.  (We’d choose identifier if it is an assignment op (=, +=, etc), left-bracket, dot, and a few others, plus a few two-token sequences (e.g., ++ and then semicolon), which is lookahead(2).
>>> The compromise strategy misses some cases we could parse unambiguously, but also offers a simpler user model: always qualify invocations of methods called yield when used as expression statements.  And it offers the better lookup behavior, which will make life easier for IDEs.
>> There's still some space for different design choices within the compromise strategy: what happens to names in contexts *other than* the start of a statement?
>> I think it's really helpful to split the question into three parts: variable names, type names, and method names.
>> 1) Variable names: we've established that, with a fixed lookahead, every legal use of the variable name 'yield' can be properly interpreted. Great.
>> 2) Type names: 'yield' might be used as the name of a class, type of a method parameter, type of a field, array component type, type of a 'final' local variable etc. Or we can prohibit it entirely as a type name.
>> We went through this when designing 'var', and settled on the more restrictive position: you can't declare classes/interfaces/type vars or make reference to types with name 'var', regardless of context. That way, there's no risk of confusion between subtly different programs—wherever you see 'var' used as a type, you know it can only mean the keyword.
>> I think it's best to treat 'yield' like 'var' in this case.
>> 3) Method names: 'yield(' at the start of a statement means YieldStatement, but what about other contexts in which method invocations can appear?
>> Example:
>> var v = switch (x) {
>>      case 1 -> yield(x); // method call?
>>      default -> { yield(x); } // no-op, produces x (oops!)
>> };
>> Fortunately, the different normal-completion behavior of a method call and a yield statement will probably catch most errors of this form—when I type the braces above, I'll probably also try adding a statement after the attempted 'yield' call, and the compiler will complain that the statement is unreachable. But it's all very subtle (not to mention painful for IDEs).
>> Taking inspiration from the treatment of type names, my preference here is to make a blanket restriction that's easy to visualize: an *unqualified* method invocation must not use the name 'yield'. Context is irrelevant. The workaround is always to add a qualifier.
>> (If, in the future, we introduce local methods or something similar that can't be qualified, we should not allow such methods to be named 'yield'.)
>> ---
>> Are people generally good with my preferred restrictions, or do you think it's better to be more permissive?

More information about the amber-spec-experts mailing list