Switch translation

Brian Goetz brian.goetz at oracle.com
Fri Apr 6 17:58:24 UTC 2018

>>     int index=-1;
>>     switch (s.hashCode()) {
>>         case 12345: if (!s.equals("Hello")) break; index = 1; break;
>>         case 6789: if (!s.equals("World")) break; index = 0; break;
>>         default: index = -1;
>>     }
>>     switch (index) {
>>         case 0: ...
>>         case 1: ...
>>         default: ...
>>     }
>> If there are hash collisions between the strings, the first switch 
>> must try all possible matching strings.
> I see why you use this structure, because it fits a general paradigm 
> of first mapping to an integer.

Or, "used", since this is the historical strategy which we're tossing 
over for the indy-based one.

> I now suggest that a post-optimization might then turn this into:
>   SUCCESS: {
>     DEFAULT: {
>       switch (s.hashCode()) {
>         case 12345: if (s.equals("Hello")) { stmts1; break SUCCESS; } 
> else if (s.equals(“Goodbye")) { stmts3; break SUCCESS; } else break 

Yes; the thing that pushed us to this translation was fallthrough and 
other weird control flow; by lowering the string-switch to an 
int-switch, the control structure is preserved, so any complex control 
flow comes along "for free" by existing int-switch translation.  Of 
course, it's not free; we pay with a pre-switch. (When we added strings 
in switch, it was part of "Project Coin", whose mandate was "small 
features", so it was preferable at the time to choose a simpler but less 
efficient desugaring.)

>> #### Switches on enums
>> Switches on `enum` constants exploit the fact that enums have 
>> (usually dense) integral ordinal values. Unfortunately, because an 
>> ordinal value can change between compilation time and runtime, we 
>> cannot rely on this mapping directly, but instead need to do an extra 
>> layer of mapping.  Given a switch like:
>>     switch(color) {
>>         case RED: ...
>>         case GREEN: ...
>>     }
>> The compiler numbers the cases starting a 1 (as with string switch), 
>> and creates a synthetic class that maps the runtime values of the 
>> enum ordinals to the statically numbered cases:
> Inconsistency: in the string example above, you actually numbered the 
> cases 0 and 1, not 1 and 2.

The old way, where the compiler generated the transform table (Java 5 
and later) used 1-origin, for the reason you surmise.  The new, 
indy-based translation uses 0, like the String example.

> Presumably for this example the chosen integers start with 1 rather 
> than 0, so that if any element of the array is not explicitly 
> initialized by Outer$0, its default 0 value will not be confused with 
> an actual enum value.  This subtle point should be mentioned explicitly.

Yes, that's exactly why the historical approach did it this way. The new 
way (which is uniform with other indy-based switch types) takes care of 
this with pre-filling the array with the index that indicates "default" 
at linkage time.  From SwitchBootstraps::enumSwitch:

             ordinalMap = new int[enumClass.getEnumConstants().length];
             Arrays.fill(ordinalMap, enumNames.length);

             for (int i=0; i<enumNames.length; i++) {
                 try {
enumNames[i]).ordinal()] = i;
                 catch (Exception e) {
                     // allow non-existent labels, but never match them

The catch-and-continue is so that if an enum disappears, we don't fail 
linkage of the classifier site, its arm just becomes dead code (which is 
fine, since the enum constant no longer exists.)

>> #### Explicit continue
>> An alternative to exposing guards is to expose an explicit `continue` 
>> statement in switch, which would have the effect of "keep matching at 
>> the next case."  Then guards could be expressed imperatively as:
>>     case P:
>>         if (!guard)
>>             continue;
>>         ...
>>         break;
>>     case Q: …
> A nice idea, but careful: it is already meaningful to write:
> while (…) { switch (…) { case 1: … case 2: if (foo) continue; … } }
> and expect the `continue` to start a new iteration of the `while` 
> loop.  Indeed, this fact was already exploited above under “### Guards”.

Yes.  One of the downsides of exposing `continue` is that currently the 
(switch, continue) entry in my table from "Disallowing break label and 
continue label inside expression switch" has a P instead of an X, 
meaning that continue is currently allowed in a switch if there's an 
enclosing continue-able context.  So this could be disambiguated as you 
say with "continue switch", or with requiring a label in some or all 

More information about the amber-spec-observers mailing list