Raw string literals and Unicode escapes

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Mon Feb 26 22:57:37 UTC 2018

Of course - delimiters is not part of the string length - I see now why 
you can have (in theory) unbound prefix/suffix.

Personally, I find the argument - "because you can have unlimited-length 
identifiers" not a great fit. From a lexer writer perspective, I can see 
that it is used as a candidate - after all it is a token whose size is 
unbound. But I find it hard to ignore that the roles played by 
identifiers and delimiters in the grammar are quite different.

At least there were other cases were we found different trade off 
between expressiveness and practicality - see Project Coin's use of 
repeated underscores in binary literals (subsequently banned):

private static final int BOND =

(Example courtesy of Joshua Bloch)


On 26/02/18 21:54, Jim Laskey wrote:
> Why introduce an artificial limit? Identifiers don’t have a 
> limit. 3.8. Identifiers An identifier is an *unlimited-length 
> sequence* of Java letters and Java digits, the first of which must be 
> a Java letter.
> — Jim
>> On Feb 26, 2018, at 5:29 PM, Maurizio Cimadamore 
>> <maurizio.cimadamore at oracle.com 
>> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>> On 26/02/18 20:17, John Rose wrote:
>>> Any*finite choice*  of end-quotes has the same problem, with
>>> a non-zero probability that decreases (but does not vanish)
>>> with the number of available end-quotes.  The only way to
>>> break out of the box is to allow the user an unlimited range
>>> of successively "stronger" end-quotes (i.e., less likely ones).
>> In reality there is a 'finite' upper bound for this length, which is 
>> given by 2^16 /2 = 2 ^ 15. That's the maximum delimiter size you 
>> could encode in a Java String which you can also symmetrically close 
>> - and it's an edge case, as it will contain the empty string.
>> So, yes, on paper, I agree with the argument, in practice, I guess 
>> I'd me more in favor of limiting the number of repetitions - I 
>> wouldn't like to open the door to puzzlers:
>> `````````````````````````````````````````````````````````````````````````hello`````````````````````````````````````````````````````````````````````````
>> (which might leave some Ascii art lovers a bit unhappy :-))
>> I think limiting to 8 or some other reasonable small number will 
>> probably reduce the clash probability enough? And, even if it's not 
>> enough, I guess we'd still be left with the question if a long 
>> (possibly unbounded?) escaping sequence is something we'd like to see 
>> in Java.
>> Maurizio

More information about the amber-spec-observers mailing list