Raw string literals and Unicode escapes

Guy Steele guy.steele at oracle.com
Tue Feb 27 19:56:57 UTC 2018

> On Feb 27, 2018, at 2:48 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>> So after this length instead of having the probability to see a character to be virtually 1, you have the opposite effect, because programming languages (a human construct) are very regular in the set of chars they use. So you do not need to a repetition of a character to avoid a statistical effect that does not occur. Being able to choose the escape character, is enough.
> The problem is not that it's enough, its that it is too much. Having nine ways to say the same thing is too many; having infinitely many (e.g., nonces) is worse.  Having used the "pick your delimiter" approach taken by Perl, I find that you are *still* often bitten by the inability to find a good delimiter for embedding a snippet of a program written in a language similar to the outer language.  And it surely makes code less readable, because many more things can be interpreted as quotes.

I agree with the comments that in practice many raw strings are much more likely to be some sort of code rather than relatively random strings.

That said, here is a perfectly plausible bit of Java code:

	final String uppercase = “ABCDEFGHIJKLMONOPQRSTUVWXYZ”;
	final String lowercase = “abcdefghijklmnopqrstuvwxyz”;
	final String enclosers = “(){}[]”;
	final String punctuation = “`~!@#$%^&*_+-=|\\:\”;’<>,.?/“;

I can quote it easily using `` and ``.  But it’s at least a little less easy, as John has argued, to quote it using the “raw|…|” convention: there is no character on my keyboard that does not occur in the code to be quoted, so I have to go in and muck with the middle of the string.  Which makes it less easy to read the embedded code: in order to interpret it, one must be mindful that the syntactic requirements of the containing language may have intruded (requiring doubling or escaping of certain characters, for example).

The nice thing about

	``<body tag=“foo”>He said `<i>what</i>’?</body>``

is that I can be _completely_ sure that the syntax of Java has not intruded _at all_ into the middle of the HTML syntax, so that’s one less thing to worry about while puzzling over the HTML.  This becomes even more important if the raw-string syntax is nested:

	System.out.println(“\t” + ```final String htmlSnippet = “\t” + ``<body tag=“foo”>He said `<i>what</i>’?</body>`` + “\n";``` + “\n”);


More information about the amber-spec-observers mailing list