Raw string literals and Unicode escapes

John Rose john.r.rose at oracle.com
Sat Feb 24 06:34:35 UTC 2018

On Feb 23, 2018, at 1:00 PM, Brian Goetz <Brian.Goetz at Oracle.COM> wrote:
>> However, since the JEP's goal is to allow copy-paste of arbitrary text without interpretation, I think the RawSP trick of assigning meaning to whitespace is out of place. To most people, the raw string literal:
>>   ` and `
>> denotes a perfectly good five-character string that will probably be inserted between two other strings. Explaining that, no, it's really a three-character string will not be popular.
> +100.  The RawSP trick is clever, but too much so.  There are ample simpler approaches for beginning/ending with BT:
>     String s = BACKTICK + `a raw string` + BACKTICK;
>     String s = `` `a raw string` ``.trim();
> These move the cognitive load on the user to the corner case, rather than landing it on the general case.

Note that the "trim" trick moves the problem elsewhere:  It can remove
more than just the one extra space, so the string "`xxx " needs a different
technique.  A bunch of only-partially-applicable tricks like that is also a kind
of cognitive load, isn't it?

Here's one I also kind of like:  If the string has no embedded newlines,
then do ``|`a raw string`|``.trimMargins().

The + operator is a more robust solution, although it requires
parentheses also if used with a postfix method of any sort.

Maybe better is trimLines, where a newline is the "guard" to
be stripped, but of which at most only one is stripped.

I suppose reasonable people might differ on whether a fixed aperiodic
quote (like "``` " or "```|") is more surprising than a grab bag of methods
for fixing edge effects.

But, I do agree that libraries can fix such edge effects.

And, I am very happy that, in lengthening the opening and
closing quotes, we are making it possible to paste an arbitrary
sequence of unicode without having to hunt around inside
the sequence to find stuff that needs extra quoting, as is
the case with today's strings.

The thing we are discussing here, the need to give special
handling to leading and trailing backticks is (crucially)
an edge effect (only at the two ends of the string) and
not a bulk effect (something that needs attention throughout
the string).  That means we have won on the key issue,
and are just disagreeing about how to collect our winnings.

(Yes you do have to look at the string bulk, but only to
choose a "strong enough fence" to enclose that bulk.
And then adjust the fence to handle edge effects from
backticks.  The emount of escaping is O(1) not O(N).)

— John

More information about the amber-spec-observers mailing list