[raw-string] indentation stripping

Brian Goetz brian.goetz at oracle.com
Sat Apr 28 18:12:05 UTC 2018

This thread accidentally got started on the wrong list, so bringing it back here.  The following messages are hereby read into the record (and hence can be considered to be under the proper terms of use for a specification list.)  

http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003034.html <http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003034.html> (Jim #1)
http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003035.html <http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003035.html> (John)
http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003051.html <http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003051.html> (Kevin #1)
http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003052.html <http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003052.html> (Jim #2)
http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003053.html <http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003053.html> (Kevin #2)
http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003054.html <http://mail.openjdk.java.net/pipermail/amber-dev/2018-April/003054.html> (Jim #3)

A summary follows.  

The key point being discussed is that the “raw means raw” interpretation for multi-line strings is likely to be at odds with how users actually plan to use the feature — that they will pad the code with incidental indentation to make it line up nicely with the enclosing Java code, and that IDEs may well adjust said incidental indentation as the code is maintained — and that this is a reasonable thing to encourage.  Kevin’s data from the Google codebase backs up this supposition.  Our design already admits this to some degree — for multi-line strings, we don’t really believe the source file when it uses platform-specific line terminators.  So we’re trying to distill how to distinguish “incidental” indentation from intended indentation in multi-line strings.  (More generally: the feedback we’ve gotten is that while raw strings is the right design center for single-line strings, when it comes to snippets that span lines, user care more about multi-line-ness than raw-ness.)

 - Most multi-line strings will be code snippets of some sort (JSON, XML, SQL, Java, etc);
 - Most developers will want to use incidental indentation to have code snippets indent “sensibly” relative to neighboring Java code, but said incidental indentation is not part of the snippet.

Jim’s #1 offers a catalog of ways in which users might craft multi-line string literals to fit cleanly into their source code, identifying which indentation is incidental and which is essential.
To the goals, I’d add:
 - In addition to it being _possible_ to render the desired result, it should be straightforward for users to _predict_ the result of indentation stripping.

Kevin adds: it would be useful if we could draw a “rectangle” that excludes all incidental indentation and includes all intended indentation.  

Tabs are a confounding issue; since there is no standard interpretation for how many spaces correspond to a tab, in the general case no trimming algorithm will do well with mixed spaces and tabs.  However, in the well-behaved case where lines begin with tab* space*, a common prefix can be stripped.  

There’s some reason to believe that calling .stripIndent() will be so common that it should be the default, rather than requiring users to invoke it every time.

Now back to discussion.

More information about the amber-spec-observers mailing list