String concatenation tweaks

Alex Buckley alex.buckley at
Thu Mar 12 01:13:51 UTC 2015

On 3/11/2015 2:01 PM, Louis Wasserman wrote:
> So for example, "foo" + myInt + myString + "bar" + myObj would be
> compiled to the equivalent of
> int myIntTmp = myInt;
> String myStringTmp = String.valueOf(myString); // defend against null
> String myObjTmp = String.valueOf(String.valueOf(myObj)); // defend
> against evil toString implementations returning null
> return new StringBuilder(
>       17 // length of "foo" (3) + max length of myInt (11) + length of
> "bar" (3)
>       + myStringTmp.length()
>       + myObjTmp.length())
>     .append("foo")
>     .append(myIntTmp)
>     .append(myStringTmp)
>     .append("bar")
>     .append(myObjTmp)
>     .toString();
> As far as language constraints go, the JLS is (apparently deliberately)
> vague about how string concatenation is implemented.  "An implementation
> may choose to perform conversion and concatenation in one step to avoid
> creating and then discarding an intermediate String object. To increase
> the performance of repeated string concatenation, a Java compiler may
> use the StringBuffer class or a similar technique to reduce the number
> of intermediate String objects that are created by evaluation of an
> expression."  We see no reason this approach would not qualify as a
> "similar technique."

The really key property of the string concatenation operator is 
left-associativity. Later subexpressions must not be evaluated until 
earlier subexpressions have been successfully evaluated AND 
concatenated. Consider this expression:

   "foo" + m() + n()

which JLS8 15.8 specifies to mean:

   ("foo" + m()) + n()

We know from JLS8 15.6 that if m() throws, then foo+m() throws, and n() 
will never be evaluated.

Happily, your translation doesn't appear to catch and swallow exceptions 
when eagerly evaluating each subexpression in turn, so I believe you 
won't evaluate n() if m() already threw.

Unhappily, a call to append(..) can in general fail with 
OutOfMemoryError. (I'm not talking about asynchronous exceptions in 
general, but rather the sense that append(..) manipulates the heap so an 
OOME is at least plausible.) In the OpenJDK implementation, if 
blah.append(m()) fails with OOME, then n() hasn't been evaluated yet -- 
that's "real" left-associativity. In the proposed implementation, it's 
possible that more memory is available when evaluating m() and n() 
upfront than at the time of an append call, so n() is evaluated even if 
append(<<tmp result of m()>>) fails -- that's not left-associative.

Perhaps you can set my mind at ease that append(..) can't fail with OOME?


More information about the compiler-dev mailing list