Gap Buffer based AbstractStringBuilder implementation

Jesús Viñuales serverperformance at
Sun Nov 22 22:01:45 UTC 2009

Osvaldo Doederlein wrote:
> Em 22/11/2009 05:55, Thomas Hawtin escreveu:
>> There is a security issue there. When multiple threads are involved, 
>> it is possible (though not necessily easy) to create a mutable String 
>> if the backing char[] is shared.
>> Tom Hawtin
> That's true. But there's apparently a simple solution
>     public String toStringShared() {
>         // createShared() is a package-protected helper/ctor
>         String ret = String.createShared(value, 0, count);
>         // Reset value, so evil user can't abuse the buffer to change 
> the String.
>         value = EMPTY;
>         count = 0;
>         return ret;
>     }
>     private static final char[] EMPTY = new char[0];
> This solution should be safe, without need of escape/alias analysis, 
> because StringBuilder and StringBuffer don't have any methods that 
> return a new mutable object that shares the same char[]. The only APIs 
> that aliases the buffer is subSequence(), but this returns a 
> CharSequence which is a read-only object.
> A+
> Osvaldo

I don't agree. That solution isn't safe because the involved methods aren't
synchronized (in StringBuilder), nor you have any guarantee within the Java
memory model about the visibility to other threads of your changes in the
value and count variables ... except if they are volatile. And if you have
to establish the values for more than one variable (value and count) in an
atomic fashion, the volatile approach doesn't help you. And also may cause
the String to appear to mutate if one thread calls toString() while another
is between the read of shared and the insert/append/delete operation, or
even worst, executing the operation itself).

I'm pretty sure that the only solution is a copy-on-write approach based in
a volatile boolean flag, and not a never-copy one as Andrew said (and I
remember that GNU Classpath implementation even addressed the "unused space
consumption problem" evaluating in the toString method how much unused space
had the buffer, and if the underlining char[] is too big, make a copy
instead of sharing it).

Anyway. Read carefully the evaluation section of Reintroducing
the copy-on-write approach was tested by Sun in 2006 approach, and "It was
discovered that the reintroduction of the sharing code caused a reproducible
regression on the order of 4% in SPECjbb2005 scores", surely for impacting
the GC or whatever. If you see the prototype description, it is perfect:
using a volatile flag, testing whether to share or to copy the char[] in the
toString method, etc.

I tried different approaches last year, and even posted one of them in this
forum (as you can see in archives) but with no luck.

My guess is that this kind of COW optimization is work for the Hotspot via
Escape analysis... or in the end of the chain, work for the MMU of the CPU.


More information about the core-libs-dev mailing list