String Deduplication in JEP192

Kirk Pepperdine kirk at
Fri Feb 28 23:25:43 PST 2014


I’ve been looking at the JEP and have been wondering 1) where did you get the statistics from and 2) is this really going to be a big win? If the weak generational hypothesis holds will you not be spending more time deduping soon to be garbage which potentially would place more load on the GC threads?


On Mar 1, 2014, at 12:08 AM, Bernd Eckenfels <bernd-2014 at> wrote:

> Hello,
> not sure what the proper process is, but I notice that Dalibor
> retweeted a JEP link to JEP192 - String Deduplication in G1.
> The most obvious thing I noticed is, that the JEP goes into detail to
> describe how a String object is constructed out of hashcode and char
> array. But it somehow totally misses offset+count fields (substrings).
> One can say, it is not the scope of the JEP to be so detailed, but then
> the other details of the string object should be removed as well.
> What I somewhat also miss is a detailed description how this is
> integrated with the GC. I mean there are some interactions around the
> topic of aging, dereferencing and atomic replacement, but most of the
> JEP deals with functionality outside the GC.
> It looks a bit like it will suffer from similiar scalability problems
> then the already existing string pool. Maybe it would be better to
> re-design the string pool in a way it solves both problems with less
> work for the GC phases. This could go so far to even have a (new)
> string intern API which could be used by things like XML parsers or
> network decoders - which are typically a source of lots of string
> duplications in apps.
> (And I am not sure if this should be so G1 specific, after all the
> adoption rate of G1 is still lower than it could be)
> Gruss
> Bernd

More information about the hotspot-gc-dev mailing list