RFR(L): 8029075 - String deduplication in G1

Per Liden per.liden at oracle.com
Mon Mar 10 12:36:13 UTC 2014

Hi John & Christian,

On 03/07/2014 01:40 AM, John Rose wrote:
> On Mar 6, 2014, at 1:04 PM, Christian Thalinger
> <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>>
> wrote:
>> Another way would be to use some kind of annotation to mark references
>> and teach the GCs about that.
> Here's how the story looks to me:
> Define a new annotation @WeakReferenceField.
> Take Alexey's code for @Contended and duplicate with adjustments for
> @WeakReferenceField.  Define another field bit and store it in metadata.
> Looks for the bit in the JVM wherever it currently asks about
> java_lang_ref_Reference::referent_offset.
> (This is the hard part, especially for G1, but it is traded against
> adding more GC-sensitive C++ code into the JVM, for a benefit of narrow
> scope.)

Interesting idea. And the equivalent of the Reference.discovered field 
would be injected after each weak-annotated field?

> Don't bother about queueing; it's optional for WeakReference.

In cases where we keep lots of weak references in things like hashmaps I 
can easily imagine that we'd want some kind of notification when 
references become stale so that the map can be pruned. Injecting another 
field after the weak-field to keep track of the queue would of course be 
an option in that case.

> The goal is that the following classes work about the same, except for
> heap pressure:
>> class A1 {
>>   private WeakReference<String> ref1 = new WeakReference<>(something);
>>   private WeakReference<String> ref2 = new WeakReference<>(somethingElse);
>> }
>> class A2 {
>>  private @WeakReferenceField String ref1 = something;
>>  private @WeakReferenceField String ref2 = somethingElse;
>> }
> Put the annotation in sun.misc or some other internal place.  Use it
> only for private JVM-coupled things like string dedup (no JCK or general
> testing).
> Put whatever practical limits you want on the implementation.  Let the
> next user (there will be a next user) refine it further.

Even if the memory overhead isn't completely removed here it's probably 
starting to approach a level that's good enough even for heavy users or 
weak refs.

> Enjoy coding and debugging multi-thread GC-safe code in Java.
> I hope this helps!

The other part of the story, orthogonal to the memory overhead, is 
making reference processing more efficient once the referent becomes 
unreachable. I've seen many cases where heavy use of weak references 
(registered or not) overloads the reference handler thread. To improve 
that situation, I'd like to see the reference handler thread removed 
completely, and instead let the GC enqueue reference directly onto the 
reference queues.


> — John
> P.S. Connection with value types:  Eventually, we could add value types
> like java.lang.ref.WeakReferenceValue which would provide a standard
> form for this.  Or we could standardize the annotation, perhaps.
> P.P.S. For a little more on value types, see
> https://blogs.oracle.com/jrose/entry/value_types_and_struct_tearing

More information about the hotspot-dev mailing list