RFR 8005698 : Handle Frequent HashMap Collisions with Balanced Trees

Brent Christian brent.christian at oracle.com
Thu May 23 19:02:07 UTC 2013

Thank you for having a look, Doug.

Comments inline...

On 5/23/13 5:20 AM, Doug Lea wrote:
> * Given that balanced trees are not used in WeakHashMap
> or Hashtable, I wonder why the TreeNode etc classes are
> not just nested inside HashMap (usable from subclass LinkedHashMap).
> And if so, it would be simpler, faster, and less code-intensive
> to declare TreeNode as a subclass of the internal Entry class.
> Plus some further gymnastics to overcome the unfortunate
> decision to make LinkedHashMap a subclass of HashMap.
> Had you considered this?

I definitely did.  I wanted a cleaner class hierarchy for 
Entry/TreeNode, but LinkedHashMap makes it difficult, since its Entries 
need extra before/after references.

I couldn't work out a TreeNode class that could be used by both HashMap 
and LinkedHashMap, since LHM would need a TreeNode derived from 
LinkedHashMap.Entry, not HashMap.Entry.

So instead I used composition, where TreeNode has-a [Linked]HashMap.Entry.

> * The comparableClassFor code turns out to be a bottleneck
> (and is bypassed in the particular case of String)
> in tests of the analogous code in ConcurrentHashMap, mainly
> because the only available reflection methods allocate little
> objects while checking for the form "class C implements Comparable<C>"
> At some point, someone might want to implement a JDK-internal
> streamlined form.


> * Notice that if trees were ever to be used in WeakHashMap,
> the nodes would surely need to be declared as  WeakReference
> subclasses, as are the current Entry nodes.

Full-disclosure: I did have a working version of WeakHashMap with 
balanced trees.  Here again, composition helped: TreeNode has-a 
WeakReference instead of being-a WeakReference.  But guaranteeing that 
TreeBin was protected from encountering null/cleared keys took some 
contortions.  In the end, the extra complexity wasn't worth it, so it's 
not included.

> The whole thought
> of doing this though is suspect (so I'm happy to see it
> not being done)

Glad to hear that taking it out was the right move. :)


> On 05/22/13 16:50, Brent Christian wrote:
>> I have updated the changes as follows:
>> * TreeBin.createEntryForNode() + the anonymous TreeBin subclass in
>> [Linked]HashMap have been replaced by a TreeBin.map reference back to the
>> containing map, enabling TreeBin to just call newEntry() directly.
>> * TreeNode.entry was made final
>> * Added a top-level comment in HashMap giving a brief overview of how
>> balanced
>> trees are used
>> * Updated the TreeBinSplitBackToEntries test for TREE_THRESHOLD=16
>> * Test code was updated to not need to be on the bootclasspath.
>> LaunchOnBootClass.java was removed from the changes (and would not
>> have been
>> needed anyway, due to the recent jtreg update)
>> The new webrev is here:
>> http://cr.openjdk.java.net/~bchristi/8005698/webrev.01/
>> Thanks,
>> -Brent
>> On 5/13/13 9:48 AM, Brent Christian wrote:
>>> Hi,
>>> Please review my changes for 8005698 : Handle Frequent HashMap
>>> Collisions with Balanced Trees.
>>> For HashMap and LinkedHashMap, we would like to duplicate the technique
>>> (and some of the code) that the latest ConcurrentHashMap[1] uses for
>>> handling hash collisions: use of balanced trees to store map entries in
>>> hash bins containing many collisions.
>>> Webrev is here:
>>> http://cr.openjdk.java.net/~bchristi/8005698/webrev.00/
>>> Some high-level details copied from the JEP[2]:
>>> Earlier work in this area in JDK 8, namely the alternative
>>> string-hashing implementation[3], improved collision performance for
>>> string-valued keys only, and it did so at the cost of adding a new
>>> (private) field to every String instance.
>>> The changes proposed here will improve collision performance for any key
>>> type that implements Comparable. The alternative string-hashing
>>> mechanism, including the private hash32 field added to the String class,
>>> can then be removed.
>>> The principal idea is that once the number of items in a hash bucket
>>> grows beyond a certain threshold, that bucket will switch from using a
>>> linked list of entries to a balanced tree.
>>> We will not implement this for the Hashtable class - some legacy code
>>> that uses it is known to depend upon iteration order. Instead, Hashtable
>>> will be reverted to its state prior to the introduction of the
>>> alternative string-hashing implementation, and will maintain its
>>> historical iteration order.
>>> We also will not implement this technique in WeakHashMap. An attempt was
>>> made, but the complexity of having to account for weak keys resulted in
>>> an unacceptable drop in microbenchmark performance. WeakHashMap will
>>> also be reverted to its prior state.
>>> Thanks!
>>> -Brent
>>> 1.
>>> http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ConcurrentHashMap.java?view=log
>>> 2. http://openjdk.java.net/jeps/180
>>> 3. http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/43bd5ee0205e

More information about the core-libs-dev mailing list