HashMap - Hybrid approach
alex14n at gmail.com
Thu Jun 11 10:05:06 PDT 2009
On Thu, Jun 11, 2009 at 6:53 PM, Doug Lea<dl at cs.oswego.edu> wrote:
> * Lazily create all arrays
Done. (in scala version it was done from the very beginning)
> * Separate the front and back of index array. (Note that
> you don't have any locality here to lose by doing so,
> since you jump to index h+hashlen.)
Initially it were 2 different arrays, 'firstIndex' and 'nextIndex',
but extra array is one extra pointer in HashMap object
and extra array object header, so overall 2 arrays = more memory
And second array contents now is tied to elements,
it's used to store/check deleted list, etc. so it's not
trivial to make lazy allocation.
> * Use default initial capacity of 4,
> And further, the performance differences are very small across
> these load factors in a few tests I ran. This suggests a
> different sort of tactic for helping with indirections and
> prefetches. Suppose that index[h] was usually just h; that is,
> with the key in elements[2*h]. (This only works in current
> design if loadFactor==1, but can work with smaller values if
> elements array is sized at hashlen.) Under this scheme, you can
> optimistically read this element at the same time as the
I did a little test of this and saw no performance gain,
I even posted a link to sources right after changing subject
from 'fast' to 'hybrid'.
Do you have any idea why second array prefetching may not happen?
> index. To make this work, you'd need a bitmap of available slots
> rather than a freelist, adding C/32 space overhead, plus some
> heuristic juggling around inside put and resize. It might be
> worth an experiment or two. Doing this requires yet further
> work/thought about how sizing varies with loadFactor
There is no need for extra bitmap - it can be stored along hashcodes.
> The exact values depend of course on hash quality, insertion
> order, etc. But this also reflects fact that the collisions array
> (or back half of current index array) is typically less than 30%
> occupied. This suggests shrinking its default size and letting
> it independently resize, using the same sort of approach as
> with main hash: start out with say 25% size, by letting sets of
> 4 indices share an overflow slot, and expand as needed. Or
> maybe start with 50%, which will still be enough footprint savings
> to bother.
As we already discussed, this approach has greater footprint
since we always need to alllocate 3*(hashtable size) words PLUS
overflow. There would be sence to do that only if prefetching trick
with the same array index for hash and key/value would work.
But it did not in my experiment, don't know why.
Maybe I'll test it on different computers lately,
or you'll help with some advice.
More information about the core-libs-dev