Faster HashMap implementation
Ulf.Zibis at gmx.de
Thu Jun 4 10:50:47 UTC 2009
Am 04.06.2009 01:21, Alex Yakovlev schrieb:
> While playing with scala programming language I've come to a HashMap
> that appears to be faster than current java.util.HashMap, also with a
> lower memory footprint.
> Alex Miller advised me to post about it in this maillist.
> I've put current implementation (both scala and java) and some
> benchmark results at github:
> Current java version implements most of the external HashMap behaviour
> inluding serialization, but does not throw ConcurrentModificationException
> and does not implement internal methods that can be requeired for some
> Main trick is in storing all data in 2 arrays without any HashEntry
> One is array of Objects with keys and values, another in a complex
> array of indices.
> Index array has 2 parts. First part contains hashcode-to-index mappings,
> second part is index-to-index mapping for elements with the same hashcode.
> The rest bits are used for:
> 1. 'Deleted' flag to mark empty spots after key removal
> (other bits part of such values is used as deleted-linked-list indices)
> 2. 'End-of-list' flag allowing not to access next index if this
> element is last.
> 3. Some bits of hashcode used to compare with hashcode of the key we
> are looking for.
> Finding an existing value for a key has approximately the same speed
> as the currrent
> HashMap implementation. It is:
> 1. Computing element hashcode,
> 2. Accessing array to get index/address for this hashcode
> 3a. Current HashMap access HashEntry object with
> key/hashcode/value/next properties
> 4a. In my implementation hashcode and next are packed in index array bits,
> key and value are stored in adjacent cells of the second array.
> When looking for missing key my implementation can be twice faster in
> some situations,
> since it's only one random memory access (element index array). If
> it's empty,
> or its hashcode bits are not equal to hashcode bits of the key being
> and it's `end-of-list` bit is set - it is enough to stop searching.
> Current HashMap
> needs 2 random memory reads: (1) array cell with address and (2)
> HashEntry at that address.
> Insertion of a new key does not require any memory allocation (new
> HashEntry object)
> since all arrays are pre-allocated except for resize operation. Resize
> consists of
> Arrays.copyOf to a twice large array and almost sequental rehashing -
> reading one array
> and writing it to either the same index or with `1` in highest bit.
> When adding several key/values they are sequentally written to
> key/values array,
> and sequental memory read is faster and more cache-friendly.
> Having 2 arrays instead of a lot of HashEntry objects is more garbage
> friendly and has significantly lower memory footprint.
> Iteration over elements is a sequental array read which is faster.
> To clone such HashMap we only need to clone 2 arrays with is also much
> Elements order is preserved even after resize, only when some key is
> it leaves an empty 'hole' that will be filled after next key
> insertion. Thus iteration
> order is more predictable and I think it's possible not to throw
> in most situations. Iterator needs only 1 parameter - array index -
> and even resize
> does not break iteration order.
> But there can be problems. For example if we iterated over some key
> that was deleted after that and then inserted again into higher array
> it can eventually came up second time into iterator. But if we save
> non-empty index that was at iteration start, and monitor insert/delete
> operations (possible holes) in indices we've not iterated over yet,
> we will not need to throw ConcurrentModificationException in most cases.
> I've done some benchmarks that proves my assumptions:
> * Reading existing key-value mapping has the same speed,
> * Adding new mapping and looking for missing key is significantly faster.
> But there can be some problems in my testing strategy
> or benchmark can be somehow biased, so it's better to retest it
> with some independent benchmarks.
> Hope it will help to make java collections faster since such approach
> (preallocated arrays instead of ...Entry objects) can be used to speedup
> other collections. In my scala version both HashMap and HashSet
> are implemented this way. It also stores primitive types like int
> in primitive arrays which saves a lot of memory. But on one hand
> scala has built-in boxed arrays, and on the other hand such array boxing
> is slow. It also has to store keys and values in different arrays,
> which gives extra random memory read and thus slower -
> when keys and values are in adjacent cells of the same array
> value will usually be pre-fetched when key is read.
> Alex Yakovlev
More information about the core-libs-dev