Bounds Check Elimination with Fast-Range

Florian Weimer fweimer at
Mon Nov 18 13:10:27 UTC 2019

* August Nagro:

> The fast-range[1] algorithm is used to map well-distributed hash
> functions to a range of size N. It is ~4x faster than using integer
> modulo, and does not require the table to be a power of two. It is
> used by libraries like Tensorflow and the StockFish chess engine.
> The idea is that, given (int) hash h and (int) size N, then ((long) h)
> * N) >>> 32 is a good mapping.

I looked at this in the past weeks in a different context, and I don't
think this would work because we have:

jshell> Integer.hashCode(0)
$1 ==> 0

jshell> Integer.hashCode(1)
$2 ==> 1

jshell> Integer.hashCode(2)
$3 ==> 2

jshell> "a".hashCode()
$4 ==> 97

jshell> "b".hashCode()
$5 ==> 98

Under the allegedly good mapping, those all map to bucket zero even for
really large arrays, which is not acceptable.  The multiplication
shortcut only works for hash functions which behave in certain ways.
Something FNV-style for strings is probably okay, but most Java
hashCode() implementations likely are not.

For non-power-of-two bucket counts, one could try to pre-compute the
reciprocal as explained in Hacker's Delight and in these posts:


(I need to write to the author and have some of the math fixed, but I
think the general direction is solid.)

For an internal hash table, it is possible to use primes which are
convenient for the saturating increment algorithm because the choice of
bucket count is an implementation detail to some extent.  (It is not in
my case, so it would need data-dependent branches, which is kind of

Not discussed on the quoted pages is a generalization which uses

  hashCode - bucketCount * (int) Long.multiplyHigh(hashCode + 1L, magic)

as the bucket number.  That works for any table size that is not a power
of two, but requires a fast multiplier to get the upper half of a 64x64


More information about the hotspot-dev mailing list