[9] Review request for 8140503: JavaFX "Pair" Hash Code Collisions

Jim Graham james.graham at oracle.com
Tue Nov 3 22:05:24 UTC 2015

Absolutely, this report uncovered an unrelated NPE bug which is helpful, 
and apparently different constants might affect the collision 
probabilities (which were already very unlikely to begin with), but the 
bug report was definitely filed primarily as a misunderstanding.

One thing to note - perhaps small numbers are more likely to be 
encountered so the larger the prime, the less likelihood of getting a 
collision.  I wouldn't recommend using too large of a prime since the 
probability is already so low even with 13, and larger primes may 
increase the size of the byte code and compiled code since larger 
constants take more complicated instructions to deal with.  Josh Bloch 
recommended a base of 17 and a multiplier of 37 in Chapter 3 of 
Effective Java, but he also admitted that the numbers were arbitrary 
except for the multiplier needing to be an odd prime.

One reason 13 might be too low depends on the probability that Pair is 
used in hash cases with lots of very small numbers.  Adding in a base 
prime and using a little bit larger prime multiplier quickly turns lots 
of small numbers into massively distributed hashes - even with 31 or 17/37.

I kind of feel like using the existing utility is overkill for just 2 
objects since it requires constructing an array object to use it.  I'd 
be just as happy (perhaps even more so) if we just upgraded the code to 
check key for null and use a little larger prime, but keep it inlined...


On 11/3/15 1:43 PM, Alexander Kouznetsov wrote:
> Moreover, the following two sentences:
> "However, this is an incorrect way to compute a hash code of two values."
> "This can lead to hard-to-find bugs anywhere that instances of Pair are
> used in a data structure like a HashSet or HashTable."
> seem to indicate misunderstanding of what hashcode is and how it is to
> be used.
> Best regards,
> Alexander Kouznetsov
> (408) 276-0387
> On 3 ноя 2015 13:42, Alexander Kouznetsov wrote:
>> After the fix, you should expect another incident report of
>> Objects.hash(1, 0) == Objects.hash(0, 31)
>> always true :-)
>> I'd rather file another bug on key == null causing NPE and closing
>> this one as incomplete or not an issue.
>> Best regards,
>> Alexander Kouznetsov
>> (408) 276-0387
>> On 3 ноя 2015 12:07, Vadim Pakhnushev wrote:
>>> Hmm, yeah, the actual difference is in the prime number only (that is
>>> changing the algorithm only doesn't improve anything), so the only
>>> remaining reason to fix this is that Objects.hash guards against null
>>> values (and I forgot to mention it in the review).
>>> The key in Pair could actually be null and in this case hashCode will
>>> throw NPE.
>>> Vadim
>>> On 03.11.2015 23:01, Vadim Pakhnushev wrote:
>>>> Well, not exactly... Previously it was 13*hash(a) + hash(b) and now
>>>> it's 31*(31 + hash(a)) + hash(b).
>>>> And apparently it improves the quality somehow. I did a test with
>>>> 100^4 combinations and collision probability dropped by the factor
>>>> of 3 from 0.065% to 0.022%.
>>>> Not really impressive, but still, and it uses well-defined utility
>>>> method.
>>>> Yeah, I know it's not really a bug since you don't want to rely on
>>>> the hashCode at all...
>>>> Thanks,
>>>> Vadim
>>>> On 03.11.2015 22:35, Jim Graham wrote:
>>>>> All this does is change the prime constant used to produce the hash
>>>>> value.
>>>>> Objects.hash(a, b) uses 31*hash(a) + hash(b) instead of the
>>>>> 13*hash(a) + hash(b) that the embedded implementation uses.
>>>>> I don't really think this is a bug.  The fact that Integer objects
>>>>> make it easy to reverse engineer and compute collisions of any
>>>>> reasonable hash combination computation don't mean that the
>>>>> technique has a bug, it just means that the submitter can read the
>>>>> code and think of a counter-example.
>>>>> If there are practical problems being caused for some particular
>>>>> and popular use case by the use of this particular constant "13",
>>>>> then we need to understand those issues and come up with a more
>>>>> comprehensive solution than to simply hand off to another mechanism
>>>>> which uses the same procedure with a different prime constant...
>>>>>             ...jim
>>>>> On 11/3/15 3:06 AM, Vadim Pakhnushev wrote:
>>>>>> Hi Chien,
>>>>>> Could you please review the fix:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8140503
>>>>>> http://cr.openjdk.java.net/~vadim/8140503/webrev.00/
>>>>>> Thanks,
>>>>>> Vadim

More information about the openjfx-dev mailing list