John Hendrikx hjohn at
Thu Oct 1 00:21:18 PDT 2009

It will depend a lot on the relative speed of equals vs hashcode.  
Objects with fast hashcode functions but slow equals functions will 
perform relatively good in Sets, while the other way around will favor 
Lists.  That said, the difference between the two functions would have 
to be huge to favor a List for any relevant amount of elements.

Contains in Set, calls hashcode once and equals atleast once.  With a 
decent hash function and the default load factor it is rare that equals 
is called a second time for the contains operation.

Contains in List will call equals size / 2 times on average.


Gene Ray wrote:
> Reinier--
> I benchmarked this, and appear to have arrived at
> different results than you.  I used lists/sets of 10-character strings
> of lowercase letters, which is close a lot of use-cases I've seen, and
> ran 10^7 trials for each implementation (I used HashSet/ArrayList). 
> For 20 elements, set was a little over 3x faster (891 ms vs 2906 ms);
> for 7 elements set was still close to twice as fast (872/1550); for 2
> elements set was still slightly faster (859/969).   List was indeed
> faster with only one element (906/844), but at this point, I personally
> would just be using the .equals method instead.   Varying the string
> length (I tried 5, 20, and 50) increased or decreased overall times
> slightly, but did not affect relative performance.  I would be curious
> to learn what you benchmarked against.
> The above was done with
>  jdk1.6.0_12-b04, according to java -version. 
> Perfomance
> is not the only concern.  "Sets" in java represent the mathematical
> notion of a set; lists do not, and pretending that the latter is
> equivalent to the former is a logical error regardless of common practice.
>> NB: Gene, you're trying to argue that a _literal_ set is
>> going to make  
>> some sort of speed difference compared to a a_literal_
>> list. That  
>> notion is frankly ridiculous. You also lost track of the
>> first rule of  
>> discussing speed: Test it first. So, go ahead. Benchmark a
>> bunch  
>> of .contains() calls on a list and a set with the same
>> items in it,  
>> both with say, 20 items in them (we are talking about
>> literals, after  
>> all. I don't think anyone is seriously considering sticking
>> hundreds  
>> of lines of code in a
>  .java file to store constant data!)
>> -  on my own  
>> machine Lists are less than a factor of 2 slower. For about
>> 7 items  
>> and down, lists are in fact faster. Also, just in case
>> someone IS  
>> tempted to write such a large collections literal: For such
>> a large  
>> literal, the added burden of wrapping the literal in a
>> "new  
>> HashSet<>()" or sticking a ".toSet();" at the end
>> seems trivial.
>> I think the 'set literals are rare' seem to have it, so I
>> repeat my  
>> plea to the set literal fans: Give us some proof they
>> aren't rare.  
>> Given that set literals cause so much pain, the burden is
>> clearly on  
>> the supporters of a set literal to prove why we need them.
>>    --Reinier Zwitserloot
>> On 2009/29/09, at
>  21:07, Gene Ray wrote:
>>> "Rare"?
>>> In my experience, Sets are not rare in well-written
>> code; they're  
>>> only rare in code where for whatever reason the
>> developer has  
>>> refused to use them, and instead expends effort and
>> CPU time  
>>> iterating through an array or ArrayList to achieve the
>> equivalent  
>>> functionality.  Encouraging this sort of behavior
>> further by  
>>> including only Lists in the new syntax is not a good
>> plan.

More information about the coin-dev mailing list