list literal gotcha and suggestion
reinier at zwitserloot.com
Thu Oct 1 06:25:56 PDT 2009
How is a mere constant factor of 3 relevant when the set/list contains
so few entries? Writing code with performance in mind before you
actually run into a performance problem is bad, but designing the
language with this in mind is undoubtedly worse. In the extremely rare
cases where this speed issue is relevant, the burden of converting
that list to a set remains a drop in the bucket compared to the other
hoops one would normally have to jump through in order to solve a
performance problem. I have absolutely no idea why 'speed' is being
put on the table here as an argument in favour of set literals. I
suggest we forget entirely about the speed issue and focus instead on
getting to some solutions.
For example, instead of stubbornly denying that Set literals aren't as
important as List literals, can you, as an avid set user, shed some
light on how acceptable some of the alternative proposals are, such as:
[1, 2, 3].toSet();
new Set[1, 2, 3]; //keep in mind that without the 'new', this syntax
is pretty much impossible to integrate into the existing java grammar,
due to looking like an array dereference operation.
new HashSet<>([1, 2, 3]);
NB: I tested on a core2duo mac, version 1.6.0_15, and I used sets/
lists of Integer instead of String.
NB2: I whole heartedly agree with Stephen Colebourne's sentiment that
I don't believe adding set literals is somehow going to make java joe
see the light and start using set a lot more.
Like it? Tip it!
On 2009/30/09, at 16:02, Gene Ray wrote:
> I benchmarked this, and appear to have arrived at different results
> than you. I used lists/sets of 10-character strings of lowercase
> letters, which is close a lot of use-cases I've seen, and ran 10^7
> trials for each implementation (I used HashSet/ArrayList). For 20
> elements, set was a little over 3x faster (891 ms vs 2906 ms); for 7
> elements set was still close to twice as fast (872/1550); for 2
> elements set was still slightly faster (859/969). List was indeed
> faster with only one element (906/844), but at this point, I
> personally would just be using the .equals method instead. Varying
> the string length (I tried 5, 20, and 50) increased or decreased
> overall times slightly, but did not affect relative performance. I
> would be curious to learn what you benchmarked against.
> The above was done with jdk1.6.0_12-b04, according to java -version.
> Perfomance is not the only concern. "Sets" in java represent the
> mathematical notion of a set; lists do not, and pretending that the
> latter is equivalent to the former is a logical error regardless of
> common practice.
> > NB: Gene, you're trying to argue that a _literal_ set is
> > going to make
> > some sort of speed difference compared to a a_literal_
> > list. That
> > notion is frankly ridiculous. You also lost track of the
> > first rule of
> > discussing speed: Test it first. So, go ahead. Benchmark a
> > bunch
> > of .contains() calls on a list and a set with the same
> > items in it,
> > both with say, 20 items in them (we are talking about
> > literals, after
> > all. I don't think anyone is seriously considering sticking
> > hundreds
> > of lines of code in a .java file to store constant data!)
> > - on my own
> > machine Lists are less than a factor of 2 slower. For about
> > 7 items
> > and down, lists are in fact faster. Also, just in case
> > someone IS
> > tempted to write such a large collections literal: For such
> > a large
> > literal, the added burden of wrapping the literal in a
> > "new
> > HashSet<>()" or sticking a ".toSet();" at the end
> > seems trivial.
> > I think the 'set literals are rare' seem to have it, so I
> > repeat my
> > plea to the set literal fans: Give us some proof they
> > aren't rare.
> > Given that set literals cause so much pain, the burden is
> > clearly on
> > the supporters of a set literal to prove why we need them.
> > --Reinier Zwitserloot
> > On 2009/29/09, at 21:07, Gene Ray wrote:
> > > "Rare"?
> > >
> > > In my experience, Sets are not rare in well-written
> > code; they're
> > > only rare in code where for whatever reason the
> > developer has
> > > refused to use them, and instead expends effort and
> > CPU time
> > > iterating through an array or ArrayList to achieve the
> > equivalent
> > > functionality. Encouraging this sort of behavior
> > further by
> > > including only Lists in the new syntax is not a good
> > plan.
> > >
More information about the coin-dev