Impact of code difference in Collection#contains() worth improving?
john.r.rose at oracle.com
Fri Sep 5 23:25:24 UTC 2014
On Aug 30, 2014, at 7:17 AM, Ulf Zibis <Ulf.Zibis at cosoco.de> wrote:
> Am 30.08.2014 um 01:33 schrieb John Rose:
>> On Aug 29, 2014, at 1:05 PM, Ulf Zibis <Ulf.Zibis at cosoco.de <mailto:Ulf.Zibis at cosoco.de>> wrote:
>>> Thanks for explaining this, but a very little nit: the immediate (I.e. -1) uses additional 32/64 bits in code which must be loaded from memory and wastes space in CPU cache or am I wrong? This could be saved with >= 0.
>> I have to say you're more wrong than right about this. Optimizers routinely change the form of constants. For example, a constant 0 will often show up as something like "xor eax,eax", not a 32-bit literal zero that loads from somewhere in memory. A comparison of the form "x > -1" will be freely changed to "x >= 0" and back again; the latter form may (or may not, depending on chip version) transform to "test eax", with no "-1" or "0" in sight.
> 1. Thanks for the hint about "x > -1" ===> "x >= 0". But I'm afraid this would apply on the "x != -1" case we are discussing here.
The equality comparison would be less likely to transform to a non-equality comparison. But it is still possible in principle, if the JIT could prove that values "x < -1" are impossible.
> 2. Are you really sure this optimization is always implemented, as following bug is still open:
> JDK-6984886 : Transform comparisons against odd border to even border <http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6984886>
Well, I'm pretty sure that no optimizations is "always" implemented; there are usually corner cases which prevent optimizations. There is always a possibility of deoptimization to the interpreter, for example. And there are lots of nice-to-have optimizations that don't make a difference to real applications. Much of compiler development is driven by observing actual speed bottlenecks and removing them.
But the bug reference is useful; thanks! I added a comment to show where the optimization occurs now—disappointingly little I grant you—and where to start looking to improve it. The bug is rated P5 (lowest rating) probably because the reporter said "it should be faster"; "wouldn't this be nice" is a far weaker argument than "I'm spending too much on hardware because this loop is slow".
My comments about the unpredictability of optimizers still stand. The most robust way to manage such problems is, first confirm it is a real performance problem (not a micro-nano thing), and second get the optimizer to make all equivalent inputs produce good code. The tiny place where this optimization is done in HotSpot suggests that that was a place where the transform actually mattered the most.
More information about the core-libs-dev