RFR 8199843 : Optimize Integer/Long.highestOneBit()
paul.sandoz at oracle.com
Mon Mar 26 21:42:35 UTC 2018
> On Mar 26, 2018, at 2:26 PM, Ivan Gerasimov <ivan.gerasimov at oracle.com> wrote:
> Thank you Andrew for looking into this!
> On 3/24/18 4:13 AM, Andrew Haley wrote:
>> On 03/20/2018 05:20 PM, Ivan Gerasimov wrote:
>>> I tried to run it, but the numbers are non-distinguishable for non-zero
>>> And my variant performs slightly better with zero argument.
>>> So, I think it's reasonable to keep the variant with the ternary operator.
>> I am suspicious of this argument. Did you look at the generated code?
>> I get
>> cbnz w10, 0x000003ffa8202384
>> mov w0, wzr
>> for the zero test and
>> cbz w10, 0x000003ffa81d228c
>> clz w11, w10
>> orr w10, wzr, #0x80000000
>> lsr w0, w10, w11 ;*iushr
>> for 42.
>> The branch at the start of both versions goes to a deoptimize trap.
>> We don't want deoptimize traps if we can avoid them, so the branchless
>> version is better IMO.
> This looks persuasive, so let's go ahead with the branchless variant!
In my experience with nano-benchamarks like this it's often informative to look at the generated code.
This is even easier now that JMH supports a dtrace ASM profiler on the Mac:
(YMMV, I ran into some issues on the mac whereby the spawned dtrace stopped logging output and needed to poked with a signal into action and dump the rest of its output. Not had time to investigate in detail and report back on this.)
More information about the core-libs-dev