Math trig intrinsics and compiler options
gustav.trede at gmail.com
Thu Aug 6 09:31:05 PDT 2009
2009/8/6 Joseph D. Darcy <Joe.Darcy at sun.com>
> gustav trede wrote:
>> 2009/7/16 Christian Thalinger <Christian.Thalinger at sun.com <mailto:
>> Christian.Thalinger at sun.com>>
>> Azeem Jiva wrote:
>> > Joe,
>> > Gustav sent me an email asking for help with the
>> intrinsification of
>> > the trig functions and a suggestion I gave him was to not call
>> > fsin/fcos/ftan since those instructions are microcoded on Intel/AMD
>> > hardware and very slow. Slower than the call to
>> > sharedRuntimeTrig.cpp, and in all cases it's best to stay away from
>> > the hardware instructions.
>> I just did some micro-benchmarking on an Intel Core2 Duo and in the
>> range of [0,2pi) inlining the hardware instructions is slightly faster
>> (about 2.5%). Limiting the range to [0,pi/4) (means no runtime calls)
>> hardware instructions are 1.5x faster.
>> I think we should keep the current approach.
>> -- Christian
>> Neither linux nor the windows platform has compiler opts enabled, only
>> solaris does, it seems when this was evaluated many years ago no other
>> platform had working compilers.
>> That fact alone is likely to make the fsin,fcos path faster then the C
>> version for the +-PI/4 range for those platforms.
>> Its some work to check the current status for the different
>> platforms/compilers regarding if they are still producing bad code with opts
>> or not,
>> its however reasonable to expect the compilers to improve over the years.
> The code from the non-Sun C compilers is not "bad" per se, it is just bad
> in not implementing the desired semantics of the FDLIBM code, which is very
> sensitive to optimizations legal in C which defeat the purpose of the code.
> The Sun C compiler can be sufficiently attuned to such floating-point need
> under optimization, the other C compilers were not and I suspect still are
> My preferred long-term approach is to port the FDLIBM C code to Java, which
> I've wanted to do for a while, but has never bubbled to the top of my to-do
>> Regarding the proposed patch, sharedRuntimeTrig.cpp usage for the entire
>> input range without external rounding:
>> I compare with 3 input,output pairs that has leaked from the JCK, and vs
>> the current Math impl for many input,output pairs and i don't manage to
>> detect any differences.
> What is many? There are on the order of 2^64 inputs to check!
Yes I'm fully aware of the fact that its hardly practical to check all
values, thats why i asked for help with conformance testing.
I had the idea that perhaps someone knew if the rounding really is needed
for Math trig or its only there for convenience.
Answers like that the 1 ulp (cos,sin) fdlibm code is producing "arbitrary
bad" values to motivate the need for rounding, that does not give a credible
Anyhow i will drop this issue now.
Thanks for your time.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-dev