RFR: 8210416: [linux] Poor StrictMath performance due to non-optimized compilation
sgehwolf at redhat.com
Mon Sep 10 13:26:40 UTC 2018
On Mon, 2018-09-10 at 10:15 -0300, Gustavo Romero wrote:
> Hi Severin,
> On 09/10/2018 06:27 AM, Severin Gehwolf wrote:
> > On Mon, 2018-09-10 at 10:05 +0100, Andrew Haley wrote:
> > > On 09/05/2018 02:12 PM, Severin Gehwolf wrote:
> > > > Is there a good
> > > > reason to not use -O3 -ffp-contract=off everywhere?
> > >
> > > Is there a good reason to use -O3 rather than -O2?
> > Not sure. I was following what JDK-8170153 did, which was using
> > OPTIMIZATION := HIGH corresponding to -O3. cc'ing Gustavo. Gustavo,
> > would you know why HIGH was chosen over, LOW?
> I don't remember exactly, but at least for ppc64 I discussed that a bit with
> the toolchain folks (also regarding the precision issue, etc) and they never
> said anything against using -O3. Unfortunately it was long time ago so I
> don't remember exactly the numbers on ppc64 for -O2 to check if it was
> worse and so I selected -O3 instead.
> > > -O3 can bloat the
> > > code which can increase cache pressure, which is not always noticeable
> > > in benchmarks but hurts real-world programs. Unless benchmarks are
> > > significantly better at -O3, -O2 is a good default choice.
> > OK, thanks! I'll re-test and change to LOW (-O2) if it gives similar
> > results.
> That's interesting. Andrew, do you mean bloat in the sense of final code size
> (for instance, due to unrolling), right?
> BTW (I just remembered that), on RISC the lack of optimization hurts way more
> than the lack of optimization on CISC, so I recall that it puzzled me the fact
> that turning on the optimization on x86_64 did not change much the scenario,
> contrary to the conspicuous gains on on ppc64 when turning on the optimization.
> I took me some time so to understand that the optimization flag was the culprit
> (a much simpler case lucky), because I tried first to profile and optimize the
> fdlibm code (after extracting it from JVM for detailed analysis) and only after
> getting to a dead end I turned to look at simpler causes.
> Are you checking the difference between -O2 and -O3 only on x86_64?
So far yes. I'll see if I can get access to a ppc64 machine to check
there as well. The plan is to run (some) TCK on a patched x86_64 build
More information about the core-libs-dev