Request for tracking down C1 optimizations: handwritten cartesian product similar to flatmap/map performance!
biboudis at gmail.com
Thu May 29 18:39:36 UTC 2014
Of course, C2, my bad.
Regarding cache misses, I have to examine if loop interchange happens and
in general what x86 code is emitted but thanks for pointing that out.
On Thu, May 29, 2014 at 9:03 PM, Andrew Haley <aph at redhat.com> wrote:
> On 05/29/2014 06:55 PM, Aggelos Biboudis wrote:
> > I would like to ask you something regarding C1 compilation (VM options:
> > -Xms769m -Xmx769m -XX:-TieredCompilation)
> That's C2 compilation.
> > of a Cartesian product stream
> > operation with the new stream API.
> > I have two versions of this computation, one handwritten and one with
> > flatmap/map. It is remarkable that these two have similar performance so
> > would like to trace-back the JIT compilation decisions (apart from
> > inlining), and more specifically if escape analysis has any effect.
> Are you quite sure your numbers aren't dominated by cache misses? Your
> data is about 40 Megabytes and it's being accessed sequentially.
> > I've tested the code above with -XX:-DoEscapeAnalysis and I've got the
> > execution times, however I would like to confirm what happens.
> > Regarding inlining, only by noticing the result of PrintInlining we
> > conclude that cartSeq inlines all the nested forEachRemaining operations
> > (of of, flatmap, map), but is that the only optimization?
> Not if this really is C2, no. There are many optimization passes,
> and several will be effective for this code.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the hotspot-compiler-dev