RFR (M): 6443505: Ideal() function for CmpLTMask
vladimir.kozlov at oracle.com
Thu Mar 28 16:37:42 PDT 2013
"pipeline" info is not used in code generation, it is outdated anyway.
So use what other similar instructions use.
I think we should use only branch variant on all x86 (32 and 64bit). We
save register (tmp) and it is MUCH MUCH more important for performance
(less spills on stack). And you need only 3 instructions for cadd_cmpLTMask.
And, please, remove unneeded spaces near parenthesis, at least in new code.
On 3/27/13 6:09 PM, David Chase wrote:
> (as stated)
> C2 needs a special case like CmpLTMask (turn result of a comparison into a -1/0 mask) for the comparison of a difference with zero.
> (as actually observed)
> 1) The improvement in that case is minimal and (very) difficult to get to trigger, because CSE interferes and extracts the p-q from the pattern.
> 2) The original code generation for caddCmpLTMask on some platforms was actually wrong (used carry bit for a signed comparison)
> 3) The original Ideal pattern matching, because of a typo/thinko, accidentally failed to apply in the symmetric case
> 4) Code generation for CmpLTMask on all platforms omitted the somewhat relevant case of and-CmpLTMask; if the very specific pattern failed to apply, then it would fall back to the laborious calculation of an actual mask, when much more compact code could apply.
> Repair wrong code generation.
> Write additional pattern for (and (cmpltmask p q) y)
> Fixed the typo so the pattern fires more often.
> Wrote a new test to definitely exercise the two patterns in question.
> Verified that "new test" would fail running with unfixed compiler.
> Verified (observing assembly output) that the new patterns matched on x86_32-cmov, x86_32+cmov, x86_64, and Sparc
> (except that I could not get the and-cmpltmask pattern to fire on Sparc.
> Bit of a shame we lack a cumulative coverage tool wired into jtreg so we could easily know if it ever fired at all).
> Benchmarked change on x86_64 (saw little or no performance difference on the whole benchmark)
> JPRT on compiler tests (clean runs thwarted by irrelevant failures, but it was always the same 2 or 3 borked tests.)
> JPRT on just the new test (clean run)
> not 100% sure on the pattern costs.
> not 100% sure on the choice of "pipeline".
More information about the hotspot-compiler-dev