RFR: JDK-8213473: Prevent transformation of LoadB->AndI->CmpI pattern to facilitate testb instruction matching
rkennke at redhat.com
Thu Nov 8 12:52:00 UTC 2018
I haven't (yet) come up with an actual micro-benchmark, and haven't yet
run specjvm or such, but I've written a program that triggers the
matching and shows the assembly.
Executed like this (usual jmh):
java -jar target/benchmarks.jar -prof perfasm
Yields the following assembly (without patch):
0x00007f5ff0587edc: movzbl 0xc(%rsi),%r10d
0,06% 0x00007f5ff0587ee1: mov $0x2,%r8d
0x00007f5ff0587ee7: mov $0x1,%eax
0x00007f5ff0587eec: test $0x2a,%r10d
and with patch:
0,08% 0x00007f27605888e7: testb $0x2a,0xc(%rsi)
I don't know if there's any chance that this performs worse? Maybe the
two instructions schedule better because CPU can do other work while
waiting for memory? It surely saves a register, which might come useful
under register pressure. And it compiles to more compact code.
Do you want me to take it further and come up with an actual working
microbenchmark? Or do you have benchmarks that you want to run?
> Roman, your change seems fine.
> I thought for SPARC it may regress but the only complex rule with
> AndI+LoadUB I found is for ConvI2L:
> match(Set dst (ConvI2L (AndI (LoadUB mem) mask)));
> I did not find anything on other platforms.
> Anyway it would be nice to run some performance testing.
> On 11/7/18 3:46 AM, Roman Kennke wrote:
>> In JDK-8203628, Aleksey added matchers to generate testb instruction (on
>> x86) for LoadB->AndI->CmpI sequence. However, in AndINode::Ideal() we
>> transform any LoadB->AndI to LoadUB->AndI which fails to match this
>> pattern. Which means we never actually match for testb. The original
>> matchers for LoadUB->AndI->CmpI needed to be removed because of
>> The proposed enhancement would prevent the transformation if we have a
>> straight LoadB->AndI->CmpI sequence. While the matcher is x86-only, I
>> don't think it actually hurts to prevent this particular sequence. It
>> should generate better code (e.g. single testb instructions) for things
>> byte field;
>> if ((field & 3) == 0) ..
>> or even:
>> if (field == x) .. because we always mask byte loads
>> or even (not sure) same for boolean fields.
>> Testing: passes hotspot/jtreg:tier1 here
>> Thoughts? Reviews?
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 833 bytes
Desc: OpenPGP digital signature
More information about the hotspot-compiler-dev