Request for reviews (M): 6987135: Performance regression on Intel platform with 32-bits edition between 6u13 and 6u14.

Tom Rodriguez tom.rodriguez at
Mon Nov 1 11:25:31 PDT 2010

That looks good.


On Oct 29, 2010, at 5:14 PM, Vladimir Kozlov wrote:

> Tom suggested to use second TEMP register instead of push/pop. Here is new webrev:
> Vladimir
> Vladimir Kozlov wrote:
>> Thank you, Tom
>> I updated webrev:
>> Tom Rodriguez wrote:
>>> On Oct 22, 2010, at 6:25 PM, Vladimir Kozlov wrote:
>>>> Fixed 6987135: Performance regression on Intel platform with 32-bits edition between 6u13 and 6u14.
>>>> Changes for 6603011 added the conversion of long
>>>> division by constant to the code with multiply.
>>>> But some modern cpus improved DIV instruction
>>>> performance. Use it for long division by constant
>>>> when it is faster than code with multiply.
>>> The formats in don't match the code.
>> Fixed.
>>> In modL_eReg_imm32, why can't the value be 0 or -1?
>> There are Ideal transformations for such divisor values,
>> DivL and ModL will never go to matcher with such divisor.
>> Asserts verify it.
>>> Why don't you use an immL definition that ensures that?  If imm is MININT then the pcon calculation will go wrong.
>> Matcher::use_asm_for_ldiv_by_con() has check for MININT.
>> I added verification check into asserts.
>>> I believe you could do the register declarations like this:
>> I updated both ModL and DivL code to use only dst.
>> Thanks,
>> Vladimir
>>> + instruct modL_eReg_imm32( eADXRegL dst, eRegL src, immL32 imm, eRegI tmp, eFlagsReg cr ) %{
>>> +   match(Set dst (ModL src imm));
>>> +   effect(TEMP dst, TEMP tmp, KILL cr );
>>> to leave the src and tmp unbound which would give the RA a little more freedom.  Actually wouldn't connecting src and dst directly result in fewer moves in the normal case?  You might need a new temp but it seems like there are quite a few moves of src into dst for the idivl.
>>> + instruct modL_eReg_imm32( eADXRegL dst, immL32 imm, eSIRegI tmp, eFlagsReg cr ) %{
>>> +   match(Set dst (ModL dst imm));
>>> +   effect( KILL cr );
>>> tom
>>>> Tested on US3, T1, T2, Sparc64, AMD and Intel latest cpus.

More information about the hotspot-compiler-dev mailing list