RFR(S): Vector popcount support

Doerr, Martin martin.doerr at sap.com
Wed Mar 14 11:03:33 UTC 2018


unfortunately, this change breaks the Windows 32 bit build:
ad_x86.obj : error LNK2019: unresolved external symbol "public: void __thiscall Assembler::vpopcntd(class XMMRegisterImpl *,class XMMRegisterImpl *,int)" (?vpopcntd at Assembler@@QAEXPAVXMMRegisterImpl@@0H at Z) referenced in function "private: virtual void __thiscall vpopcount16INode::emit(class CodeBuffer &,class PhaseRegAlloc *)const " (?emit at vpopcount16INode@@EBEXAAVCodeBuffer@@PAVPhaseRegAlloc@@@Z)

Should the new nodes not be in x86_64.ad?
The declaration of vpopcntd is in shared 32/64 bit code while the definition is in 64 bit code.

Is already somebody working on a fix?

Best regards,

-----Original Message-----
From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov
Sent: Dienstag, 13. März 2018 19:26
To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; hotspot-compiler-dev at openjdk.java.net
Subject: Re: RFR(S): Vector popcount support

On 3/13/18 11:15 AM, Lupusoru, Razvan A wrote:
> Hi Vladimir,
> I wanted to add at least one test variant that will exercise a smaller vector size (in this case 8 bytes). That is because when AVX512 is supported, it is possible to use smaller vectors with EVEX encoding. If you find it unnecessary, please feel free to remove it before integration.

Good to know.

Testing passed and I pushed changes.


> Thanks!
> --Razvan
> -----Original Message-----
> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
> Sent: Monday, March 12, 2018 6:33 PM
> To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(S): Vector popcount support
> Looks good. Thank you for explanation about b,w and q variants. I will start testing.
> Can you explain why you have additional test with MaxVectorSize=8?
> Thanks,
> Vladimir
> On 3/12/18 5:47 PM, Lupusoru, Razvan A wrote:
>> Hi Vladimir,
>> Thank you so much for the quick review.
>> I have addressed following issues:
>> - renamed support_avx512_vpopcntdq to supports_vpopcntdq
>> - No longer check UseAVX > 2 following your suggestion to clear flag
>> when AVX512 is not available
>> - Added length checking in predicate
>> - Added appropriate compiler tests to exercise functionality
>> Update patch is available at:
>> http://cr.openjdk.java.net/~rlupusoru/jdk_hs/webrev_vpopcount_03/
>> Regarding b and w variants, semantics of popcount are not consistent with other instructions which do allow narrowing to subword types. Namely, since popcount counts bits, narrowing ends up losing bit count information. Potentially, this can be fixed with some additional instructions being generated that test top bit and add appropriate count to each lane based on that top bit. But given users are likely to use Integer.bitCount without casting result to byte/short, I don't believe adding these variants is useful. If you believe otherwise, I can try to see what I can do.
>> Regarding q variant, it is currently not easily supported in vectorizer without some fundamental changes. That is because Long.bitCount returns int instead of long. The type mismatch in same chain of operations does not pass vectorizer alignment checking.
>> Let me know if you have any more comments or suggestions!
>> Thanks,
>> Razvan
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Friday, March 09, 2018 4:03 PM
>> To: Lupusoru, Razvan A <razvan.a.lupusoru at intel.com>;
>> hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: RFR(S): Vector popcount support
>> Hi Razvan,
>> In general changes are good. Do you plans to add vpopcntb,w too?
>> Use 'supports' with 's' in name as in other support functions names:
>> supports_avx512_vpopcntdq()
>> Also why use avx512 in function name? I know it is CPUID bit name. But do you have other vpopcntdq instructions, not avx512?
>> In assembler_x86.cpp and other places you don't need to check UseAVX,
>> support_avx512_vpopcntdq() is enough. You can clear feature bit in vm_version_x86.cpp when AVX < 3:
>>      if (UseAVX < 3) {
>>        _features &= ~CPU_AVX512F;
>>        _features &= ~CPU_AVX512DQ;
>>        _features &= ~CPU_AVX512CD;
>>        _features &= ~CPU_AVX512BW;
>>        _features &= ~CPU_AVX512VL;
>> +   _features &= ~CPU_AVX512_VPOPCNTDQ;
>>      }
>> In x86.ad you forgot to add length check in predicate() like next:
>> instruct vadd2I_reg(vecD dst, vecD src1, vecD src2) %{
>>      predicate(UseAVX > 0 && n->as_Vector()->length() == 2);
>> And, please, add code generation test to test/hotspot/jtreg/compiler/vectorization/ tests to verify correctness of vector operations.
>> Thanks,
>> Vladimir
>> On 3/9/18 2:54 PM, Lupusoru, Razvan A wrote:
>>> Hi everyone,
>>> As per "Intel Architecture Instruction Set Extensions and Future
>>> Features Programming Reference" manual [1], vector popcount
>>> instruction will be supported in future Intel ISA. I have updated the
>>> superword vectorizer to take advantage of this instruction. I have
>>> tested with Intel SDE [2] to confirm encoding and semantics are
>>> correctly implemented. Please take a look and let me know if you have
>>> any questions or comments.
>>> http://cr.openjdk.java.net/~rlupusoru/jdk_hs/webrev_vpopcount_01/inde
>>> x
>>> .html
>>> Thanks,
>>> Razvan
>>> [1]
>>> https://software.intel.com/sites/default/files/managed/c5/15/architec
>>> t ure-instruction-set-extensions-programming-reference.pdf
>>> [2]
>>> https://software.intel.com/en-us/articles/intel-software-development-
>>> e
>>> mulator
>>> [3] https://bugs.openjdk.java.net/browse/JDK-8199421

More information about the hotspot-compiler-dev mailing list