AVX512 intrinsics not taken on CPU supporting AVX512?

Kai Burjack kburjack at googlemail.com
Fri May 29 18:20:34 UTC 2020

I was just measuring performance of this code:
fromArray(SPECIES_512, es, 0).intoByteBuffer(bb, 0, nativeOrder());
comparing it with:
fromArray(SPECIES_256, es, 0).intoByteBuffer(bb, 0, nativeOrder());
fromArray(SPECIES_256, es, 8).intoByteBuffer(bb, 32, nativeOrder());
and found that the former was more than 10x slower than the latter on
a Xeon Platinum 8124M, which according to cpuinfo does support AVX512:
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm
constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf
tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe
popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm
3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 hle avx2 smep
bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb
avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke
Are AVX512 mov intrinsics not implemented right now or why are they not

Current benchmark results:


More information about the panama-dev mailing list