You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@pablodelara Hi, i benchmark xor_gen 10 +1 performance in different sizes. I can see that:
when CPU L2 Cache(1280K) can hold all data units(the test length is less than 128K per unit), the xor_gen performance when enable AVX512 is much better than not enable; however, for larger sizes, the xor_gen performance when enable AVX512 is worse than not enable.
OS: debian 9
GCC: 6.3
NASM: 2.12.01
CPU: Intel(R) Xeon(R) Silver 4314 CPU @ 2.40GHz
ISA-L: 2.31
I have confirmed that my cpu supports AVX512 through https://ark.intel.com/content/www/us/en/ark/products/215269/intel-xeon-silver-4314-processor-24m-cache-2-40-ghz.html. However the AVX512 detection in building isa-l failed. I tried to run the detection code myself and got the following output:
$echo vinserti32x8 zmm0, ymm1, 1\; > tst.asm && nasm -f elf64 tst.asm && echo pass
tst.asm:1: error: invalid combination of opcode and operands
.I want to know why the detection failed and if there are some operations to the OS kernel I need to perform to enable AVX512. Thanks for answering!
The text was updated successfully, but these errors were encountered: