~~NOTRANS~~ {{indexmenu_n>400}} ====== SIMD ====== Modern processors have capabilities for SIMD (single instruction multiple data) like SSE or AVX instruction sets on x64 or NEON on ARM architecture. Now, in 2022, C/C++ compilers still have problems at automatic vectorization to generate these vector instructions. Assembler or compiler intrinsic functions can be used to enforce the usage .. ===== CPU architectures/infos ===== * https://en.wikichip.org/wiki/x86/extensions * https://en.wikichip.org/wiki/arm/versions * https://en.wikichip.org/wiki/arm_holdings#Microarchitectures * https://en.wikipedia.org/wiki/Raspberry_Pi#Specifications * https://en.wikipedia.org/wiki/Comparison_of_ARMv8-A_processors * does contain the L1/L2/L3 cache sizes * Raspberry Pi 4 has Broadcom Chip BCM2711 ===== Libraries ===== * https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html * https://github.com/VcDevel/Vc * https://github.com/hayguen/MIPP * it's origin https://github.com/aff3ct/MIPP * SSE/AVX to Neon * https://github.com/DLTcollab/sse2neon * https://github.com/kunpengcompute/AvxToNeon * Neon for x86/SSE * https://github.com/intel/ARM_NEON_2_x86_SSE * SIMD Everywhere * https://simd-everywhere.github.io/blog/ * https://github.com/simd-everywhere/simde * https://github.com/jpcima/simde