One way is to use the Function Multiversioning feature in GCC, write a test program, and see what version of the function (dependent on your CPU arch) will it pick.
The foo function from the program below will create multiple symbols in the binary, and the "best" version will be picked at runtime
$ nm a.out | grep foo
0000000000402236 T _Z3foov
000000000040224c T _Z3foov.arch_x86_64
0000000000402257 T _Z3foov.arch_x86_64_v2
0000000000402262 T _Z3foov.arch_x86_64_v3
000000000040226d T _Z3foov.arch_x86_64_v4
0000000000402290 W _Z3foov.resolver
0000000000402241 T _Z3foov.sse4.2
0000000000402290 i _Z7_Z3foovv
// multiversioning.c
#include <stdio.h>
attribute ((target ("default")))
const char* foo () { return "default"; }
attribute ((target ("sse4.2")))
const char* foo () { return "sse4.2"; }
attribute ((target ("arch=x86-64")))
const char* foo () { return "x86-64-v1"; }
attribute ((target ("arch=x86-64-v2")))
const char* foo () { return "x86-64-v2"; }
attribute ((target ("arch=x86-64-v3")))
const char* foo () { return "x86-64-v3"; }
attribute ((target ("arch=x86-64-v4")))
const char* foo () { return "x86-64-v4"; }
int main ()
{
printf("%s\n", foo());
return 0;
}
On my laptop, this prints
$ g++ multiversioning.c
$ ./a.out
x86-64-v3
Note that the use of g++ is intentional here.
If I used gcc to compile, it would fail with error: redefinition of ‘foo’.
/sse3/.) The de-facto standard is that runtime CPU dispatching only needs to check the highest SSE feature flag it depends on. – Peter Cordes Jan 27 '21 at 19:02popcnt, but that's good to check explicitly. And other non-SIMD extensions like BMI1 are fully independent of SIMD (although since some BMI1/2 instructions use VEX encoding, they're normally only found on CPUs that support AVX. And unfortunately Intel even disables BMI1/2 on their Pentium/Celeron CPUs, perhaps as a way of fully disabling AVX.). – Peter Cordes Jan 27 '21 at 19:08-march=skylake-avx512. – Peter Cordes Jan 27 '21 at 19:12/lm/will match anything containing those characters). I followed the exhaustive level definitions as used in the first answer (that’s where/ssse3/without/sse3/came from), even though as you say many of them are redundant. (I’ve been following the discussions leading up to the definition of these levels.) – Stephen Kitt Jan 27 '21 at 19:29lmis long mode; checking for level 1 is basically just a sanity check of CPUID flags if you're already running a 64-bit kernel because those are all baseline for x86-64. (Also, my comments aren't fully directed at your answer, some of it I just wanted to put somewhere on this page for future readers. Also: Are older SIMD-versions available when using newer ones? / Do the MMX registers always exist in modern processors? / Does a processor that supports SSE4 support SSSE3 instructions?) – Peter Cordes Jan 27 '21 at 19:37