You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
__m256i _mm256_blend_epi16(__m256i a, __m256i b, const int imm8)
__m256i _mm256_blend_epi32(__m256i a, __m256i b, const int imm8)
have different performance characteristics. Among them the function _mm256_blend_epi32() is the fastest but its mask needs to be encoded into an const int imm8 at compile-time. That hinders its use in the blend implementation of the current libsimdpp if I understand correctly (see also #56)
For masks that are already known at compile-time, I think it would be good to represent them in a new fashion. For instance the blend mask could be represented as a tuple from the library boost::hana
The intrinsics blend functions
have different performance characteristics. Among them the function _mm256_blend_epi32() is the fastest but its mask needs to be encoded into an
const int imm8
at compile-time. That hinders its use in the blend implementation of the current libsimdpp if I understand correctly (see also #56)For masks that are already known at compile-time, I think it would be good to represent them in a new fashion. For instance the blend mask could be represented as a tuple from the library boost::hana
auto mask = hana::make_tuple( hana::true_c, hana::true_c, hana::true_c, hana::true_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::true_c, hana::true_c, hana::true_c, hana::true_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c, hana::false_c );
The immediate mask for _mm256_blend_epi32() could then be computed at compile-time.
I made an proof-of-concept implementation of this in
https://github.com/eriksjolund/compile-time-simd-blend-mask
The text was updated successfully, but these errors were encountered: