Replies: 5 comments 17 replies
-
Do you use cmake? Do you use install feature? It's far easier to set proper architectures with cmake. To enable only SSE4.1 and disable runtime dispatch you need to use the following flags: cmake -GNinja -DKFR_ARCH=sse41 -DKFR_ENABLE_MULTIARCH=OFF -DCMAKE_INSTALL_PREFIX=install-dir <other-arguments>
ninja install This automatically sets all needed c++ defines and flags and ensures that the generated code can run on SSE4.1 cpu. Note that automatic cpu detection can be turned off at runtime by calling Example: // somehow detect that we are running on Intel hybrid architecture.
// P-cores: AVX, AVX2, AVX512
// E-cores: AVX, AVX2
kfr::override_cpu(kfr::cpu_t::avx2); // common to all cores
|
Beta Was this translation helpful? Give feedback.
-
Yes, we use cmake, but currently don't use cmake for kfr because of the aforementioned problems. Also because it didn't build originally with MSVC. For current plugin projects, we actually created a KFRlib_static build, a library we can link to with the functionality we need as simple C functions. We would like to change that and build KFRlib directly into the projects.
Is that thread-safe? Our plugins are effectively DLLs on Windows, and any global variables (like function pointers?) are then also shared. I want to avoid us calling Maybe we can get away with using a std::call_once on a static std::once_flag? |
Beta Was this translation helpful? Give feedback.
-
Please also keep in mind that for macOS, we still need Neon support. So disabling multi-arch, does that also disable Neon? Do we need a separate define for that? |
Beta Was this translation helpful? Give feedback.
-
So can you tell me the lines I should add to your CMakeLists.txt to make KFR use only up to SSE4 on Intel CPUs, always use Neon on ARM CPUs? |
Beta Was this translation helpful? Give feedback.
-
You can always refer to how KFR releases are built The proper way to use KFR is to install it to a separate location (unix-style of building libraries). Windows// LLVM must be installed (latest version is preferred) REM !!! Replace the paths below !!!
REM execute this in the KFR source tree
call "C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvars64.bat"
cmake -B build-release -S . -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=dist -GNinja -DKFR_ENABLE_CAPI_BUILD=ON -DKFR_ARCH=sse41 -DKFR_ENABLE_MULTIARCH=OFF -DCMAKE_CXX_COMPILER="C:/Program Files/LLVM/bin/clang-cl.exe" -DCMAKE_LINKER="C:/Program Files/LLVM/bin/lld-link.exe" -DCMAKE_AR="C:/Program Files/LLVM/bin/llvm-lib.exe" This builds MSVC-compatible binaries in macOSbrew install ninja
cmake -B build-release -S . -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=dist -GNinja -DCMAKE_OSX_ARCHITECTURES=x86_64 -DCMAKE_OSX_DEPLOYMENT_TARGET=10.13 -DKFR_ENABLE_CAPI_BUILD=ON -DKFR_ARCH=sse41 -DKFR_ENABLE_MULTIARCH=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON
|
Beta Was this translation helpful? Give feedback.
-
In our projects, we can not rely on ALL architectures (AVX, AVX2, etc.) to be present on all cores (Hybrid CPUs exist), and we have no control over which core our code gets run on; it might even switch between function call to function call.
Explanation: We sell plugin products. These are dynamic libraries that expose a certain number of functions. The hosting application might call these functions from any thread, at any time, on any core.
So we can't detect the instruction set on startup, so we set a few function pointers and be done with it. An instruction set that gets detected on core0 doesn't necessarily exist on core13.
We don't even need dynamic dispatch, we want to make sure that on Intel CPUs, no instructions higher than SSE4 (4.1 & 4.2) get emitted.
Are the usual defines OK for that?
Or do we need to set others? What if we want to move away from using the C-API and use the "normal" C++ stuff? Do the same definitions work as well?
I'm sorry to ask this, but the documentation is pretty poor, doesn't explain a whole lot, and has a ton of gaps. It's by far the weakest point and biggest hindrance to the adoption of KFRlib.
Beta Was this translation helpful? Give feedback.
All reactions