[Issue]: flang-new: runtime and math functions don't link for OpenMP target regions #201

VeeEM · 2024-11-08T11:56:06Z

Problem Description

I get many linker errors for OpenMP target regions when offloading to GPU. Symbols from libFortranRuntime show as undefined and so do some math intrinsics like cosh.

There are some other math intrinsics that do link successfully, like tanh.

@sfantao

Operating System

SUSE Linux Enterprise Server 15 SP5 (Cray OS on LUMI)

CPU

AMD EPYC 7742 64-Core

GPU

AMD Instinct MI250X

ROCm Version

ROCm 6.2.2

ROCm Component

flang

Steps to Reproduce

flang-new --version
AMD AFAR drop #4.0 9/28/24 flang-new version 20.0.0git (ssh://gerritgit/lightning/ec/llvm-project amd-feature/atd-fortran/2024.09.28 24385 1ad3ac337fa4b1a5a7621a4c5480028b54fffada)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /pfs/lustrep3/scratch/project_462000394/amd-sw/rocm-afar/5891/lib/llvm/bin
Build config: +assertions

$ cat link.F90 
program link
implicit none
real :: r
real, dimension(5) :: xs

!$omp target map(xs, r)
xs = 2
xs = modulo(xs, 3)
r = cosh(r)
r = tanh(r)
!$omp end target

end program

flang-new -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa --offload-arch=gfx90a -fdefault-real-8 link.F90
ld.lld: error: undefined symbol: _FortranAAssign
>>> referenced by /tmp/a.out.amdgcn.gfx90a-6a405a.img.lto.o:(__omp_offloading_54bbb604_5101b8ee__QQmain_l6)
>>> referenced by /tmp/a.out.amdgcn.gfx90a-6a405a.img.lto.o:(__omp_offloading_54bbb604_5101b8ee__QQmain_l6)

ld.lld: error: undefined symbol: _FortranAModuloReal8
>>> referenced by /tmp/a.out.amdgcn.gfx90a-6a405a.img.lto.o:(__omp_offloading_54bbb604_5101b8ee__QQmain_l6)
>>> referenced by /tmp/a.out.amdgcn.gfx90a-6a405a.img.lto.o:(__omp_offloading_54bbb604_5101b8ee__QQmain_l6)

ld.lld: error: undefined symbol: cosh
>>> referenced by /tmp/a.out.amdgcn.gfx90a-6a405a.img.lto.o:(__omp_offloading_54bbb604_5101b8ee__QQmain_l6)
>>> referenced by /tmp/a.out.amdgcn.gfx90a-6a405a.img.lto.o:(__omp_offloading_54bbb604_5101b8ee__QQmain_l6)
clang: error: ld.lld command failed with exit code 1 (use -v to see invocation)
/pfs/lustrep3/scratch/project_462000394/amd-sw/rocm-afar/5891/lib/llvm/bin/clang-linker-wrapper: error: 'clang' failed
flang-new: error: linker command failed with exit code 1 (use -v to see invocation)

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

No response

The text was updated successfully, but these errors were encountered:

bcornille · 2024-11-25T16:21:36Z

In the user guide we have documented that adding -lFortranRuntimeHostDevice to the link line will resolve these link issues. As noted, using this device version of the FortranRuntime will result in low performance but allow linking and running of user programs. We however very much appreciate the reports of what functionality is needed by user codes from the runtime so the runtime calls can be circumvented. E.g. one would expect cosh to be able to be lowered directly without a call to the Fortran runtime.

bcornille · 2024-11-25T22:42:48Z

Math functions may alternatively require -lm when linking.

VeeEM · 2024-11-26T14:48:37Z

Thanks! With the drop 4.2 compiler I am able to link the runtime with -lFortranRuntimeHostDevice. I'm curious, why is performance poor with the device runtime? Is it just overhead from calling library functions or something else entirely? The program I'm working on uses assign, dot_product, mod, modulo and sum in some target regions.

With the math functions I do still have the same problem, adding -lm to the compiler invocation does not help with linking cosh. tanh works fine, just as before.

$ flang-new -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa --offload-arch=gfx90a -fdefault-real-8 -lFortranRuntimeHostDevice -lm link.F90
ld.lld: error: undefined symbol: cosh
>>> referenced by a.out.amdgcn.gfx90a.img.lto.o:(__omp_offloading_54bbb604_4d007b9a__QQmain_l6)
>>> referenced by a.out.amdgcn.gfx90a.img.lto.o:(__omp_offloading_54bbb604_4d007b9a__QQmain_l6)
clang: error: ld.lld command failed with exit code 1 (use -v to see invocation)

If I compile with --save-temps and look into link-openmp-amdgcn-amd-amdhsa-gfx90a-llvmir.mlir, I see that the symbols for cosh and tanh look quite different to eachother. cosh is cosh, but the symbol for tanh is __ocml_tanh_f64.

bcornille · 2024-11-27T17:09:29Z

I've opened an internal ticket regarding cosh so we will investigate. A drop 4.3 is available (need to update the user guide still, https://repo.radeon.com/rocm/misc/flang/). It may improve some of the assignment performance issues. The runtime is not really a device optimized library and is mostly the existing runtime compiled for device, which is a highly templated C++ library.

bcornille added the flang label Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: flang-new: runtime and math functions don't link for OpenMP target regions #201

[Issue]: flang-new: runtime and math functions don't link for OpenMP target regions #201

VeeEM commented Nov 8, 2024

bcornille commented Nov 25, 2024 •

edited

Loading

bcornille commented Nov 25, 2024

VeeEM commented Nov 26, 2024

bcornille commented Nov 27, 2024

[Issue]: flang-new: runtime and math functions don't link for OpenMP target regions #201

[Issue]: flang-new: runtime and math functions don't link for OpenMP target regions #201

Comments

VeeEM commented Nov 8, 2024

Problem Description

Operating System

CPU

GPU

ROCm Version

ROCm Component

Steps to Reproduce

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

Additional Information

bcornille commented Nov 25, 2024 • edited Loading

bcornille commented Nov 25, 2024

VeeEM commented Nov 26, 2024

bcornille commented Nov 27, 2024

bcornille commented Nov 25, 2024 •

edited

Loading