How to use OpenMP the right way? #245
-
I have an example code you shared somewhere in here a couple of years ago (...already years?!) and I'm trying to take advantage of OpenMP parallelism's capabilities. Here's my attempt: ##[[
if ccinfo.is_gcc and not ccinfo.is_clang then
OPENMP = true
cflags '-DLINUX -D_REENTRANT -D_GNU_SOURCE -I/usr/include/apr-1.0 -fopenmp'
ldflags '-lapr-1'
cinclude '<apr_pools.h>'
linklib 'gomp'
end
]]
local function fibmod(n: uinteger, m: uinteger)
local a, b = 0_u, 1_u
##[[if OPENMP then
cemit '#pragma omp parallel for schedule(dynamic)'
end]]
for i=1_u,n do
a, b = b, (a + b) % m
end
return a
end
local res = fibmod(100000000, 1000000000000)
assert(res == 167760546875)
print(res) The error I'm getting is the following: /home/stefanos/.cache/nelua/fibmod.c: In function ‘fibmod_fibmod’:
/home/stefanos/.cache/nelua/fibmod.c:141:34: error: expected iteration declaration or initialization before ‘i’
141 | for(uint64_t i = 1U, _end = n; i <= _end; i += 1) {
| ^
error: C compilation for '/home/stefanos/.cache/nelua/fibmod' failed
shell returned 1 @edubart what is the right way I need to configure my pragma in order to make it work? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Each iteration of your loop depends on the previous iteration's values, so I don't see how it can be parallelized. Maybe the error is OpenMP's way of complaining about that. Here's a recursive fibonacci and a use of it that's speed up with OpenMP. The usage is the same as in examples/condots.nelua: local function fib(n: integer)
if n < 2 then return n end
return fib(n - 1) + fib(n - 2)
end
local results: [1000]uinteger
## if pragmas.withmp then
## cflags '-fopenmp'
## cemit '#pragma omp parallel for schedule(dynamic)'
## end
for i=0, #results-1 do
results[i] = fib(30)
end
print(results[0]) comparing them:
|
Beta Was this translation helpful? Give feedback.
-
Belatedly, OpenMP's actual problem with your code is that it doesn't accept multiple initialization in a loop, and nelua easily emits that. I only chanced upon a case with |
Beta Was this translation helpful? Give feedback.
Each iteration of your loop depends on the previous iteration's values, so I don't see how it can be parallelized. Maybe the error is OpenMP's way of complaining about that.
Here's a recursive fibonacci and a use of it that's speed up with OpenMP. The usage is the same as in examples/condots.nelua:
comparing them: