Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enzyme internal error while running neural ODE with Lux + Enzyme #2110

Open
heyyeahcrow opened this issue Nov 20, 2024 · 3 comments
Open

Enzyme internal error while running neural ODE with Lux + Enzyme #2110

heyyeahcrow opened this issue Nov 20, 2024 · 3 comments

Comments

@heyyeahcrow
Copy link

Hi,

I tried to use AutoEnzyme as an optimizer to build a neural network to predict parameters for ODEs following the example of DiffEqFlux, but it turned out to return an Enzyme internal error and a bunch of LLVM computations.

using Lux, DiffEqFlux, OrdinaryDiffEq, Plots, Printf, Statistics
using ComponentArrays
using Optimization, OptimizationOptimisers
using Enzyme
using Dates
using Random
using StaticArrays

function evolve!(dc, c, p, t)
    p1 = p[1]
    p2 = p[2]
    dc .= c .* p2 * p1
end

function simulate(i1, i2, a, b, t_span)
    p2 = exp(-i2 * a)
    p1 = i1 * b
    p = (p1, p2)
    c0 = [1.0 0.0; 1.0 0.0]
    prob = ODEProblem(evolve!, c0, t_span, p)
    sol = solve(prob, Euler(), save_everystep=false, dt = 0.5)
    return Array(sol[end])
end


rng = Xoshiro(0)
b = [0.0 1.0; 1.0 0.0]
a = 0.6
n = length(b[1, :])
i1 = 0.18
i2 = 2.5 
timespan = (0.0, 5.0)
ans = simulate(i1, i2, a, b, timespan)

display(ans)

inputs = [i1, i2]
input_size = length(inputs)
output_size = length(a) + length(b)
nn = Chain(
    Dense(input_size, input_size*3*n, tanh),
    Dense(input_size*3*n, output_size*2, tanh),
    Dense(output_size*2, output_size, sigmoid)
)

u, st = Lux.setup(rng, nn)

function predict_neuralode(u)
    # Get parameters from the neural network
    output, outst = nn(inputs, u, st)
    # Segregate the output
    p_a = output[1]
    pp_b = output[length(a)+1:end]
    p_b = zeros(n, n)
    index = 1
    for i in 1:n
        for j in 1:n
            p_b[i, j] = pp_b[index]
            index += 1
        end
    end
    nn_output = [p_a, p_b]
    println("nn_output: ", nn_output)
    pred = simulate(i1, i2, p_a, p_b, timespan)
    return Array(pred)
end

function loss_neuralode(ans, u)
    pred = predict_neuralode(u)
    loss = sum(abs2, ans .- pred)
    return loss, pred
end

loss, pred = loss_neuralode(ans, u)

loss_values = Float64[]
callback = function (p, l, pred; doplot = false)
    println(l)
    push!(loss_values, l)
end

pinit = ComponentArray(u)
callback(pinit, loss_neuralode(ans, pinit)...)

adtype = Optimization.AutoEnzyme()

optf = Optimization.OptimizationFunction((u,_) -> loss_neuralode(ans, u), adtype)
optprob = Optimization.OptimizationProblem(optf, pinit)

result_neuralode = Optimization.solve(
    optprob, OptimizationOptimisers.Adam(0.02); callback = callback, maxiters = 50)

The error log: error_log_2024-11-19_15-56-09.txt
The stacktrace: Stacktrace.txt

I also tried to run the example by replacing the Zygote and AutoZygote with Enzyme and AutoEnzyme, but it still returned the same error. They happened on both Mac and Windows systems.

Julia Version 1.11.1
Packages:
[b0b7db55] ComponentArrays v0.15.18
[aae7a2af] DiffEqFlux v4.1.0
[7da242da] Enzyme v0.13.14
⌅ [d9f16b24] Functors v0.4.12
[e6f89c97] LoggingExtras v1.1.0
[b2108857] Lux v1.2.3
[7f7a1694] Optimization v4.0.5
[42dfb2eb] OptimizationOptimisers v0.3.4
[1dea7af3] OrdinaryDiffEq v6.90.1
[91a5bcdd] Plots v1.40.9
[90137ffa] StaticArrays v1.9.8
[10745b16] Statistics v1.11.1
[e88e6eb3] Zygote v0.6.73
[ade2ca70] Dates v1.11.0
[56ddb016] Logging v1.11.0
[de0858da] Printf v1.11.0
[9a3f8284] Random v1.11.0

@wsmoses
Copy link
Member

wsmoses commented Nov 21, 2024

Hi, this looks like an error in tje 1.11 FFI call support in Enzyme. Two quick things:

  1. Can you test on the latest version of enzyme (I think this ought be fixed, if not we should fix it)
  2. Can you make a reproducer that only has a direct autodiff call? @ChrisRackauckas may be able to help you with this

@heyyeahcrow
Copy link
Author

Hi, this looks like an error in tje 1.11 FFI call support in Enzyme. Two quick things:

  1. Can you test on the latest version of enzyme (I think this ought be fixed, if not we should fix it)
  2. Can you make a reproducer that only has a direct autodiff call? @ChrisRackauckas may be able to help you with this

I updated it and still showed the same error. I'm currently trying the second path.

BTW, do I need to vectorize all the inputs and outputs of my ODE?

@wsmoses
Copy link
Member

wsmoses commented Nov 21, 2024

I don't think that should be needed here to get it to fail, but that's just an intuition

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants