Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conflict with CUDA Context after modifying FDK filtering step with CuPy and calling tigre.Atb() #484

Closed
HBanjak opened this issue Sep 6, 2023 · 3 comments

Comments

@HBanjak
Copy link

HBanjak commented Sep 6, 2023

I have integrated the CuPy library to modify the FDK filtering step to run on the GPU. After doing so, I've encountered an issue when calling the tigre.Atb() function.

Environment:

TIGRE version: 2.2
CUDA version: 11.1
GPU Model: NVIDIA GeForce GTX 1080 Ti
OS: Windows 10
Language: Python

Steps to Reproduce:

Modify the FDK filtering step using CuPy to run it on the GPU.
Run a loop where:
I use CuPy to perform GPU operations for the FDK filtering.
Call tigre.Atb() for backprojection.
Use CuPy again for other GPU operations.
On the second iteration, the program crashes with exit code -1073741819 (0xC0000005) which is related to a memory access violation error.

The issue seems to arise after calling tigre.Atb(), suggesting that something within this function or its dependencies might be altering or destroying the CUDA context in a way that prevents subsequent CuPy GPU operations from proceeding.

I'd appreciate any insights or solutions to this problem. It seems like there's a conflict between TIGRE's internal CUDA operations and the GPU operations performed using CuPy.

@AnderBiguri
Copy link
Member

Hi @HBanjak , apologies, I was on annual leave.

First, I would suggest you have a look at #423, as its basically the filtering on GPU. Its not finished yet, but I think its working nevertheless.

Probably this line is causing the issues: https://github.com/CERN/TIGRE/blob/master/Common/CUDA/voxel_backprojection.cu#L616

As in TIGRE standard use, the calls to Ax/Atb are modular and independent, destroying (or not) the context is irrelevant, so it seems that that line got left there at some point. But AFAIK there should be no problem with you removing that, hopefully that will fix your issue.

@HBanjak
Copy link
Author

HBanjak commented Sep 12, 2023

Hi @AnderBiguri,

Thanks for your reply.

I followed your recommendation to remove the line at: https://github.com/CERN/TIGRE/blob/master/Common/CUDA/voxel_backprojection.cu#L616 and it fixed the issue.

I appreciate the time you took to help me fix this problem.

@HBanjak HBanjak closed this as completed Sep 12, 2023
@AnderBiguri
Copy link
Member

Fantastic! good to hear!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants