-
Notifications
You must be signed in to change notification settings - Fork 145
Planning for plastimatch FDK migration
Rough draft of file list
FFT ramp_filter.c
CUDA utils cuda_util.cu cuda_kernel_util.cu cuda_mem.cu
OpenCL utils autotune_opencl.cxx opencl_probe.cxx opencl_util.cxx opencl_util_nvidia.cxx
FDK fdk_main.c fdk_opts.c fdk_util.c bowtie_correction.c fdk_cuda.cu fdk_opencl.cxx
Misc stuff bstring_util.cxx dir_list.c file_util.cxx fwrite_block.c hnd_io.c logfile.c math_util.h mha_io.c print_and_exit.c proj_image.c proj_image_dir.c proj_matrix.c string_util.c threading.c plm_timer.c volume.c volume_limit.c delayload.c
Questions for discussion
-
How does ITK FFT compare with FFTW? Especially I would like to know for multicore.
- Note #1: for GPU we should consider native GPU FFT methods.
- Note #2: how does convolution compare to FFT (may be superior for empirical kernels and low-res images)?
- Simon:
- ITK uses either vnl (default) or fftw. On 363x512x512 elekta projections, internally zero padded to 363x512x1024, I have 96 seconds for vnl and 17s for fftw. Moreover, vnl does not support non power of 2 image dimensions. I found a note about fft and itk 4, it seems to be one of their concerns.
- I noticed in ramp_filter.c that you use a 1D fft. I have observed much better performances with 2D or 3D FFT in FFTW. Is there an easy way to benchmark plastimatch ramp filter? I could run the benchmark on the same dataset I have used in the previous point.
- I have some issues with multicore. Without going into details, you can not call itk fftw, e.g. FFTWRealToComplexConjugateImageFilter, from another Filter ThreadedGenerateData function, this is not thread safe. I guess I can correct for that but it is not straightforward if we want to keep the ability to do both vnl and fftw. TO DISCUSS...
- GPU: we should start with benchmarking cufftw.
- I don't see how convolution could bring something on CPU. Do we care about speed for low-res images? It will be fast anyway.
- http://www.fftw.org/fftw3_doc/Thread-safety.html#Thread-safety
-
CUDA and OpenCL utils should build as library in subdirectory of RTK. Non-fdk code in plastimatch can link to that library.
- Simon: ok for a library but why in a subdirectory?
-
Plastimatch FDK has it's own method for specifying geometry (.txt files + .pfm files which is used for generic systems.) Do we want to keep this?
- Simon: intuitively, no, we want to keep one good one :-). Can you tell us more about pfm files?
-
For file/directory processing, we should perhaps migrate to ITK-stype methods.
- Simon: I don't get this point, sorry. What processing?
-
Any difference between PLM and RTK for hnd processing?
- Simon: I mostly copy pasted your code, there should not be any difference.
-
ITK v4 will (perhaps) disallow raw pointer access to images. That is potentially a performance problem.
- Note #1: ITK stock iterators are very slow, (IMO they are unusable)
- Note #2: We should benchmark
- Simon:
- Can we actually do any gpu processing without access to the raw pointer?
- (Maybe not a problem any more. -Greg) http://www.cmake.org/Wiki/ITK_Release_4.0
- If stock iterators are the basic ones, like itk::ImageRegionIterator, I'm surprised and I always use them. Are you sure they are slow? Yes, let's benchmark. Any specific operation in mind?
-
PLM will require, at least temporarily, to retain the plm_image and plm_matrix methods for images and geometry. This is to maintain compatibility with DRR code. Therefore, some bridge code will be needed.