2,828 research outputs found
Accelerating iterative CT reconstruction algorithms using Tensor Cores
Tensor Cores are specialized hardware units added to recent NVIDIA GPUs to speed up matrix multiplication-related tasks, such as convolutions and densely connected layers in neural networks. Due to their specific hardware implementation and programming model, Tensor Cores cannot be straightforwardly applied to other applications outside machine learning. In this paper, we demonstrate the feasibility of using NVIDIA Tensor Cores for the acceleration of a non-machine learning application: iterative Computed Tomography (CT) reconstruction. For large CT images and real-time CT scanning, the reconstruction time for many existing iterative reconstruction methods is relatively high, ranging from seconds to minutes, depending on the size of the image. Therefore, CT reconstruction is an application area that could potentially benefit from Tensor Core hardware acceleration. We first studied the reconstruction algorithm's performance as a function of the hardware related parameters and proposed an approach to accelerate reconstruction on Tensor Cores. The results show that the proposed method provides about 5 x increase in speed and energy saving using the NVIDIA RTX 2080 Ti GPU for the parallel projection of 32 images of size 512 x 512. The relative reconstruction error due to the mixed-precision computations was almost equal to the error of single-precision (32-bit) floating- point computations. We then presented an approach for real-time and memory-limited applications by exploiting the symmetry of the system (i.e., the acquisition geometry). As the proposed approach is based on the conjugate gradient method, it can be generalized to extend its application to many research and industrial fields
Four-dimensional tomographic reconstruction by time domain decomposition
Since the beginnings of tomography, the requirement that the sample does not
change during the acquisition of one tomographic rotation is unchanged. We
derived and successfully implemented a tomographic reconstruction method which
relaxes this decades-old requirement of static samples. In the presented
method, dynamic tomographic data sets are decomposed in the temporal domain
using basis functions and deploying an L1 regularization technique where the
penalty factor is taken for spatial and temporal derivatives. We implemented
the iterative algorithm for solving the regularization problem on modern GPU
systems to demonstrate its practical use
2.5D Deep Learning for CT Image Reconstruction using a Multi-GPU implementation
While Model Based Iterative Reconstruction (MBIR) of CT scans has been shown
to have better image quality than Filtered Back Projection (FBP), its use has
been limited by its high computational cost. More recently, deep convolutional
neural networks (CNN) have shown great promise in both denoising and
reconstruction applications. In this research, we propose a fast reconstruction
algorithm, which we call Deep Learning MBIR (DL-MBIR), for approximating MBIR
using a deep residual neural network. The DL-MBIR method is trained to produce
reconstructions that approximate true MBIR images using a 16 layer residual
convolutional neural network implemented on multiple GPUs using Google
Tensorflow. In addition, we propose 2D, 2.5D and 3D variations on the DL-MBIR
method and show that the 2.5D method achieves similar quality to the fully 3D
method, but with reduced computational cost.Comment: IEEE Asilomar conference on signals systems and computers, 201
High-Level Programming for Medical Imaging on Multi-GPU Systems Using the SkelCL Library
Application development for modern high-performance systems with Graphics Processing Units (GPUs) relies on low-level programming approaches like CUDA and OpenCL, which leads to complex, lengthy and error-prone programs.
In this paper, we present SkelCL – a high-level programming model for systems with multiple GPUs and its implementation as a library on top of OpenCL. SkelCL provides three main enhancements to the OpenCL standard: 1) computations are conveniently expressed using parallel patterns (skeletons); 2) memory management is simplified using parallel container data types; 3) an automatic data (re)distribution mechanism allows for scalability when using multi-GPU systems.
We use a real-world example from the field of medical imaging to motivate the design of our programming model and we show how application development using SkelCL is simplified without sacrificing performance: we were able to reduce the code size in our imaging example application by 50% while introducing only a moderate runtime overhead of less than 5%
Investigation of iterative image reconstruction in three-dimensional optoacoustic tomography
Iterative image reconstruction algorithms for optoacoustic tomography (OAT),
also known as photoacoustic tomography, have the ability to improve image
quality over analytic algorithms due to their ability to incorporate accurate
models of the imaging physics, instrument response, and measurement noise.
However, to date, there have been few reported attempts to employ advanced
iterative image reconstruction algorithms for improving image quality in
three-dimensional (3D) OAT. In this work, we implement and investigate two
iterative image reconstruction methods for use with a 3D OAT small animal
imager: namely, a penalized least-squares (PLS) method employing a quadratic
smoothness penalty and a PLS method employing a total variation norm penalty.
The reconstruction algorithms employ accurate models of the ultrasonic
transducer impulse responses. Experimental data sets are employed to compare
the performances of the iterative reconstruction algorithms to that of a 3D
filtered backprojection (FBP) algorithm. By use of quantitative measures of
image quality, we demonstrate that the iterative reconstruction algorithms can
mitigate image artifacts and preserve spatial resolution more effectively than
FBP algorithms. These features suggest that the use of advanced image
reconstruction algorithms can improve the effectiveness of 3D OAT while
reducing the amount of data required for biomedical applications
- …