4,969 research outputs found

    Evaluation of Single-Chip, Real-Time Tomographic Data Processing on FPGA - SoC Devices

    Get PDF
    A novel approach to tomographic data processing has been developed and evaluated using the Jagiellonian PET (J-PET) scanner as an example. We propose a system in which there is no need for powerful, local to the scanner processing facility, capable to reconstruct images on the fly. Instead we introduce a Field Programmable Gate Array (FPGA) System-on-Chip (SoC) platform connected directly to data streams coming from the scanner, which can perform event building, filtering, coincidence search and Region-Of-Response (ROR) reconstruction by the programmable logic and visualization by the integrated processors. The platform significantly reduces data volume converting raw data to a list-mode representation, while generating visualization on the fly.Comment: IEEE Transactions on Medical Imaging, 17 May 201

    Iterative Reconstruction of Cone-Beam Micro-CT Data

    Get PDF
    The use of x-ray computed tomography (CT) scanners has become widespread in both clinical and preclinical contexts. CT scanners can be used to noninvasively test for anatom- ical anomalies as well as to diagnose and monitor disease progression. However, the data acquired by a CT scanner must be reconstructed prior to use and interpretation. A recon- struction algorithm processes the data and outputs a three dimensional image representing the x-ray attenuation properties of the scanned object. The algorithms in most widespread use today are based on filtered backprojection (FBP) methods. These algorithms are rela- tively fast and work well on high quality data, but cannot easily handle data with missing projections or considerable amounts of noise. On the other hand, iterative reconstruction algorithms may offer benefits in such cases, but the computational burden associated with iterative reconstructions is prohibitive. In this work, we address this computational burden and present methods that make iterative reconstruction of high-resolution CT data possible in a reasonable amount of time. Our proposed techniques include parallelization, ordered subsets, reconstruction region restriction, and a modified version of the SIRT algorithm that reduces the overall run-time. When combining all of these techniques, we can reconstruct a 512 × 512 × 1022 image from acquired micro-CT data in less than thirty minutes

    Psnr Based Optimization Applied to Algebraic Reconstruction Technique for Image Reconstruction on a Multi-core System

    Get PDF
    The present work attempts to reveal a parallel Algebraic Reconstruction Technique (pART) to reduce the computational speed of reconstructing artifact-free images from projections. ART is an iterative algorithm well known to reconstruct artifact-free images with limited number of projections. In this work, a novel idea has been focused on to optimize the number of iterations mandatory based on Peak to Signal Noise Ratio (PSNR) to reconstruct an image. However, it suffers of worst computation speed. Hence, an attempt is made to reduce the computation time by running iterative algorithm on a multi-core parallel environment. The execution times are computed for both serial and parallel implementations of ART using different projection data, and, tabulated for comparison. The experimental results demonstrate that the parallel computing environment provides a source of high computational power leading to obtain reconstructed image instantaneously

    CUDA accelerated cone‐beam reconstruction

    Get PDF
    Cone-Beam Computed Tomography (CBCT) is an imaging method that reconstructs a 3D representation of the object from its 2D X-ray images. It is an important diagnostic tool in the medical field, especially dentistry. However, most 3D reconstruction algorithms are computationally intensive and time consuming; this limitation constrains the use of CBCT. In recent years, high-end graphics cards, such as the ones powered by NVIDIA graphics processing units (GPUs), are able to perform general purpose computation. Due to the highly parallel nature of the 3D reconstruction algorithms, it is possible to implement these algorithms on the GPU to reduce the processing time to the level that is practical. Two of the most popular 3D Cone-Beam reconstruction algorithms are the Feldkamp-Davis-Kress algorithm (FDK) and the Algebraic Reconstruction Technique (ART). FDK is fast to construct 3D images, but the quality of its images is lower than the quality of ART images. However, ART requires significantly more computation. Material ART is a recently developed algorithm that uses beam-hardening correction to eliminate artifacts. In this thesis, these three algorithms were implemented on the NVIDIA\u27s CUDA platform. These CUDA based algorithms were tested on three different graphics cards, using phantom and real data. The test results show significant speedup when compared to the CPU software implementation. The speedup is sufficient to allow a moderate cost personal computer with NVIDIA graphics card to process CBCT images in real-time

    Tomographic reconstruction algorithms using optoelectronic devices

    Get PDF
    During the last two decades, iterative computerized tomography (CT) algorithms, such as ART (Algebraic Reconstruction Technique) and SIRT (Simultaneous Iterative Reconstruction Technique), have been applied to the solution of overdetermined and underdetermined systems. These algorithms arrive at the least squares solution of normal equations. In theory, such algorithms converge to the minimum-norm solution when a system is underdetermined if there are no computational errors and the initial vector is chosen properly. In practice, computational errors may lead to failure to converge to a unique solution.;The dissertation introduces a method called the projection iterative reconstruction technique (PIRT) which differs from the other reconstruction algorithms used for solving underdetermined systems. Even though the differences between the method outlined in this dissertation and the algorithms proposed earlier are subtle, the proposed scheme guarantees convergence to a unique minimum-norm solution. Several acceleration techniques are discussed in the dissertation. Furthermore, the iterative algorithm can also be generalized and employed to solve other large and sparse linear systems

    Mapping Iterative Medical Imaging Algorithm on Cell Accelerator

    Get PDF
    Algebraic reconstruction techniques require about half the number of projections as that of Fourier backprojection methods, which makes these methods safer in terms of required radiation dose. Algebraic reconstruction technique (ART) and its variant OS-SART (ordered subset simultaneous ART) are techniques that provide faster convergence with comparatively good image quality. However, the prohibitively long processing time of these techniques prevents their adoption in commercial CT machines. Parallel computing is one solution to this problem. With the advent of heterogeneous multicore architectures that exploit data parallel applications, medical imaging algorithms such as OS-SART can be studied to produce increased performance. In this paper, we map OS-SART on cell broadband engine (Cell BE). We effectively use the architectural features of Cell BE to provide an efficient mapping. The Cell BE consists of one powerPC processor element (PPE) and eight SIMD coprocessors known as synergetic processor elements (SPEs). The limited memory storage on each of the SPEs makes the mapping challenging. Therefore, we present optimization techniques to efficiently map the algorithm on the Cell BE for improved performance over CPU version. We compare the performance of our proposed algorithm on Cell BE to that of Sun Fire ×4600, a shared memory machine. The Cell BE is five times faster than AMD Opteron dual-core processor. The speedup of the algorithm on Cell BE increases with the increase in the number of SPEs. We also experiment with various parameters, such as number of subsets, number of processing elements, and number of DMA transfers between main memory and local memory, that impact the performance of the algorithm

    Multi-GPU Acceleration of Iterative X-ray CT Image Reconstruction

    Get PDF
    X-ray computed tomography is a widely used medical imaging modality for screening and diagnosing diseases and for image-guided radiation therapy treatment planning. Statistical iterative reconstruction (SIR) algorithms have the potential to significantly reduce image artifacts by minimizing a cost function that models the physics and statistics of the data acquisition process in X-ray CT. SIR algorithms have superior performance compared to traditional analytical reconstructions for a wide range of applications including nonstandard geometries arising from irregular sampling, limited angular range, missing data, and low-dose CT. The main hurdle for the widespread adoption of SIR algorithms in multislice X-ray CT reconstruction problems is their slow convergence rate and associated computational time. We seek to design and develop fast parallel SIR algorithms for clinical X-ray CT scanners. Each of the following approaches is implemented on real clinical helical CT data acquired from a Siemens Sensation 16 scanner and compared to the straightforward implementation of the Alternating Minimization (AM) algorithm of O’Sullivan and Benac [1]. We parallelize the computationally expensive projection and backprojection operations by exploiting the massively parallel hardware architecture of 3 NVIDIA TITAN X Graphical Processing Unit (GPU) devices with CUDA programming tools and achieve an average speedup of 72X over a straightforward CPU implementation. We implement a multi-GPU based voxel-driven multislice analytical reconstruction algorithm called Feldkamp-Davis-Kress (FDK) [2] and achieve an average overall speedup of 1382X over the baseline CPU implementation by using 3 TITAN X GPUs. Moreover, we propose a novel adaptive surrogate-function based optimization scheme for the AM algorithm, resulting in more aggressive update steps in every iteration. On average, we double the convergence rate of our baseline AM algorithm and also improve image quality by using the adaptive surrogate function. We extend the multi-GPU and adaptive surrogate-function based acceleration techniques to dual-energy reconstruction problems as well. Furthermore, we design and develop a GPU-based deep Convolutional Neural Network (CNN) to denoise simulated low-dose X-ray CT images. Our experiments show significant improvements in the image quality with our proposed deep CNN-based algorithm against some widely used denoising techniques including Block Matching 3-D (BM3D) and Weighted Nuclear Norm Minimization (WNNM). Overall, we have developed novel fast, parallel, computationally efficient methods to perform multislice statistical reconstruction and image-based denoising on clinically-sized datasets

    GPU-based ultra fast dose calculation using a finite pencil beam model

    Full text link
    Online adaptive radiation therapy (ART) is an attractive concept that promises the ability to deliver an optimal treatment in response to the inter-fraction variability in patient anatomy. However, it has yet to be realized due to technical limitations. Fast dose deposit coefficient calculation is a critical component of the online planning process that is required for plan optimization of intensity modulated radiation therapy (IMRT). Computer graphics processing units (GPUs) are well-suited to provide the requisite fast performance for the data-parallel nature of dose calculation. In this work, we develop a dose calculation engine based on a finite-size pencil beam (FSPB) algorithm and a GPU parallel computing framework. The developed framework can accommodate any FSPB model. We test our implementation on a case of a water phantom and a case of a prostate cancer patient with varying beamlet and voxel sizes. All testing scenarios achieved speedup ranging from 200~400 times when using a NVIDIA Tesla C1060 card in comparison with a 2.27GHz Intel Xeon CPU. The computational time for calculating dose deposition coefficients for a 9-field prostate IMRT plan with this new framework is less than 1 second. This indicates that the GPU-based FSPB algorithm is well-suited for online re-planning for adaptive radiotherapy.Comment: submitted Physics in Medicine and Biolog

    GraphLab: A New Framework for Parallel Machine Learning

    Full text link
    Designing and implementing efficient, provably correct parallel machine learning (ML) algorithms is challenging. Existing high-level parallel abstractions like MapReduce are insufficiently expressive while low-level tools like MPI and Pthreads leave ML experts repeatedly solving the same design challenges. By targeting common patterns in ML, we developed GraphLab, which improves upon abstractions like MapReduce by compactly expressing asynchronous iterative algorithms with sparse computational dependencies while ensuring data consistency and achieving a high degree of parallel performance. We demonstrate the expressiveness of the GraphLab framework by designing and implementing parallel versions of belief propagation, Gibbs sampling, Co-EM, Lasso and Compressed Sensing. We show that using GraphLab we can achieve excellent parallel performance on large scale real-world problems
    corecore