15 research outputs found

    High Speed 3D Tomography on CPU, GPU, and FPGA

    Get PDF
    12 pages; 50% d'acceptationInternational audienceBack-projection (BP) is a costly computational step in tomography image reconstruction such as positron emission tomography (PET). To reduce the computation time, this paper presents a pipelined, prefetch, and parallelized architecture for PET BP (3PA-PET). The key feature of this architecture is its original memory access strategy, masking the high latency of the external memory. Indeed, the pattern of the memory references to the data acquired hinders the processing unit. The memory access bottleneck is overcome by an efficient use of the intrinsic temporal and spatial locality of the BP algorithm. A loop reordering allows an efficient use of general purpose processor's caches, for software implementation, as well as the 3D predictive and adaptive cache (3D-AP cache), when considering hardware implementations. Parallel hardware pipelines are also efficient thanks to a hierarchical 3D-AP cache: each pipeline performs a memory reference in about one clock cycle to reach a computational throughput close to 100%. The 3PA-PET architecture is prototyped on a system on programmable chip (SoPC) to validate the system and to measure its expected performances. Time performances are compared with a desktop PC, a workstation, and a graphic processor unit (GPU)

    Multi GPU parallelization of 3D bayesian CT algorithm and its application on real foam reonconstruction with incomplete data set

    Get PDF
    International audienceA great number of image reconstruction algorithms, based on analytical filtered backprojection, are implemented for X-ray Computed Tomography (CT) [1,2]. The limits of these methods appear when the number of projections is small, and/or not equidistributed around the object. That's the case in the context of dynamic study of fluids in foams for example, the data set are not complete due to the limited acquistion time. In this specific context, iterative algebraic methods are a solution to this lack of data. A great number of them are mainly based on least square criterion. Recently, we proposed a regularized version based on Bayesian estimation approach. The main problem that appears when using such methods as well as any iterative algebraic methods is the computation time and especially for projection and backprojection steps. In this study, first we show how we implemented some main steps of such algorithms which are the forward projection and backward backprojection steps on multi-GPU hardware, and then we show some results on real application of the 3D tomographic reconstruction of metallic foams from a small number of projections. Through this application, we also show the good quality of results as well as a significant speed up of the computation with GPU implementation (300 acceleration factor)

    Multi-streaming and multi-GPU optimization for a matched pair of Projector and Backprojector

    Get PDF
    International audienceIterative reconstruction methods are used in X-ray Computed Tomography in order to improve the quality of reconstruction compared to filtered backprojection methods. However, these methods are computationally expensive due to repeated projection and backprojection operations. Among the possible pairs of projector and backprojector, the Separable Footprint (SF) pair has the advantage to be matched in order to ensure the convergence of the reconstruction algorithm. Nevertheless, this pair implies more computations compared to unmatched pairs commonly used in order to reduce the computation time. In order to speed up this pair, the projector and the backprojector can be parallelized on GPU. Following one of our previous work, in this paper, we propose a new implementation which takes benefits from the factorized calculations of the SF pair in order to increase the number of data handled by each thread. We also describe the adaptation of this implementation for multi-streaming computations. The method is tested on large volumes of size 1024 3 and 2048 3 voxels

    GPU implementation of a 3D bayesian CT algorithm and its application on real foam reconstruction

    Get PDF
    5 pagesInternational audienceA great number of image reconstruction algorithms, based on analytical filtered backprojection, are implemented for X-ray Computed Tomography (CT) [1, 3]. The limits of these methods appear when the number of projections is small, and/or not equidistributed around the object. In this specific context, iterative algebraic methods are implemented. A great number of them are mainly based on least square criterion. Recently, we proposed a regularized version based on Bayesian estimation approach. The main problem that appears when using such methods as well as any iterative algebraic methods is the computation time and especially for projection and backprojection steps. In this paper, first we show how we implemented some main steps of such algorithems which are the forward projection and backward backprojection steps on GPU hardware, and then we show some results on real application of the 3D tomographic reconstruction of metallic foams from a small number of projections. Through this application, we also show the good quality of results as well as a significant speed up of the computation with GPU implementation

    Analysis of 3D Cone-Beam CT Image Reconstruction Performance on a FPGA

    Get PDF
    Efficient and accurate tomographic image reconstruction has been an intensive topic of research due to the increasing everyday usage in areas such as radiology, biology, and materials science. Computed tomography (CT) scans are used to analyze internal structures through capture of x-ray images. Cone-beam CT scans project a cone-shaped x-ray to capture 2D image data from a single focal point, rotating around the object. CT scans are prone to multiple artifacts, including motion blur, streaks, and pixel irregularities, therefore must be run through image reconstruction software to reduce visual artifacts. The most common algorithm used is the Feldkamp, Davis, and Kress (FDK) backprojection algorithm. The algorithm is computationally intensive due to the O(n4) backprojection step, running slowly with large CT data files on CPUs, but exceptionally well on GPUs due to the parallel nature of the algorithm. This thesis will analyze the performance of 3D cone-beam CT image reconstruction implemented in OpenCL on a FPGA embedded into a Power System

    Accélération sur serveur multi-GPUs de la reconstruction 3D d'une mousse de nickel par méthodes itératives algébriques régularisées

    Get PDF
    4 pagesNational audienceCet article traite de la reconstruction 3D en tomographie X de volume de grandes tailles (10243) à partir d'un nombre limité de projections. Lors d'acquisitions incomplètes en raison d'un temps limité d'acquisition ou bien dans le but de réduire la dose de rayon X, les méthodes analytiques standards de type rétroprojection filtrée offrent une qualité de reconstruction décevante. Les méthodes itératives algébriques régularisées permettent d'aller au delà de ces limitations mais souffrent d'un coût de calcul prohibitif pour son utilisation en pratique. Dans cet article, nous présentons la parallélisation des calculs des principaux opérateurs (projection, rétroprojection, convolution 3D) sur les processeurs graphiques de type "many cores". Cette parallélisation a permis une accélération significative de la reconstruction (facteur 300 sur données 10243 à l'aide de 8 GPUs). Par ailleurs, les résultats des reconstruction sur données simulées (phantom de Shepp Logan) et réelles en Contrôle Non Destructif (mousse de nickel) offrant une nette amélioration de la qualité de reconstruction par rapport à la méthode analytique standard FDK sont également présentés

    Four-dimensional tomographic reconstruction by time domain decomposition

    Full text link
    Since the beginnings of tomography, the requirement that the sample does not change during the acquisition of one tomographic rotation is unchanged. We derived and successfully implemented a tomographic reconstruction method which relaxes this decades-old requirement of static samples. In the presented method, dynamic tomographic data sets are decomposed in the temporal domain using basis functions and deploying an L1 regularization technique where the penalty factor is taken for spatial and temporal derivatives. We implemented the iterative algorithm for solving the regularization problem on modern GPU systems to demonstrate its practical use

    Evaluation of Single-Chip, Real-Time Tomographic Data Processing on FPGA - SoC Devices

    Get PDF
    A novel approach to tomographic data processing has been developed and evaluated using the Jagiellonian PET (J-PET) scanner as an example. We propose a system in which there is no need for powerful, local to the scanner processing facility, capable to reconstruct images on the fly. Instead we introduce a Field Programmable Gate Array (FPGA) System-on-Chip (SoC) platform connected directly to data streams coming from the scanner, which can perform event building, filtering, coincidence search and Region-Of-Response (ROR) reconstruction by the programmable logic and visualization by the integrated processors. The platform significantly reduces data volume converting raw data to a list-mode representation, while generating visualization on the fly.Comment: IEEE Transactions on Medical Imaging, 17 May 201
    corecore