41 research outputs found

    Surfing the optimization space of a multiple-GPU parallel implementation of a X-ray tomography reconstruction algorithm

    Get PDF
    The increasing popularity of massively parallel architectures based on accelerators have opened up the possibility of significantly improving the performance of X-ray computed tomography (CT) applications towards achieving real-time imaging. However, achieving this goal is a challenging process, as most CT applications have not been designed for exploiting the amount of parallelism existing in these architectures. In this paper we present the massively parallel implementation and optimization of Mangoose(++), a CT application for reconstructing 3D volumes from 20 images collected by scanners based on cone-beam geometry. The main contribution of this paper are the following. First, we develop a modular application design that allows to exploit the functional parallelism inside the application and to facilitate the parallelization of individual application phases. Second, we identify a set of optimizations that can be applied individually and in combination for optimally deploying the application on a massively parallel multi-GPU system. Third, we present a study of surfing the optimization space of the modularized application and demonstrate that a significant benefit can be obtained from employing the adequate combination of application optimizations. (C) 2014 Elsevier Inc. All rights reserved.This work was partially funded by the Spanish Ministry of Science and Technology under the grant TIN2010-16497, the AMIT project (CEN-20101014) from the CDTI-CENIT program, RECAVA-RETIC Network (RD07/0014/2009), projects TEC2010-21619-C04-01, TEC2011-28972-C02-01, and PI11/00616 from the Spanish Ministerio de Ciencia e Innovacion, ARTEMIS program (S2009/DPI-1802), from the Comunidad de Madrid

    GPU-accelerated iterative reconstruction for limited-data tomography in CBCT systems

    Get PDF
    Standard cone-beam computed tomography (CBCT) involves the acquisition of at least 360 projections rotating through 360 degrees. Nevertheless, there are cases in which only a few projections can be taken in a limited angular span, such as during surgery, where rotation of the source-detector pair is limited to less than 180 degrees. Reconstruction of limited data with the conventional method proposed by Feldkamp, Davis and Kress (FDK) results in severe artifacts. Iterative methods may compensate for the lack of data by including additional prior information, although they imply a high computational burden and memory consumption. Results: We present an accelerated implementation of an iterative method for CBCT following the Split Bregman formulation, which reduces computational time through GPU-accelerated kernels. The implementation enables the reconstruction of large volumes (> 1024 3 pixels) using partitioning strategies in forward- and back-projection operations. We evaluated the algorithm on small-animal data for different scenarios with different numbers of projections, angular span, and projection size. Reconstruction time varied linearly with the number of projections and quadratically with projection size but remained almost unchanged with angular span. Forward- and back-projection operations represent 60% of the total computational burden. Conclusion: Efficient implementation using parallel processing and large-memory management strategies together with GPU kernels enables the use of advanced reconstruction approaches which are needed in limited-data scenarios. Our GPU implementation showed a significant time reduction (up to 48x) compared to a CPU-only implementation, resulting in a total reconstruction time from several hours to few minutes.This work has been supported by TEC2013-47270-R, RTC-2014-3028-1, TIN2016-79637-P (Spanish Ministerio de Economia y Competitividad), DPI2016-79075-R (Spanish Ministerio de Economia, Industria y Competitividad), CIBER CB07/09/0031 (Spanish Ministerio de Sanidad y Consumo), RePhrase 644235 (European Commission) and grant FPU14/03875 (Spanish Ministerio de Educacion, Cultura y Deporte)

    FUX-Sim: implementation of a fast universal simulation/reconstruction framework for X-ray systems

    Get PDF
    The availability of digital X-ray detectors, together with advances in reconstruction algorithms, creates an opportunity for bringing 3D capabilities to conventional radiology systems. The downside is that reconstruction algorithms for non-standard acquisition protocols are generally based on iterative approaches that involve a high computational burden. The development of new flexible X-ray systems could benefit from computer simulations, which may enable performance to be checked before expensive real systems are implemented. The development of simulation/reconstruction algorithms in this context poses three main difficulties. First, the algorithms deal with large data volumes and are computationally expensive, thus leading to the need for hardware and software optimizations. Second, these optimizations are limited by the high flexibility required to explore new scanning geometries, including fully configurable positioning of source and detector elements.This work was funded by the projects TEC2013-47270-R, RTC-2014-3028-1, TIN2016-79637-P, DPI2016-79075-R, and the Cardiovascular Research Network (RIC, RD12/0042/0057) from the Spanish Ministerio de Economía y Competitividad (www.mineco.gob.es/) and, FPU14/03875 grant from the Spanish Ministerio de Educación, Cultura y Deporte (http://www.mecd.gob.es). We also thank NVidia for providing the Tesla K40 device used to perform the experiments

    Novel high performance techniques for high definition computer aided tomography

    Get PDF
    Mención Internacional en el título de doctorMedical image processing is an interdisciplinary field in which multiple research areas are involved: image acquisition, scanner design, image reconstruction algorithms, visualization, etc. X-Ray Computed Tomography (CT) is a medical imaging modality based on the attenuation suffered by the X-rays as they pass through the body. Intrinsic differences in attenuation properties of bone, air, and soft tissue result in high-contrast images of anatomical structures. The main objective of CT is to obtain tomographic images from radiographs acquired using X-Ray scanners. The process of building a 3D image or volume from the 2D radiographs is known as reconstruction. One of the latest trends in CT is the reduction of the radiation dose delivered to patients through the decrease of the amount of acquired data. This reduction results in artefacts in the final images if conventional reconstruction methods are used, making it advisable to employ iterative reconstruction algorithms. There are numerous reconstruction algorithms available, from which we can highlight two specific types: traditional algorithms, which are fast but do not enable the obtaining of high quality images in situations of limited data; and iterative algorithms, slower but more reliable when traditional methods do not reach the quality standard requirements. One of the priorities of reconstruction is the obtaining of the final images in near real time, in order to reduce the time spent in diagnosis. To accomplish this objective, new high performance techniques and methods for accelerating these types of algorithms are needed. This thesis addresses the challenges of both traditional and iterative reconstruction algorithms, regarding acceleration and image quality. One common approach for accelerating these algorithms is the usage of shared-memory and heterogeneous architectures. In this thesis, we propose a novel simulation/reconstruction framework, namely FUX-Sim. This framework follows the hypothesis that the development of new flexible X-ray systems can benefit from computer simulations, which may also enable performance to be checked before expensive real systems are implemented. Its modular design abstracts the complexities of programming for accelerated devices to facilitate the development and evaluation of the different configurations and geometries available. In order to obtain near real execution times, low-level optimizations for the main components of the framework are provided for Graphics Processing Unit (GPU) architectures. Other alternative tackled in this thesis is the acceleration of iterative reconstruction algorithms by using distributed memory architectures. We present a novel architecture that unifies the two most important computing paradigms for scientific computing nowadays: High Performance Computing (HPC). The proposed architecture combines Big Data frameworks with the advantages of accelerated computing. The proposed methods presented in this thesis provide more flexible scanner configurations as they offer an accelerated solution. Regarding performance, our approach is as competitive as the solutions found in the literature. Additionally, we demonstrate that our solution scales with the size of the problem, enabling the reconstruction of high resolution images.El procesamiento de imágenes médicas es un campo interdisciplinario en el que participan múltiples áreas de investigación como la adquisición de imágenes, diseño de escáneres, algoritmos de reconstrucción de imágenes, visualización, etc. La tomografía computarizada (TC) de rayos X es una modalidad de imágen médica basada en el cálculo de la atenuación sufrida por los rayos X a medida que pasan por el cuerpo a escanear. Las diferencias intrínsecas en la atenuación de hueso, aire y tejido blando dan como resultado imágenes de alto contraste de estas estructuras anatómicas. El objetivo principal de la TC es obtener imágenes tomográficas a partir estas radiografías obtenidas mediante escáneres de rayos X. El proceso de construir una imagen o volumen en 3D a partir de las radiografías 2D se conoce como reconstrucción. Una de las últimas tendencias en la tomografía computarizada es la reducción de la dosis de radiación administrada a los pacientes a través de la reducción de la cantidad de datos adquiridos. Esta reducción da como resultado artefactos en las imágenes finales si se utilizan métodos de reconstrucción convencionales, por lo que es aconsejable emplear algoritmos de reconstrucción iterativos. Existen numerosos algoritmos de reconstrucción disponibles a partir de los cuales podemos destacar dos categorías: algoritmos tradicionales, rápidos pero no permiten obtener imágenes de alta calidad en situaciones en las que los datos son limitados; y algoritmos iterativos, más lentos pero más estables en situaciones donde los métodos tradicionales no alcanzan los requisitos en cuanto a la calidad de la imagen. Una de las prioridades de la reconstrucción es la obtención de las imágenes finales en tiempo casi real, con el fin de reducir el tiempo de diagnóstico. Para lograr este objetivo, se necesitan nuevas técnicas y métodos de alto rendimiento para acelerar estos algoritmos. Esta tesis aborda los desafíos de los algoritmos de reconstrucción tradicionales e iterativos, con respecto a la aceleración y la calidad de imagen. Un enfoque común para acelerar estos algoritmos es el uso de arquitecturas de memoria compartida y heterogéneas. En esta tesis, proponemos un nuevo sistema de simulación/reconstrucción, llamado FUX-Sim. Este sistema se construye alrededor de la hipótesis de que el desarrollo de nuevos sistemas de rayos X flexibles puede beneficiarse de las simulaciones por computador, en los que también se puede realizar un control del rendimiento de los nuevos sistemas a desarrollar antes de su implementación física. Su diseño modular abstrae las complejidades de la programación para aceleradores con el objetivo de facilitar el desarrollo y la evaluación de las diferentes configuraciones y geometrías disponibles. Para obtener ejecuciones en casi tiempo real, se proporcionan optimizaciones de bajo nivel para los componentes principales del sistema en las arquitecturas GPU. Otra alternativa abordada en esta tesis es la aceleración de los algoritmos de reconstrucción iterativa mediante el uso de arquitecturas de memoria distribuidas. Presentamos una arquitectura novedosa que unifica los dos paradigmas informáticos más importantes en la actualidad: computación de alto rendimiento (HPC) y Big Data. La arquitectura propuesta combina sistemas Big Data con las ventajas de los dispositivos aceleradores. Los métodos propuestos presentados en esta tesis proporcionan configuraciones de escáner más flexibles y ofrecen una solución acelerada. En cuanto al rendimiento, nuestro enfoque es tan competitivo como las soluciones encontradas en la literatura. Además, demostramos que nuestra solución escala con el tamaño del problema, lo que permite la reconstrucción de imágenes de alta resolución.This work has been mainly funded thanks to a FPU fellowship (FPU14/03875) from the Spanish Ministry of Education. It has also been partially supported by other grants: • DPI2016-79075-R. “Nuevos escenarios de tomografía por rayos X”, from the Spanish Ministry of Economy and Competitiveness. • TIN2016-79637-P Towards unification of HPC and Big Data Paradigms from the Spanish Ministry of Economy and Competitiveness. • Short-term scientific missions (STSM) grant from NESUS COST Action IC1305. • TIN2013-41350-P, Scalable Data Management Techniques for High-End Computing Systems from the Spanish Ministry of Economy and Competitiveness. • RTC-2014-3028-1 NECRA Nuevos escenarios clinicos con radiología avanzada from the Spanish Ministry of Economy and Competitiveness.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: José Daniel García Sánchez.- Secretario: Katzlin Olcoz Herrero.- Vocal: Domenico Tali

    Advanced Image Reconstruction for Limited View Cone-Beam CT

    Get PDF
    In a standard CT acquisition, a high number of projections is obtained around the sample, generally covering an angular span of 360º. However, complexities may arise in some clinical scenarios such as surgery and emergency rooms or Intensive Care Units (ICUs) when the accessibility to the patient is limited due to the monitoring equipment attached. X-ray systems used in these cases are usually C-arms that only enable the acquisition of planar images within a limited angular range. Obtaining 3D images in these scenarios could be extremely interesting for diagnosis or image guided surgery. This would be based on the acquisition of a small number of projections within a limited angular span. Reconstruction of these limited-view data with conventional algorithms such as FDK result in streak artifacts and shape distortion deteriorating the image quality. In order to reduce these artifacts, advanced reconstruction methods can be used to compensate the lack of data by the incorporation of prior information. This bachelor thesis is framed on one of the lines of research carried out by the Biomedical Imaging and Instrumentation group from the Bioengineering and Aerospace Department of Universidad Carlos III de Madrid working jointly with the Hospital General Universitario Gregorio Marañón through its Instituto de Investigación Sanitaria. This line of research is carried out in collaboration with the company SEDECAL, which enables the direct transfer to the industry. Previous work showed that a new iterative reconstruction method proposed by the group, SCoLD, is able to restore the altered contour of the object, suppress greatly the streak artifacts and recover to some extend the image quality by restricting the space of search with a surface constraint. However, the evaluation was only carried out using a simulated mask that described the shape of the object obtained by thresholding a previous CT image of the sample, which is generally not available in real scenarios. The general objective of this thesis is the designing of a complete workflow to implement SCoLD in real scenarios. For that purpose, the 3D scanner Artec Eva was chosen to acquire the surface information of the sample, which was then transformed to be usable as prior information for SCoLD method. The evaluation done in a rodent study showed high similarity between the mask obtained from real data and the ideal mask obtained from a CT. Distortions in shape and streak artifacts in the limited-view FDK reconstruction were greatly reduced when using the real mask with the SCoLD reconstruction and the image quality was highly improved demonstrating the feasibility of the proposal.Grado en Ingeniería Biomédica (Plan 2010

    Quantum state characterization with deep neural networks

    Get PDF
    In this licentiate thesis, I explain some of the interdisciplinary topics connecting machine learning to quantum physics. The thesis is based on the two appended papers, where deep neural networks were used for the characterization of quantum systems. I discuss the connections between parameter estimation, inverse problems and machine learning to put the results of the appended papers in perspective. In these papers, we have shown how to incorporate prior knowledge of quantum physics and noise models in generative adversarial neural networks. This thesis further discusses how automatic differentiation techniques allow training such custom neural-network-based methods to characterize quantum systems or learn their description. In the appended papers, we have demonstrated that the neural-network approach could learn a quantum state description from an order of magnitude fewer data points and faster than an iterative maximum-likelihood estimation technique. The goal of the thesis is to bring such tools and techniques from machine learning to the physicist’s arsenal and to explore the intersection between quantum physics and machine learning

    Eight Biennial Report : April 2005 – March 2007

    No full text
    corecore