153 research outputs found

    High-performance blob-based iterative three-dimensional reconstruction in electron tomography using multi-GPUs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Three-dimensional (3D) reconstruction in electron tomography (ET) has emerged as a leading technique to elucidate the molecular structures of complex biological specimens. Blob-based iterative methods are advantageous reconstruction methods for 3D reconstruction in ET, but demand huge computational costs. Multiple graphic processing units (multi-GPUs) offer an affordable platform to meet these demands. However, a synchronous communication scheme between multi-GPUs leads to idle GPU time, and a weighted matrix involved in iterative methods cannot be loaded into GPUs especially for large images due to the limited available memory of GPUs.</p> <p>Results</p> <p>In this paper we propose a multilevel parallel strategy combined with an asynchronous communication scheme and a blob-ELLR data structure to efficiently perform blob-based iterative reconstructions on multi-GPUs. The asynchronous communication scheme is used to minimize the idle GPU time so as to asynchronously overlap communications with computations. The blob-ELLR data structure only needs nearly 1/16 of the storage space in comparison with ELLPACK-R (ELLR) data structure and yields significant acceleration.</p> <p>Conclusions</p> <p>Experimental results indicate that the multilevel parallel scheme combined with the asynchronous communication scheme and the blob-ELLR data structure allows efficient implementations of 3D reconstruction in ET on multi-GPUs.</p

    GPU-accelerated iterative reconstruction for limited-data tomography in CBCT systems

    Get PDF
    Standard cone-beam computed tomography (CBCT) involves the acquisition of at least 360 projections rotating through 360 degrees. Nevertheless, there are cases in which only a few projections can be taken in a limited angular span, such as during surgery, where rotation of the source-detector pair is limited to less than 180 degrees. Reconstruction of limited data with the conventional method proposed by Feldkamp, Davis and Kress (FDK) results in severe artifacts. Iterative methods may compensate for the lack of data by including additional prior information, although they imply a high computational burden and memory consumption. Results: We present an accelerated implementation of an iterative method for CBCT following the Split Bregman formulation, which reduces computational time through GPU-accelerated kernels. The implementation enables the reconstruction of large volumes (> 1024 3 pixels) using partitioning strategies in forward- and back-projection operations. We evaluated the algorithm on small-animal data for different scenarios with different numbers of projections, angular span, and projection size. Reconstruction time varied linearly with the number of projections and quadratically with projection size but remained almost unchanged with angular span. Forward- and back-projection operations represent 60% of the total computational burden. Conclusion: Efficient implementation using parallel processing and large-memory management strategies together with GPU kernels enables the use of advanced reconstruction approaches which are needed in limited-data scenarios. Our GPU implementation showed a significant time reduction (up to 48x) compared to a CPU-only implementation, resulting in a total reconstruction time from several hours to few minutes.This work has been supported by TEC2013-47270-R, RTC-2014-3028-1, TIN2016-79637-P (Spanish Ministerio de Economia y Competitividad), DPI2016-79075-R (Spanish Ministerio de Economia, Industria y Competitividad), CIBER CB07/09/0031 (Spanish Ministerio de Sanidad y Consumo), RePhrase 644235 (European Commission) and grant FPU14/03875 (Spanish Ministerio de Educacion, Cultura y Deporte)

    Multi-GPU Acceleration of Iterative X-ray CT Image Reconstruction

    Get PDF
    X-ray computed tomography is a widely used medical imaging modality for screening and diagnosing diseases and for image-guided radiation therapy treatment planning. Statistical iterative reconstruction (SIR) algorithms have the potential to significantly reduce image artifacts by minimizing a cost function that models the physics and statistics of the data acquisition process in X-ray CT. SIR algorithms have superior performance compared to traditional analytical reconstructions for a wide range of applications including nonstandard geometries arising from irregular sampling, limited angular range, missing data, and low-dose CT. The main hurdle for the widespread adoption of SIR algorithms in multislice X-ray CT reconstruction problems is their slow convergence rate and associated computational time. We seek to design and develop fast parallel SIR algorithms for clinical X-ray CT scanners. Each of the following approaches is implemented on real clinical helical CT data acquired from a Siemens Sensation 16 scanner and compared to the straightforward implementation of the Alternating Minimization (AM) algorithm of O’Sullivan and Benac [1]. We parallelize the computationally expensive projection and backprojection operations by exploiting the massively parallel hardware architecture of 3 NVIDIA TITAN X Graphical Processing Unit (GPU) devices with CUDA programming tools and achieve an average speedup of 72X over a straightforward CPU implementation. We implement a multi-GPU based voxel-driven multislice analytical reconstruction algorithm called Feldkamp-Davis-Kress (FDK) [2] and achieve an average overall speedup of 1382X over the baseline CPU implementation by using 3 TITAN X GPUs. Moreover, we propose a novel adaptive surrogate-function based optimization scheme for the AM algorithm, resulting in more aggressive update steps in every iteration. On average, we double the convergence rate of our baseline AM algorithm and also improve image quality by using the adaptive surrogate function. We extend the multi-GPU and adaptive surrogate-function based acceleration techniques to dual-energy reconstruction problems as well. Furthermore, we design and develop a GPU-based deep Convolutional Neural Network (CNN) to denoise simulated low-dose X-ray CT images. Our experiments show significant improvements in the image quality with our proposed deep CNN-based algorithm against some widely used denoising techniques including Block Matching 3-D (BM3D) and Weighted Nuclear Norm Minimization (WNNM). Overall, we have developed novel fast, parallel, computationally efficient methods to perform multislice statistical reconstruction and image-based denoising on clinically-sized datasets

    Review : Deep learning in electron microscopy

    Get PDF
    Deep learning is transforming most areas of science and technology, including electron microscopy. This review paper offers a practical perspective aimed at developers with limited familiarity. For context, we review popular applications of deep learning in electron microscopy. Following, we discuss hardware and software needed to get started with deep learning and interface with electron microscopes. We then review neural network components, popular architectures, and their optimization. Finally, we discuss future directions of deep learning in electron microscopy

    Algorithmic and infrastructural software development for cryo electron tomography

    Get PDF
    Many Cryo Electron Microscopy (cryoEM) software packages have accumulated significant technical debts over the years, resulting in overcomplicated codebases that are costly to maintain and that slow down development. In this thesis, we advocate for the development of open-source cryoEM core libraries as a solution to this debt and with the ultimate goal of improving the developer and user experience. First, a brief summary of cryoEM is presented, with an emphasis on projection algorithms and tomography. Second, the requirements of modern and future cryoEM image processing are discussed. Third, a new experimental cryoEM core library written in modern C++ is introduced. This library prioritises performance and code reusability, and is designed around a few core functions which offers an efficient model to manipulate multidimensional arrays at an index-wise and element-wise level. C++ template metaprogramming allowed us to develop modular and transparent compute backends, that provide great CPU and GPU performance, unified in an easy to use interface. Fourth, new projection algorithms will be described, notably a grid-driven approach to accurately insert and sample central slices in 3-dimensional (3d) Fourier space. A Fourier-based fused backward-forward projection, further improving the computational efficiency and accuracy of reprojections, will also be presented. Fifth, and as part of our efforts to test and showcase the library, we have started to implement a tilt series alignment package that gathers existing and new techniques into an automated pipeline. The current program first estimates the per-tilt translations and specimen stage rotation using a coarse alignment based on cosine stretching. It then fits the Thon rings of each tilt image as part of a global optimization to estimate the specimen inclination. Finally, we are using our Fourier-based fused reprojection to efficiently refine the per-tilt translations, and are starting to explore ways that would allow us to refine the per-tilt stage rotations

    Adorym: A multi-platform generic x-ray image reconstruction framework based on automatic differentiation

    Full text link
    We describe and demonstrate an optimization-based x-ray image reconstruction framework called Adorym. Our framework provides a generic forward model, allowing one code framework to be used for a wide range of imaging methods ranging from near-field holography to and fly-scan ptychographic tomography. By using automatic differentiation for optimization, Adorym has the flexibility to refine experimental parameters including probe positions, multiple hologram alignment, and object tilts. It is written with strong support for parallel processing, allowing large datasets to be processed on high-performance computing systems. We demonstrate its use on several experimental datasets to show improved image quality through parameter refinement

    System Characterizations and Optimized Reconstruction Methods for Novel X-ray Imaging

    Get PDF
    In the past decade there have been many new emerging X-ray based imaging technologies developed for different diagnostic purposes or imaging tasks. However, there exist one or more specific problems that prevent them from being effectively or efficiently employed. In this dissertation, four different novel X-ray based imaging technologies are discussed, including propagation-based phase-contrast (PB-XPC) tomosynthesis, differential X-ray phase-contrast tomography (D-XPCT), projection-based dual-energy computed radiography (DECR), and tetrahedron beam computed tomography (TBCT). System characteristics are analyzed or optimized reconstruction methods are proposed for these imaging modalities. In the first part, we investigated the unique properties of propagation-based phase-contrast imaging technique when combined with the X-ray tomosynthesis. Fourier slice theorem implies that the high frequency components collected in the tomosynthesis data can be more reliably reconstructed. It is observed that the fringes or boundary enhancement introduced by the phase-contrast effects can serve as an accurate indicator of the true depth position in the tomosynthesis in-plane image. In the second part, we derived a sub-space framework to reconstruct images from few-view D-XPCT data set. By introducing a proper mask, the high frequency contents of the image can be theoretically preserved in a certain region of interest. A two-step reconstruction strategy is developed to mitigate the risk of subtle structures being oversmoothed when the commonly used total-variation regularization is employed in the conventional iterative framework. In the thirt part, we proposed a practical method to improve the quantitative accuracy of the projection-based dual-energy material decomposition. It is demonstrated that applying a total-projection-length constraint along with the dual-energy measurements can achieve a stabilized numerical solution of the decomposition problem, thus overcoming the disadvantages of the conventional approach that was extremely sensitive to noise corruption. In the final part, we described the modified filtered backprojection and iterative image reconstruction algorithms specifically developed for TBCT. Special parallelization strategies are designed to facilitate the use of GPU computing, showing demonstrated capability of producing high quality reconstructed volumetric images with a super fast computational speed. For all the investigations mentioned above, both simulation and experimental studies have been conducted to demonstrate the feasibility and effectiveness of the proposed methodologies
    corecore