Search CORE

967 research outputs found

BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images

Author: Baruffa Fabio
Cossio Pilar
Hummer Gerhard
Lindenstruth Volker
Rampp Markus
Rohr David
Publication venue: 'Elsevier BV'
Publication date: 21/09/2016
Field of study

In cryo-electron microscopy (EM), molecular structures are determined from large numbers of projection images of individual particles. To harness the full power of this single-molecule information, we use the Bayesian inference of EM (BioEM) formalism. By ranking structural models using posterior probabilities calculated for individual images, BioEM in principle addresses the challenge of working with highly dynamic or heterogeneous systems not easily handled in traditional EM reconstruction. However, the calculation of these posteriors for large numbers of particles and models is computationally demanding. Here we present highly parallelized, GPU-accelerated computer software that performs this task efficiently. Our flexible formulation employs CUDA, OpenMP, and MPI parallelization combined with both CPU and GPU computing. The resulting BioEM software scales nearly ideally both on pure CPU and on CPU+GPU architectures, thus enabling Bayesian analysis of tens of thousands of images in a reasonable time. The general mathematical framework and robust algorithms are not limited to cryo-electron microscopy but can be generalized for electron tomography and other imaging experiments

arXiv.org e-Print Archive

MPG.PuRe

Efficient algorithms for the fast computation of space charge effects caused by charged particles in particle accelerators

Author: Zheng Dawei (gnd: 112231213X)
Publication venue: Universität Rostock Rostock
Publication date
Field of study

In this dissertation, a Poisson solver is improved with three parts: the efficient integrated Green's function; the discrete cosine transform of the efficient integrated Green's function values; the implicitly zero-padded fast Fourier transform for charge density. In addition, the high performance computing technology is utilized for the further improvement of efficiency, such as: OpenMP API, OpenMP+CUDA, MPI, and MPI+OpenMP parallelizations. The examples and simulation results are matched with the results of the commonly used Poisson solver to demonstrate the accuracy performance

RosDok Rostocker Dokumentenserver

GPU-based Iterative Cone Beam CT Reconstruction Using Tight Frame Regularization

Author: Bin Dong
Cai J
Cho S
Dong B
Gu X
Gu X
Han G Liang Z You J
Hestenes M R
Jacobs F
Jia X
Li M
Men C
Men C H
Meyer Y
NVIDIA
Sharp G C
Shen Z W
Shen Z W Toh K C Yun S
Sidky E Y
Sidky E Y
Steve B Jiang
Tang J
Xu F
Xun Jia
Yan G R
Yifei Lou
Publication venue: 'IOP Publishing'
Publication date: 05/05/2011
Field of study

X-ray imaging dose from serial cone-beam CT (CBCT) scans raises a clinical concern in most image guided radiation therapy procedures. It is the goal of this paper to develop a fast GPU-based algorithm to reconstruct high quality CBCT images from undersampled and noisy projection data so as to lower the imaging dose. For this purpose, we have developed an iterative tight frame (TF) based CBCT reconstruction algorithm. A condition that a real CBCT image has a sparse representation under a TF basis is imposed in the iteration process as regularization to the solution. To speed up the computation, a multi-grid method is employed. Our GPU implementation has achieved high computational efficiency and a CBCT image of resolution 512\times512\times70 can be reconstructed in ~5 min. We have tested our algorithm on a digital NCAT phantom and a physical Catphan phantom. It is found that our TF-based algorithm is able to reconstrct CBCT in the context of undersampling and low mAs levels. We have also quantitatively analyzed the reconstructed CBCT image quality in terms of modulation-transfer-function and contrast-to-noise ratio under various scanning conditions. The results confirm the high CBCT image quality obtained from our TF algorithm. Moreover, our algorithm has also been validated in a real clinical context using a head-and-neck patient case. Comparisons of the developed TF algorithm and the current state-of-the-art TV algorithm have also been made in various cases studied in terms of reconstructed image quality and computation efficiency.Comment: 24 pages, 8 figures, accepted by Phys. Med. Bio

arXiv.org e-Print Archive

Crossref

A scalable, efficient scheme for evaluation of stencil computations over unstructured meshes

Author: King James
Kirby Robert Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

pre-printStencil computations are a common class of operations that appear in many computational scientific and engineering applications. Stencil computations often benefit from compile-time analysis, exploiting data-locality, and parallelism. Post-processing of discontinuous Galerkin (dG) simulation solutions with B-spline kernels is an example of a numerical method which requires evaluating computationally intensive stencil operations over a mesh. Previous work on stencil computations has focused on structured meshes, while giving little attention to unstructured meshes. Performing stencil operations over an unstructured mesh requires sampling of heterogeneous elements which often leads to inefficient memory access patterns and limits data locality/reuse. In this paper, we present an efficient method for performing stencil computations over unstructured meshes which increases data-locality and cache efficiency, and a scalable approach for stencil tiling and concurrent execution. We provide experimental results in the context of post-processing of dG solutions that demonstrate the effectiveness of our approach

The University of Utah: J. Willard Marriott Digital Library

The State of the Art in Flow Visualization: Dense and Texture-Based Techniques

Author: Benjamin Vrolijk
Daniel Weiskopf
Frits H. Post
Helmut Doleisch
Helwig Hauser
Robert S. Laramee
Publication venue: 'Wiley'
Publication date: 01/01/2004
Field of study

Flow visualization has been a very attractive component of scientific visualization research for a long time. Usually very large multivariate datasets require processing. These datasets often consist of a large number of sample locations and several time steps. The steadily increasing performance of computers has recently become a driving factor for a reemergence in flow visualization research, especially in texture-based techniques. In this paper, dense, texture-based flow visualization techniques are discussed. This class of techniques attempts to provide a complete, dense representation of the flow field with high spatio-temporal coherency. An attempt of categorizing closely related solutions is incorporated and presented. Fundamentals are shortly addressed as well as advantages and disadvantages of the methods. Categories and Subject Descriptors (according to ACM CCS): I.3 [Computer Graphics]: visualization, flow visualization, computational flow visualizatio

CiteSeerX

Cronfa at Swansea University

Three-Dimensional Photoacoustic Computed Tomography: Imaging Models and Reconstruction Algorithms

Author: Wang Kun
Publication venue: Washington University Open Scholarship
Publication date: 01/01/2012
Field of study

Photoacoustic computed tomography: PACT), also known as optoacoustic tomography, is a rapidly emerging imaging modality that holds great promise for a wide range of biomedical imaging applications. Much effort has been devoted to the investigation of imaging physics and the optimization of experimental designs. Meanwhile, a variety of image reconstruction algorithms have been developed for the purpose of computed tomography. Most of these algorithms assume full knowledge of the acoustic pressure function on a measurement surface that either encloses the object or extends to infinity, which poses many difficulties for practical applications. To overcome these limitations, iterative image reconstruction algorithms have been actively investigated. However, little work has been conducted on imaging models that incorporate the characteristics of data acquisition systems. Moreover, when applying to experimental data, most studies simplify the inherent three-dimensional wave propagation as two-dimensional imaging models by introducing heuristic assumptions on the transducer responses and/or the object structures. One important reason is because three-dimensional image reconstruction is computationally burdensome. The inaccurate imaging models severely limit the performance of iterative image reconstruction algorithms in practice. In the dissertation, we propose a framework to construct imaging models that incorporate the characteristics of ultrasonic transducers. Based on the imaging models, we systematically investigate various iterative image reconstruction algorithms, including advanced algorithms that employ total variation-norm regularization. In order to accelerate three-dimensional image reconstruction, we develop parallel implementations on graphic processing units. In addition, we derive a fast Fourier-transform based analytical image reconstruction formula. By use of iterative image reconstruction algorithms based on the proposed imaging models, PACT imaging scanners can have a compact size while maintaining high spatial resolution. The research demonstrates, for the first time, the feasibility and advantages of iterative image reconstruction algorithms in three-dimensional PACT

Washington University St. Louis: Open Scholarship

Doctor of Philosophy

Author: King James Sokhom
Publication venue: University of Utah
Publication date: 01/01/2017
Field of study

dissertationMemory access irregularities are a major bottleneck for bandwidth limited problems on Graphics Processing Unit (GPU) architectures. GPU memory systems are designed to allow consecutive memory accesses to be coalesced into a single memory access. Noncontiguous accesses within a parallel group of threads working in lock step may cause serialized memory transfers. Irregular algorithms may have data-dependent control flow and memory access, which requires runtime information to be evaluated. Compile time methods for evaluating parallelism, such as static dependence graphs, are not capable of evaluating irregular algorithms. The goals of this dissertation are to study irregularities within the context of unstructured mesh and sparse matrix problems, analyze the impact of vectorization widths on irregularities, and present data-centric methods that improve control flow and memory access irregularity within those contexts. Reordering associative operations has often been exploited for performance gains in parallel algorithms. This dissertation presents a method for associative reordering of stencil computations over unstructured meshes that increases data reuse through caching. This novel parallelization scheme offers considerable speedups over standard methods. Vectorization widths can have significant impact on performance in vectorized computations. Although the hardware vector width is generally fixed, the logical vector width used within a computation can range from one up to the width of the computation. Significant performance differences can occur due to thread scheduling and resource limitations. This dissertation analyzes the impact of vectorization widths on dense numerical computations such as 3D dG postprocessing. It is difficult to efficiently perform dynamic updates on traditional sparse matrix formats. Explicitly controlling memory segmentation allows for in-place dynamic updates in sparse matrices. Dynamically updating the matrix without rebuilding or sorting greatly improves processing time and overall throughput. This dissertation presents a new sparse matrix format, dynamic compressed sparse row (DCSR), which allows for dynamic streaming updates to a sparse matrix. A new method for parallel sparse matrix-matrix multiplication (SpMM) that uses dynamic updates is also presented

The University of Utah: J. Willard Marriott Digital Library

Doctor of Philosophy

Author: Kim Mark
Publication venue: University of Utah
Publication date: 01/01/2016
Field of study

dissertationVisualizing surfaces is a fundamental technique in computer science and is frequently used across a wide range of fields such as computer graphics, biology, engineering, and scientific visualization. In many cases, visualizing an interface between boundaries can provide meaningful analysis or simplification of complex data. Some examples include physical simulation for animation, multimaterial mesh extraction in biophysiology, flow on airfoils in aeronautics, and integral surfaces. However, the quest for high-quality visualization, coupled with increasingly complex data, comes with a high computational cost. Therefore, new techniques are needed to solve surface visualization problems within a reasonable amount of time while also providing sophisticated visuals that are meaningful to scientists and engineers. In this dissertation, novel techniques are presented to facilitate surface visualization. First, a particle system for mesh extraction is parallelized on the graphics processing unit (GPU) with a red-black update scheme to achieve an order of magnitude speed-up over a central processing unit (CPU) implementation. Next, extending the red-black technique to multiple materials showed inefficiencies on the GPU. Therefore, we borrow the underlying data structure from the closest point method, the closest point embedding, and the particle system solver is switched to hierarchical octree-based approach on the GPU. Third, to demonstrate that the closest point embedding is a fast, flexible data structure for surface particles, it is adapted to unsteady surface flow visualization at near-interactive speeds. Finally, the closest point embedding is a three-dimensional dense structure that does not scale well. Therefore, we introduce a closest point sparse octree that allows the closest point embedding to scale to higher resolution. Further, we demonstrate unsteady line integral convolution using the closest point method

The University of Utah: J. Willard Marriott Digital Library