2,394 research outputs found
A portable platform for accelerated PIC codes and its application to GPUs using OpenACC
We present a portable platform, called PIC_ENGINE, for accelerating
Particle-In-Cell (PIC) codes on heterogeneous many-core architectures such as
Graphic Processing Units (GPUs). The aim of this development is efficient
simulations on future exascale systems by allowing different parallelization
strategies depending on the application problem and the specific architecture.
To this end, this platform contains the basic steps of the PIC algorithm and
has been designed as a test bed for different algorithmic options and data
structures. Among the architectures that this engine can explore, particular
attention is given here to systems equipped with GPUs. The study demonstrates
that our portable PIC implementation based on the OpenACC programming model can
achieve performance closely matching theoretical predictions. Using the Cray
XC30 system, Piz Daint, at the Swiss National Supercomputing Centre (CSCS), we
show that PIC_ENGINE running on an NVIDIA Kepler K20X GPU can outperform the
one on an Intel Sandybridge 8-core CPU by a factor of 3.4
BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images
In cryo-electron microscopy (EM), molecular structures are determined from
large numbers of projection images of individual particles. To harness the full
power of this single-molecule information, we use the Bayesian inference of EM
(BioEM) formalism. By ranking structural models using posterior probabilities
calculated for individual images, BioEM in principle addresses the challenge of
working with highly dynamic or heterogeneous systems not easily handled in
traditional EM reconstruction. However, the calculation of these posteriors for
large numbers of particles and models is computationally demanding. Here we
present highly parallelized, GPU-accelerated computer software that performs
this task efficiently. Our flexible formulation employs CUDA, OpenMP, and MPI
parallelization combined with both CPU and GPU computing. The resulting BioEM
software scales nearly ideally both on pure CPU and on CPU+GPU architectures,
thus enabling Bayesian analysis of tens of thousands of images in a reasonable
time. The general mathematical framework and robust algorithms are not limited
to cryo-electron microscopy but can be generalized for electron tomography and
other imaging experiments
Master slave en-face OCT/SLO
Master Slave optical coherence tomography (MS-OCT) is an OCT method that does not require resampling of data and can be used to deliver en-face images from several depths simultaneously. As the MS-OCT method requires important computational resources, the number of multiple depth en-face images that can be produced in real-time is limited. Here, we demonstrate progress in taking advantage of the parallel processing feature of the MS-OCT technology. Harnessing the capabilities of graphics processing units (GPU)s, information from 384 depth positions is acquired in one raster with real time display of up to 40 en-face OCT images. These exhibit comparable resolution and sensitivity to the images produced using the conventional Fourier domain based method. The GPU facilitates versatile real time selection of parameters, such as the depth positions of the 40 images out of the set of 384 depth locations, as well as their axial resolution. In each updated displayed frame, in parallel with the 40 en-face OCT images, a scanning laser ophthalmoscopy (SLO) lookalike image is presented together with two B-scan OCT images oriented along rectangular directions. The thickness of the SLO lookalike image is dynamically determined by the choice of number of en-face OCT images displayed in the frame and the choice of differential axial distance between them
Interactive ray shading of FRep objects
In this paper we present a method for interactive rendering general procedurally defined functionally represented (FRep) objects using the acceleration with graphics hardware, namely Graphics Processing Units (GPU). We obtain interactive rates by using GPU acceleration for all computations in rendering algorithm, such as ray-surface intersection, function evaluation and normal computations. We compute primary rays as well as secondary rays for shadows, reflection and refraction for obtaining high quality of the output visualization and further extension to ray-tracing of FRep objects. The algorithm is well-suited for modern GPUs and provides acceptable interactive rates with good quality of the results. A wide range of objects can be rendered including traditional skeletal implicit surfaces, constructive solids, and purely procedural objects such as 3D fractals
Recommended from our members
Volume MLS Ray Casting
The method of Moving Least Squares (MLS) is a popular framework for reconstructing continuous functions from scattered data due to its rich mathematical properties and well-understood theoretical foundations. This paper applies MLS to volume rendering, providing a unified mathematical framework for ray casting of scalar data stored over regular as well as irregular grids. We use the MLS reconstruction to render smooth isosurfaces and to compute accurate derivatives for high-quality shading effects. We also present a novel, adaptive preintegration scheme to improve the efficiency of the ray casting algorithm by reducing the overall number of function evaluations, and an efficient implementation of our framework exploiting modern graphics hardware. The resulting system enables high-quality volume integration and shaded isosurface rendering for regular and irregular volume data.Engineering and Applied Science
Fast Reliable Ray-tracing of Procedurally Defined Implicit Surfaces Using Revised Affine Arithmetic
Fast and reliable rendering of implicit surfaces is an important area in the field of implicit modelling. Direct rendering, namely ray-tracing, is shown to be a suitable technique for obtaining good-quality visualisations of implicit surfaces. We present a technique for reliable ray-tracing of arbitrary procedurally defined implicit surfaces by using a modification of Affine Arithmetic called Revised Affine Arithmetic. A wide range of procedurally defined implicit objects can be rendered using this technique including polynomial surfaces, constructive solids, pseudo-random objects, procedurally defined microstructures, and others. We compare our technique with other reliable techniques based on Interval and Affine Arithmetic to show that our technique provides the fastest, while still reliable, ray-surface intersections and ray-tracing. We also suggest possible modifications for the GPU implementation of this technique for real-time rendering of relatively simple implicit models and for near real-time for complex implicit models
Accelerated High-Resolution Photoacoustic Tomography via Compressed Sensing
Current 3D photoacoustic tomography (PAT) systems offer either high image
quality or high frame rates but are not able to deliver high spatial and
temporal resolution simultaneously, which limits their ability to image dynamic
processes in living tissue. A particular example is the planar Fabry-Perot (FP)
scanner, which yields high-resolution images but takes several minutes to
sequentially map the photoacoustic field on the sensor plane, point-by-point.
However, as the spatio-temporal complexity of many absorbing tissue structures
is rather low, the data recorded in such a conventional, regularly sampled
fashion is often highly redundant. We demonstrate that combining variational
image reconstruction methods using spatial sparsity constraints with the
development of novel PAT acquisition systems capable of sub-sampling the
acoustic wave field can dramatically increase the acquisition speed while
maintaining a good spatial resolution: First, we describe and model two general
spatial sub-sampling schemes. Then, we discuss how to implement them using the
FP scanner and demonstrate the potential of these novel compressed sensing PAT
devices through simulated data from a realistic numerical phantom and through
measured data from a dynamic experimental phantom as well as from in-vivo
experiments. Our results show that images with good spatial resolution and
contrast can be obtained from highly sub-sampled PAT data if variational image
reconstruction methods that describe the tissues structures with suitable
sparsity-constraints are used. In particular, we examine the use of total
variation regularization enhanced by Bregman iterations. These novel
reconstruction strategies offer new opportunities to dramatically increase the
acquisition speed of PAT scanners that employ point-by-point sequential
scanning as well as reducing the channel count of parallelized schemes that use
detector arrays.Comment: submitted to "Physics in Medicine and Biology
- …