3,222 research outputs found
Recommended from our members
Efficient Parallel FFTs for Different Computational Models
We select the Fast Fourier Transform (FFT) to demonstrate a methodology for deriving the optimal parallel algorithm according to predetermined performance metrics, within a computational model. Following the vector space framework for parallel permutations, we provide a specification language to capture the algorithm, derive the optimal parallel FFT specification, compute the arithmetic, memory, communication and load{balance complexity metrics, apply the analytical performance evaluation to PRAM, LPRAM, BSP and LogP computational models, and compare with actual performance results.Engineering and Applied Science
First-principle molecular dynamics with ultrasoft pseudopotentials: parallel implementation and application to extended bio-inorganic system
We present a plane-wave ultrasoft pseudopotential implementation of
first-principle molecular dynamics, which is well suited to model large
molecular systems containing transition metal centers. We describe an efficient
strategy for parallelization that includes special features to deal with the
augmented charge in the contest of Vanderbilt's ultrasoft pseudopotentials. We
also discuss a simple approach to model molecular systems with a net charge
and/or large dipole/quadrupole moments. We present test applications to
manganese and iron porphyrins representative of a large class of biologically
relevant metallorganic systems. Our results show that accurate
Density-Functional Theory calculations on systems with several hundred atoms
are feasible with access to moderate computational resources.Comment: 29 pages, 4 Postscript figures, revtex
Accelerated Modeling of Near and Far-Field Diffraction for Coronagraphic Optical Systems
Accurately predicting the performance of coronagraphs and tolerancing optical
surfaces for high-contrast imaging requires a detailed accounting of
diffraction effects. Unlike simple Fraunhofer diffraction modeling, near and
far-field diffraction effects, such as the Talbot effect, are captured by
plane-to-plane propagation using Fresnel and angular spectrum propagation. This
approach requires a sequence of computationally intensive Fourier transforms
and quadratic phase functions, which limit the design and aberration
sensitivity parameter space which can be explored at high-fidelity in the
course of coronagraph design. This study presents the results of optimizing the
multi-surface propagation module of the open source Physical Optics Propagation
in PYthon (POPPY) package. This optimization was performed by implementing and
benchmarking Fourier transforms and array operations on graphics processing
units, as well as optimizing multithreaded numerical calculations using the
NumExpr python library where appropriate, to speed the end-to-end simulation of
observatory and coronagraph optical systems. Using realistic systems, this
study demonstrates a greater than five-fold decrease in wall-clock runtime over
POPPY's previous implementation and describes opportunities for further
improvements in diffraction modeling performance.Comment: Presented at SPIE ASTI 2018, Austin Texas. 11 pages, 6 figure
Application of graphics processing units to search pipelines for gravitational waves from coalescing binaries of compact objects
We report a novel application of a graphics processing unit (GPU) for the purpose of accelerating the search pipelines for gravitational waves from coalescing binaries of compact objects. A speed-up of 16-fold in total has been achieved with an NVIDIA GeForce 8800 Ultra GPU card compared with one core of a 2.5 GHz Intel Q9300 central processing unit (CPU). We show that substantial improvements are possible and discuss the reduction in CPU count required for the detection of inspiral sources afforded by the use of GPUs
- …