5,165 research outputs found
Accelerated Modeling of Near and Far-Field Diffraction for Coronagraphic Optical Systems
Accurately predicting the performance of coronagraphs and tolerancing optical
surfaces for high-contrast imaging requires a detailed accounting of
diffraction effects. Unlike simple Fraunhofer diffraction modeling, near and
far-field diffraction effects, such as the Talbot effect, are captured by
plane-to-plane propagation using Fresnel and angular spectrum propagation. This
approach requires a sequence of computationally intensive Fourier transforms
and quadratic phase functions, which limit the design and aberration
sensitivity parameter space which can be explored at high-fidelity in the
course of coronagraph design. This study presents the results of optimizing the
multi-surface propagation module of the open source Physical Optics Propagation
in PYthon (POPPY) package. This optimization was performed by implementing and
benchmarking Fourier transforms and array operations on graphics processing
units, as well as optimizing multithreaded numerical calculations using the
NumExpr python library where appropriate, to speed the end-to-end simulation of
observatory and coronagraph optical systems. Using realistic systems, this
study demonstrates a greater than five-fold decrease in wall-clock runtime over
POPPY's previous implementation and describes opportunities for further
improvements in diffraction modeling performance.Comment: Presented at SPIE ASTI 2018, Austin Texas. 11 pages, 6 figure
Accelerating the Fourier split operator method via graphics processing units
Current generations of graphics processing units have turned into highly
parallel devices with general computing capabilities. Thus, graphics processing
units may be utilized, for example, to solve time dependent partial
differential equations by the Fourier split operator method. In this
contribution, we demonstrate that graphics processing units are capable to
calculate fast Fourier transforms much more efficiently than traditional
central processing units. Thus, graphics processing units render efficient
implementations of the Fourier split operator method possible. Performance
gains of more than an order of magnitude as compared to implementations for
traditional central processing units are reached in the solution of the time
dependent Schr\"odinger equation and the time dependent Dirac equation
DSPSR: Digital Signal Processing Software for Pulsar Astronomy
DSPSR is a high-performance, open-source, object-oriented, digital signal
processing software library and application suite for use in radio pulsar
astronomy. Written primarily in C++, the library implements an extensive range
of modular algorithms that can optionally exploit both multiple-core processors
and general-purpose graphics processing units. After over a decade of research
and development, DSPSR is now stable and in widespread use in the community.
This paper presents a detailed description of its functionality, justification
of major design decisions, analysis of phase-coherent dispersion removal
algorithms, and demonstration of performance on some contemporary
microprocessor architectures.Comment: 15 pages, 10 figures, to be published in PAS
The Parallel Algorithm for the 2-D Discrete Wavelet Transform
The discrete wavelet transform can be found at the heart of many
image-processing algorithms. Until now, the transform on general-purpose
processors (CPUs) was mostly computed using a separable lifting scheme. As the
lifting scheme consists of a small number of operations, it is preferred for
processing using single-core CPUs. However, considering a parallel processing
using multi-core processors, this scheme is inappropriate due to a large number
of steps. On such architectures, the number of steps corresponds to the number
of points that represent the exchange of data. Consequently, these points often
form a performance bottleneck. Our approach appropriately rearranges
calculations inside the transform, and thereby reduces the number of steps. In
other words, we propose a new scheme that is friendly to parallel environments.
When evaluating on multi-core CPUs, we consistently overcome the original
lifting scheme. The evaluation was performed on 61-core Intel Xeon Phi and
8-core Intel Xeon processors.Comment: accepted for publication at ICGIP 201
- …