5,165 research outputs found

    Accelerated Modeling of Near and Far-Field Diffraction for Coronagraphic Optical Systems

    Full text link
    Accurately predicting the performance of coronagraphs and tolerancing optical surfaces for high-contrast imaging requires a detailed accounting of diffraction effects. Unlike simple Fraunhofer diffraction modeling, near and far-field diffraction effects, such as the Talbot effect, are captured by plane-to-plane propagation using Fresnel and angular spectrum propagation. This approach requires a sequence of computationally intensive Fourier transforms and quadratic phase functions, which limit the design and aberration sensitivity parameter space which can be explored at high-fidelity in the course of coronagraph design. This study presents the results of optimizing the multi-surface propagation module of the open source Physical Optics Propagation in PYthon (POPPY) package. This optimization was performed by implementing and benchmarking Fourier transforms and array operations on graphics processing units, as well as optimizing multithreaded numerical calculations using the NumExpr python library where appropriate, to speed the end-to-end simulation of observatory and coronagraph optical systems. Using realistic systems, this study demonstrates a greater than five-fold decrease in wall-clock runtime over POPPY's previous implementation and describes opportunities for further improvements in diffraction modeling performance.Comment: Presented at SPIE ASTI 2018, Austin Texas. 11 pages, 6 figure

    Accelerating the Fourier split operator method via graphics processing units

    Full text link
    Current generations of graphics processing units have turned into highly parallel devices with general computing capabilities. Thus, graphics processing units may be utilized, for example, to solve time dependent partial differential equations by the Fourier split operator method. In this contribution, we demonstrate that graphics processing units are capable to calculate fast Fourier transforms much more efficiently than traditional central processing units. Thus, graphics processing units render efficient implementations of the Fourier split operator method possible. Performance gains of more than an order of magnitude as compared to implementations for traditional central processing units are reached in the solution of the time dependent Schr\"odinger equation and the time dependent Dirac equation

    DSPSR: Digital Signal Processing Software for Pulsar Astronomy

    Full text link
    DSPSR is a high-performance, open-source, object-oriented, digital signal processing software library and application suite for use in radio pulsar astronomy. Written primarily in C++, the library implements an extensive range of modular algorithms that can optionally exploit both multiple-core processors and general-purpose graphics processing units. After over a decade of research and development, DSPSR is now stable and in widespread use in the community. This paper presents a detailed description of its functionality, justification of major design decisions, analysis of phase-coherent dispersion removal algorithms, and demonstration of performance on some contemporary microprocessor architectures.Comment: 15 pages, 10 figures, to be published in PAS

    The Parallel Algorithm for the 2-D Discrete Wavelet Transform

    Full text link
    The discrete wavelet transform can be found at the heart of many image-processing algorithms. Until now, the transform on general-purpose processors (CPUs) was mostly computed using a separable lifting scheme. As the lifting scheme consists of a small number of operations, it is preferred for processing using single-core CPUs. However, considering a parallel processing using multi-core processors, this scheme is inappropriate due to a large number of steps. On such architectures, the number of steps corresponds to the number of points that represent the exchange of data. Consequently, these points often form a performance bottleneck. Our approach appropriately rearranges calculations inside the transform, and thereby reduces the number of steps. In other words, we propose a new scheme that is friendly to parallel environments. When evaluating on multi-core CPUs, we consistently overcome the original lifting scheme. The evaluation was performed on 61-core Intel Xeon Phi and 8-core Intel Xeon processors.Comment: accepted for publication at ICGIP 201
    • …
    corecore