1,000 research outputs found

    Determining an Out-of-Core FFT Decomposition Strategy for Parallel Disks by Dynamic Programming

    Get PDF
    We present an out-of-core FFT algorithm based on the in-core FFT method developed by Swarztrauber. Our algorithm uses a recursive divide-and-conquer strategy, and each stage in the recursion presents several possibilities for how to split the problem into subproblems. We give a recurrence for the algorithm\u27s I/O complexity on the Parallel Disk Model and show how to use dynamic programming to determine optimal splits at each recursive stage. The algorithm to determine the optimal splits takes only Theta(lg^2 N) time for an N-point FFT, and it is practical. The out-of-core FFT algorithm itself takes considerably longer

    ERS-1 SAR data processing

    Get PDF
    To take full advantage of the synthetic aperature radar (SAR) to be flown on board the European Space Agency's Remote Sensing Satellite (ERS-1) (1989) and the Canadian Radarsat (1990), the implementation of a receiving station in Alaska is being studied to gather and process SAR data pertaining in particular to regions within the station's range of reception. The current SAR data processing requirement is estimated to be on the order of 5 minutes per day. The Interim Digital Sar Processor (IDP) which was under continual development through Seasat (1978) and SIR-B (1984) can process slightly more than 2 minutes of ERS-1 data per day. On the other hand, the Advanced Digital SAR Processore (ADSP), currently under development for the Shuttle Imaging Radar C (SIR-C, 1988) and the Venus Radar Mapper, (VMR, 1988), is capable of processing ERS-1 SAR data at a real time rate. To better suit the anticipated ERS-1 SAR data processing requirement, both a modified IDP and an ADSP derivative are being examined. For the modified IDP, a pipelined architecture is proposed for the mini-computer plus array processor arrangement to improve throughout. For the ADSP derivative, a simplified version is proposed to enhance ease of implementation and maintainability while maintaing real time throughput rates. These processing systems are discussed and evaluated

    Multiprocessor Out-of-Core FFTs with Distributed Memory and Parallel Disks

    Get PDF
    This paper extends an earlier out-of-core Fast Fourier Transform (FFT) method for a uniprocessor with the Parallel Disk Model (PDM) to use multiple processors. Four out-of-core multiprocessor methods are examined. Operationally, these methods differ in the size of mini-butterfly computed in memory and how the data are organized on the disks and in the distributed memory of the multiprocessor. The methods also perform differing amounts of I/O and communication. Two of them have the remarkable property that even though they are computing the FFT on a multiprocessor, all interprocessor communication occurs outside the mini-butterfly computations. Performance results on a small workstation cluster indicate that except for unusual combinations of problem size and memory size, the methods that do not perform interprocessor communication during the mini-butterfly computations require approximately 86% of the time of those that do. Moreover, the faster methods are much easier to implement

    Optimizing the Dimensional Method for Performing Multidimensional, Multiprocessor, Out-of-Core FFTs

    Get PDF
    We present an improved version of the Dimensional Method for computing multidimensional Fast Fourier Transforms (FFTs) on a multiprocessor system when the data consist of too many records to fit into memory. Data are spread across parallel disks and processed in sections. We use the Parallel Disk Model for analysis. The simple Dimensional Method performs the 1-dimensional FFTs for each dimension in term. Between each dimension, an out-of-core permutation is used to rearrange the data to contiguous locations. The improved Dimensional Method processes multiple dimensions at a time. We show that determining an optimal sequence and groupings of dimensions is NP-complete. We then analyze the effects of two modifications to the Dimensional Method independently: processing multiple dimensions at one time, and processing single dimensions in a different order. Finally, we show a lower bound on the I/O complexity of the Dimensional Method and present an algorithm that is approximately asymptotically optimal

    Solving the Klein-Gordon equation using Fourier spectral methods: A benchmark test for computer performance

    Get PDF
    The cubic Klein-Gordon equation is a simple but non-trivial partial differential equation whose numerical solution has the main building blocks required for the solution of many other partial differential equations. In this study, the library 2DECOMP&FFT is used in a Fourier spectral scheme to solve the Klein-Gordon equation and strong scaling of the code is examined on thirteen different machines for a problem size of 512^3. The results are useful in assessing likely performance of other parallel fast Fourier transform based programs for solving partial differential equations. The problem is chosen to be large enough to solve on a workstation, yet also of interest to solve quickly on a supercomputer, in particular for parametric studies. Unlike other high performance computing benchmarks, for this problem size, the time to solution will not be improved by simply building a bigger supercomputer.Comment: 10 page

    Out-of-Core Hydrodynamic Simulations for Cosmological Applications

    Full text link
    We present an out-of-core hydrodynamic code for high resolution cosmological simulations that require terabytes of memory. Out-of-core computation refers to the technique of using disk space as virtual memory and transferring data in and out of main memory at high I/O bandwidth. The code is based on a two-level mesh scheme where short-range physics is solved on a high-resolution, localized mesh while long-range physics is captured on a lower resolution, global mesh. The two-level mesh gravity solver allows FFTs to operate on data stored entirely in memory, which is much faster than the alternative of computing the transforms out-of-core through non-sequential disk accesses. We also describe an out-of-core initial conditions generator that is used to prepare large data sets for cosmological simulations. The out-of-core code is accurate, cost-effective, and memory-efficient and the current version is implemented to run in parallel on shared-memory machines. I/O overhead is significantly reduced down to less than 10% by performing disk operations concurrently with numerical calculations. The current computational setup, which includes a 32 processor Alpha server and a 3 TB striped SCSI disk array, allows us to run cosmological simulations with up to 4000^3 grid cells and 2000^3 dark matter particles.Comment: 19 pages, 10 figures; accepted by New Astronom

    First-principle molecular dynamics with ultrasoft pseudopotentials: parallel implementation and application to extended bio-inorganic system

    Full text link
    We present a plane-wave ultrasoft pseudopotential implementation of first-principle molecular dynamics, which is well suited to model large molecular systems containing transition metal centers. We describe an efficient strategy for parallelization that includes special features to deal with the augmented charge in the contest of Vanderbilt's ultrasoft pseudopotentials. We also discuss a simple approach to model molecular systems with a net charge and/or large dipole/quadrupole moments. We present test applications to manganese and iron porphyrins representative of a large class of biologically relevant metallorganic systems. Our results show that accurate Density-Functional Theory calculations on systems with several hundred atoms are feasible with access to moderate computational resources.Comment: 29 pages, 4 Postscript figures, revtex
    • …
    corecore