Search CORE

303 research outputs found

An O(log sup 2 N) parallel algorithm for computing the eigenvalues of a symmetric tridiagonal matrix

Author: Swarztrauber Paul N.
Publication venue
Publication date
Field of study

An O(log sup 2 N) parallel algorithm is presented for computing the eigenvalues of a symmetric tridiagonal matrix using a parallel algorithm for computing the zeros of the characteristic polynomial. The method is based on a quadratic recurrence in which the characteristic polynomial is constructed on a binary tree from polynomials whose degree doubles at each level. Intervals that contain exactly one zero are determined by the zeros of polynomials at the previous level which ensures that different processors compute different zeros. The exact behavior of the polynomials at the interval endpoints is used to eliminate the usual problems induced by finite precision arithmetic

NASA Technical Reports Server

Ordered fast fourier transforms on a massively parallel hypercube multiprocessor

Author: Swarztrauber Paul N.
Tong Charles
Publication venue
Publication date
Field of study

Design alternatives for ordered Fast Fourier Transformation (FFT) algorithms were examined on massively parallel hypercube multiprocessors such as the Connection Machine. Particular emphasis is placed on reducing communication which is known to dominate the overall computing time. To this end, the order and computational phases of the FFT were combined, and the sequence to processor maps that reduce communication were used. The class of ordered transforms is expanded to include any FFT in which the order of the transform is the same as that of the input sequence. Two such orderings are examined, namely, standard-order and A-order which can be implemented with equal ease on the Connection Machine where orderings are determined by geometries and priorities. If the sequence has N = 2 exp r elements and the hypercube has P = 2 exp d processors, then a standard-order FFT can be implemented with d + r/2 + 1 parallel transmissions. An A-order sequence can be transformed with 2d - r/2 parallel transmissions which is r - d + 1 fewer than the standard order. A parallel method for computing the trigonometric coefficients is presented that does not use trigonometric functions or interprocessor communication. A performance of 0.9 GFLOPS was obtained for an A-order transform on the Connection Machine

NASA Technical Reports Server

On Reorganizing the Pentagon

Author: Swarztrauber S.A
Publication venue: U.S. Naval War College Digital Commons
Publication date: 23/05/2018
Field of study

U.S. Naval War College Digital Commons

Solving the shallow water equations on the Cray X-MP/48 and the connection machine 2

Author: Sato Richard K.
Swarztrauber Paul N.
Publication venue
Publication date
Field of study

The shallow water equations in Cartesian coordinates and 2-D are solved on the Connection Machine 2 (CM-2) using both the spectral and finite difference methods. A description of these implementations is presented together with a brief discussion of the CM-2 as it relates to these specific computations. The finite difference code was written both in C* and *LISP and the spectral code was written in *LISP. The performance of the codes is compared with a FORTRAN version that was optimized for the Cray X-MP/48

NASA Technical Reports Server

Efficient detection of a CW signal with a linear frequency drift

Author: Bailey David H.
Swarztrauber Paul N.
Publication venue
Publication date
Field of study

An efficient method is presented for the detection of a continuous wave (CW) signal with a frequency drift that is linear in time. Signals of this type occur in transmissions between any two locations that are accelerating relative to one another, e.g., transmissions from the Voyager spacecraft. We assume that both the frequency and the drift are unknown. We also assume that the signal is weak compared to the Gaussian noise. The signal is partitioned into subsequences whose discrete Fourier transforms provide a sequence of instantaneous spectra at equal time intervals. These spectra are then accumulated with a shift that is proportional to time. When the shift is equal to the frequency drift, the signal to noise ratio increases and detection occurs. Here, we show how to compute these accumulations for many shifts in an efficient manner using a variety of Fast Fourier Transformations (FFT). Computing time is proportional to L log L where L is the length of the time series

NASA Technical Reports Server

On Strategy: A Critical Analysis of the Vietnam War.

Author: Swarztrauber S.A.
Publication venue: U.S. Naval War College Digital Commons
Publication date: 24/05/2018
Field of study

U.S. Naval War College Digital Commons

Bureaucracy at War: U. S. Performance in the Vietnam Conflict.

Author: Swarztrauber S. A
Publication venue: U.S. Naval War College Digital Commons
Publication date: 18/05/2018
Field of study

U.S. Naval War College Digital Commons

A multidomain spectral method for solving elliptic equations

Author: Adams
Axelsson
Baden
Barrett
Boyd
Brandt
Canuto
Cook
Cook
Demaret
Deville
Funaro
Gervasio
Gottlieb
Grandclément
Harald P. Pfeiffer
Kidder
Ku
Lawrence E. Kidder
Macaraeg
Mark A. Scheel
Marronetti
Orszag
Orszag
Pfeiffer
Pinelli
Press
Saad
Saul A. Teukolsky
Smith
Swarztrauber
Swarztrauber
Swarztrauber
Publication venue: 'Elsevier BV'
Publication date: 27/02/2002
Field of study

We present a new solver for coupled nonlinear elliptic partial differential equations (PDEs). The solver is based on pseudo-spectral collocation with domain decomposition and can handle one- to three-dimensional problems. It has three distinct features. First, the combined problem of solving the PDE, satisfying the boundary conditions, and matching between different subdomains is cast into one set of equations readily accessible to standard linear and nonlinear solvers. Second, touching as well as overlapping subdomains are supported; both rectangular blocks with Chebyshev basis functions as well as spherical shells with an expansion in spherical harmonics are implemented. Third, the code is very flexible: The domain decomposition as well as the distribution of collocation points in each domain can be chosen at run time, and the solver is easily adaptable to new PDEs. The code has been used to solve the equations of the initial value problem of general relativity and should be useful in many other problems. We compare the new method to finite difference codes and find it superior in both runtime and accuracy, at least for the smooth problems considered here.Comment: 31 pages, 8 figure

arXiv.org e-Print Archive

Crossref

Caltech Authors

CERN Document Server

Fast algorithms for spherical harmonic expansions, III

Author: Adams
Aho
Candès
Cheng
Gu
Gu
Mark Tygert
Martinsson
Michielssen
O’Neil
Reuter
Swarztrauber
Szegö
Tygert
Tyrtyshnikov
Ying
Publication venue: 'Elsevier BV'
Publication date: 05/04/2010
Field of study

We accelerate the computation of spherical harmonic transforms, using what is known as the butterfly scheme. This provides a convenient alternative to the approach taken in the second paper from this series on "Fast algorithms for spherical harmonic expansions." The requisite precomputations become manageable when organized as a "depth-first traversal" of the program's control-flow graph, rather than as the perhaps more natural "breadth-first traversal" that processes one-by-one each level of the multilevel procedure. We illustrate the results via several numerical examples.Comment: 14 pages, 1 figure, 6 table

arXiv.org e-Print Archive

Crossref

A pseudospectral matrix method for time-dependent tensor fields on a spherical shell

Author: Alcubierre
Alvi
Ben-Israel
Bernd Brügmann
Bona
Boyd
Brügmann
Brügmann
Brügmann
Campbell
Cheong
Fornberg
Fornberg
Friedrich
Galassi
Garfinkle
Goldberg
Grandclément
Gundlach
Hesthaven
Kidder
Kostelec
Lindblom
Merilees
Misner
Nath
Newman
Novak
Pretorius
Pretorius
Pretorius
Rinne
Rinne
Ruiz
Spotz
Spotz
Swarztrauber
Swarztrauber
Szilágyi
Tichy
Tichy
Trefethen
Weideman
Wiaux
Wiaux
York
Publication venue: 'Elsevier BV'
Publication date: 18/04/2011
Field of study

We construct a pseudospectral method for the solution of time-dependent, non-linear partial differential equations on a three-dimensional spherical shell. The problem we address is the treatment of tensor fields on the sphere. As a test case we consider the evolution of a single black hole in numerical general relativity. A natural strategy would be the expansion in tensor spherical harmonics in spherical coordinates. Instead, we consider the simpler and potentially more efficient possibility of a double Fourier expansion on the sphere for tensors in Cartesian coordinates. As usual for the double Fourier method, we employ a filter to address time-step limitations and certain stability issues. We find that a tensor filter based on spin-weighted spherical harmonics is successful, while two simplified, non-spin-weighted filters do not lead to stable evolutions. The derivatives and the filter are implemented by matrix multiplication for efficiency. A key technical point is the construction of a matrix multiplication method for the spin-weighted spherical harmonic filter. As example for the efficient parallelization of the double Fourier, spin-weighted filter method we discuss an implementation on a GPU, which achieves a speed-up of up to a factor of 20 compared to a single core CPU implementation.Comment: 33 pages, 9 figure

arXiv.org e-Print Archive

Crossref