29 research outputs found
Fast Computation of Fourier Integral Operators
We introduce a general purpose algorithm for rapidly computing certain types
of oscillatory integrals which frequently arise in problems connected to wave
propagation and general hyperbolic equations. The problem is to evaluate
numerically a so-called Fourier integral operator (FIO) of the form at points given on
a Cartesian grid. Here, is a frequency variable, is the
Fourier transform of the input , is an amplitude and
is a phase function, which is typically as large as ;
hence the integral is highly oscillatory at high frequencies. Because an FIO is
a dense matrix, a naive matrix vector product with an input given on a
Cartesian grid of size by would require operations.
This paper develops a new numerical algorithm which requires operations, and as low as in storage space. It operates by
localizing the integral over polar wedges with small angular aperture in the
frequency plane. On each wedge, the algorithm factorizes the kernel into two components: 1) a diffeomorphism which is
handled by means of a nonuniform FFT and 2) a residual factor which is handled
by numerical separation of the spatial and frequency variables. The key to the
complexity and accuracy estimates is that the separation rank of the residual
kernel is \emph{provably independent of the problem size}. Several numerical
examples demonstrate the efficiency and accuracy of the proposed methodology.
We also discuss the potential of our ideas for various applications such as
reflection seismology.Comment: 31 pages, 3 figure
A parallel butterfly algorithm
The butterfly algorithm is a fast algorithm which approximately evaluates a
discrete analogue of the integral transform \int K(x,y) g(y) dy at large
numbers of target points when the kernel, K(x,y), is approximately low-rank
when restricted to subdomains satisfying a certain simple geometric condition.
In d dimensions with O(N^d) quasi-uniformly distributed source and target
points, when each appropriate submatrix of K is approximately rank-r, the
running time of the algorithm is at most O(r^2 N^d log N). A parallelization of
the butterfly algorithm is introduced which, assuming a message latency of
\alpha and per-process inverse bandwidth of \beta, executes in at most O(r^2
N^d/p log N + \beta r N^d/p + \alpha)log p) time using p processes. This
parallel algorithm was then instantiated in the form of the open-source
DistButterfly library for the special case where K(x,y)=exp(i \Phi(x,y)), where
\Phi(x,y) is a black-box, sufficiently smooth, real-valued phase function.
Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for
important classes of phase functions. Using quasi-uniform sources, hyperbolic
Radon transforms and an analogue of a 3D generalized Radon transform were
respectively observed to strong-scale from 1-node/16-cores up to
1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively.Comment: To appear in SIAM Journal on Scientific Computin