1,471 research outputs found
Butterfly Factorization
The paper introduces the butterfly factorization as a data-sparse
approximation for the matrices that satisfy a complementary low-rank property.
The factorization can be constructed efficiently if either fast algorithms for
applying the matrix and its adjoint are available or the entries of the matrix
can be sampled individually. For an matrix, the resulting
factorization is a product of sparse matrices, each with
non-zero entries. Hence, it can be applied rapidly in operations.
Numerical results are provided to demonstrate the effectiveness of the
butterfly factorization and its construction algorithms
A parallel butterfly algorithm
The butterfly algorithm is a fast algorithm which approximately evaluates a
discrete analogue of the integral transform \int K(x,y) g(y) dy at large
numbers of target points when the kernel, K(x,y), is approximately low-rank
when restricted to subdomains satisfying a certain simple geometric condition.
In d dimensions with O(N^d) quasi-uniformly distributed source and target
points, when each appropriate submatrix of K is approximately rank-r, the
running time of the algorithm is at most O(r^2 N^d log N). A parallelization of
the butterfly algorithm is introduced which, assuming a message latency of
\alpha and per-process inverse bandwidth of \beta, executes in at most O(r^2
N^d/p log N + \beta r N^d/p + \alpha)log p) time using p processes. This
parallel algorithm was then instantiated in the form of the open-source
DistButterfly library for the special case where K(x,y)=exp(i \Phi(x,y)), where
\Phi(x,y) is a black-box, sufficiently smooth, real-valued phase function.
Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for
important classes of phase functions. Using quasi-uniform sources, hyperbolic
Radon transforms and an analogue of a 3D generalized Radon transform were
respectively observed to strong-scale from 1-node/16-cores up to
1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively.Comment: To appear in SIAM Journal on Scientific Computin
Fast hyperbolic Radon transform represented as convolutions in log-polar coordinates
The hyperbolic Radon transform is a commonly used tool in seismic processing,
for instance in seismic velocity analysis, data interpolation and for multiple
removal. A direct implementation by summation of traces with different moveouts
is computationally expensive for large data sets. In this paper we present a
new method for fast computation of the hyperbolic Radon transforms. It is based
on using a log-polar sampling with which the main computational parts reduce to
computing convolutions. This allows for fast implementations by means of FFT.
In addition to the FFT operations, interpolation procedures are required for
switching between coordinates in the time-offset; Radon; and log-polar domains.
Graphical Processor Units (GPUs) are suitable to use as a computational
platform for this purpose, due to the hardware supported interpolation routines
as well as optimized routines for FFT. Performance tests show large speed-ups
of the proposed algorithm. Hence, it is suitable to use in iterative methods,
and we provide examples for data interpolation and multiple removal using this
approach.Comment: 21 pages, 10 figures, 2 table
Signal Flow Graph Approach to Efficient DST I-IV Algorithms
In this paper, fast and efficient discrete sine transformation (DST)
algorithms are presented based on the factorization of sparse, scaled
orthogonal, rotation, rotation-reflection, and butterfly matrices. These
algorithms are completely recursive and solely based on DST I-IV. The presented
algorithms have low arithmetic cost compared to the known fast DST algorithms.
Furthermore, the language of signal flow graph representation of digital
structures is used to describe these efficient and recursive DST algorithms
having points signal flow graph for DST-I and points signal flow
graphs for DST II-IV
- …