27 research outputs found
Fast algorithms for spherical harmonic expansions, III
We accelerate the computation of spherical harmonic transforms, using what is
known as the butterfly scheme. This provides a convenient alternative to the
approach taken in the second paper from this series on "Fast algorithms for
spherical harmonic expansions." The requisite precomputations become manageable
when organized as a "depth-first traversal" of the program's control-flow
graph, rather than as the perhaps more natural "breadth-first traversal" that
processes one-by-one each level of the multilevel procedure. We illustrate the
results via several numerical examples.Comment: 14 pages, 1 figure, 6 table
A parallel butterfly algorithm
The butterfly algorithm is a fast algorithm which approximately evaluates a
discrete analogue of the integral transform \int K(x,y) g(y) dy at large
numbers of target points when the kernel, K(x,y), is approximately low-rank
when restricted to subdomains satisfying a certain simple geometric condition.
In d dimensions with O(N^d) quasi-uniformly distributed source and target
points, when each appropriate submatrix of K is approximately rank-r, the
running time of the algorithm is at most O(r^2 N^d log N). A parallelization of
the butterfly algorithm is introduced which, assuming a message latency of
\alpha and per-process inverse bandwidth of \beta, executes in at most O(r^2
N^d/p log N + \beta r N^d/p + \alpha)log p) time using p processes. This
parallel algorithm was then instantiated in the form of the open-source
DistButterfly library for the special case where K(x,y)=exp(i \Phi(x,y)), where
\Phi(x,y) is a black-box, sufficiently smooth, real-valued phase function.
Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for
important classes of phase functions. Using quasi-uniform sources, hyperbolic
Radon transforms and an analogue of a 3D generalized Radon transform were
respectively observed to strong-scale from 1-node/16-cores up to
1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively.Comment: To appear in SIAM Journal on Scientific Computin
Butterfly Factorization
The paper introduces the butterfly factorization as a data-sparse
approximation for the matrices that satisfy a complementary low-rank property.
The factorization can be constructed efficiently if either fast algorithms for
applying the matrix and its adjoint are available or the entries of the matrix
can be sampled individually. For an matrix, the resulting
factorization is a product of sparse matrices, each with
non-zero entries. Hence, it can be applied rapidly in operations.
Numerical results are provided to demonstrate the effectiveness of the
butterfly factorization and its construction algorithms
A butterflyâbased direct solver using hierarchical LU factorization for PoggioâMillerâChangâHarringtonâWuâTsai equations
A butterflyâbased hierarchical LU factorization scheme for solving the PMCHWT equations for analyzing scattering from homogenous dielectric objects is presented. The proposed solver judiciously reâorders the discretized integral operator and butterflyâcompresses blocks in the operator and its LU factors. The observed memory and CPU complexities scale as O(N log2 N) and O(N1.5 log N), respectively. The proposed solver is applied to the analyses of scattering several largeâscale dielectric objects.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/143676/1/mop31166.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/143676/2/mop31166_am.pd