27 research outputs found

    Fast algorithms for spherical harmonic expansions, III

    Full text link
    We accelerate the computation of spherical harmonic transforms, using what is known as the butterfly scheme. This provides a convenient alternative to the approach taken in the second paper from this series on "Fast algorithms for spherical harmonic expansions." The requisite precomputations become manageable when organized as a "depth-first traversal" of the program's control-flow graph, rather than as the perhaps more natural "breadth-first traversal" that processes one-by-one each level of the multilevel procedure. We illustrate the results via several numerical examples.Comment: 14 pages, 1 figure, 6 table

    A parallel butterfly algorithm

    Full text link
    The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform \int K(x,y) g(y) dy at large numbers of target points when the kernel, K(x,y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(N^d) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r^2 N^d log N). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of \alpha and per-process inverse bandwidth of \beta, executes in at most O(r^2 N^d/p log N + \beta r N^d/p + \alpha)log p) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x,y)=exp(i \Phi(x,y)), where \Phi(x,y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms and an analogue of a 3D generalized Radon transform were respectively observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively.Comment: To appear in SIAM Journal on Scientific Computin

    Butterfly Factorization

    Full text link
    The paper introduces the butterfly factorization as a data-sparse approximation for the matrices that satisfy a complementary low-rank property. The factorization can be constructed efficiently if either fast algorithms for applying the matrix and its adjoint are available or the entries of the matrix can be sampled individually. For an N×NN \times N matrix, the resulting factorization is a product of O(log⁡N)O(\log N) sparse matrices, each with O(N)O(N) non-zero entries. Hence, it can be applied rapidly in O(Nlog⁡N)O(N\log N) operations. Numerical results are provided to demonstrate the effectiveness of the butterfly factorization and its construction algorithms

    A butterfly‐based direct solver using hierarchical LU factorization for Poggio‐Miller‐Chang‐Harrington‐Wu‐Tsai equations

    Full text link
    A butterfly‐based hierarchical LU factorization scheme for solving the PMCHWT equations for analyzing scattering from homogenous dielectric objects is presented. The proposed solver judiciously re‐orders the discretized integral operator and butterfly‐compresses blocks in the operator and its LU factors. The observed memory and CPU complexities scale as O(N log2 N) and O(N1.5 log N), respectively. The proposed solver is applied to the analyses of scattering several large‐scale dielectric objects.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/143676/1/mop31166.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/143676/2/mop31166_am.pd