Search CORE

637 research outputs found

Fast evaluation of real and complex exponential sums

Author: Kunis Stefan
Melzer Ines
Publication venue
Publication date: 10/06/2016
Field of study

Recently, the butterfly approximation scheme and hierarchical approximations have been proposed for the efficient computation of integral transforms with oscillatory and with asymptotically smooth kernels. Combining both approaches, we propose a certain fast Fourier-Laplace transform, which in particular allows for the fast evaluation of polynomials at nodes in the complex unit disk. All theoretical results are illustrated by numerical experiments

arXiv.org e-Print Archive

A Fast Butterfly Algorithm for the Computation of Fourier Integral Operators

Author: Candes Emmanuel
Demanet Laurent
Ying Lexing
Publication venue
Publication date: 01/01/2008
Field of study

This paper is concerned with the fast computation of Fourier integral operators of the general form \int_{\R^d} e^{2\pi\i \Phi(x,k)} f(k) d k, where

k

is a frequency variable,

\Phi(x,k)

is a phase function obeying a standard homogeneity condition, and

f

is a given input. This is of interest for such fundamental computations are connected with the problem of finding numerical solutions to wave equations, and also frequently arise in many applications including reflection seismology, curvilinear tomography and others. In two dimensions, when the input and output are sampled on

N \times N

Cartesian grids, a direct evaluation requires

O(N^4)

operations, which is often times prohibitively expensive. This paper introduces a novel algorithm running in

O(N^2 \log N)

time, i. e. with near-optimal computational complexity, and whose overall structure follows that of the butterfly algorithm [Michielssen and Boag, IEEE Trans Antennas Propagat 44 (1996), 1086-1093]. Underlying this algorithm is a mathematical insight concerning the restriction of the kernel e^{2\pi\i \Phi(x,k)} to subsets of the time and frequency domains. Whenever these subsets obey a simple geometric condition, the restricted kernel has approximately low-rank; we propose constructing such low-rank approximations using a special interpolation scheme, which prefactors the oscillatory component, interpolates the remaining nonoscillatory part and, lastly, remodulates the outcome. A byproduct of this scheme is that the whole algorithm is highly efficient in terms of memory requirement. Numerical results demonstrate the performance and illustrate the empirical properties of this algorithm.Comment: 25 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Caltech Authors

A Multiscale Butterfly Algorithm for Multidimensional Fourier Integral Operators

Author: Li Yingzhou
Yang Haizhao
Ying Lexing
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 04/02/2015
Field of study

This paper presents an efficient multiscale butterfly algorithm for computing Fourier integral operators (FIOs) of the form (\mathcal{L} f)(x) = \int_{R^d}a(x,\xi) e^{2\pi \i \Phi(x,\xi)}\hat{f}(\xi) d\xi, where

\Phi(x,\xi)

is a phase function,

a(x,\xi)

is an amplitude function, and

f(x)

is a given input. The frequency domain is hierarchically decomposed into a union of Cartesian coronas. The integral kernel a(x,\xi) e^{2\pi \i \Phi(x,\xi)} in each corona satisfies a special low-rank property that enables the application of a butterfly algorithm on the Cartesian phase-space grid. This leads to an algorithm with quasi-linear operation complexity and linear memory complexity. Different from previous butterfly methods for the FIOs, this new approach is simple and reduces the computational cost by avoiding extra coordinate transformations. Numerical examples in two and three dimensions are provided to demonstrate the practical advantages of the new algorithm

arXiv.org e-Print Archive

Butterfly-Net: Optimal Function Representation Based on Convolutional Neural Networks

Author: Cheng Xiuyuan
Li Yingzhou
Lu Jianfeng
Publication venue
Publication date: 30/04/2020
Field of study

Deep networks, especially convolutional neural networks (CNNs), have been successfully applied in various areas of machine learning as well as to challenging problems in other scientific and engineering fields. This paper introduces Butterfly-Net, a low-complexity CNN with structured and sparse cross-channel connections, together with a Butterfly initialization strategy for a family of networks. Theoretical analysis of the approximation power of Butterfly-Net to the Fourier representation of input data shows that the error decays exponentially as the depth increases. Combining Butterfly-Net with a fully connected neural network, a large class of problems are proved to be well approximated with network complexity depending on the effective frequency bandwidth instead of the input dimension. Regular CNN is covered as a special case in our analysis. Numerical experiments validate the analytical results on the approximation of Fourier kernels and energy functionals of Poisson's equations. Moreover, all experiments support that training from Butterfly initialization outperforms training from random initialization. Also, adding the remaining cross-channel connections, although significantly increase the parameter number, does not much improve the post-training accuracy and is more sensitive to data distribution

arXiv.org e-Print Archive

A Unified Framework for Oscillatory Integral Transform: When to use NUFFT or Butterfly Factorization?

Author: Yang Haizhao
Publication venue: 'Elsevier BV'
Publication date: 08/10/2018
Field of study

This paper concerns the fast evaluation of the matvec

g=Kf

for

K\in \mathbb{C}^{N\times N}

, which is the discretization of the oscillatory integral transform

g(x) = \int K(x,\xi) f(\xi)d\xi

with a kernel function K(x,\xi)=\alpha(x,\xi)e^{2\pi\i \Phi(x,\xi)}, where

\alpha(x,\xi)

is a smooth amplitude function, and

\Phi(x,\xi)

is a piecewise smooth phase function with

O(1)

discontinuous points in

x

and

\xi

. A unified framework is proposed to compute

Kf

with

O(N\log N)

time and memory complexity via the non-uniform fast Fourier transform (NUFFT) or the butterfly factorization (BF), together with an

O(N)

fast algorithm to determine whether NUFFT or BF is more suitable. This framework works for two cases: 1) explicit formulas for the amplitude and phase functions are known, 2) only indirect access of the amplitude and phase functions are available. Especially in the case of indirect access, our main contributions are: 1) an

O(N\log N)

algorithm for recovering the amplitude and phase functions is proposed based on a new low-rank matrix recovery algorithm, 2) a new stable and nearly optimal BF with amplitude and phase functions in a form of a low-rank factorization (IBF-MAT) is proposed to evaluate the matvec

Kf

. Numerical results are provided to demonstrate the effectiveness of the proposed framework

arXiv.org e-Print Archive

Fast and backward stable transforms between spherical harmonic expansions and bivariate Fourier series

Author: Slevinsky Richard Mikael
Publication venue
Publication date: 04/11/2017
Field of study

A rapid transformation is derived between spherical harmonic expansions and their analogues in a bivariate Fourier series. The change of basis is described in two steps: firstly, expansions in normalized associated Legendre functions of all orders are converted to those of order zero and one; then, these intermediate expressions are re-expanded in trigonometric form. The first step proceeds with a butterfly factorization of the well-conditioned matrices of connection coefficients. The second step proceeds with fast orthogonal polynomial transforms via hierarchically off-diagonal low-rank matrix decompositions. Total pre-computation requires at best

\mathcal{O}(n^3\log n)

flops; and, asymptotically optimal execution time of

\mathcal{O}(n^2\log^2 n)

is rigorously proved via connection to Fourier integral operators.Comment: arXiv admin note: text overlap with arXiv:0910.5435 by other author

arXiv.org e-Print Archive

Recommended from our members

Sparse Recovery and Representation Learning

Author: Liang Jingwen
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

This dissertation focuses on sparse representation and dictionary learning, with three relative topics. First, in chapter 1, we study the problem of low-rank matrix recovery in the presence of prior information. We first study the recovery of low-rank matrices with a necessary and sufficient condition, called the Null Space Property, for exact recovery from compressively sampled measurements using nuclear norm minimization. Here, we provide an alternative theoretical analysis of the bound on the number of random Gaussian measurements needed for the condition to be satisfied with high probability. We then study low-rank matrix recovery when prior information is available. We analyze an existing algorithm, provide the necessary and sufficient conditions for exact recovery and show that the existing algorithm is limited in certain cases. We provide an alternative recovery algorithm to deal with the drawback and provide sufficient recovery conditions based on that. In chapter 2, we study the problem of learning a sparsifying dictionary of a set of data, focusing on learning dictionaries that admit fast transforms. Inspired by the Fast Fourier Transform, we propose a learning algorithm involving

O(N)

unknown parameters for a

N\times N

linear transformation matrix. Empirically, our algorithm can produce dictionaries that provide lower numerical sparsity for the sparse representation of images than the Discrete Fourier Transformation (DFT). Additionally, due to its structure, the learned dictionary can recover the original signal from the sparse representation in

O(N\log N)

computations. In chapter 3, we study the representation learning problem in a more complex setting. We use the concept of dictionary learning and apply it in a deep generative model. Motivated by an application in the computer gaming industry where designers needs to have an urban layout generation tool that allows fast generation and modification, we present a novel solution to synthesize high quality building placements using conditional generative latent optimization together with adversarial training. The capability of the proposed method is demonstrated in various examples. The inference is nearly in real time, thus it can assist designers to iterate their designs of virtual cities quickly

eScholarship - University of California

A fast butterfly algorithm for the hyperbolic Radon transform

Author: Demanet Laurent
Fomel Sergey
Hu Jingwei
Ying Lexing
Publication venue: Massachusetts Institute of Technology. Earth Resources Laboratory
Publication date: 01/01/2012
Field of study

We introduce a fast butterfly algorithm for the hyperbolic Radon transform commonly used in seismic data processing. For two-dimensional data, the algorithm runs in complexity O(N[superscript 2] logN), where N is representative of the number of points in either dimension of data space or model space. Using a series of examples, we show that the proposed algorithm is significantly more efficient than conventional integration

Randomized estimation of spectral densities of large matrices made accurate

Author: Lin Lin
Publication venue
Publication date: 23/11/2015
Field of study

For a large Hermitian matrix

A\in \mathbb{C}^{N\times N}

, it is often the case that the only affordable operation is matrix-vector multiplication. In such case, randomized method is a powerful way to estimate the spectral density (or density of states) of

A

. However, randomized methods developed so far for estimating spectral densities only extract information from different random vectors independently, and the accuracy is therefore inherently limited to

\mathcal{O}(1/\sqrt{N_{v}})

where

N_{v}

is the number of random vectors. In this paper we demonstrate that the "

\mathcal{O}(1/\sqrt{N_{v}})

barrier" can be overcome by taking advantage of the correlated information of random vectors when properly filtered by polynomials of

A

. Our method uses the fact that the estimation of the spectral density essentially requires the computation of the trace of a series of matrix functions that are numerically low rank. By repeatedly applying

A

to the same set of random vectors and taking different linear combination of the results, we can sweep through the entire spectrum of

A

by building such low rank decomposition at different parts of the spectrum. Under some assumptions, we demonstrate that a robust and efficient implementation of such spectrum sweeping method can compute the spectral density accurately with

\mathcal{O}(N^2)

computational cost and

\mathcal{O}(N)

memory cost. Numerical results indicate that the new method can significantly outperform existing randomized methods in terms of accuracy. As an application, we demonstrate a way to accurately compute a trace of a smooth matrix function, by carefully balancing the smoothness of the integrand and the regularized density of states using a deconvolution procedure

arXiv.org e-Print Archive

Monarch: Expressive Structured Matrices for Efficient and Accurate Training

Author: Chen Beidi
Dao Tri
Desai Arjun
Grogan Jessica
Liu Alexander
Poli Michael
Rao Aniruddh
Rudra Atri
Ré Christopher
Sohoni Nimit
Publication venue
Publication date: 01/04/2022
Field of study

Large neural networks excel in many domains, but they are expensive to train and fine-tune. A popular approach to reduce their compute or memory requirements is to replace dense weight matrices with structured ones (e.g., sparse, low-rank, Fourier transform). These methods have not seen widespread adoption (1) in end-to-end training due to unfavorable efficiency--quality tradeoffs, and (2) in dense-to-sparse fine-tuning due to lack of tractable algorithms to approximate a given dense weight matrix. To address these issues, we propose a class of matrices (Monarch) that is hardware-efficient (they are parameterized as products of two block-diagonal matrices for better hardware utilization) and expressive (they can represent many commonly used transforms). Surprisingly, the problem of approximating a dense weight matrix with a Monarch matrix, though nonconvex, has an analytical optimal solution. These properties of Monarch matrices unlock new ways to train and fine-tune sparse and dense models. We empirically validate that Monarch can achieve favorable accuracy-efficiency tradeoffs in several end-to-end sparse training applications: speeding up ViT and GPT-2 training on ImageNet classification and Wikitext-103 language modeling by 2x with comparable model quality, and reducing the error on PDE solving and MRI reconstruction tasks by 40%. In sparse-to-dense training, with a simple technique called "reverse sparsification," Monarch matrices serve as a useful intermediate representation to speed up GPT-2 pretraining on OpenWebText by 2x without quality drop. The same technique brings 23% faster BERT pretraining than even the very optimized implementation from Nvidia that set the MLPerf 1.1 record. In dense-to-sparse fine-tuning, as a proof-of-concept, our Monarch approximation algorithm speeds up BERT fine-tuning on GLUE by 1.7x with comparable accuracy

arXiv.org e-Print Archive