21 research outputs found
Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions
Low-rank matrix approximations, such as the truncated singular value
decomposition and the rank-revealing QR decomposition, play a central role in
data analysis and scientific computing. This work surveys and extends recent
research which demonstrates that randomization offers a powerful tool for
performing low-rank matrix approximation. These techniques exploit modern
computational architectures more fully than classical methods and open the
possibility of dealing with truly massive data sets.
This paper presents a modular framework for constructing randomized
algorithms that compute partial matrix decompositions. These methods use random
sampling to identify a subspace that captures most of the action of a matrix.
The input matrix is then compressed---either explicitly or implicitly---to this
subspace, and the reduced matrix is manipulated deterministically to obtain the
desired low-rank factorization. In many cases, this approach beats its
classical competitors in terms of accuracy, speed, and robustness. These claims
are supported by extensive numerical experiments and a detailed error analysis
LU decomposition and Toeplitz decomposition of a neural network
It is well-known that any matrix has an LU decomposition. Less well-known
is the fact that it has a 'Toeplitz decomposition'
where 's are Toeplitz matrices. We will prove that any continuous function
has an approximation to arbitrary accuracy
by a neural network that takes the form , i.e., where the weight matrices alternate
between lower and upper triangular matrices,
for some bias vector , and the activation may be chosen to be
essentially any uniformly continuous nonpolynomial function. The same result
also holds with Toeplitz matrices, i.e., to arbitrary accuracy, and likewise for Hankel
matrices. A consequence of our Toeplitz result is a fixed-width universal
approximation theorem for convolutional neural networks, which so far have only
arbitrary width versions. Since our results apply in particular to the case
when is a general neural network, we may regard them as LU and Toeplitz
decompositions of a neural network. The practical implication of our results is
that one may vastly reduce the number of weight parameters in a neural network
without sacrificing its power of universal approximation. We will present
several experiments on real data sets to show that imposing such structures on
the weight matrices sharply reduces the number of training parameters with
almost no noticeable effect on test accuracy.Comment: 14 pages, 3 figure
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or
implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k))
floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data
Support estimation of a sample space-time covariance matrix
The ensemble-optimum support for a sample space-time covariance matrix can be determined from the ground truth space-time covariance, and the variance of the estimator. In this paper we provide approximations that permit the estimation of the sample-optimum support from the estimate itself, given a suitable detection threshold. In simulations, we provide some insight into the (in)sensitivity and dependencies of this threshold