81 research outputs found
Computation- and Space-Efficient Implementation of SSA
The computational complexity of different steps of the basic SSA is
discussed. It is shown that the use of the general-purpose "blackbox" routines
(e.g. found in packages like LAPACK) leads to huge waste of time resources
since the special Hankel structure of the trajectory matrix is not taken into
account. We outline several state-of-the-art algorithms (for example,
Lanczos-based truncated SVD) which can be modified to exploit the structure of
the trajectory matrix. The key components here are hankel matrix-vector
multiplication and hankelization operator. We show that both can be computed
efficiently by the means of Fast Fourier Transform. The use of these methods
yields the reduction of the worst-case computational complexity from O(N^3) to
O(k N log(N)), where N is series length and k is the number of eigentriples
desired.Comment: 27 pages, 8 figure
SPAN: A Stochastic Projected Approximate Newton Method
Second-order optimization methods have desirable convergence properties.
However, the exact Newton method requires expensive computation for the Hessian
and its inverse. In this paper, we propose SPAN, a novel approximate and fast
Newton method. SPAN computes the inverse of the Hessian matrix via low-rank
approximation and stochastic Hessian-vector products. Our experiments on
multiple benchmark datasets demonstrate that SPAN outperforms existing
first-order and second-order optimization methods in terms of the convergence
wall-clock time. Furthermore, we provide a theoretical analysis of the
per-iteration complexity, the approximation error, and the convergence rate.
Both the theoretical analysis and experimental results show that our proposed
method achieves a better trade-off between the convergence rate and the
per-iteration efficiency.Comment: Appeared in the AAAI 2020, 25 pages, 6 figure
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or
implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k))
floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data
Parallel time-dependent variational principle algorithm for matrix product states
Combining the time-dependent variational principle (TDVP) algorithm with the
parallelization scheme introduced by Stoudenmire and White for the density
matrix renormalization group (DMRG), we present the first parallel matrix
product state (MPS) algorithm capable of time evolving one-dimensional (1D)
quantum lattice systems with long-range interactions. We benchmark the accuracy
and performance of the algorithm by simulating quenches in the long-range Ising
and XY models. We show that our code scales well up to 32 processes, with
parallel efficiencies as high as 86%. Finally, we calculate the dynamical
correlation function of a 201-site Heisenberg XXX spin chain with
interactions, which is challenging to compute sequentially. These results pave
the way for the application of tensor networks to increasingly complex
many-body systems.Comment: Version accepted for publication in Phys. Rev. B. Text clarified and
references updated. Main text: 11 pages, 13 figures. Appendices: 3 pages, 3
figures. Supplemental material: 4 pages, 3 figure
Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions
Low-rank matrix approximations, such as the truncated singular value
decomposition and the rank-revealing QR decomposition, play a central role in
data analysis and scientific computing. This work surveys and extends recent
research which demonstrates that randomization offers a powerful tool for
performing low-rank matrix approximation. These techniques exploit modern
computational architectures more fully than classical methods and open the
possibility of dealing with truly massive data sets.
This paper presents a modular framework for constructing randomized
algorithms that compute partial matrix decompositions. These methods use random
sampling to identify a subspace that captures most of the action of a matrix.
The input matrix is then compressed---either explicitly or implicitly---to this
subspace, and the reduced matrix is manipulated deterministically to obtain the
desired low-rank factorization. In many cases, this approach beats its
classical competitors in terms of accuracy, speed, and robustness. These claims
are supported by extensive numerical experiments and a detailed error analysis
- …