1,067 research outputs found
Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity
Dense kernel matrices obtained from point evaluations of a covariance function at locations arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously distributed sampling points, we show how to identify a subset , with , such that the zero fill-in incomplete Cholesky factorization of the sparse matrix is an -approximation of . This factorization can provably be obtained in complexity in space and in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the and does not require an analytic representation of . Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion, and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity in space and in time, improving upon the state of the art for general elliptic operators
Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity
Dense kernel matrices Θ∈R^(N×N) obtained from point evaluations of a covariance function G at locations {x_i}1≤i≤N arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions elliptic boundary value problems and approximately equally spaced sampling points, we show how to identify a subset S⊂{1,…,N}×{1,…,N}, with #S=O(Nlog(N)log^d(N/ϵ)), such that the zero fill-in block-incomplete Cholesky decomposition of Θ_(i,j)1_((i,j)∈S) is an ϵ-approximation of Θ. This block-factorisation can provably be obtained in O(Nlog^2(N)(log(1/ϵ)+log^2(N))^(4d+1)) complexity in time. Numerical evidence further suggests that element-wise Cholesky decomposition with the same ordering constitutes an O(Nlog^2(N)log^(2d)(N/ϵ)) solver. The algorithm only needs to know the spatial configuration of the x_i and does not require an analytic representation of G. Furthermore, an approximate PCA with optimal rate of convergence in the operator norm can be easily read off from this decomposition. Hence, by using only subsampling and the incomplete Cholesky decomposition, we obtain at nearly linear complexity the compression, inversion and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky decomposition we also obtain a near-linear-time solver for elliptic PDEs
Quantum machine learning: a classical perspective
Recently, increased computational power and data availability, as well as
algorithmic advances, have led machine learning techniques to impressive
results in regression, classification, data-generation and reinforcement
learning tasks. Despite these successes, the proximity to the physical limits
of chip fabrication alongside the increasing size of datasets are motivating a
growing number of researchers to explore the possibility of harnessing the
power of quantum computation to speed-up classical machine learning algorithms.
Here we review the literature in quantum machine learning and discuss
perspectives for a mixed readership of classical machine learning and quantum
computation experts. Particular emphasis will be placed on clarifying the
limitations of quantum algorithms, how they compare with their best classical
counterparts and why quantum resources are expected to provide advantages for
learning problems. Learning in the presence of noise and certain
computationally hard problems in machine learning are identified as promising
directions for the field. Practical questions, like how to upload classical
data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde
Kernel Methods are Competitive for Operator Learning
We present a general kernel-based framework for learning operators between
Banach spaces along with a priori error analysis and comprehensive numerical
comparisons with popular neural net (NN) approaches such as Deep Operator Net
(DeepONet) [Lu et al.] and Fourier Neural Operator (FNO) [Li et al.]. We
consider the setting where the input/output spaces of target operator
are reproducing kernel
Hilbert spaces (RKHS), the data comes in the form of partial observations
of input/output functions
(), and the measurement operators
and are linear. Writing and
for the optimal recovery maps
associated with and , we approximate with
where is an optimal
recovery approximation of . We show that, even when using vanilla
kernels (e.g., linear or Mat\'{e}rn), our approach is competitive in terms of
cost-accuracy trade-off and either matches or beats the performance of NN
methods on a majority of benchmarks. Additionally, our framework offers several
advantages inherited from kernel methods: simplicity, interpretability,
convergence guarantees, a priori error estimates, and Bayesian uncertainty
quantification. As such, it can serve as a natural benchmark for operator
learning.Comment: 35 pages, 10 figure
Tensor Networks for Big Data Analytics and Large-Scale Optimization Problems
In this paper we review basic and emerging models and associated algorithms
for large-scale tensor networks, especially Tensor Train (TT) decompositions
using novel mathematical and graphical representations. We discus the concept
of tensorization (i.e., creating very high-order tensors from lower-order
original data) and super compression of data achieved via quantized tensor
train (QTT) networks. The purpose of a tensorization and quantization is to
achieve, via low-rank tensor approximations "super" compression, and
meaningful, compact representation of structured data. The main objective of
this paper is to show how tensor networks can be used to solve a wide class of
big data optimization problems (that are far from tractable by classical
numerical methods) by applying tensorization and performing all operations
using relatively small size matrices and tensors and applying iteratively
optimized and approximative tensor contractions.
Keywords: Tensor networks, tensor train (TT) decompositions, matrix product
states (MPS), matrix product operators (MPO), basic tensor operations,
tensorization, distributed representation od data optimization problems for
very large-scale problems: generalized eigenvalue decomposition (GEVD),
PCA/SVD, canonical correlation analysis (CCA).Comment: arXiv admin note: text overlap with arXiv:1403.204
Certified and fast computations with shallow covariance kernels
Many techniques for data science and uncertainty quantification demand
efficient tools to handle Gaussian random fields, which are defined in terms of
their mean functions and covariance operators. Recently, parameterized Gaussian
random fields have gained increased attention, due to their higher degree of
flexibility. However, especially if the random field is parameterized through
its covariance operator, classical random field discretization techniques fail
or become inefficient. In this work we introduce and analyze a new and
certified algorithm for the low-rank approximation of a parameterized family of
covariance operators which represents an extension of the adaptive cross
approximation method for symmetric positive definite matrices. The algorithm
relies on an affine linear expansion of the covariance operator with respect to
the parameters, which needs to be computed in a preprocessing step using, e.g.,
the empirical interpolation method. We discuss and test our new approach for
isotropic covariance kernels, such as Mat\'ern kernels. The numerical results
demonstrate the advantages of our approach in terms of computational time and
confirm that the proposed algorithm provides the basis of a fast sampling
procedure for parameter dependent Gaussian random fields
- …