1,067 research outputs found

    Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

    Get PDF
    Dense kernel matrices ΘRN×N\Theta \in \mathbb{R}^{N \times N} obtained from point evaluations of a covariance function GG at locations {xi}1iNRd\{ x_{i} \}_{1 \leq i \leq N} \subset \mathbb{R}^{d} arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously distributed sampling points, we show how to identify a subset S{1,,N}2S \subset \{ 1 , \dots , N \}^2, with #S=O(Nlog(N)logd(N/ϵ))\# S = \mathcal{O} ( N \log (N) \log^{d} ( N /\epsilon ) ), such that the zero fill-in incomplete Cholesky factorization of the sparse matrix Θij1(i,j)S\Theta_{ij} \mathbf{1}_{( i, j ) \in S} is an ϵ\epsilon-approximation of Θ\Theta. This factorization can provably be obtained in complexity O(Nlog(N)logd(N/ϵ))\mathcal{O} ( N \log( N ) \log^{d}( N /\epsilon) ) in space and O(Nlog2(N)log2d(N/ϵ))\mathcal{O} ( N \log^{2}( N ) \log^{2d}( N /\epsilon) ) in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that dd can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the xix_{i} and does not require an analytic representation of GG. Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion, and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity O(Nlogd(N/ϵ))\mathcal{O} ( N \log^{d}( N /\epsilon) ) in space and O(Nlog2d(N/ϵ))\mathcal{O} ( N \log^{2d}( N /\epsilon) ) in time, improving upon the state of the art for general elliptic operators

    Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

    Get PDF
    Dense kernel matrices Θ∈R^(N×N) obtained from point evaluations of a covariance function G at locations {x_i}1≤i≤N arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions elliptic boundary value problems and approximately equally spaced sampling points, we show how to identify a subset S⊂{1,…,N}×{1,…,N}, with #S=O(Nlog(N)log^d(N/ϵ)), such that the zero fill-in block-incomplete Cholesky decomposition of Θ_(i,j)1_((i,j)∈S) is an ϵ-approximation of Θ. This block-factorisation can provably be obtained in O(Nlog^2(N)(log(1/ϵ)+log^2(N))^(4d+1)) complexity in time. Numerical evidence further suggests that element-wise Cholesky decomposition with the same ordering constitutes an O(Nlog^2(N)log^(2d)(N/ϵ)) solver. The algorithm only needs to know the spatial configuration of the x_i and does not require an analytic representation of G. Furthermore, an approximate PCA with optimal rate of convergence in the operator norm can be easily read off from this decomposition. Hence, by using only subsampling and the incomplete Cholesky decomposition, we obtain at nearly linear complexity the compression, inversion and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky decomposition we also obtain a near-linear-time solver for elliptic PDEs

    Quantum machine learning: a classical perspective

    Get PDF
    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning techniques to impressive results in regression, classification, data-generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets are motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed-up classical machine learning algorithms. Here we review the literature in quantum machine learning and discuss perspectives for a mixed readership of classical machine learning and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in machine learning are identified as promising directions for the field. Practical questions, like how to upload classical data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde

    Kernel Methods are Competitive for Operator Learning

    Full text link
    We present a general kernel-based framework for learning operators between Banach spaces along with a priori error analysis and comprehensive numerical comparisons with popular neural net (NN) approaches such as Deep Operator Net (DeepONet) [Lu et al.] and Fourier Neural Operator (FNO) [Li et al.]. We consider the setting where the input/output spaces of target operator G:UV\mathcal{G}^\dagger\,:\, \mathcal{U}\to \mathcal{V} are reproducing kernel Hilbert spaces (RKHS), the data comes in the form of partial observations ϕ(ui),φ(vi)\phi(u_i), \varphi(v_i) of input/output functions vi=G(ui)v_i=\mathcal{G}^\dagger(u_i) (i=1,,Ni=1,\ldots,N), and the measurement operators ϕ:URn\phi\,:\, \mathcal{U}\to \mathbb{R}^n and φ:VRm\varphi\,:\, \mathcal{V} \to \mathbb{R}^m are linear. Writing ψ:RnU\psi\,:\, \mathbb{R}^n \to \mathcal{U} and χ:RmV\chi\,:\, \mathbb{R}^m \to \mathcal{V} for the optimal recovery maps associated with ϕ\phi and φ\varphi, we approximate G\mathcal{G}^\dagger with Gˉ=χfˉϕ\bar{\mathcal{G}}=\chi \circ \bar{f} \circ \phi where fˉ\bar{f} is an optimal recovery approximation of f:=φGψ:RnRmf^\dagger:=\varphi \circ \mathcal{G}^\dagger \circ \psi\,:\,\mathbb{R}^n \to \mathbb{R}^m. We show that, even when using vanilla kernels (e.g., linear or Mat\'{e}rn), our approach is competitive in terms of cost-accuracy trade-off and either matches or beats the performance of NN methods on a majority of benchmarks. Additionally, our framework offers several advantages inherited from kernel methods: simplicity, interpretability, convergence guarantees, a priori error estimates, and Bayesian uncertainty quantification. As such, it can serve as a natural benchmark for operator learning.Comment: 35 pages, 10 figure

    Tensor Networks for Big Data Analytics and Large-Scale Optimization Problems

    Full text link
    In this paper we review basic and emerging models and associated algorithms for large-scale tensor networks, especially Tensor Train (TT) decompositions using novel mathematical and graphical representations. We discus the concept of tensorization (i.e., creating very high-order tensors from lower-order original data) and super compression of data achieved via quantized tensor train (QTT) networks. The purpose of a tensorization and quantization is to achieve, via low-rank tensor approximations "super" compression, and meaningful, compact representation of structured data. The main objective of this paper is to show how tensor networks can be used to solve a wide class of big data optimization problems (that are far from tractable by classical numerical methods) by applying tensorization and performing all operations using relatively small size matrices and tensors and applying iteratively optimized and approximative tensor contractions. Keywords: Tensor networks, tensor train (TT) decompositions, matrix product states (MPS), matrix product operators (MPO), basic tensor operations, tensorization, distributed representation od data optimization problems for very large-scale problems: generalized eigenvalue decomposition (GEVD), PCA/SVD, canonical correlation analysis (CCA).Comment: arXiv admin note: text overlap with arXiv:1403.204

    Certified and fast computations with shallow covariance kernels

    Full text link
    Many techniques for data science and uncertainty quantification demand efficient tools to handle Gaussian random fields, which are defined in terms of their mean functions and covariance operators. Recently, parameterized Gaussian random fields have gained increased attention, due to their higher degree of flexibility. However, especially if the random field is parameterized through its covariance operator, classical random field discretization techniques fail or become inefficient. In this work we introduce and analyze a new and certified algorithm for the low-rank approximation of a parameterized family of covariance operators which represents an extension of the adaptive cross approximation method for symmetric positive definite matrices. The algorithm relies on an affine linear expansion of the covariance operator with respect to the parameters, which needs to be computed in a preprocessing step using, e.g., the empirical interpolation method. We discuss and test our new approach for isotropic covariance kernels, such as Mat\'ern kernels. The numerical results demonstrate the advantages of our approach in terms of computational time and confirm that the proposed algorithm provides the basis of a fast sampling procedure for parameter dependent Gaussian random fields
    corecore