Search CORE

1,067 research outputs found

Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

Author: Owhadi Houman
Schäfer Florian
Sullivan T. J.
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 07/06/2017
Field of study

Dense kernel matrices

\Theta \in \mathbb{R}^{N \times N}

obtained from point evaluations of a covariance function

G

at locations

\{ x_{i} \}_{1 \leq i \leq N} \subset \mathbb{R}^{d}

arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously distributed sampling points, we show how to identify a subset

S \subset \{ 1 , \dots , N \}^2

, with

\# S = \mathcal{O} ( N \log (N) \log^{d} ( N /\epsilon ) )

, such that the zero fill-in incomplete Cholesky factorization of the sparse matrix

\Theta_{ij} \mathbf{1}_{( i, j ) \in S}

is an

\epsilon

-approximation of

\Theta

. This factorization can provably be obtained in complexity

\mathcal{O} ( N \log( N ) \log^{d}( N /\epsilon) )

in space and

\mathcal{O} ( N \log^{2}( N ) \log^{2d}( N /\epsilon) )

in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that

d

can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the

x_{i}

and does not require an analytic representation of

G

. Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion, and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity

\mathcal{O} ( N \log^{d}( N /\epsilon) )

in space and

\mathcal{O} ( N \log^{2d}( N /\epsilon) )

in time, improving upon the state of the art for general elliptic operators

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository

Caltech Authors

Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity

Author: Owhadi H.
Schäfer F.
Sullivan T. J.
Publication venue
Publication date: 07/06/2017
Field of study

Dense kernel matrices Θ∈R^(N×N) obtained from point evaluations of a covariance function G at locations {x_i}1≤i≤N arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions elliptic boundary value problems and approximately equally spaced sampling points, we show how to identify a subset S⊂{1,…,N}×{1,…,N}, with #S=O(Nlog(N)log^d(N/ϵ)), such that the zero fill-in block-incomplete Cholesky decomposition of Θ_(i,j)1_((i,j)∈S) is an ϵ-approximation of Θ. This block-factorisation can provably be obtained in O(Nlog^2(N)(log(1/ϵ)+log^2(N))^(4d+1)) complexity in time. Numerical evidence further suggests that element-wise Cholesky decomposition with the same ordering constitutes an O(Nlog^2(N)log^(2d)(N/ϵ)) solver. The algorithm only needs to know the spatial configuration of the x_i and does not require an analytic representation of G. Furthermore, an approximate PCA with optimal rate of convergence in the operator norm can be easily read off from this decomposition. Hence, by using only subsampling and the incomplete Cholesky decomposition, we obtain at nearly linear complexity the compression, inversion and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky decomposition we also obtain a near-linear-time solver for elliptic PDEs

Quantum machine learning: a classical perspective

Author: Ben-David S
Bishop CM
Bottou L
Breuer H-P
Chiang C-F
Getoor L
Golub GH
Grötschel M
Khardon R
Lanckriet GR
Li S
Messiah A
Murphy KP
Papadimitriou CH
Rasmussen CE
Vapnik VN
Publication venue: 'The Royal Society'
Publication date: 01/01/2018
Field of study

Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning techniques to impressive results in regression, classification, data-generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets are motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed-up classical machine learning algorithms. Here we review the literature in quantum machine learning and discuss perspectives for a mixed readership of classical machine learning and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in machine learning are identified as promising directions for the field. Practical questions, like how to upload classical data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

UCL Discovery

MPG.PuRe

Kernel Methods are Competitive for Operator Learning

Author: Batlle Pau
Darcy Matthieu
Hosseini Bamdad
Owhadi Houman
Publication venue
Publication date: 25/04/2023
Field of study

We present a general kernel-based framework for learning operators between Banach spaces along with a priori error analysis and comprehensive numerical comparisons with popular neural net (NN) approaches such as Deep Operator Net (DeepONet) [Lu et al.] and Fourier Neural Operator (FNO) [Li et al.]. We consider the setting where the input/output spaces of target operator

\mathcal{G}^\dagger\,:\, \mathcal{U}\to \mathcal{V}

are reproducing kernel Hilbert spaces (RKHS), the data comes in the form of partial observations

\phi(u_i), \varphi(v_i)

of input/output functions

v_i=\mathcal{G}^\dagger(u_i)

(

i=1,\ldots,N

), and the measurement operators

\phi\,:\, \mathcal{U}\to \mathbb{R}^n

and

\varphi\,:\, \mathcal{V} \to \mathbb{R}^m

are linear. Writing

\psi\,:\, \mathbb{R}^n \to \mathcal{U}

and

\chi\,:\, \mathbb{R}^m \to \mathcal{V}

for the optimal recovery maps associated with

\phi

and

\varphi

, we approximate

\mathcal{G}^\dagger

with

\bar{\mathcal{G}}=\chi \circ \bar{f} \circ \phi

where

\bar{f}

is an optimal recovery approximation of

f^\dagger:=\varphi \circ \mathcal{G}^\dagger \circ \psi\,:\,\mathbb{R}^n \to \mathbb{R}^m

. We show that, even when using vanilla kernels (e.g., linear or Mat\'{e}rn), our approach is competitive in terms of cost-accuracy trade-off and either matches or beats the performance of NN methods on a majority of benchmarks. Additionally, our framework offers several advantages inherited from kernel methods: simplicity, interpretability, convergence guarantees, a priori error estimates, and Bayesian uncertainty quantification. As such, it can serve as a natural benchmark for operator learning.Comment: 35 pages, 10 figure

arXiv.org e-Print Archive

Tensor Networks for Big Data Analytics and Large-Scale Optimization Problems

Author: Cichocki Andrzej
Publication venue
Publication date: 22/08/2014
Field of study

In this paper we review basic and emerging models and associated algorithms for large-scale tensor networks, especially Tensor Train (TT) decompositions using novel mathematical and graphical representations. We discus the concept of tensorization (i.e., creating very high-order tensors from lower-order original data) and super compression of data achieved via quantized tensor train (QTT) networks. The purpose of a tensorization and quantization is to achieve, via low-rank tensor approximations "super" compression, and meaningful, compact representation of structured data. The main objective of this paper is to show how tensor networks can be used to solve a wide class of big data optimization problems (that are far from tractable by classical numerical methods) by applying tensorization and performing all operations using relatively small size matrices and tensors and applying iteratively optimized and approximative tensor contractions. Keywords: Tensor networks, tensor train (TT) decompositions, matrix product states (MPS), matrix product operators (MPO), basic tensor operations, tensorization, distributed representation od data optimization problems for very large-scale problems: generalized eigenvalue decomposition (GEVD), PCA/SVD, canonical correlation analysis (CCA).Comment: arXiv admin note: text overlap with arXiv:1403.204

arXiv.org e-Print Archive

CiteSeerX

Certified and fast computations with shallow covariance kernels

Author: Kressner Daniel
Latz Jonas
Massei Stefano
Ullmann Elisabeth
Publication venue
Publication date: 01/01/2020
Field of study

Many techniques for data science and uncertainty quantification demand efficient tools to handle Gaussian random fields, which are defined in terms of their mean functions and covariance operators. Recently, parameterized Gaussian random fields have gained increased attention, due to their higher degree of flexibility. However, especially if the random field is parameterized through its covariance operator, classical random field discretization techniques fail or become inefficient. In this work we introduce and analyze a new and certified algorithm for the low-rank approximation of a parameterized family of covariance operators which represents an extension of the adaptive cross approximation method for symmetric positive definite matrices. The algorithm relies on an affine linear expansion of the covariance operator with respect to the parameters, which needs to be computed in a preprocessing step using, e.g., the empirical interpolation method. We discuss and test our new approach for isotropic covariance kernels, such as Mat\'ern kernels. The numerical results demonstrate the advantages of our approach in terms of computational time and confirm that the proposed algorithm provides the basis of a fast sampling procedure for parameter dependent Gaussian random fields

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Pure OAI Repository

Archivio della Ricerca - Università di Pisa