521 research outputs found

    Optimal approximation of infinite-dimensional holomorphic functions II: recovery from i.i.d. pointwise samples

    Full text link
    Infinite-dimensional, holomorphic functions have been studied in detail over the last several decades, due to their relevance to parametric differential equations and computational uncertainty quantification. The approximation of such functions from finitely many samples is of particular interest, due to the practical importance of constructing surrogate models to complex mathematical models of physical processes. In a previous work, [5] we studied the approximation of so-called Banach-valued, (b,ε)(\boldsymbol{b},\varepsilon)-holomorphic functions on the infinite-dimensional hypercube [−1,1]N[-1,1]^{\mathbb{N}} from mm (potentially adaptive) samples. In particular, we derived lower bounds for the adaptive mm-widths for classes of such functions, which showed that certain algebraic rates of the form m1/2−1/pm^{1/2-1/p} are the best possible regardless of the sampling-recovery pair. In this work, we continue this investigation by focusing on the practical case where the samples are pointwise evaluations drawn identically and independently from a probability measure. Specifically, for Hilbert-valued (b,ε)(\boldsymbol{b},\varepsilon)-holomorphic functions, we show that the same rates can be achieved (up to a small polylogarithmic or algebraic factor) for essentially arbitrary tensor-product Jacobi (ultraspherical) measures. Our reconstruction maps are based on least squares and compressed sensing procedures using the corresponding orthonormal Jacobi polynomials. In doing so, we strengthen and generalize past work that has derived weaker nonuniform guarantees for the uniform and Chebyshev measures (and corresponding polynomials) only. We also extend various best ss-term polynomial approximation error bounds to arbitrary Jacobi polynomial expansions. Overall, we demonstrate that i.i.d.\ pointwise samples are near-optimal for the recovery of infinite-dimensional, holomorphic functions

    Optimal approximation of infinite-dimensional holomorphic functions

    Full text link
    Over the last decade, approximating functions in infinite dimensions from samples has gained increasing attention in computational science and engineering, especially in computational uncertainty quantification. This is primarily due to the relevance of functions that are solutions to parametric differential equations in various fields, e.g. chemistry, economics, engineering, and physics. While acquiring accurate and reliable approximations of such functions is inherently difficult, current benchmark methods exploit the fact that such functions often belong to certain classes of holomorphic functions to get algebraic convergence rates in infinite dimensions with respect to the number of (potentially adaptive) samples mm. Our work focuses on providing theoretical approximation guarantees for the class of (b,ε)(\boldsymbol{b},\varepsilon)-holomorphic functions, demonstrating that these algebraic rates are the best possible for Banach-valued functions in infinite dimensions. We establish lower bounds using a reduction to a discrete problem in combination with the theory of mm-widths, Gelfand widths and Kolmogorov widths. We study two cases, known and unknown anisotropy, in which the relative importance of the variables is known and unknown, respectively. A key conclusion of our paper is that in the latter setting, approximation from finite samples is impossible without some inherent ordering of the variables, even if the samples are chosen adaptively. Finally, in both cases, we demonstrate near-optimal, non-adaptive (random) sampling and recovery strategies which achieve close to same rates as the lower bounds

    CS4ML: A general framework for active learning with arbitrary data based on Christoffel functions

    Full text link
    We introduce a general framework for active learning in regression problems. Our framework extends the standard setup by allowing for general types of data, rather than merely pointwise samples of the target function. This generalization covers many cases of practical interest, such as data acquired in transform domains (e.g., Fourier data), vector-valued data (e.g., gradient-augmented data), data acquired along continuous curves, and, multimodal data (i.e., combinations of different types of measurements). Our framework considers random sampling according to a finite number of sampling measures and arbitrary nonlinear approximation spaces (model classes). We introduce the concept of generalized Christoffel functions and show how these can be used to optimize the sampling measures. We prove that this leads to near-optimal sample complexity in various important cases. This paper focuses on applications in scientific computing, where active learning is often desirable, since it is usually expensive to generate data. We demonstrate the efficacy of our framework for gradient-augmented learning with polynomials, Magnetic Resonance Imaging (MRI) using generative models and adaptive sampling for solving PDEs using Physics-Informed Neural Networks (PINNs)

    A unified framework for learning with nonlinear model classes from arbitrary linear samples

    Full text link
    This work considers the fundamental problem of learning an unknown object from training data using a given model class. We introduce a unified framework that allows for objects in arbitrary Hilbert spaces, general types of (random) linear measurements as training data and general types of nonlinear model classes. We establish a series of learning guarantees for this framework. These guarantees provide explicit relations between the amount of training data and properties of the model class to ensure near-best generalization bounds. In doing so, we also introduce and develop the key notion of the variation of a model class with respect to a distribution of sampling operators. To exhibit the versatility of this framework, we show that it can accommodate many different types of well-known problems of interest. We present examples such as matrix sketching by random sampling, compressed sensing with isotropic vectors, active learning in regression and compressed sensing with generative models. In all cases, we show how known results become straightforward corollaries of our general learning guarantees. For compressed sensing with generative models, we also present a number of generalizations and improvements of recent results. In summary, our work not only introduces a unified way to study learning unknown objects from general types of data, but also establishes a series of general theoretical guarantees which consolidate and improve various known results

    Near-optimal learning of Banach-valued, high-dimensional functions via deep neural networks

    Full text link
    The past decade has seen increasing interest in applying Deep Learning (DL) to Computational Science and Engineering (CSE). Driven by impressive results in applications such as computer vision, Uncertainty Quantification (UQ), genetics, simulations and image processing, DL is increasingly supplanting classical algorithms, and seems poised to revolutionize scientific computing. However, DL is not yet well-understood from the standpoint of numerical analysis. Little is known about the efficiency and reliability of DL from the perspectives of stability, robustness, accuracy, and sample complexity. In particular, approximating solutions to parametric PDEs is an objective of UQ for CSE. Training data for such problems is often scarce and corrupted by errors. Moreover, the target function is a possibly infinite-dimensional smooth function taking values in the PDE solution space, generally an infinite-dimensional Banach space. This paper provides arguments for Deep Neural Network (DNN) approximation of such functions, with both known and unknown parametric dependence, that overcome the curse of dimensionality. We establish practical existence theorems that describe classes of DNNs with dimension-independent architecture size and training procedures based on minimizing the (regularized) â„“2\ell^2-loss which achieve near-optimal algebraic rates of convergence. These results involve key extensions of compressed sensing for Banach-valued recovery and polynomial emulation with DNNs. When approximating solutions of parametric PDEs, our results account for all sources of error, i.e., sampling, optimization, approximation and physical discretization, and allow for training high-fidelity DNN approximations from coarse-grained sample data. Our theoretical results fall into the category of non-intrusive methods, providing a theoretical alternative to classical methods for high-dimensional approximation.Comment: 49 page

    Learning smooth functions in high dimensions: from sparse polynomials to deep neural networks

    Full text link
    Learning approximations to smooth target functions of many variables from finite sets of pointwise samples is an important task in scientific computing and its many applications in computational science and engineering. Despite well over half a century of research on high-dimensional approximation, this remains a challenging problem. Yet, significant advances have been made in the last decade towards efficient methods for doing this, commencing with so-called sparse polynomial approximation methods and continuing most recently with methods based on Deep Neural Networks (DNNs). In tandem, there have been substantial advances in the relevant approximation theory and analysis of these techniques. In this work, we survey this recent progress. We describe the contemporary motivations for this problem, which stem from parametric models and computational uncertainty quantification; the relevant function classes, namely, classes of infinite-dimensional, Banach-valued, holomorphic functions; fundamental limits of learnability from finite data for these classes; and finally, sparse polynomial and DNN methods for efficiently learning such functions from finite data. For the latter, there is currently a significant gap between the approximation theory of DNNs and the practical performance of deep learning. Aiming to narrow this gap, we develop the topic of practical existence theory, which asserts the existence of dimension-independent DNN architectures and training strategies that achieve provably near-optimal generalization errors in terms of the amount of training data
    • …