4,976 research outputs found

    Universal discrete-time reservoir computers with stochastic inputs and linear readouts using non-homogeneous state-affine systems

    Full text link
    A new class of non-homogeneous state-affine systems is introduced for use in reservoir computing. Sufficient conditions are identified that guarantee first, that the associated reservoir computers with linear readouts are causal, time-invariant, and satisfy the fading memory property and second, that a subset of this class is universal in the category of fading memory filters with stochastic almost surely uniformly bounded inputs. This means that any discrete-time filter that satisfies the fading memory property with random inputs of that type can be uniformly approximated by elements in the non-homogeneous state-affine family.Comment: 41 page

    A Transfer Principle: Universal Approximators Between Metric Spaces From Euclidean Universal Approximators

    Full text link
    We build universal approximators of continuous maps between arbitrary Polish metric spaces X\mathcal{X} and Y\mathcal{Y} using universal approximators between Euclidean spaces as building blocks. Earlier results assume that the output space Y\mathcal{Y} is a topological vector space. We overcome this limitation by "randomization": our approximators output discrete probability measures over Y\mathcal{Y}. When X\mathcal{X} and Y\mathcal{Y} are Polish without additional structure, we prove very general qualitative guarantees; when they have suitable combinatorial structure, we prove quantitative guarantees for H\"older-like maps, including maps between finite graphs, solution operators to rough differential equations between certain Carnot groups, and continuous non-linear operators between Banach spaces arising in inverse problems. In particular, we show that the required number of Dirac measures is determined by the combinatorial structure of X\mathcal{X} and Y\mathcal{Y}. For barycentric Y\mathcal{Y}, including Banach spaces, R\mathbb{R}-trees, Hadamard manifolds, or Wasserstein spaces on Polish metric spaces, our approximators reduce to Y\mathcal{Y}-valued functions. When the Euclidean approximators are neural networks, our constructions generalize transformer networks, providing a new probabilistic viewpoint of geometric deep learning.Comment: 14 Figures, 3 Tables, 78 Pages (Main 40, Proofs 26, Acknowledgments and References 12

    The Universal Approximation Property

    Full text link
    The universal approximation property of various machine learning models is currently only understood on a case-by-case basis, limiting the rapid development of new theoretically justified neural network architectures and blurring our understanding of our current models' potential. This paper works towards overcoming these challenges by presenting a characterization, a representation, a construction method, and an existence result, each of which applies to any universal approximator on most function spaces of practical interest. Our characterization result is used to describe which activation functions allow the feed-forward architecture to maintain its universal approximation capabilities when multiple constraints are imposed on its final layers and its remaining layers are only sparsely connected. These include a rescaled and shifted Leaky ReLU activation function but not the ReLU activation function. Our construction and representation result is used to exhibit a simple modification of the feed-forward architecture, which can approximate any continuous function with non-pathological growth, uniformly on the entire Euclidean input space. This improves the known capabilities of the feed-forward architecture

    Universal Regular Conditional Distributions

    Full text link
    We introduce a general framework for approximating regular conditional distributions (RCDs). Our approximations of these RCDs are implemented by a new class of geometric deep learning models with inputs in Rd\mathbb{R}^d and outputs in the Wasserstein-11 space P1(RD)\mathcal{P}_1(\mathbb{R}^D). We find that the models built using our framework can approximate any continuous functions from Rd\mathbb{R}^d to P1(RD)\mathcal{P}_1(\mathbb{R}^D) uniformly on compacts, and quantitative rates are obtained. We identify two methods for avoiding the "curse of dimensionality"; i.e.: the number of parameters determining the approximating neural network depends only polynomially on the involved dimension and the approximation error. The first solution describes functions in C(Rd,P1(RD))C(\mathbb{R}^d,\mathcal{P}_1(\mathbb{R}^D)) which can be efficiently approximated on any compact subset of Rd\mathbb{R}^d. Conversely, the second approach describes sets in Rd\mathbb{R}^d, on which any function in C(Rd,P1(RD))C(\mathbb{R}^d,\mathcal{P}_1(\mathbb{R}^D)) can be efficiently approximated. Our framework is used to obtain an affirmative answer to the open conjecture of Bishop (1994); namely: mixture density networks are universal regular conditional distributions. The predictive performance of the proposed models is evaluated against comparable learning models on various probabilistic predictions tasks in the context of ELMs, model uncertainty, and heteroscedastic regression. All the results are obtained for more general input and output spaces and thus apply to geometric deep learning contexts.Comment: Keywords: Universal Regular Conditional Distributions, Geometric Deep Learning, Measure-Valued Neural Networks, Conditional Expectation, Uncertainty Quantification. Additional Information: 27 Pages + 22 Page Appendix, 7 Table
    corecore