12 research outputs found

    Convergence bounds for empirical nonlinear least-squares

    Get PDF
    We consider best approximation problems in a nonlinear subset of a Banach space of functions. The norm is assumed to be a generalization of the L2 norm for which only a weighted Monte Carlo estimate can be computed. The objective is to obtain an approximation of an unknown target function by minimizing the empirical norm. In the case of linear subspaces it is well-known that such least squares approximations can become inaccurate and unstable when the number of samples is too close to the number of parameters. We review this statement for general nonlinear subsets and establish error bounds for the empirical best approximation error. Our results are based on a restricted isometry property (RIP) which holds in probability and we show sufficient conditions for the RIP to be satisfied with high probability. Several model classes are examined where analytical statements can be made about the RIP. Numerical experiments illustrate some of the obtained stability bounds

    Convergence bounds for empirical nonlinear least-squares

    Get PDF
    We consider best approximation problems in a nonlinear subset M\mathcal{M} of a Banach space of functions (V,)(\mathcal{V},\|\bullet\|). The norm is assumed to be a generalization of the L2L^2-norm for which only a weighted Monte Carlo estimate n\|\bullet\|_n can be computed. The objective is to obtain an approximation vMv\in\mathcal{M} of an unknown function uVu \in \mathcal{V} by minimizing the empirical norm uvn\|u-v\|_n. In the case of linear subspaces M\mathcal{M} it is well-known that such least squares approximations can become inaccurate and unstable when the number of samples nn is too close to the number of parameters m=dim(M)m = \operatorname{dim}(\mathcal{M}). We review this statement for general nonlinear subsets and establish error bounds for the empirical best approximation error. Our results are based on a restricted isometry property (RIP) which holds in probability and we show that nmn \gtrsim m is sufficient for the RIP to be satisfied with high probability. Several model classes are examined where analytical statements can be made about the RIP. Numerical experiments illustrate some of the obtained stability bounds.Comment: 32 pages, 18 figures; major revision

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page
    corecore