7 research outputs found

    Deep Dimension Reduction for Supervised Representation Learning

    Full text link
    The success of deep supervised learning depends on its automatic data representation abilities. Among all the characteristics of an ideal representation for high-dimensional complex data, information preservation, low dimensionality and disentanglement are the most essential ones. In this work, we propose a deep dimension reduction (DDR) approach to achieving a good data representation with these characteristics for supervised learning. At the population level, we formulate the ideal representation learning task as finding a nonlinear dimension reduction map that minimizes the sum of losses characterizing conditional independence and disentanglement. We estimate the target map at the sample level nonparametrically with deep neural networks. We derive a bound on the excess risk of the deep nonparametric estimator. The proposed method is validated via comprehensive numerical experiments and real data analysis in the context of regression and classification

    On the geometry of Stein variational gradient descent

    Get PDF
    Bayesian inference problems require sampling or approximating high-dimensional probability distributions. The focus of this paper is on the recently introduced Stein variational gradient descent methodology, a class of algorithms that rely on iterated steepest descent steps with respect to a reproducing kernel Hilbert space norm. This construction leads to interacting particle systems, the mean-field limit of which is a gradient flow on the space of probability distributions equipped with a certain geometrical structure. We leverage this viewpoint to shed some light on the convergence properties of the algorithm, in particular addressing the problem of choosing a suitable positive definite kernel function. Our analysis leads us to considering certain nondifferentiable kernels with adjusted tails. We demonstrate significant performs gains of these in various numerical experiments

    On the geometry of Stein variational gradient descent

    Full text link
    Bayesian inference problems require sampling or approximating high-dimensional probability distributions. The focus of this paper is on the recently introduced Stein variational gradient descent methodology, a class of algorithms that rely on iterated steepest descent steps with respect to a reproducing kernel Hilbert space norm. This construction leads to interacting particle systems, the mean-field limit of which is a gradient flow on the space of probability distributions equipped with a certain geometrical structure. We leverage this viewpoint to shed some light on the convergence properties of the algorithm, in particular addressing the problem of choosing a suitable positive definite kernel function. Our analysis leads us to considering certain nondifferentiable kernels with adjusted tails. We demonstrate significant performs gains of these in various numerical experiments.Comment: 39 pages, 4 figure
    corecore