2,239 research outputs found

    Convergence of Online Mirror Descent

    Full text link
    In this paper we consider online mirror descent (OMD) algorithms, a class of scalable online learning algorithms exploiting data geometric structures through mirror maps. Necessary and sufficient conditions are presented in terms of the step size sequence {ηt}t\{\eta_t\}_{t} for the convergence of an OMD algorithm with respect to the expected Bregman distance induced by the mirror map. The condition is limtηt=0,t=1ηt=\lim_{t\to\infty}\eta_t=0, \sum_{t=1}^{\infty}\eta_t=\infty in the case of positive variances. It is reduced to t=1ηt=\sum_{t=1}^{\infty}\eta_t=\infty in the case of zero variances for which the linear convergence may be achieved by taking a constant step size sequence. A sufficient condition on the almost sure convergence is also given. We establish tight error bounds under mild conditions on the mirror map, the loss function, and the regularizer. Our results are achieved by some novel analysis on the one-step progress of the OMD algorithm using smoothness and strong convexity of the mirror map and the loss function.Comment: Published in Applied and Computational Harmonic Analysis, 202

    Iterative Regularization for Learning with Convex Loss Functions

    Get PDF
    We consider the problem of supervised learning with convex loss functions and propose a new form of iterative regularization based on the subgradient method. Unlike other regularization approaches, in iterative regularization no constraint or penalization is considered, and generalization is achieved by (early) stopping an empirical iteration. We consider a nonparametric setting, in the framework of reproducing kernel Hilbert spaces, and prove finite sample bounds on the excess risk under general regularity conditions. Our study provides a new class of efficient regularized learning algorithms and gives insights on the interplay between statistics and optimization in machine learning

    Learning gradients on manifolds

    Full text link
    A common belief in high-dimensional data analysis is that data are concentrated on a low-dimensional manifold. This motivates simultaneous dimension reduction and regression on manifolds. We provide an algorithm for learning gradients on manifolds for dimension reduction for high-dimensional data with few observations. We obtain generalization error bounds for the gradient estimates and show that the convergence rate depends on the intrinsic dimension of the manifold and not on the dimension of the ambient space. We illustrate the efficacy of this approach empirically on simulated and real data and compare the method to other dimension reduction procedures.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ206 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    On a conjecture of Z. Ditzian

    Get PDF
    AbstractA conjecture of Z. Ditzian on Bernstein polynomials is proved. This yields additional information on the problem of characterizing the rate of convergence for Bernstein polynomials
    corecore