642 research outputs found

    Convergence of Unregularized Online Learning Algorithms

    Full text link
    In this paper we study the convergence of online gradient descent algorithms in reproducing kernel Hilbert spaces (RKHSs) without regularization. We establish a sufficient condition and a necessary condition for the convergence of excess generalization errors in expectation. A sufficient condition for the almost sure convergence is also given. With high probability, we provide explicit convergence rates of the excess generalization errors for both averaged iterates and the last iterate, which in turn also imply convergence rates with probability one. To our best knowledge, this is the first high-probability convergence rate for the last iterate of online gradient descent algorithms without strong convexity. Without any boundedness assumptions on iterates, our results are derived by a novel use of two measures of the algorithm's one-step progress, respectively by generalization errors and by distances in RKHSs, where the variances of the involved martingales are cancelled out by the descent property of the algorithm

    Super-Linear Convergence of Dual Augmented-Lagrangian Algorithm for Sparsity Regularized Estimation

    Full text link
    We analyze the convergence behaviour of a recently proposed algorithm for regularized estimation called Dual Augmented Lagrangian (DAL). Our analysis is based on a new interpretation of DAL as a proximal minimization algorithm. We theoretically show under some conditions that DAL converges super-linearly in a non-asymptotic and global sense. Due to a special modelling of sparse estimation problems in the context of machine learning, the assumptions we make are milder and more natural than those made in conventional analysis of augmented Lagrangian algorithms. In addition, the new interpretation enables us to generalize DAL to wide varieties of sparse estimation problems. We experimentally confirm our analysis in a large scale β„“1\ell_1-regularized logistic regression problem and extensively compare the efficiency of DAL algorithm to previously proposed algorithms on both synthetic and benchmark datasets.Comment: 51 pages, 9 figure

    Convergence of Online Mirror Descent

    Full text link
    In this paper we consider online mirror descent (OMD) algorithms, a class of scalable online learning algorithms exploiting data geometric structures through mirror maps. Necessary and sufficient conditions are presented in terms of the step size sequence {Ξ·t}t\{\eta_t\}_{t} for the convergence of an OMD algorithm with respect to the expected Bregman distance induced by the mirror map. The condition is lim⁑tβ†’βˆžΞ·t=0,βˆ‘t=1∞ηt=∞\lim_{t\to\infty}\eta_t=0, \sum_{t=1}^{\infty}\eta_t=\infty in the case of positive variances. It is reduced to βˆ‘t=1∞ηt=∞\sum_{t=1}^{\infty}\eta_t=\infty in the case of zero variances for which the linear convergence may be achieved by taking a constant step size sequence. A sufficient condition on the almost sure convergence is also given. We establish tight error bounds under mild conditions on the mirror map, the loss function, and the regularizer. Our results are achieved by some novel analysis on the one-step progress of the OMD algorithm using smoothness and strong convexity of the mirror map and the loss function.Comment: Published in Applied and Computational Harmonic Analysis, 202

    Sparse Bilinear Logistic Regression

    Full text link
    In this paper, we introduce the concept of sparse bilinear logistic regression for decision problems involving explanatory variables that are two-dimensional matrices. Such problems are common in computer vision, brain-computer interfaces, style/content factorization, and parallel factor analysis. The underlying optimization problem is bi-convex; we study its solution and develop an efficient algorithm based on block coordinate descent. We provide a theoretical guarantee for global convergence and estimate the asymptotical convergence rate using the Kurdyka-{\L}ojasiewicz inequality. A range of experiments with simulated and real data demonstrate that sparse bilinear logistic regression outperforms current techniques in several important applications.Comment: 27 pages, 5 figure
    • …
    corecore