Search CORE

642 research outputs found

Convergence of Unregularized Online Learning Algorithms

Author: Guo Zheng-Chu
Lei Yunwen
Shi Lei
Publication venue
Publication date: 09/08/2017
Field of study

In this paper we study the convergence of online gradient descent algorithms in reproducing kernel Hilbert spaces (RKHSs) without regularization. We establish a sufficient condition and a necessary condition for the convergence of excess generalization errors in expectation. A sufficient condition for the almost sure convergence is also given. With high probability, we provide explicit convergence rates of the excess generalization errors for both averaged iterates and the last iterate, which in turn also imply convergence rates with probability one. To our best knowledge, this is the first high-probability convergence rate for the last iterate of online gradient descent algorithms without strong convexity. Without any boundedness assumptions on iterates, our results are derived by a novel use of two measures of the algorithm's one-step progress, respectively by generalization errors and by distances in RKHSs, where the variances of the involved martingales are cancelled out by the descent property of the algorithm

arXiv.org e-Print Archive

University of Birmingham Research Portal

Super-Linear Convergence of Dual Augmented-Lagrangian Algorithm for Sparsity Regularized Estimation

Author: Masashi Sugiyama
Masashi Sugiyama
Ryota Tomioka
Ryota Tomioka
See Profile
See Profile
Taiji Suzuki
Tong Zhang
Publication venue
Publication date: 01/01/2011
Field of study

We analyze the convergence behaviour of a recently proposed algorithm for regularized estimation called Dual Augmented Lagrangian (DAL). Our analysis is based on a new interpretation of DAL as a proximal minimization algorithm. We theoretically show under some conditions that DAL converges super-linearly in a non-asymptotic and global sense. Due to a special modelling of sparse estimation problems in the context of machine learning, the assumptions we make are milder and more natural than those made in conventional analysis of augmented Lagrangian algorithms. In addition, the new interpretation enables us to generalize DAL to wide varieties of sparse estimation problems. We experimentally confirm our analysis in a large scale

\ell_1

-regularized logistic regression problem and extensively compare the efficiency of DAL algorithm to previously proposed algorithms on both synthetic and benchmark datasets.Comment: 51 pages, 9 figure

arXiv.org e-Print Archive

CiteSeerX

Convergence of Online Mirror Descent

Author: Lei Yunwen
Zhou Ding-Xuan
Publication venue
Publication date: 13/12/2019
Field of study

In this paper we consider online mirror descent (OMD) algorithms, a class of scalable online learning algorithms exploiting data geometric structures through mirror maps. Necessary and sufficient conditions are presented in terms of the step size sequence

\{\eta_t\}_{t}

for the convergence of an OMD algorithm with respect to the expected Bregman distance induced by the mirror map. The condition is

\lim_{t\to\infty}\eta_t=0, \sum_{t=1}^{\infty}\eta_t=\infty

in the case of positive variances. It is reduced to

\sum_{t=1}^{\infty}\eta_t=\infty

in the case of zero variances for which the linear convergence may be achieved by taking a constant step size sequence. A sufficient condition on the almost sure convergence is also given. We establish tight error bounds under mild conditions on the mirror map, the loss function, and the regularizer. Our results are achieved by some novel analysis on the one-step progress of the OMD algorithm using smoothness and strong convexity of the mirror map and the loss function.Comment: Published in Applied and Computational Harmonic Analysis, 202

arXiv.org e-Print Archive

University of Birmingham Research Portal

Sparse Bilinear Logistic Regression

Author: Baraniuk Richard G.
Shi Jianing V.
Xu Yangyang
Publication venue
Publication date: 15/04/2014
Field of study

In this paper, we introduce the concept of sparse bilinear logistic regression for decision problems involving explanatory variables that are two-dimensional matrices. Such problems are common in computer vision, brain-computer interfaces, style/content factorization, and parallel factor analysis. The underlying optimization problem is bi-convex; we study its solution and develop an efficient algorithm based on block coordinate descent. We provide a theoretical guarantee for global convergence and estimate the asymptotical convergence rate using the Kurdyka-{\L}ojasiewicz inequality. A range of experiments with simulated and real data demonstrate that sparse bilinear logistic regression outperforms current techniques in several important applications.Comment: 27 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX