4,632 research outputs found
A Generalized Online Mirror Descent with Applications to Classification and Regression
Online learning algorithms are fast, memory-efficient, easy to implement, and
applicable to many prediction problems, including classification, regression,
and ranking. Several online algorithms were proposed in the past few decades,
some based on additive updates, like the Perceptron, and some on multiplicative
updates, like Winnow. A unifying perspective on the design and the analysis of
online algorithms is provided by online mirror descent, a general prediction
strategy from which most first-order algorithms can be obtained as special
cases. We generalize online mirror descent to time-varying regularizers with
generic updates. Unlike standard mirror descent, our more general formulation
also captures second order algorithms, algorithms for composite losses and
algorithms for adaptive filtering. Moreover, we recover, and sometimes improve,
known regret bounds as special cases of our analysis using specific
regularizers. Finally, we show the power of our approach by deriving a new
second order algorithm with a regret bound invariant with respect to arbitrary
rescalings of individual features
An Efficient Primal-Dual Prox Method for Non-Smooth Optimization
We study the non-smooth optimization problems in machine learning, where both
the loss function and the regularizer are non-smooth functions. Previous
studies on efficient empirical loss minimization assume either a smooth loss
function or a strongly convex regularizer, making them unsuitable for
non-smooth optimization. We develop a simple yet efficient method for a family
of non-smooth optimization problems where the dual form of the loss function is
bilinear in primal and dual variables. We cast a non-smooth optimization
problem into a minimax optimization problem, and develop a primal dual prox
method that solves the minimax optimization problem at a rate of
{assuming that the proximal step can be efficiently solved}, significantly
faster than a standard subgradient descent method that has an
convergence rate. Our empirical study verifies the efficiency of the proposed
method for various non-smooth optimization problems that arise ubiquitously in
machine learning by comparing it to the state-of-the-art first order methods
- …