9 research outputs found
Recovering the Optimal Solution by Dual Random Projection
Random projection has been widely used in data classification. It maps
high-dimensional data into a low-dimensional subspace in order to reduce the
computational cost in solving the related optimization problem. While previous
studies are focused on analyzing the classification performance of using random
projection, in this work, we consider the recovery problem, i.e., how to
accurately recover the optimal solution to the original optimization problem in
the high-dimensional space based on the solution learned from the subspace
spanned by random projections. We present a simple algorithm, termed Dual
Random Projection, that uses the dual solution of the low-dimensional
optimization problem to recover the optimal solution to the original problem.
Our theoretical analysis shows that with a high probability, the proposed
algorithm is able to accurately recover the optimal solution to the original
problem, provided that the data matrix is of low rank or can be well
approximated by a low rank matrix.Comment: The 26th Annual Conference on Learning Theory (COLT 2013
A Modern Introduction to Online Learning
In this monograph, I introduce the basic concepts of Online Learning through
a modern view of Online Convex Optimization. Here, online learning refers to
the framework of regret minimization under worst-case assumptions. I present
first-order and second-order algorithms for online learning with convex losses,
in Euclidean and non-Euclidean settings. All the algorithms are clearly
presented as instantiation of Online Mirror Descent or
Follow-The-Regularized-Leader and their variants. Particular attention is given
to the issue of tuning the parameters of the algorithms and learning in
unbounded domains, through adaptive and parameter-free online learning
algorithms. Non-convex losses are dealt through convex surrogate losses and
through randomization. The bandit setting is also briefly discussed, touching
on the problem of adversarial and stochastic multi-armed bandits. These notes
do not require prior knowledge of convex analysis and all the required
mathematical tools are rigorously explained. Moreover, all the proofs have been
carefully chosen to be as simple and as short as possible.Comment: Fixed more typos, added more history bits, added local norms bounds
for OMD and FTR
A modern introduction to online learning
In this monograph, I introduce the basic concepts of Online Learning through a modern view of Online Convex Optimization. Here, online learning refers to the framework of regret minimization under worst-case assumptions. I
present first-order and second-order algorithms for online learning with convex losses, in Euclidean and non-Euclidean settings. All the algorithms are clearly presented as instantiation of Online Mirror Descent or Follow-The RegularizedLeader and their variants. Particular attention is given to the issue of tuning the parameters of the algorithms and learning in unbounded domains, through adaptive and parameter-free online learning algorithms. Non-convex losses are dealt through convex surrogate losses and through randomization. The bandit setting is also briefly discussed, touching on the problem of adversarial and stochastic multi-armed bandits. These notes do not require prior knowledge of convex analysis and all the required mathematical tools are rigorously explained. Moreover, all the proofs have been
carefully chosen to be as simple and as short as possible. I want to thank all the people that checked the proofs and reasonings in these notes. In particular, the students in my class that mercilessly pointed out my mistakes, Nicolo Campolongo that found all the typos in my formulas, and Jake Abernethy for the brainstorming on how to make the Tsallis proof even simpler.https://arxiv.org/pdf/1912.13213.pdfFirst author draf
Stochastic Composite Least-Squares Regression with Convergence Rate O(1/n)
International audienceWe consider the optimization of a quadratic objective function whose gradients are only accessible through a stochastic oracle that returns the gradient at any given point plus a zero-mean finite variance random error. We present the first algorithm that achieves jointly the optimal prediction error rates for least-squares regression, both in terms of forgetting of initial conditions in O(1/n 2), and in terms of dependence on the noise and dimension d of the problem, as O(d/n). Our new algorithm is based on averaged accelerated regularized gradient descent, and may also be analyzed through finer assumptions on initial conditions and the Hessian matrix, leading to dimension-free quantities that may still be small while the " optimal " terms above are large. In order to characterize the tightness of these new bounds, we consider an application to non-parametric regression and use the known lower bounds on the statistical performance (without computational limits), which happen to match our bounds obtained from a single pass on the data and thus show optimality of our algorithm in a wide variety of particular trade-offs between bias and variance
Online learning meets optimization in the dual
Abstract. We describe a novel framework for the design and analysis of online learning algorithms based on the notion of duality in constrained optimization. We cast a sub-family of universal online bounds as an optimization problem. Using the weak duality theorem we reduce the process of online learning to the task of incrementally increasing the dual objective function. The amount by which the dual increases serves as a new and natural notion of progress. We are thus able to tie the primal objective value and the number of prediction mistakes using and the increase in the dual. The end result is a general framework for designing and analyzing old and new online learning algorithms in the mistake bound model.