13 research outputs found
Tight Lower Bounds for Multiplicative Weights Algorithmic Families
We study the fundamental problem of prediction with expert advice and develop
regret lower bounds for a large family of algorithms for this problem. We
develop simple adversarial primitives, that lend themselves to various
combinations leading to sharp lower bounds for many algorithmic families. We
use these primitives to show that the classic Multiplicative Weights Algorithm
(MWA) has a regret of , there by completely closing
the gap between upper and lower bounds. We further show a regret lower bound of
for a much more general family of
algorithms than MWA, where the learning rate can be arbitrarily varied over
time, or even picked from arbitrary distributions over time. We also use our
primitives to construct adversaries in the geometric horizon setting for MWA to
precisely characterize the regret at for the case
of experts and a lower bound of
for the case of arbitrary number of experts
Online Learning with an Almost Perfect Expert
We study the multiclass online learning problem where a forecaster makes a
sequence of predictions using the advice of experts. Our main contribution
is to analyze the regime where the best expert makes at most mistakes and
to show that when , the expected number of mistakes made by
the optimal forecaster is at most . We also describe
an adversary strategy showing that this bound is tight and that the worst case
is attained for binary prediction
Analysis of Perturbation Techniques in Online Learning
The most commonly used regularization technique in machine learning is to directly add a penalty function to the optimization objective. For example, regularization is universally applied to a wide range of models including linear regression and neural networks. The alternative regularization technique, which has become essential in modern applications of machine learning, is implicit regularization by injecting random noise into the training data.
In fact, this idea of using random perturbations as regularizer has been one of the first algorithms for online learning, where a learner chooses actions iteratively on a data sequence that may be designed adversarially to thwart learning process. One such classical algorithm is known as Follow The Perturbed Leader (FTPL).
This dissertation presents new interpretations of FTPL. In the first part, we show that FTPL is equivalent to playing the gradients of a stochastically smoothed potential function in the dual space. In the second part, we show that FTPL is the extension of a differentially private mechanism that has inherent stability guarantees. These perspectives lead to novel frameworks for FTPL regret analysis, which not only prove strong performance guarantees but also help characterize the optimal choice of noise distributions. Furthermore, they extend to the partial information setting where the learner observes only part of the input data.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143968/1/chansool_1.pd
Improved Kernel Alignment Regret Bound for Online Kernel Learning
In this paper, we improve the kernel alignment regret bound for online kernel
learning in the regime of the Hinge loss function. Previous algorithm achieves
a regret of at a computational
complexity (space and per-round time) of ,
where is called \textit{kernel alignment}. We propose an
algorithm whose regret bound and computational complexity are better than
previous results. Our results depend on the decay rate of eigenvalues of the
kernel matrix. If the eigenvalues of the kernel matrix decay exponentially,
then our algorithm enjoys a regret of at a
computational complexity of . Otherwise, our algorithm enjoys a
regret of at a computational complexity of
. We extend our algorithm to batch learning and
obtain a excess risk bound
which improves the previous bound
Online Sequential Decision-Making with Unknown Delays
In the field of online sequential decision-making, we address the problem
with delays utilizing the framework of online convex optimization (OCO), where
the feedback of a decision can arrive with an unknown delay. Unlike previous
research that is limited to Euclidean norm and gradient information, we propose
three families of delayed algorithms based on approximate solutions to handle
different types of received feedback. Our proposed algorithms are versatile and
applicable to universal norms. Specifically, we introduce a family of Follow
the Delayed Regularized Leader algorithms for feedback with full information on
the loss function, a family of Delayed Mirror Descent algorithms for feedback
with gradient information on the loss function and a family of Simplified
Delayed Mirror Descent algorithms for feedback with the value information of
the loss function's gradients at corresponding decision points. For each type
of algorithm, we provide corresponding regret bounds under cases of general
convexity and relative strong convexity, respectively. We also demonstrate the
efficiency of each algorithm under different norms through concrete examples.
Furthermore, our theoretical results are consistent with the current best
bounds when degenerated to standard settings