8,066 research outputs found
A Drifting-Games Analysis for Online Learning and Applications to Boosting
We provide a general mechanism to design online learning algorithms based on
a minimax analysis within a drifting-games framework. Different online learning
settings (Hedge, multi-armed bandit problems and online convex optimization)
are studied by converting into various kinds of drifting games. The original
minimax analysis for drifting games is then used and generalized by applying a
series of relaxations, starting from choosing a convex surrogate of the 0-1
loss function. With different choices of surrogates, we not only recover
existing algorithms, but also propose new algorithms that are totally
parameter-free and enjoy other useful properties. Moreover, our drifting-games
framework naturally allows us to study high probability bounds without
resorting to any concentration results, and also a generalized notion of regret
that measures how good the algorithm is compared to all but the top small
fraction of candidates. Finally, we translate our new Hedge algorithm into a
new adaptive boosting algorithm that is computationally faster as shown in
experiments, since it ignores a large number of examples on each round.Comment: In NIPS201
Private Learning Implies Online Learning: An Efficient Reduction
We study the relationship between the notions of differentially private
learning and online learning in games. Several recent works have shown that
differentially private learning implies online learning, but an open problem of
Neel, Roth, and Wu \cite{NeelAaronRoth2018} asks whether this implication is
{\it efficient}. Specifically, does an efficient differentially private learner
imply an efficient online learner? In this paper we resolve this open question
in the context of pure differential privacy. We derive an efficient black-box
reduction from differentially private learning to online learning from expert
advice
UCBoost: A Boosting Approach to Tame Complexity and Optimality for Stochastic Bandits
In this work, we address the open problem of finding low-complexity
near-optimal multi-armed bandit algorithms for sequential decision making
problems. Existing bandit algorithms are either sub-optimal and computationally
simple (e.g., UCB1) or optimal and computationally complex (e.g., kl-UCB). We
propose a boosting approach to Upper Confidence Bound based algorithms for
stochastic bandits, that we call UCBoost. Specifically, we propose two types of
UCBoost algorithms. We show that UCBoost() enjoys complexity for each
arm per round as well as regret guarantee that is -close to that of the
kl-UCB algorithm. We propose an approximation-based UCBoost algorithm,
UCBoost(), that enjoys a regret guarantee -close to that of
kl-UCB as well as complexity for each arm per round.
Hence, our algorithms provide practitioners a practical way to trade optimality
with computational complexity. Finally, we present numerical results which show
that UCBoost() can achieve the same regret performance as the
standard kl-UCB while incurring only of the computational cost of kl-UCB.Comment: Accepted by IJCAI 201
Optimal and Adaptive Algorithms for Online Boosting
We study online boosting, the task of converting any weak online learner into
a strong online learner. Based on a novel and natural definition of weak online
learnability, we develop two online boosting algorithms. The first algorithm is
an online version of boost-by-majority. By proving a matching lower bound, we
show that this algorithm is essentially optimal in terms of the number of weak
learners and the sample complexity needed to achieve a specified accuracy. This
optimal algorithm is not adaptive however. Using tools from online loss
minimization, we derive an adaptive online boosting algorithm that is also
parameter-free, but not optimal. Both algorithms work with base learners that
can handle example importance weights directly, as well as by rejection
sampling examples with probability defined by the booster. Results are
complemented with an extensive experimental study
Fast rates with high probability in exp-concave statistical learning
We present an algorithm for the statistical learning setting with a bounded
exp-concave loss in dimensions that obtains excess risk with probability at least . The core technique
is to boost the confidence of recent in-expectation excess risk bounds
for empirical risk minimization (ERM), without sacrificing the rate, by
leveraging a Bernstein condition which holds due to exp-concavity. We also show
that with probability the standard ERM method obtains excess risk
. We further show that a regret bound for
any online learner in this setting translates to a high probability excess risk
bound for the corresponding online-to-batch conversion of the online learner.
Lastly, we present two high probability bounds for the exp-concave model
selection aggregation problem that are quantile-adaptive in a certain sense.
The first bound is a purely exponential weights type algorithm, obtains a
nearly optimal rate, and has no explicit dependence on the Lipschitz continuity
of the loss. The second bound requires Lipschitz continuity but obtains the
optimal rate.Comment: added results on model selection aggregation (Section 7
Learning Reductions that Really Work
We provide a summary of the mathematical and computational techniques that
have enabled learning reductions to effectively address a wide class of
problems, and show that this approach to solving machine learning problems can
be broadly useful
Thompson Sampling for the MNL-Bandit
We consider a sequential subset selection problem under parameter
uncertainty, where at each time step, the decision maker selects a subset of
cardinality from possible items (arms), and observes a (bandit)
feedback in the form of the index of one of the items in said subset, or none.
Each item in the index set is ascribed a certain value (reward), and the
feedback is governed by a Multinomial Logit (MNL) choice model whose parameters
are a priori unknown. The objective of the decision maker is to maximize the
expected cumulative rewards over a finite horizon , or alternatively,
minimize the regret relative to an oracle that knows the MNL parameters. We
refer to this as the MNL-Bandit problem. This problem is representative of a
larger family of exploration-exploitation problems that involve a combinatorial
objective, and arise in several important application domains. We present an
approach to adapt Thompson Sampling to this problem and show that it achieves
near-optimal regret as well as attractive numerical performance.Comment: Accepted for presentation at Conference on Learning Theory (COLT)
201
Not-So-Random Features
We propose a principled method for kernel learning, which relies on a
Fourier-analytic characterization of translation-invariant or
rotation-invariant kernels. Our method produces a sequence of feature maps,
iteratively refining the SVM margin. We provide rigorous guarantees for
optimality and generalization, interpreting our algorithm as online
equilibrium-finding dynamics in a certain two-player min-max game. Evaluations
on synthetic and real-world datasets demonstrate scalability and consistent
improvements over related random features-based methods.Comment: Published as a conference paper at ICLR 201
Composite Binary Losses
We study losses for binary classification and class probability estimation
and extend the understanding of them from margin losses to general composite
losses which are the composition of a proper loss with a link function. We
characterise when margin losses can be proper composite losses, explicitly show
how to determine a symmetric loss in full from half of one of its partial
losses, introduce an intrinsic parametrisation of composite binary losses and
give a complete characterisation of the relationship between proper losses and
``classification calibrated'' losses. We also consider the question of the
``best'' surrogate binary loss. We introduce a precise notion of ``best'' and
show there exist situations where two convex surrogate losses are
incommensurable. We provide a complete explicit characterisation of the
convexity of composite binary losses in terms of the link function and the
weight function associated with the proper loss which make up the composite
loss. This characterisation suggests new ways of ``surrogate tuning''. Finally,
in an appendix we present some new algorithm-independent results on the
relationship between properness, convexity and robustness to misclassification
noise for binary losses and show that all convex proper losses are non-robust
to misclassification noise.Comment: 38 pages, 4 figures. Submitted to JML
Exp-Concavity of Proper Composite Losses
The goal of online prediction with expert advice is to find a decision
strategy which will perform almost as well as the best expert in a given pool
of experts, on any sequence of outcomes. This problem has been widely studied
and and regret bounds can be achieved for convex
losses (\cite{zinkevich2003online}) and strictly convex losses with bounded
first and second derivatives (\cite{hazan2007logarithmic}) respectively. In
special cases like the Aggregating Algorithm (\cite{vovk1995game}) with mixable
losses and the Weighted Average Algorithm (\cite{kivinen1999averaging}) with
exp-concave losses, it is possible to achieve regret bounds.
\cite{van2012exp} has argued that mixability and exp-concavity are roughly
equivalent under certain conditions. Thus by understanding the underlying
relationship between these two notions we can gain the best of both algorithms
(strong theoretical performance guarantees of the Aggregating Algorithm and the
computational efficiency of the Weighted Average Algorithm). In this paper we
provide a complete characterization of the exp-concavity of any proper
composite loss. Using this characterization and the mixability condition of
proper losses (\cite{van2012mixability}), we show that it is possible to
transform (re-parameterize) any -mixable binary proper loss into a
-exp-concave composite loss with the same . In the multi-class
case, we propose an approximation approach for this transformation
- …