20 research outputs found
Private Learning Implies Online Learning: An Efficient Reduction
We study the relationship between the notions of differentially private
learning and online learning in games. Several recent works have shown that
differentially private learning implies online learning, but an open problem of
Neel, Roth, and Wu \cite{NeelAaronRoth2018} asks whether this implication is
{\it efficient}. Specifically, does an efficient differentially private learner
imply an efficient online learner? In this paper we resolve this open question
in the context of pure differential privacy. We derive an efficient black-box
reduction from differentially private learning to online learning from expert
advice
The Sample Complexity of Multi-Distribution Learning for VC Classes
Multi-distribution learning is a natural generalization of PAC learning to
settings with multiple data distributions. There remains a significant gap
between the known upper and lower bounds for PAC-learnable classes. In
particular, though we understand the sample complexity of learning a VC
dimension d class on distributions to be , the best lower bound is
. We discuss recent progress on this
problem and some hurdles that are fundamental to the use of game dynamics in
statistical learning.Comment: 11 pages. Authors are ordered alphabetically. Open problem presented
at the 36th Annual Conference on Learning Theor
An Improved Relaxation for Oracle-Efficient Adversarial Contextual Bandits
We present an oracle-efficient relaxation for the adversarial contextual
bandits problem, where the contexts are sequentially drawn i.i.d from a known
distribution and the cost sequence is chosen by an online adversary. Our
algorithm has a regret bound of
and makes at most calls
per round to an offline optimization oracle, where denotes the number of
actions, denotes the number of rounds and denotes the set of
policies. This is the first result to improve the prior best bound of
as obtained by Syrgkanis et
al. at NeurIPS 2016, and the first to match the original bound of Langford and
Zhang at NeurIPS 2007 which was obtained for the stochastic case.Comment: Appears in NeurIPS 202
Online Improper Learning with an Approximation Oracle
We revisit the question of reducing online learning to approximate
optimization of the offline problem. In this setting, we give two algorithms
with near-optimal performance in the full information setting: they guarantee
optimal regret and require only poly-logarithmically many calls to the
approximation oracle per iteration. Furthermore, these algorithms apply to the
more general improper learning problems. In the bandit setting, our algorithm
also significantly improves the best previously known oracle complexity while
maintaining the same regret