20 research outputs found

    Private Learning Implies Online Learning: An Efficient Reduction

    Full text link
    We study the relationship between the notions of differentially private learning and online learning in games. Several recent works have shown that differentially private learning implies online learning, but an open problem of Neel, Roth, and Wu \cite{NeelAaronRoth2018} asks whether this implication is {\it efficient}. Specifically, does an efficient differentially private learner imply an efficient online learner? In this paper we resolve this open question in the context of pure differential privacy. We derive an efficient black-box reduction from differentially private learning to online learning from expert advice

    The Sample Complexity of Multi-Distribution Learning for VC Classes

    Full text link
    Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension d class on kk distributions to be O(ϵ2ln(k)(d+k)+min{ϵ1dk,ϵ4ln(k)d})O(\epsilon^{-2} \ln(k)(d + k) + \min\{\epsilon^{-1} dk, \epsilon^{-4} \ln(k) d\}), the best lower bound is Ω(ϵ2(d+kln(k)))\Omega(\epsilon^{-2}(d + k \ln(k))). We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.Comment: 11 pages. Authors are ordered alphabetically. Open problem presented at the 36th Annual Conference on Learning Theor

    An Improved Relaxation for Oracle-Efficient Adversarial Contextual Bandits

    Full text link
    We present an oracle-efficient relaxation for the adversarial contextual bandits problem, where the contexts are sequentially drawn i.i.d from a known distribution and the cost sequence is chosen by an online adversary. Our algorithm has a regret bound of O(T23(Klog(Π))13)O(T^{\frac{2}{3}}(K\log(|\Pi|))^{\frac{1}{3}}) and makes at most O(K)O(K) calls per round to an offline optimization oracle, where KK denotes the number of actions, TT denotes the number of rounds and Π\Pi denotes the set of policies. This is the first result to improve the prior best bound of O((TK)23(log(Π))13)O((TK)^{\frac{2}{3}}(\log(|\Pi|))^{\frac{1}{3}}) as obtained by Syrgkanis et al. at NeurIPS 2016, and the first to match the original bound of Langford and Zhang at NeurIPS 2007 which was obtained for the stochastic case.Comment: Appears in NeurIPS 202

    Online Improper Learning with an Approximation Oracle

    Full text link
    We revisit the question of reducing online learning to approximate optimization of the offline problem. In this setting, we give two algorithms with near-optimal performance in the full information setting: they guarantee optimal regret and require only poly-logarithmically many calls to the approximation oracle per iteration. Furthermore, these algorithms apply to the more general improper learning problems. In the bandit setting, our algorithm also significantly improves the best previously known oracle complexity while maintaining the same regret
    corecore