The Sample Complexity of Multi-Distribution Learning for VC Classes

Abstract

Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension d class on kk distributions to be O(ϵ2ln(k)(d+k)+min{ϵ1dk,ϵ4ln(k)d})O(\epsilon^{-2} \ln(k)(d + k) + \min\{\epsilon^{-1} dk, \epsilon^{-4} \ln(k) d\}), the best lower bound is Ω(ϵ2(d+kln(k)))\Omega(\epsilon^{-2}(d + k \ln(k))). We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.Comment: 11 pages. Authors are ordered alphabetically. Open problem presented at the 36th Annual Conference on Learning Theor

    Similar works

    Full text

    thumbnail-image

    Available Versions