446 research outputs found

    The unstable formula theorem revisited

    Full text link
    We first prove that Littlestone classes, those which model theorists call stable, characterize learnability in a new statistical model: a learner in this new setting outputs the same hypothesis, up to measure zero, with probability one, after a uniformly bounded number of revisions. This fills a certain gap in the literature, and sets the stage for an approximation theorem characterizing Littlestone classes in terms of a range of learning models, by analogy to definability of types in model theory. We then give a complete analogue of Shelah's celebrated (and perhaps a priori untranslatable) Unstable Formula Theorem in the learning setting, with algorithmic arguments taking the place of the infinite

    Stability in Online Learning: From Random Perturbations in Bandit Problems to Differential Privacy

    Full text link
    Online learning is an area of machine learning that studies algorithms that make sequential predictions on data arriving incrementally. In this thesis, we investigate stability of online learning algorithms in two different settings. First, we examine random perturbation methods as a source of stability in bandit problems. Second, we study stability as a key concept connecting online learning and differential privacy. The first two chapters study the statistical properties of the perturbation technique in both stochastic and adversarial multi-armed bandit problems. We provide the first general analysis of perturbations for the stochastic multi-armed bandit problem. We also show that the open problem regarding minimax optimal perturbations for adversarial bandits cannot be solved in two ways that might seem very natural. The next two chapters consider stationary and non-stationary stochastic linear bandits respectively. We develop two randomized exploration strategies: (1) by replacing optimism with a simple randomization when deciding a confidence level in optimism based algorithms, or (2) by directly injecting the random perturbations to current estimates to overcome the conservatism that optimism based algorithms generally suffer from. Furthermore, we study the statistical and computational aspects of both of these strategies. While at a first glance it may seem that online learning and differential privacy have little in common, there is a strong connection between them via the notion of stability since the definition of differential-privacy is at its core, a form of stability. The final chapter investigates whether the recently established equivalence between online and private learnability in binary classification extends to multi-class classification and regression.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169709/1/baekjin_1.pd

    Multiclass versus binary differentially private PAC learning

    Full text link
    We show a generic reduction from multiclass differentially private PAC learning to binary private PAC learning. We apply this transformation to a recently proposed binary private PAC learner to obtain a private multiclass learner with sample complexity that has a polynomial dependence on the multiclass Littlestone dimension and a poly-logarithmic dependence on the number of classes. This yields a doubly exponential improvement in the dependence on both parameters over learners from previous work. Our proof extends the notion of -dimension defined in work of Ben-David et al. [5] to the online setting and explores its general properties.https://proceedings.neurips.cc/paper/2021/file/c1d53b7a97707b5cd1815c8d228d8ef1-Paper.pd

    Comparative Learning: A Sample Complexity Theory for Two Hypothesis Classes

    Get PDF
    In many learning theory problems, a central role is played by a hypothesis class: we might assume that the data is labeled according to a hypothesis in the class (usually referred to as the realizable setting), or we might evaluate the learned model by comparing it with the best hypothesis in the class (the agnostic setting). Taking a step beyond these classic setups that involve only a single hypothesis class, we study a variety of problems that involve two hypothesis classes simultaneously. We introduce comparative learning as a combination of the realizable and agnostic settings in PAC learning: given two binary hypothesis classes S and B, we assume that the data is labeled according to a hypothesis in the source class S and require the learned model to achieve an accuracy comparable to the best hypothesis in the benchmark class B. Even when both S and B have infinite VC dimensions, comparative learning can still have a small sample complexity. We show that the sample complexity of comparative learning is characterized by the mutual VC dimension VC(S,B) which we define to be the maximum size of a subset shattered by both S and B. We also show a similar result in the online setting, where we give a regret characterization in terms of the analogous mutual Littlestone dimension Ldim(S,B). These results also hold for partial hypotheses. We additionally show that the insights necessary to characterize the sample complexity of comparative learning can be applied to other tasks involving two hypothesis classes. In particular, we characterize the sample complexity of realizable multiaccuracy and multicalibration using the mutual fat-shattering dimension, an analogue of the mutual VC dimension for real-valued hypotheses. This not only solves an open problem proposed by Hu, Peale, Reingold (2022), but also leads to independently interesting results extending classic ones about regression, boosting, and covering number to our two-hypothesis-class setting
    • …
    corecore