446 research outputs found
The unstable formula theorem revisited
We first prove that Littlestone classes, those which model theorists call
stable, characterize learnability in a new statistical model: a learner in this
new setting outputs the same hypothesis, up to measure zero, with probability
one, after a uniformly bounded number of revisions. This fills a certain gap in
the literature, and sets the stage for an approximation theorem characterizing
Littlestone classes in terms of a range of learning models, by analogy to
definability of types in model theory. We then give a complete analogue of
Shelah's celebrated (and perhaps a priori untranslatable) Unstable Formula
Theorem in the learning setting, with algorithmic arguments taking the place of
the infinite
Stability in Online Learning: From Random Perturbations in Bandit Problems to Differential Privacy
Online learning is an area of machine learning that studies algorithms that make sequential predictions on data arriving incrementally. In this thesis, we investigate stability of online learning algorithms in two different settings. First, we examine random perturbation methods as a source of stability in bandit problems. Second, we study stability as a key concept connecting online learning and differential privacy.
The first two chapters study the statistical properties of the perturbation technique in both stochastic and adversarial multi-armed bandit problems. We provide the first general analysis of perturbations for the stochastic multi-armed bandit problem. We also show that the open problem regarding minimax optimal perturbations for adversarial bandits cannot be solved in two ways that might seem very natural.
The next two chapters consider stationary and non-stationary stochastic linear bandits respectively. We develop two randomized exploration strategies: (1) by replacing optimism with a simple randomization when deciding a confidence level in optimism based algorithms, or (2) by directly injecting the random perturbations to current estimates to overcome the conservatism that optimism based algorithms generally suffer from. Furthermore, we study the statistical and computational aspects of both of these strategies.
While at a first glance it may seem that online learning and differential privacy have little in common, there is a strong connection between them via the notion of stability since the definition of differential-privacy is at its core, a form of stability. The final chapter investigates whether the recently established equivalence between online and private learnability in binary classification extends to multi-class classification and regression.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169709/1/baekjin_1.pd
Multiclass versus binary differentially private PAC learning
We show a generic reduction from multiclass differentially private PAC learning to binary private PAC learning. We apply this transformation to a recently proposed binary private PAC learner to obtain a private multiclass learner with sample complexity
that has a polynomial dependence on the multiclass Littlestone dimension
and a poly-logarithmic dependence on the number of classes. This yields a doubly exponential improvement in the dependence on both parameters over learners from previous work. Our proof extends the notion of -dimension defined in work of Ben-David et al. [5] to the online setting and explores its general properties.https://proceedings.neurips.cc/paper/2021/file/c1d53b7a97707b5cd1815c8d228d8ef1-Paper.pd
Comparative Learning: A Sample Complexity Theory for Two Hypothesis Classes
In many learning theory problems, a central role is played by a hypothesis class: we might assume that the data is labeled according to a hypothesis in the class (usually referred to as the realizable setting), or we might evaluate the learned model by comparing it with the best hypothesis in the class (the agnostic setting). Taking a step beyond these classic setups that involve only a single hypothesis class, we study a variety of problems that involve two hypothesis classes simultaneously.
We introduce comparative learning as a combination of the realizable and agnostic settings in PAC learning: given two binary hypothesis classes S and B, we assume that the data is labeled according to a hypothesis in the source class S and require the learned model to achieve an accuracy comparable to the best hypothesis in the benchmark class B. Even when both S and B have infinite VC dimensions, comparative learning can still have a small sample complexity. We show that the sample complexity of comparative learning is characterized by the mutual VC dimension VC(S,B) which we define to be the maximum size of a subset shattered by both S and B. We also show a similar result in the online setting, where we give a regret characterization in terms of the analogous mutual Littlestone dimension Ldim(S,B). These results also hold for partial hypotheses.
We additionally show that the insights necessary to characterize the sample complexity of comparative learning can be applied to other tasks involving two hypothesis classes. In particular, we characterize the sample complexity of realizable multiaccuracy and multicalibration using the mutual fat-shattering dimension, an analogue of the mutual VC dimension for real-valued hypotheses. This not only solves an open problem proposed by Hu, Peale, Reingold (2022), but also leads to independently interesting results extending classic ones about regression, boosting, and covering number to our two-hypothesis-class setting
- …