629 research outputs found
PAC-Bayes Analysis of Multi-view Learning
This paper presents eight PAC-Bayes bounds to analyze the generalization
performance of multi-view classifiers. These bounds adopt data dependent
Gaussian priors which emphasize classifiers with high view agreements. The
center of the prior for the first two bounds is the origin, while the center of
the prior for the third and fourth bounds is given by a data dependent vector.
An important technique to obtain these bounds is two derived logarithmic
determinant inequalities whose difference lies in whether the dimensionality of
data is involved. The centers of the fifth and sixth bounds are calculated on a
separate subset of the training set. The last two bounds use unlabeled data to
represent view agreements and are thus applicable to semi-supervised multi-view
learning. We evaluate all the presented multi-view PAC-Bayes bounds on
benchmark data and compare them with previous single-view PAC-Bayes bounds. The
usefulness and performance of the multi-view bounds are discussed.Comment: 35 page
Maximum Margin Multiclass Nearest Neighbors
We develop a general framework for margin-based multicategory classification
in metric spaces. The basic work-horse is a margin-regularized version of the
nearest-neighbor classifier. We prove generalization bounds that match the
state of the art in sample size and significantly improve the dependence on
the number of classes . Our point of departure is a nearly Bayes-optimal
finite-sample risk bound independent of . Although -free, this bound is
unregularized and non-adaptive, which motivates our main result: Rademacher and
scale-sensitive margin bounds with a logarithmic dependence on . As the best
previous risk estimates in this setting were of order , our bound is
exponentially sharper. From the algorithmic standpoint, in doubling metric
spaces our classifier may be trained on examples in time and
evaluated on new points in time
Active Nearest-Neighbor Learning in Metric Spaces
We propose a pool-based non-parametric active learning algorithm for general
metric spaces, called MArgin Regularized Metric Active Nearest Neighbor
(MARMANN), which outputs a nearest-neighbor classifier. We give prediction
error guarantees that depend on the noisy-margin properties of the input
sample, and are competitive with those obtained by previously proposed passive
learners. We prove that the label complexity of MARMANN is significantly lower
than that of any passive learner with similar error guarantees. MARMANN is
based on a generalized sample compression scheme, and a new label-efficient
active model-selection procedure
A New PAC-Bayesian Perspective on Domain Adaptation
We study the issue of PAC-Bayesian domain adaptation: We want to learn, from
a source domain, a majority vote model dedicated to a target one. Our
theoretical contribution brings a new perspective by deriving an upper-bound on
the target risk where the distributions' divergence---expressed as a
ratio---controls the trade-off between a source error measure and the target
voters' disagreement. Our bound suggests that one has to focus on regions where
the source data is informative.From this result, we derive a PAC-Bayesian
generalization bound, and specialize it to linear classifiers. Then, we infer a
learning algorithmand perform experiments on real data.Comment: Published at ICML 201
PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization
While there has been progress in developing non-vacuous generalization bounds
for deep neural networks, these bounds tend to be uninformative about why deep
learning works. In this paper, we develop a compression approach based on
quantizing neural network parameters in a linear subspace, profoundly improving
on previous results to provide state-of-the-art generalization bounds on a
variety of tasks, including transfer learning. We use these tight bounds to
better understand the role of model size, equivariance, and the implicit biases
of optimization, for generalization in deep learning. Notably, we find large
models can be compressed to a much greater extent than previously known,
encapsulating Occam's razor. We also argue for data-independent bounds in
explaining generalization.Comment: NeurIPS 2022. Code is available at
https://github.com/activatedgeek/tight-pac-baye
- âŠ