926 research outputs found
On Margins and Derandomisation in PAC-Bayes
International audienceWe give a general recipe for derandomising PAC-Bayesian bounds using margins, with the critical ingredient being that our randomised predictions concentrate around some value. The tools we develop traightforwardly lead to margin bounds for various classifiers, including linear prediction—a class that includes boosting and the support vector machine—single-hidden-layer neural networks with an unusual erf activation function, and deep ReLU networks. Further, we extend to partially-derandomised predictors where only some of the randomness is removed, letting us extend bounds to cases where the concentration properties of our predictors are otherwise poor
A New PAC-Bayesian Perspective on Domain Adaptation
We study the issue of PAC-Bayesian domain adaptation: We want to learn, from
a source domain, a majority vote model dedicated to a target one. Our
theoretical contribution brings a new perspective by deriving an upper-bound on
the target risk where the distributions' divergence---expressed as a
ratio---controls the trade-off between a source error measure and the target
voters' disagreement. Our bound suggests that one has to focus on regions where
the source data is informative.From this result, we derive a PAC-Bayesian
generalization bound, and specialize it to linear classifiers. Then, we infer a
learning algorithmand perform experiments on real data.Comment: Published at ICML 201
PAC-Bayesian Analysis of the Exploration-Exploitation Trade-off
We develop a coherent framework for integrative simultaneous analysis of the
exploration-exploitation and model order selection trade-offs. We improve over
our preceding results on the same subject (Seldin et al., 2011) by combining
PAC-Bayesian analysis with Bernstein-type inequality for martingales. Such a
combination is also of independent interest for studies of multiple
simultaneously evolving martingales.Comment: On-line Trading of Exploration and Exploitation 2 - ICML-2011
workshop. http://explo.cs.ucl.ac.uk/workshop
PAC-Bayesian Theory Meets Bayesian Inference
We exhibit a strong link between frequentist PAC-Bayesian risk bounds and the
Bayesian marginal likelihood. That is, for the negative log-likelihood loss
function, we show that the minimization of PAC-Bayesian generalization risk
bounds maximizes the Bayesian marginal likelihood. This provides an alternative
explanation to the Bayesian Occam's razor criteria, under the assumption that
the data is generated by an i.i.d distribution. Moreover, as the negative
log-likelihood is an unbounded loss function, we motivate and propose a
PAC-Bayesian theorem tailored for the sub-gamma loss family, and we show that
our approach is sound on classical Bayesian linear regression tasks.Comment: Published at NIPS 2015
(http://papers.nips.cc/paper/6569-pac-bayesian-theory-meets-bayesian-inference
PAC-Bayes Analysis of Multi-view Learning
This paper presents eight PAC-Bayes bounds to analyze the generalization
performance of multi-view classifiers. These bounds adopt data dependent
Gaussian priors which emphasize classifiers with high view agreements. The
center of the prior for the first two bounds is the origin, while the center of
the prior for the third and fourth bounds is given by a data dependent vector.
An important technique to obtain these bounds is two derived logarithmic
determinant inequalities whose difference lies in whether the dimensionality of
data is involved. The centers of the fifth and sixth bounds are calculated on a
separate subset of the training set. The last two bounds use unlabeled data to
represent view agreements and are thus applicable to semi-supervised multi-view
learning. We evaluate all the presented multi-view PAC-Bayes bounds on
benchmark data and compare them with previous single-view PAC-Bayes bounds. The
usefulness and performance of the multi-view bounds are discussed.Comment: 35 page
Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary -Mixing Processes
Pac-Bayes bounds are among the most accurate generalization bounds for
classifiers learned from independently and identically distributed (IID) data,
and it is particularly so for margin classifiers: there have been recent
contributions showing how practical these bounds can be either to perform model
selection (Ambroladze et al., 2007) or even to directly guide the learning of
linear classifiers (Germain et al., 2009). However, there are many practical
situations where the training data show some dependencies and where the
traditional IID assumption does not hold. Stating generalization bounds for
such frameworks is therefore of the utmost interest, both from theoretical and
practical standpoints. In this work, we propose the first - to the best of our
knowledge - Pac-Bayes generalization bounds for classifiers trained on data
exhibiting interdependencies. The approach undertaken to establish our results
is based on the decomposition of a so-called dependency graph that encodes the
dependencies within the data, in sets of independent data, thanks to graph
fractional covers. Our bounds are very general, since being able to find an
upper bound on the fractional chromatic number of the dependency graph is
sufficient to get new Pac-Bayes bounds for specific settings. We show how our
results can be used to derive bounds for ranking statistics (such as Auc) and
classifiers trained on data distributed according to a stationary {\ss}-mixing
process. In the way, we show how our approach seemlessly allows us to deal with
U-processes. As a side note, we also provide a Pac-Bayes generalization bound
for classifiers learned on data from stationary -mixing distributions.Comment: Long version of the AISTATS 09 paper:
http://jmlr.csail.mit.edu/proceedings/papers/v5/ralaivola09a/ralaivola09a.pd
- …