105 research outputs found
PAC-Bayes and Domain Adaptation
We provide two main contributions in PAC-Bayesian theory for domain
adaptation where the objective is to learn, from a source distribution, a
well-performing majority vote on a different, but related, target distribution.
Firstly, we propose an improvement of the previous approach we proposed in
Germain et al. (2013), which relies on a novel distribution pseudodistance
based on a disagreement averaging, allowing us to derive a new tighter domain
adaptation bound for the target risk. While this bound stands in the spirit of
common domain adaptation works, we derive a second bound (introduced in Germain
et al., 2016) that brings a new perspective on domain adaptation by deriving an
upper bound on the target risk where the distributions' divergence-expressed as
a ratio-controls the trade-off between a source error measure and the target
voters' disagreement. We discuss and compare both results, from which we obtain
PAC-Bayesian generalization bounds. Furthermore, from the PAC-Bayesian
specialization to linear classifiers, we infer two learning algorithms, and we
evaluate them on real data.Comment: Neurocomputing, Elsevier, 2019. arXiv admin note: substantial text
overlap with arXiv:1503.0694
A New PAC-Bayesian Perspective on Domain Adaptation
We study the issue of PAC-Bayesian domain adaptation: We want to learn, from
a source domain, a majority vote model dedicated to a target one. Our
theoretical contribution brings a new perspective by deriving an upper-bound on
the target risk where the distributions' divergence---expressed as a
ratio---controls the trade-off between a source error measure and the target
voters' disagreement. Our bound suggests that one has to focus on regions where
the source data is informative.From this result, we derive a PAC-Bayesian
generalization bound, and specialize it to linear classifiers. Then, we infer a
learning algorithmand perform experiments on real data.Comment: Published at ICML 201
An Improvement to the Domain Adaptation Bound in a PAC-Bayesian context
This paper provides a theoretical analysis of domain adaptation based on the
PAC-Bayesian theory. We propose an improvement of the previous domain
adaptation bound obtained by Germain et al. in two ways. We first give another
generalization bound tighter and easier to interpret. Moreover, we provide a
new analysis of the constant term appearing in the bound that can be of high
interest for developing new algorithmic solutions.Comment: NIPS 2014 Workshop on Transfer and Multi-task learning: Theory Meets
Practice, Dec 2014, Montr{\'e}al, Canad
PAC-Bayesian Learning and Domain Adaptation
In machine learning, Domain Adaptation (DA) arises when the distribution gen-
erating the test (target) data differs from the one generating the learning
(source) data. It is well known that DA is an hard task even under strong
assumptions, among which the covariate-shift where the source and target
distributions diverge only in their marginals, i.e. they have the same labeling
function. Another popular approach is to consider an hypothesis class that
moves closer the two distributions while implying a low-error for both tasks.
This is a VC-dim approach that restricts the complexity of an hypothesis class
in order to get good generalization. Instead, we propose a PAC-Bayesian
approach that seeks for suitable weights to be given to each hypothesis in
order to build a majority vote. We prove a new DA bound in the PAC-Bayesian
context. This leads us to design the first DA-PAC-Bayesian algorithm based on
the minimization of the proposed bound. Doing so, we seek for a \rho-weighted
majority vote that takes into account a trade-off between three quantities. The
first two quantities being, as usual in the PAC-Bayesian approach, (a) the
complexity of the majority vote (measured by a Kullback-Leibler divergence) and
(b) its empirical risk (measured by the \rho-average errors on the source
sample). The third quantity is (c) the capacity of the majority vote to
distinguish some structural difference between the source and target samples.Comment: https://sites.google.com/site/multitradeoffs2012
PAC-Bayesian Domain Adaptation Bounds for Multi-view learning
This paper presents a series of new results for domain adaptation in the
multi-view learning setting. The incorporation of multiple views in the domain
adaptation was paid little attention in the previous studies. In this way, we
propose an analysis of generalization bounds with Pac-Bayesian theory to
consolidate the two paradigms, which are currently treated separately. Firstly,
building on previous work by Germain et al., we adapt the distance between
distribution proposed by Germain et al. for domain adaptation with the concept
of multi-view learning. Thus, we introduce a novel distance that is tailored
for the multi-view domain adaptation setting. Then, we give Pac-Bayesian bounds
for estimating the introduced divergence. Finally, we compare the different new
bounds with the previous studies.Comment: arXiv admin note: text overlap with arXiv:2004.11829 by other author
A New PAC-Bayesian View of Domain Adaptation
International audienceWe propose a new theoretical study of domain adaptation for majority vote classifiers (from a source to a target domain). We upper bound the target risk by a trade-off between only two terms: The voters’ joint errors on the source domain, and the voters’ disagreement on the target one. Hence, this new study is simpler than other analyses that usually rely on three terms. We also derive a PAC-Bayesian generalization bound leading to a DA algorithm for linear classifiers
A PAC-Bayesian Approach for Domain Adaptation with Specialization to Linear Classifiers
International audienceWe provide a first PAC-Bayesian analysis for domain adaptation (DA) which arises when the learning and test distributions differ. It relies on a novel distribution pseudodistance based on a disagreement averaging. Using this measure, we derive a PAC-Bayesian DA bound for the stochastic Gibbs classifier. This bound has the advantage of being directly optimizable for any hypothesis space. We specialize it to linear classifiers, and design a learning algorithm which shows interesting results on a synthetic problem and on a popular sentiment annotation task. This opens the door to tackling DA tasks by making use of all the PAC-Bayesian tools
Theoretic Analysis and Extremely Easy Algorithms for Domain Adaptive Feature Learning
Domain adaptation problems arise in a variety of applications, where a
training dataset from the \textit{source} domain and a test dataset from the
\textit{target} domain typically follow different distributions. The primary
difficulty in designing effective learning models to solve such problems lies
in how to bridge the gap between the source and target distributions. In this
paper, we provide comprehensive analysis of feature learning algorithms used in
conjunction with linear classifiers for domain adaptation. Our analysis shows
that in order to achieve good adaptation performance, the second moments of the
source domain distribution and target domain distribution should be similar.
Based on our new analysis, a novel extremely easy feature learning algorithm
for domain adaptation is proposed. Furthermore, our algorithm is extended by
leveraging multiple layers, leading to a deep linear model. We evaluate the
effectiveness of the proposed algorithms in terms of domain adaptation tasks on
the Amazon review dataset and the spam dataset from the ECML/PKDD 2006
discovery challenge.Comment: ijca
PAC-Bayesian Bounds on Rate-Efficient Classifiers
We derive analytic bounds on the noise invariance of majority vote classifiers operating on compressed inputs. Specifically, starting from recent
bounds on the true risk of majority vote classifiers,
we extend the applicability of PAC-Bayesian theory to quantify the resilience of majority votes to
input noise stemming from compression. The derived bounds are intuitive in binary classification
settings, where they can be measured as expressions of voter differentials and voter pair agreement. By combining measures of input distortion
with analytic guarantees on noise invariance, we
prescribe rate-efficient machines to compress inputs without affecting subsequent classification.
Our validation shows how bounding noise invariance can inform the compression stage for any
majority vote classifier such that worst-case implications of bad input reconstructions are known,
and inputs can be compressed to the minimum
amount of information needed prior to inference
A survey on domain adaptation theory: learning bounds and theoretical guarantees
All famous machine learning algorithms that comprise both supervised and
semi-supervised learning work well only under a common assumption: the training
and test data follow the same distribution. When the distribution changes, most
statistical models must be reconstructed from newly collected data, which for
some applications can be costly or impossible to obtain. Therefore, it has
become necessary to develop approaches that reduce the need and the effort to
obtain new labeled samples by exploiting data that are available in related
areas, and using these further across similar fields. This has given rise to a
new machine learning framework known as transfer learning: a learning setting
inspired by the capability of a human being to extrapolate knowledge across
tasks to learn more efficiently. Despite a large amount of different transfer
learning scenarios, the main objective of this survey is to provide an overview
of the state-of-the-art theoretical results in a specific, and arguably the
most popular, sub-field of transfer learning, called domain adaptation. In this
sub-field, the data distribution is assumed to change across the training and
the test data, while the learning task remains the same. We provide a first
up-to-date description of existing results related to domain adaptation problem
that cover learning bounds based on different statistical learning frameworks
- …