95,176 research outputs found
A theoretical framework for supervised learning from regions
Supervised learning is investigated, when the data are represented not only by labeled points but also labeled regions of the input space. In the limit case, such
regions degenerate to single points and the proposed approach changes back to the classical learning context. The adopted framework entails the minimization
of a functional obtained by introducing a loss function that involves such regions. An additive regularization term is expressed via differential operators that model
the smoothness properties of the desired input/output relationship. Representer
theorems are given, proving that the optimization problem associated to learning
from labeled regions has a unique solution, which takes on the form of a linear
combination of kernel functions determined by the differential operators together
with the regions themselves. As a relevant situation, the case of regions given
by multi-dimensional intervals (i.e., “boxes”) is investigated, which models prior
knowledge expressed by logical propositions
Domain adaptation of weighted majority votes via perturbed variation-based self-labeling
In machine learning, the domain adaptation problem arrives when the test
(target) and the train (source) data are generated from different
distributions. A key applied issue is thus the design of algorithms able to
generalize on a new distribution, for which we have no label information. We
focus on learning classification models defined as a weighted majority vote
over a set of real-val ued functions. In this context, Germain et al. (2013)
have shown that a measure of disagreement between these functions is crucial to
control. The core of this measure is a theoretical bound--the C-bound (Lacasse
et al., 2007)--which involves the disagreement and leads to a well performing
majority vote learning algorithm in usual non-adaptative supervised setting:
MinCq. In this work, we propose a framework to extend MinCq to a domain
adaptation scenario. This procedure takes advantage of the recent perturbed
variation divergence between distributions proposed by Harel and Mannor (2012).
Justified by a theoretical bound on the target risk of the vote, we provide to
MinCq a target sample labeled thanks to a perturbed variation-based
self-labeling focused on the regions where the source and target marginals
appear similar. We also study the influence of our self-labeling, from which we
deduce an original process for tuning the hyperparameters. Finally, our
framework called PV-MinCq shows very promising results on a rotation and
translation synthetic problem
Domain Adaptation of Majority Votes via Perturbed Variation-based Label Transfer
We tackle the PAC-Bayesian Domain Adaptation (DA) problem. This arrives when
one desires to learn, from a source distribution, a good weighted majority vote
(over a set of classifiers) on a different target distribution. In this
context, the disagreement between classifiers is known crucial to control. In
non-DA supervised setting, a theoretical bound - the C-bound - involves this
disagreement and leads to a majority vote learning algorithm: MinCq. In this
work, we extend MinCq to DA by taking advantage of an elegant divergence
between distribution called the Perturbed Varation (PV). Firstly, justified by
a new formulation of the C-bound, we provide to MinCq a target sample labeled
thanks to a PV-based self-labeling focused on regions where the source and
target marginal distributions are closer. Secondly, we propose an original
process for tuning the hyperparameters. Our framework shows very promising
results on a toy problem
- …