15,472 research outputs found
Generalization Bounds for Representative Domain Adaptation
In this paper, we propose a novel framework to analyze the theoretical
properties of the learning process for a representative type of domain
adaptation, which combines data from multiple sources and one target (or
briefly called representative domain adaptation). In particular, we use the
integral probability metric to measure the difference between the distributions
of two domains and meanwhile compare it with the H-divergence and the
discrepancy distance. We develop the Hoeffding-type, the Bennett-type and the
McDiarmid-type deviation inequalities for multiple domains respectively, and
then present the symmetrization inequality for representative domain
adaptation. Next, we use the derived inequalities to obtain the Hoeffding-type
and the Bennett-type generalization bounds respectively, both of which are
based on the uniform entropy number. Moreover, we present the generalization
bounds based on the Rademacher complexity. Finally, we analyze the asymptotic
convergence and the rate of convergence of the learning process for
representative domain adaptation. We discuss the factors that affect the
asymptotic behavior of the learning process and the numerical experiments
support our theoretical findings as well. Meanwhile, we give a comparison with
the existing results of domain adaptation and the classical results under the
same-distribution assumption.Comment: arXiv admin note: substantial text overlap with arXiv:1304.157
Distribution-Based Categorization of Classifier Transfer Learning
Transfer Learning (TL) aims to transfer knowledge acquired in one problem,
the source problem, onto another problem, the target problem, dispensing with
the bottom-up construction of the target model. Due to its relevance, TL has
gained significant interest in the Machine Learning community since it paves
the way to devise intelligent learning models that can easily be tailored to
many different applications. As it is natural in a fast evolving area, a wide
variety of TL methods, settings and nomenclature have been proposed so far.
However, a wide range of works have been reporting different names for the same
concepts. This concept and terminology mixture contribute however to obscure
the TL field, hindering its proper consideration. In this paper we present a
review of the literature on the majority of classification TL methods, and also
a distribution-based categorization of TL with a common nomenclature suitable
to classification problems. Under this perspective three main TL categories are
presented, discussed and illustrated with examples
Bounded-Distortion Metric Learning
Metric learning aims to embed one metric space into another to benefit tasks
like classification and clustering. Although a greatly distorted metric space
has a high degree of freedom to fit training data, it is prone to overfitting
and numerical inaccuracy. This paper presents {\it bounded-distortion metric
learning} (BDML), a new metric learning framework which amounts to finding an
optimal Mahalanobis metric space with a bounded-distortion constraint. An
efficient solver based on the multiplicative weights update method is proposed.
Moreover, we generalize BDML to pseudo-metric learning and devise the
semidefinite relaxation and a randomized algorithm to approximately solve it.
We further provide theoretical analysis to show that distortion is a key
ingredient for stability and generalization ability of our BDML algorithm.
Extensive experiments on several benchmark datasets yield promising results
- …