4 research outputs found
Metric Learning for Graph-based Domain Adaptation
Abstract In many domain adaption formulations, it is assumed to have large amount of unlabeled data from the domain of interest (target domain), some portion of it may be labeled, and large amount of labeled data from other domains, also known as source domain(s). Motivated by the fact that labeled data is hard to obtain in any domain, we design algorithms for the settings in which there exists large amount of unlabeled data from all domains, small portion of which may be labeled. We build on recent advances in graph-based semi-supervised learning and supervised metric learning. Given all instances, labeled and unlabeled, from all domains, we build a large similarity graph between them, where an edge exists between two instances if they are close according to some metric. Instead of using predefined metric, as commonly performed, we feed the labeled instances into metric-learning algorithms and (re)construct a data-dependent metric, which is used to construct the graph. We employ different types of edges depending on the domain-identity of the two vertices touching it, and learn the weights of each edge. Experimental results show that our approach leads to significant reduction in classification error across domains, and performs better than two state-of-the-art models on the task of sentiment classification
Domain Adaptation on Graphs by Learning Aligned Graph Bases
A common assumption in semi-supervised learning with graph models is that the
class label function varies smoothly on the data graph, resulting in the rather
strict prior that the label function has low-frequency content. Meanwhile, in
many classification problems, the label function may vary abruptly in certain
graph regions, resulting in high-frequency components. Although the
semi-supervised estimation of class labels is an ill-posed problem in general,
in several applications it is possible to find a source graph on which the
label function has similar frequency content to that on the target graph where
the actual classification problem is defined. In this paper, we propose a
method for domain adaptation on graphs motivated by these observations. Our
algorithm is based on learning the spectrum of the label function in a source
graph with many labeled nodes, and transferring the information of the spectrum
to the target graph with fewer labeled nodes. While the frequency content of
the class label function can be identified through the graph Fourier transform,
it is not easy to transfer the Fourier coefficients directly between the two
graphs, since no one-to-one match exists between the Fourier basis vectors of
independently constructed graphs in the domain adaptation setting. We solve
this problem by learning a transformation between the Fourier bases of the two
graphs that flexibly ``aligns'' them. The unknown class label function on the
target graph is then reconstructed such that its spectrum matches that on the
source graph while also ensuring the consistency with the available labels. The
proposed method is tested in the classification of image, online product
review, and social network data sets. Comparative experiments suggest that the
proposed algorithm performs better than recent domain adaptation methods in the
literature in most settings
Metric Learning for Graph-based Domain Adaptation
In many domain adaption formulations, it is assumed to have large amount of unlabeled data from the domain of interest (target domain), some portion of it may be labeled, and large amount of labeled data from other domains, also known as source domain(s). Motivated by the fact that labeled data is hard to obtain in any domain, we design algorithms for the settings in which there exists large amount of unlabeled data from all domains, small portion of which may be labeled. We build on recent advances in graph-based semi-supervised learning and supervised metric learning. Given all instances, labeled and unlabeled, from all domains, we build a large similarity graph between them, where an edge exists between two instances if they are close according to some metric. Instead of using predefined metric, as commonly performed, we feed the labeled instances into metric-learning algorithms and (re)construct a data-dependent metric, which is used to construct the graph. We employ different types of edges depending on the domain-identity of the two vertices touching it, and learn the weights of each edge. Experimental results show that our approach leads to significant reduction in classification error across domains, and performs better than two state-of-the-art models on the task of sentiment classification