Structure-Preserved Unsupervised Domain Adaptation

Abstract

Domain adaptation has been a primal approach to addressing the issues by lack of labels in many data mining tasks. Although considerable efforts have been devoted to domain adaptation with promising results, most existing work learns a classifier on a source domain and then predicts the labels for target data, where only the instances near the boundary determine the hyperplane and the whole structure information is ignored. Moreover, little work has been done regarding to multi-source domain adaptation. To that end, we develop a novel unsupervised domain adaptation framework, which ensures the whole structure of source domains is preserved to guide the target structure learning in a semi-supervised clustering fashion. To our knowledge, this is the first time when the domain adaptation problem is re-formulated as a semi-supervised clustering problem with target labels as missing values. Furthermore, by introducing an augmented matrix, a non-trivial solution is designed, which can be exactly mapped into a K-means-like optimization problem with modified distance function and update rule for centroids in an efficient way. Extensive experiments on several widely-used databases show the substantial improvements of our proposed approach over the state-of-the-art methods

    Similar works