Framework for transfer learning: Maximization of quadratic mutual information to create discriminative subspaces

Abstract

In the area of pattern recognition and computer vision, Transfer learning has become an emerging topic in recent years. It is motivated by the mechanism of human vision system that is capable of accumulating previous knowledge or experience to unveil a novel domain. Learning an effective model to solve a classification or recognition task in a new domain (dataset) requires sufficient data with ground truth information. Visual data are being generated in an enormous amount every moment with the advance of photo capturing devices. Most of these data remain unannotated. Manually collecting and annotating training data by human intervention is expensive and hence the learned model may suffer from performance bottleneck because of poor generalization and label scarcity. Also an existing trained model may become outdated if the distribution of training data differs from the distribution where the model is tested. Traditional machine learning methods generally assume that training and test data are sampled from the same distribution. This assumption is often challenged in real life scenario. Therefore, adapting an existing model or utilizing the knowledge of a label-rich domain becomes inevitable to overcome the issue of continuous evolving data distribution and the lack of label information in a novel domain. In other words, a knowledge transfer process is developed with a goal to minimize the distribution divergence between domains such that a classifier trained using source dataset can also generalize over target domain. In this thesis, we propose a novel framework for transfer learning by creating a common subspace based on maximization of non-parametric quadratic mutual information (QMI) between data and corresponding class labels. We extend the prior work of QMI in the context of knowledge transfer by introducing soft class assignment and instance weighting for data across domains. The proposed approach learns a class discriminative subspace by leveraging soft-labeling. Also by employing a suitable weighting scheme, the method identifies samples with underlying shared similarity across domains in order to maximize their impact on subspace learning. Variants of the proposed framework, parameter sensitivity, extensive experiments using benchmark datasets and also performance comparison with recent competitive methods are provided to prove the efficacy of our novel framework

    Similar works