Search CORE

15,574 research outputs found

Deep causal representation learning for unsupervised domain adaptation

Author: Liu Huan
Moraffah Raha
Raglin Adrienne
Shu Kai
Publication venue
Publication date: 27/10/2019
Field of study

Studies show that the representations learned by deep neural networks can be transferred to similar prediction tasks in other domains for which we do not have enough labeled data. However, as we transition to higher layers in the model, the representations become more task-specific and less generalizable. Recent research on deep domain adaptation proposed to mitigate this problem by forcing the deep model to learn more transferable feature representations across domains. This is achieved by incorporating domain adaptation methods into deep learning pipeline. The majority of existing models learn the transferable feature representations which are highly correlated with the outcome. However, correlations are not always transferable. In this paper, we propose a novel deep causal representation learning framework for unsupervised domain adaptation, in which we propose to learn domain-invariant causal representations of the input from the source domain. We simulate a virtual target domain using reweighted samples from the source domain and estimate the causal effect of features on the outcomes. The extensive comparative study demonstrates the strengths of the proposed model for unsupervised domain adaptation via causal representations

arXiv.org e-Print Archive

Deep Transfer Learning with Joint Adaptation Networks

Author: Jordan Michael I.
Long Mingsheng
Wang Jianmin
Zhu Han
Publication venue
Publication date: 17/08/2017
Field of study

Deep networks have been successfully applied to learn transferable features for adapting models from a source domain to a different target domain. In this paper, we present joint adaptation networks (JAN), which learn a transfer network by aligning the joint distributions of multiple domain-specific layers across domains based on a joint maximum mean discrepancy (JMMD) criterion. Adversarial training strategy is adopted to maximize JMMD such that the distributions of the source and target domains are made more distinguishable. Learning can be performed by stochastic gradient descent with the gradients computed by back-propagation in linear-time. Experiments testify that our model yields state of the art results on standard datasets.Comment: 34th International Conference on Machine Learnin

arXiv.org e-Print Archive

PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities

Author: Gao Chenqiang
Meng Deyu
Wang Lan
Yang Luyu
Zhao Yue
Zuo Wangmeng
Publication venue
Publication date: 17/04/2018
Field of study

Data of different modalities generally convey complimentary but heterogeneous information, and a more discriminative representation is often preferred by combining multiple data modalities like the RGB and infrared features. However in reality, obtaining both data channels is challenging due to many limitations. For example, the RGB surveillance cameras are often restricted from private spaces, which is in conflict with the need of abnormal activity detection for personal security. As a result, using partial data channels to build a full representation of multi-modalities is clearly desired. In this paper, we propose a novel Partial-modal Generative Adversarial Networks (PM-GANs) that learns a full-modal representation using data from only partial modalities. The full representation is achieved by a generated representation in place of the missing data channel. Extensive experiments are conducted to verify the performance of our proposed method on action recognition, compared with four state-of-the-art methods. Meanwhile, a new Infrared-Visible Dataset for action recognition is introduced, and will be the first publicly available action dataset that contains paired infrared and visible spectrum

arXiv.org e-Print Archive

Unsupervised Domain Adaptation with Residual Transfer Networks

Author: Jordan Michael I.
Long Mingsheng
Wang Jianmin
Zhu Han
Publication venue
Publication date: 16/02/2017
Field of study

The recent success of deep neural networks relies on massive amounts of labeled data. For a target task where labeled data is unavailable, domain adaptation can transfer a learner from a different source domain. In this paper, we propose a new approach to domain adaptation in deep networks that can jointly learn adaptive classifiers and transferable features from labeled data in the source domain and unlabeled data in the target domain. We relax a shared-classifier assumption made by previous methods and assume that the source classifier and target classifier differ by a residual function. We enable classifier adaptation by plugging several layers into deep network to explicitly learn the residual function with reference to the target classifier. We fuse features of multiple layers with tensor product and embed them into reproducing kernel Hilbert spaces to match distributions for feature adaptation. The adaptation can be achieved in most feed-forward models by extending them with new residual layers and loss functions, which can be trained efficiently via back-propagation. Empirical evidence shows that the new approach outperforms state of the art methods on standard domain adaptation benchmarks.Comment: 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spai

arXiv.org e-Print Archive

Learning Transferable Features with Deep Adaptation Networks

Author: Cao Yue
Jordan Michael I.
Long Mingsheng
Wang Jianmin
Publication venue
Publication date: 27/05/2015
Field of study

Recent studies reveal that a deep neural network can learn transferable features which generalize well to novel tasks for domain adaptation. However, as deep features eventually transition from general to specific along the network, the feature transferability drops significantly in higher layers with increasing domain discrepancy. Hence, it is important to formally reduce the dataset bias and enhance the transferability in task-specific layers. In this paper, we propose a new Deep Adaptation Network (DAN) architecture, which generalizes deep convolutional neural network to the domain adaptation scenario. In DAN, hidden representations of all task-specific layers are embedded in a reproducing kernel Hilbert space where the mean embeddings of different domain distributions can be explicitly matched. The domain discrepancy is further reduced using an optimal multi-kernel selection method for mean embedding matching. DAN can learn transferable features with statistical guarantees, and can scale linearly by unbiased estimate of kernel embedding. Extensive empirical evidence shows that the proposed architecture yields state-of-the-art image classification error rates on standard domain adaptation benchmarks

arXiv.org e-Print Archive

Learning Multiple Tasks with Multilinear Relationship Networks

Author: Cao Zhangjie
Long Mingsheng
Wang Jianmin
Yu Philip S.
Publication venue
Publication date: 06/11/2017
Field of study

Deep networks trained on large-scale data can learn transferable features to promote learning multiple tasks. Since deep features eventually transition from general to specific along deep networks, a fundamental problem of multi-task learning is how to exploit the task relatedness underlying parameter tensors and improve feature transferability in the multiple task-specific layers. This paper presents Multilinear Relationship Networks (MRN) that discover the task relationships based on novel tensor normal priors over parameter tensors of multiple task-specific layers in deep convolutional networks. By jointly learning transferable features and multilinear relationships of tasks and features, MRN is able to alleviate the dilemma of negative-transfer in the feature layers and under-transfer in the classifier layer. Experiments show that MRN yields state-of-the-art results on three multi-task learning datasets.Comment: NIPS 201

arXiv.org e-Print Archive

Wasserstein Distance Guided Representation Learning for Domain Adaptation

Author: Qu Yanru
Shen Jian
Yu Yong
Zhang Weinan
Publication venue
Publication date: 09/03/2018
Field of study

Domain adaptation aims at generalizing a high-performance learner on a target domain via utilizing the knowledge distilled from a source domain which has a different but related data distribution. One solution to domain adaptation is to learn domain invariant feature representations while the learned representations should also be discriminative in prediction. To learn such representations, domain adaptation frameworks usually include a domain invariant representation learning approach to measure and reduce the domain discrepancy, as well as a discriminator for classification. Inspired by Wasserstein GAN, in this paper we propose a novel approach to learn domain invariant feature representations, namely Wasserstein Distance Guided Representation Learning (WDGRL). WDGRL utilizes a neural network, denoted by the domain critic, to estimate empirical Wasserstein distance between the source and target samples and optimizes the feature extractor network to minimize the estimated Wasserstein distance in an adversarial manner. The theoretical advantages of Wasserstein distance for domain adaptation lie in its gradient property and promising generalization bound. Empirical studies on common sentiment and image classification adaptation datasets demonstrate that our proposed WDGRL outperforms the state-of-the-art domain invariant representation learning approaches.Comment: The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018

arXiv.org e-Print Archive

Multi-Adversarial Domain Adaptation

Author: Cao Zhangjie
Long Mingsheng
Pei Zhongyi
Wang Jianmin
Publication venue
Publication date: 04/09/2018
Field of study

Recent advances in deep domain adaptation reveal that adversarial learning can be embedded into deep networks to learn transferable features that reduce distribution discrepancy between the source and target domains. Existing domain adversarial adaptation methods based on single domain discriminator only align the source and target data distributions without exploiting the complex multimode structures. In this paper, we present a multi-adversarial domain adaptation (MADA) approach, which captures multimode structures to enable fine-grained alignment of different data distributions based on multiple domain discriminators. The adaptation can be achieved by stochastic gradient descent with the gradients computed by back-propagation in linear-time. Empirical evidence demonstrates that the proposed model outperforms state of the art methods on standard domain adaptation datasets.Comment: AAAI 2018 Oral. arXiv admin note: substantial text overlap with arXiv:1705.10667, arXiv:1707.0790

arXiv.org e-Print Archive

Learning to Transfer Examples for Partial Domain Adaptation

Author: Cao Zhangjie
Long Mingsheng
Wang Jianmin
Yang Qiang
You Kaichao
Publication venue
Publication date: 07/04/2019
Field of study

Domain adaptation is critical for learning in new and unseen environments. With domain adversarial training, deep networks can learn disentangled and transferable features that effectively diminish the dataset shift between the source and target domains for knowledge transfer. In the era of Big Data, the ready availability of large-scale labeled datasets has stimulated wide interest in partial domain adaptation (PDA), which transfers a recognizer from a labeled large domain to an unlabeled small domain. It extends standard domain adaptation to the scenario where target labels are only a subset of source labels. Under the condition that target labels are unknown, the key challenge of PDA is how to transfer relevant examples in the shared classes to promote positive transfer, and ignore irrelevant ones in the specific classes to mitigate negative transfer. In this work, we propose a unified approach to PDA, Example Transfer Network (ETN), which jointly learns domain-invariant representations across the source and target domains, and a progressive weighting scheme that quantifies the transferability of source examples while controlling their importance to the learning task in the target domain. A thorough evaluation on several benchmark datasets shows that our approach achieves state-of-the-art results for partial domain adaptation tasks.Comment: CVPR 2019 accepte

arXiv.org e-Print Archive

Task-generalizable Adversarial Attack based on Perceptual Metric

Author: Khan Salman H.
Naseer Muzammal
Porikli Fatih
Rahman Shafin
Publication venue
Publication date: 26/03/2019
Field of study

Deep neural networks (DNNs) can be easily fooled by adding human imperceptible perturbations to the images. These perturbed images are known as `adversarial examples' and pose a serious threat to security and safety critical systems. A litmus test for the strength of adversarial examples is their transferability across different DNN models in a black box setting (i.e. when the target model's architecture and parameters are not known to attacker). Current attack algorithms that seek to enhance adversarial transferability work on the decision level i.e. generate perturbations that alter the network decisions. This leads to two key limitations: (a) An attack is dependent on the task-specific loss function (e.g. softmax cross-entropy for object recognition) and therefore does not generalize beyond its original task. (b) The adversarial examples are specific to the network architecture and demonstrate poor transferability to other network architectures. We propose a novel approach to create adversarial examples that can broadly fool different networks on multiple tasks. Our approach is based on the following intuition: "Perpetual metrics based on neural network features are highly generalizable and show excellent performance in measuring and stabilizing input distortions. Therefore an ideal attack that creates maximum distortions in the network feature space should realize highly transferable examples". We report extensive experiments to show how adversarial examples generalize across multiple networks for classification, object detection and segmentation tasks

arXiv.org e-Print Archive