15,574 research outputs found
Deep causal representation learning for unsupervised domain adaptation
Studies show that the representations learned by deep neural networks can be
transferred to similar prediction tasks in other domains for which we do not
have enough labeled data. However, as we transition to higher layers in the
model, the representations become more task-specific and less generalizable.
Recent research on deep domain adaptation proposed to mitigate this problem by
forcing the deep model to learn more transferable feature representations
across domains. This is achieved by incorporating domain adaptation methods
into deep learning pipeline. The majority of existing models learn the
transferable feature representations which are highly correlated with the
outcome. However, correlations are not always transferable. In this paper, we
propose a novel deep causal representation learning framework for unsupervised
domain adaptation, in which we propose to learn domain-invariant causal
representations of the input from the source domain. We simulate a virtual
target domain using reweighted samples from the source domain and estimate the
causal effect of features on the outcomes. The extensive comparative study
demonstrates the strengths of the proposed model for unsupervised domain
adaptation via causal representations
Deep Transfer Learning with Joint Adaptation Networks
Deep networks have been successfully applied to learn transferable features
for adapting models from a source domain to a different target domain. In this
paper, we present joint adaptation networks (JAN), which learn a transfer
network by aligning the joint distributions of multiple domain-specific layers
across domains based on a joint maximum mean discrepancy (JMMD) criterion.
Adversarial training strategy is adopted to maximize JMMD such that the
distributions of the source and target domains are made more distinguishable.
Learning can be performed by stochastic gradient descent with the gradients
computed by back-propagation in linear-time. Experiments testify that our model
yields state of the art results on standard datasets.Comment: 34th International Conference on Machine Learnin
PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities
Data of different modalities generally convey complimentary but heterogeneous
information, and a more discriminative representation is often preferred by
combining multiple data modalities like the RGB and infrared features. However
in reality, obtaining both data channels is challenging due to many
limitations. For example, the RGB surveillance cameras are often restricted
from private spaces, which is in conflict with the need of abnormal activity
detection for personal security. As a result, using partial data channels to
build a full representation of multi-modalities is clearly desired. In this
paper, we propose a novel Partial-modal Generative Adversarial Networks
(PM-GANs) that learns a full-modal representation using data from only partial
modalities. The full representation is achieved by a generated representation
in place of the missing data channel. Extensive experiments are conducted to
verify the performance of our proposed method on action recognition, compared
with four state-of-the-art methods. Meanwhile, a new Infrared-Visible Dataset
for action recognition is introduced, and will be the first publicly available
action dataset that contains paired infrared and visible spectrum
Unsupervised Domain Adaptation with Residual Transfer Networks
The recent success of deep neural networks relies on massive amounts of
labeled data. For a target task where labeled data is unavailable, domain
adaptation can transfer a learner from a different source domain. In this
paper, we propose a new approach to domain adaptation in deep networks that can
jointly learn adaptive classifiers and transferable features from labeled data
in the source domain and unlabeled data in the target domain. We relax a
shared-classifier assumption made by previous methods and assume that the
source classifier and target classifier differ by a residual function. We
enable classifier adaptation by plugging several layers into deep network to
explicitly learn the residual function with reference to the target classifier.
We fuse features of multiple layers with tensor product and embed them into
reproducing kernel Hilbert spaces to match distributions for feature
adaptation. The adaptation can be achieved in most feed-forward models by
extending them with new residual layers and loss functions, which can be
trained efficiently via back-propagation. Empirical evidence shows that the new
approach outperforms state of the art methods on standard domain adaptation
benchmarks.Comment: 30th Conference on Neural Information Processing Systems (NIPS 2016),
Barcelona, Spai
Learning Transferable Features with Deep Adaptation Networks
Recent studies reveal that a deep neural network can learn transferable
features which generalize well to novel tasks for domain adaptation. However,
as deep features eventually transition from general to specific along the
network, the feature transferability drops significantly in higher layers with
increasing domain discrepancy. Hence, it is important to formally reduce the
dataset bias and enhance the transferability in task-specific layers. In this
paper, we propose a new Deep Adaptation Network (DAN) architecture, which
generalizes deep convolutional neural network to the domain adaptation
scenario. In DAN, hidden representations of all task-specific layers are
embedded in a reproducing kernel Hilbert space where the mean embeddings of
different domain distributions can be explicitly matched. The domain
discrepancy is further reduced using an optimal multi-kernel selection method
for mean embedding matching. DAN can learn transferable features with
statistical guarantees, and can scale linearly by unbiased estimate of kernel
embedding. Extensive empirical evidence shows that the proposed architecture
yields state-of-the-art image classification error rates on standard domain
adaptation benchmarks
Learning Multiple Tasks with Multilinear Relationship Networks
Deep networks trained on large-scale data can learn transferable features to
promote learning multiple tasks. Since deep features eventually transition from
general to specific along deep networks, a fundamental problem of multi-task
learning is how to exploit the task relatedness underlying parameter tensors
and improve feature transferability in the multiple task-specific layers. This
paper presents Multilinear Relationship Networks (MRN) that discover the task
relationships based on novel tensor normal priors over parameter tensors of
multiple task-specific layers in deep convolutional networks. By jointly
learning transferable features and multilinear relationships of tasks and
features, MRN is able to alleviate the dilemma of negative-transfer in the
feature layers and under-transfer in the classifier layer. Experiments show
that MRN yields state-of-the-art results on three multi-task learning datasets.Comment: NIPS 201
Wasserstein Distance Guided Representation Learning for Domain Adaptation
Domain adaptation aims at generalizing a high-performance learner on a target
domain via utilizing the knowledge distilled from a source domain which has a
different but related data distribution. One solution to domain adaptation is
to learn domain invariant feature representations while the learned
representations should also be discriminative in prediction. To learn such
representations, domain adaptation frameworks usually include a domain
invariant representation learning approach to measure and reduce the domain
discrepancy, as well as a discriminator for classification. Inspired by
Wasserstein GAN, in this paper we propose a novel approach to learn domain
invariant feature representations, namely Wasserstein Distance Guided
Representation Learning (WDGRL). WDGRL utilizes a neural network, denoted by
the domain critic, to estimate empirical Wasserstein distance between the
source and target samples and optimizes the feature extractor network to
minimize the estimated Wasserstein distance in an adversarial manner. The
theoretical advantages of Wasserstein distance for domain adaptation lie in its
gradient property and promising generalization bound. Empirical studies on
common sentiment and image classification adaptation datasets demonstrate that
our proposed WDGRL outperforms the state-of-the-art domain invariant
representation learning approaches.Comment: The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI
2018
Multi-Adversarial Domain Adaptation
Recent advances in deep domain adaptation reveal that adversarial learning
can be embedded into deep networks to learn transferable features that reduce
distribution discrepancy between the source and target domains. Existing domain
adversarial adaptation methods based on single domain discriminator only align
the source and target data distributions without exploiting the complex
multimode structures. In this paper, we present a multi-adversarial domain
adaptation (MADA) approach, which captures multimode structures to enable
fine-grained alignment of different data distributions based on multiple domain
discriminators. The adaptation can be achieved by stochastic gradient descent
with the gradients computed by back-propagation in linear-time. Empirical
evidence demonstrates that the proposed model outperforms state of the art
methods on standard domain adaptation datasets.Comment: AAAI 2018 Oral. arXiv admin note: substantial text overlap with
arXiv:1705.10667, arXiv:1707.0790
Learning to Transfer Examples for Partial Domain Adaptation
Domain adaptation is critical for learning in new and unseen environments.
With domain adversarial training, deep networks can learn disentangled and
transferable features that effectively diminish the dataset shift between the
source and target domains for knowledge transfer. In the era of Big Data, the
ready availability of large-scale labeled datasets has stimulated wide interest
in partial domain adaptation (PDA), which transfers a recognizer from a labeled
large domain to an unlabeled small domain. It extends standard domain
adaptation to the scenario where target labels are only a subset of source
labels. Under the condition that target labels are unknown, the key challenge
of PDA is how to transfer relevant examples in the shared classes to promote
positive transfer, and ignore irrelevant ones in the specific classes to
mitigate negative transfer. In this work, we propose a unified approach to PDA,
Example Transfer Network (ETN), which jointly learns domain-invariant
representations across the source and target domains, and a progressive
weighting scheme that quantifies the transferability of source examples while
controlling their importance to the learning task in the target domain. A
thorough evaluation on several benchmark datasets shows that our approach
achieves state-of-the-art results for partial domain adaptation tasks.Comment: CVPR 2019 accepte
Task-generalizable Adversarial Attack based on Perceptual Metric
Deep neural networks (DNNs) can be easily fooled by adding human
imperceptible perturbations to the images. These perturbed images are known as
`adversarial examples' and pose a serious threat to security and safety
critical systems. A litmus test for the strength of adversarial examples is
their transferability across different DNN models in a black box setting (i.e.
when the target model's architecture and parameters are not known to attacker).
Current attack algorithms that seek to enhance adversarial transferability work
on the decision level i.e. generate perturbations that alter the network
decisions. This leads to two key limitations: (a) An attack is dependent on the
task-specific loss function (e.g. softmax cross-entropy for object recognition)
and therefore does not generalize beyond its original task. (b) The adversarial
examples are specific to the network architecture and demonstrate poor
transferability to other network architectures. We propose a novel approach to
create adversarial examples that can broadly fool different networks on
multiple tasks. Our approach is based on the following intuition: "Perpetual
metrics based on neural network features are highly generalizable and show
excellent performance in measuring and stabilizing input distortions. Therefore
an ideal attack that creates maximum distortions in the network feature space
should realize highly transferable examples". We report extensive experiments
to show how adversarial examples generalize across multiple networks for
classification, object detection and segmentation tasks
- …