108,072 research outputs found
Adversarial Domain Adaptation Being Aware of Class Relationships
Adversarial training is a useful approach to promote the learning of
transferable representations across the source and target domains, which has
been widely applied for domain adaptation (DA) tasks based on deep neural
networks. Until very recently, existing adversarial domain adaptation (ADA)
methods ignore the useful information from the label space, which is an
important factor accountable for the complicated data distributions associated
with different semantic classes. Especially, the inter-class semantic
relationships have been rarely considered and discussed in the current work of
transfer learning. In this paper, we propose a novel relationship-aware
adversarial domain adaptation (RADA) algorithm, which first utilizes a single
multi-class domain discriminator to enforce the learning of inter-class
dependency structure during domain-adversarial training and then aligns this
structure with the inter-class dependencies that are characterized from
training the label predictor on source domain. Specifically, we impose a
regularization term to penalize the structure discrepancy between the
inter-class dependencies respectively estimated from domain discriminator and
label predictor. Through this alignment, our proposed method makes the
adversarial domain adaptation aware of the class relationships. Empirical
studies show that the incorporation of class relationships significantly
improves the performance on benchmark datasets
Joint auto-encoders: a flexible multi-task learning framework
The incorporation of prior knowledge into learning is essential in achieving
good performance based on small noisy samples. Such knowledge is often
incorporated through the availability of related data arising from domains and
tasks similar to the one of current interest. Ideally one would like to allow
both the data for the current task and for previous related tasks to
self-organize the learning system in such a way that commonalities and
differences between the tasks are learned in a data-driven fashion. We develop
a framework for learning multiple tasks simultaneously, based on sharing
features that are common to all tasks, achieved through the use of a modular
deep feedforward neural network consisting of shared branches, dealing with the
common features of all tasks, and private branches, learning the specific
unique aspects of each task. Once an appropriate weight sharing architecture
has been established, learning takes place through standard algorithms for
feedforward networks, e.g., stochastic gradient descent and its variations. The
method deals with domain adaptation and multi-task learning in a unified
fashion, and can easily deal with data arising from different types of sources.
Numerical experiments demonstrate the effectiveness of learning in domain
adaptation and transfer learning setups, and provide evidence for the flexible
and task-oriented representations arising in the network
Improving Transferability of Deep Neural Networks
Learning from small amounts of labeled data is a challenge in the area of
deep learning. This is currently addressed by Transfer Learning where one
learns the small data set as a transfer task from a larger source dataset.
Transfer Learning can deliver higher accuracy if the hyperparameters and source
dataset are chosen well. One of the important parameters is the learning rate
for the layers of the neural network. We show through experiments on the
ImageNet22k and Oxford Flowers datasets that improvements in accuracy in range
of 127% can be obtained by proper choice of learning rates. We also show that
the images/label parameter for a dataset can potentially be used to determine
optimal learning rates for the layers to get the best overall accuracy. We
additionally validate this method on a sample of real-world image
classification tasks from a public visual recognition API.Comment: 15 pages, 11 figures, 2 tables, Workshop on Domain Adaptation for
Visual Understanding (Joint IJCAI/ECAI/AAMAS/ICML 2018 Workshop) Keywords:
deep learning, transfer learning, finetuning, deep neural network,
experimenta
Zero-Annotation Object Detection with Web Knowledge Transfer
Object detection is one of the major problems in computer vision, and has
been extensively studied. Most of the existing detection works rely on
labor-intensive supervision, such as ground truth bounding boxes of objects or
at least image-level annotations. On the contrary, we propose an object
detection method that does not require any form of human annotation on target
tasks, by exploiting freely available web images. In order to facilitate
effective knowledge transfer from web images, we introduce a multi-instance
multi-label domain adaption learning framework with two key innovations. First
of all, we propose an instance-level adversarial domain adaptation network with
attention on foreground objects to transfer the object appearances from web
domain to target domain. Second, to preserve the class-specific semantic
structure of transferred object features, we propose a simultaneous transfer
mechanism to transfer the supervision across domains through pseudo strong
label generation. With our end-to-end framework that simultaneously learns a
weakly supervised detector and transfers knowledge across domains, we achieved
significant improvements over baseline methods on the benchmark datasets.Comment: Accepted in ECCV 201
Neural Supervised Domain Adaptation by Augmenting Pre-trained Models with Random Units
Neural Transfer Learning (TL) is becoming ubiquitous in Natural Language
Processing (NLP), thanks to its high performance on many tasks, especially in
low-resourced scenarios. Notably, TL is widely used for neural domain
adaptation to transfer valuable knowledge from high-resource to low-resource
domains. In the standard fine-tuning scheme of TL, a model is initially
pre-trained on a source domain and subsequently fine-tuned on a target domain
and, therefore, source and target domains are trained using the same
architecture. In this paper, we show through interpretation methods that such
scheme, despite its efficiency, is suffering from a main limitation. Indeed,
although capable of adapting to new domains, pre-trained neurons struggle with
learning certain patterns that are specific to the target domain. Moreover, we
shed light on the hidden negative transfer occurring despite the high
relatedness between source and target domains, which may mitigate the final
gain brought by transfer learning. To address these problems, we propose to
augment the pre-trained model with normalised, weighted and randomly
initialised units that foster a better adaptation while maintaining the
valuable source knowledge. We show that our approach exhibits significant
improvements to the standard fine-tuning scheme for neural domain adaptation
from the news domain to the social media domain on four NLP tasks:
part-of-speech tagging, chunking, named entity recognition and morphosyntactic
tagging
Interventional Domain Adaptation
Domain adaptation (DA) aims to transfer discriminative features learned from
source domain to target domain. Most of DA methods focus on enhancing feature
transferability through domain-invariance learning. However, source-learned
discriminability itself might be tailored to be biased and unsafely
transferable by spurious correlations, \emph{i.e.}, part of source-specific
features are correlated with category labels. We find that standard
domain-invariance learning suffers from such correlations and incorrectly
transfers the source-specifics. To address this issue, we intervene in the
learning of feature discriminability using unlabeled target data to guide it to
get rid of the domain-specific part and be safely transferable. Concretely, we
generate counterfactual features that distinguish the domain-specifics from
domain-sharable part through a novel feature intervention strategy. To prevent
the residence of domain-specifics, the feature discriminability is trained to
be invariant to the mutations in the domain-specifics of counterfactual
features. Experimenting on typical \emph{one-to-one} unsupervised domain
adaptation and challenging domain-agnostic adaptation tasks, the consistent
performance improvements of our method over state-of-the-art approaches
validate that the learned discriminative features are more safely transferable
and generalize well to novel domains
Automatic Survey-Invariant Variable Star Classification
Machine learning techniques have been successfully used to classify variable
stars on widely-studied astronomical surveys. These datasets have been
available to astronomers long enough, thus allowing them to perform deep
analysis over several variable sources and generating useful catalogs with
identified variable stars. The products of these studies are labeled data that
enable supervised learning models to be trained successfully. However, when
these models are blindly applied to data from new sky surveys their performance
drops significantly. Furthermore, unlabeled data becomes available at a much
higher rate than its labeled counterpart, since labeling is a manual and
time-consuming effort. Domain adaptation techniques aim to learn from a domain
where labeled data is available, the \textit{source domain}, and through some
adaptation perform well on a different domain, the \textit{target domain}. We
propose a full probabilistic model that represents the joint distribution of
features from two surveys as well as a probabilistic transformation of the
features between one survey to the other. This allows us to transfer labeled
data to a study where it is not available and to effectively run a variable
star classification model in a new survey. Our model represents the features of
each domain as a Gaussian mixture and models the transformation as a
translation, rotation and scaling of each separate component. We perform tests
using three different variability catalogs: EROS, MACHO, and HiTS, presenting
differences among them, such as the amount of observations per star, cadence,
observational time and optical bands observed, among others
Optimal Bayesian Transfer Learning
Transfer learning has recently attracted significant research attention, as
it simultaneously learns from different source domains, which have plenty of
labeled data, and transfers the relevant knowledge to the target domain with
limited labeled data to improve the prediction performance. We propose a
Bayesian transfer learning framework where the source and target domains are
related through the joint prior density of the model parameters. The modeling
of joint prior densities enables better understanding of the "transferability"
between domains. We define a joint Wishart density for the precision matrices
of the Gaussian feature-label distributions in the source and target domains to
act like a bridge that transfers the useful information of the source domain to
help classification in the target domain by improving the target posteriors.
Using several theorems in multivariate statistics, the posteriors and posterior
predictive densities are derived in closed forms with hypergeometric functions
of matrix argument, leading to our novel closed-form and fast Optimal Bayesian
Transfer Learning (OBTL) classifier. Experimental results on both synthetic and
real-world benchmark data confirm the superb performance of the OBTL compared
to the other state-of-the-art transfer learning and domain adaptation methods.Comment: IEEE Transactions on Signal Processin
Domain-Agnostic Learning with Anatomy-Consistent Embedding for Cross-Modality Liver Segmentation
Domain Adaptation (DA) has the potential to greatly help the generalization
of deep learning models. However, the current literature usually assumes to
transfer the knowledge from the source domain to a specific known target
domain. Domain Agnostic Learning (DAL) proposes a new task of transferring
knowledge from the source domain to data from multiple heterogeneous target
domains. In this work, we propose the Domain-Agnostic Learning framework with
Anatomy-Consistent Embedding (DALACE) that works on both domain-transfer and
task-transfer to learn a disentangled representation, aiming to not only be
invariant to different modalities but also preserve anatomical structures for
the DA and DAL tasks in cross-modality liver segmentation. We validated and
compared our model with state-of-the-art methods, including CycleGAN, Task
Driven Generative Adversarial Network (TD-GAN), and Domain Adaptation via
Disentangled Representations (DADR). For the DA task, our DALACE model
outperformed CycleGAN, TD-GAN ,and DADR with DSC of 0.847 compared to 0.721,
0.793 and 0.806. For the DAL task, our model improved the performance with DSC
of 0.794 from 0.522, 0.719 and 0.742 by CycleGAN, TD-GAN, and DADR. Further, we
visualized the success of disentanglement, which added human interpretability
of the learned meaningful representations. Through ablation analysis, we
specifically showed the concrete benefits of disentanglement for downstream
tasks and the role of supervision for better disentangled representation with
segmentation consistency to be invariant to domains with the proposed
Domain-Agnostic Module (DAM) and to preserve anatomical information with the
proposed Anatomy-Preserving Module (APM)
Hypothesis Disparity Regularized Mutual Information Maximization
We propose a hypothesis disparity regularized mutual information
maximization~(HDMI) approach to tackle unsupervised hypothesis transfer -- as
an effort towards unifying hypothesis transfer learning (HTL) and unsupervised
domain adaptation (UDA) -- where the knowledge from a source domain is
transferred solely through hypotheses and adapted to the target domain in an
unsupervised manner. In contrast to the prevalent HTL and UDA approaches that
typically use a single hypothesis, HDMI employs multiple hypotheses to leverage
the underlying distributions of the source and target hypotheses. To better
utilize the crucial relationship among different hypotheses -- as opposed to
unconstrained optimization of each hypothesis independently -- while adapting
to the unlabeled target domain through mutual information maximization, HDMI
incorporates a hypothesis disparity regularization that coordinates the target
hypotheses jointly learn better target representations while preserving more
transferable source knowledge with better-calibrated prediction uncertainty.
HDMI achieves state-of-the-art adaptation performance on benchmark datasets for
UDA in the context of HTL, without the need to access the source data during
the adaptation.Comment: Accepted to AAAI 202
- …