353 research outputs found
Factorized Adversarial Networks for Unsupervised Domain Adaptation
In this paper, we propose Factorized Adversarial Networks (FAN) to solve
unsupervised domain adaptation problems for image classification tasks. Our
networks map the data distribution into a latent feature space, which is
factorized into a domain-specific subspace that contains domain-specific
characteristics and a task-specific subspace that retains category information,
for both source and target domains, respectively. Unsupervised domain
adaptation is achieved by adversarial training to minimize the discrepancy
between the distributions of two task-specific subspaces from source and target
domains. We demonstrate that the proposed approach outperforms state-of-the-art
methods on multiple benchmark datasets used in the literature for unsupervised
domain adaptation. Furthermore, we collect two real-world tagging datasets that
are much larger than existing benchmark datasets, and get significant
improvement upon baselines, proving the practical value of our approach
Domain Adaptation from Synthesis to Reality in Single-model Detector for Video Smoke Detection
This paper proposes a method for video smoke detection using synthetic smoke
samples. The virtual data can automatically offer precise and rich annotated
samples. However, the learning of smoke representations will be hurt by the
appearance gap between real and synthetic smoke samples. The existed researches
mainly work on the adaptation to samples extracted from original annotated
samples. These methods take the object detection and domain adaptation as two
independent parts. To train a strong detector with rich synthetic samples, we
construct the adaptation to the detection layer of state-of-the-art
single-model detectors (SSD and MS-CNN). The training procedure is an
end-to-end stage. The classification, location and adaptation are combined in
the learning. The performance of the proposed model surpasses the original
baseline in our experiments. Meanwhile, our results show that the detectors
based on the adversarial adaptation are superior to the detectors based on the
discrepancy adaptation. Code will be made publicly available on
http://smoke.ustc.edu.cn. Moreover, the domain adaptation for two-stage
detector is described in Appendix A.Comment: The manuscript approved by all authors is our original work, and has
submitted to Pattern Recognition for peer review previously. There are 4532
words, 6 figures and 1 table in this manuscrip
Multiple Subspace Alignment Improves Domain Adaptation
We present a novel unsupervised domain adaptation (DA) method for
cross-domain visual recognition. Though subspace methods have found success in
DA, their performance is often limited due to the assumption of approximating
an entire dataset using a single low-dimensional subspace. Instead, we develop
a method to effectively represent the source and target datasets via a
collection of low-dimensional subspaces, and subsequently align them by
exploiting the natural geometry of the space of subspaces, on the Grassmann
manifold. We demonstrate the effectiveness of this approach, using empirical
studies on two widely used benchmarks, with state of the art domain adaptation
performanceComment: under review in ICASSP 201
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Deep Grassmann Manifold Optimization for Computer Vision
In this work, we propose methods that advance four areas in the field of computer vision: dimensionality reduction, deep feature embeddings, visual domain adaptation, and deep neural network compression. We combine concepts from the fields of manifold geometry and deep learning to develop cutting edge methods in each of these areas. Each of the methods proposed in this work achieves state-of-the-art results in our experiments. We propose the Proxy Matrix Optimization (PMO) method for optimization over orthogonal matrix manifolds, such as the Grassmann manifold. This optimization technique is designed to be highly flexible enabling it to be leveraged in many situations where traditional manifold optimization methods cannot be used.
We first use PMO in the field of dimensionality reduction, where we propose an iterative optimization approach to Principal Component Analysis (PCA) in a framework called Proxy Matrix optimization based PCA (PM-PCA). We also demonstrate how PM-PCA can be used to solve the general -PCA problem, a variant of PCA that uses arbitrary fractional norms, which can be more robust to outliers. We then present Cascaded Projection (CaP), a method which uses tensor compression based on PMO, to reduce the number of filters in deep neural networks. This, in turn, reduces the number of computational operations required to process each image with the network. Cascaded Projection is the first end-to-end trainable method for network compression that uses standard backpropagation to learn the optimal tensor compression. In the area of deep feature embeddings, we introduce Deep Euclidean Feature Representations through Adaptation on the Grassmann manifold (DEFRAG), that leverages PMO. The DEFRAG method improves the feature embeddings learned by deep neural networks through the use of auxiliary loss functions and Grassmann manifold optimization. Lastly, in the area of visual domain adaptation, we propose the Manifold-Aligned Label Transfer for Domain Adaptation (MALT-DA) to transfer knowledge from samples in a known domain to an unknown domain based on cross-domain cluster correspondences
A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation
Recently, DNN model compression based on network architecture design, e.g.,
SqueezeNet, attracted a lot attention. No accuracy drop on image classification
is observed on these extremely compact networks, compared to well-known models.
An emerging question, however, is whether these model compression techniques
hurt DNN's learning ability other than classifying images on a single dataset.
Our preliminary experiment shows that these compression methods could degrade
domain adaptation (DA) ability, though the classification performance is
preserved. Therefore, we propose a new compact network architecture and
unsupervised DA method in this paper. The DNN is built on a new basic module
Conv-M which provides more diverse feature extractors without significantly
increasing parameters. The unified framework of our DA method will
simultaneously learn invariance across domains, reduce divergence of feature
representations, and adapt label prediction. Our DNN has 4.1M parameters, which
is only 6.7% of AlexNet or 59% of GoogLeNet. Experiments show that our DNN
obtains GoogLeNet-level accuracy both on classification and DA, and our DA
method slightly outperforms previous competitive ones. Put all together, our DA
strategy based on our DNN achieves state-of-the-art on sixteen of total
eighteen DA tasks on popular Office-31 and Office-Caltech datasets.Comment: 2017 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR'17
Unsupervised Domain Adaptation Using Approximate Label Matching
Domain adaptation addresses the problem created when training data is
generated by a so-called source distribution, but test data is generated by a
significantly different target distribution. In this work, we present
approximate label matching (ALM), a new unsupervised domain adaptation
technique that creates and leverages a rough labeling on the test samples, then
uses these noisy labels to learn a transformation that aligns the source and
target samples. We show that the transformation estimated by ALM has favorable
properties compared to transformations estimated by other methods, which do not
use any kind of target labeling. Our model is regularized by requiring that a
classifier trained to discriminate source from transformed target samples
cannot distinguish between the two. We experiment with ALM on simulated and
real data, and show that it outperforms techniques commonly used in the field
Domain-Adversarial Training of Neural Networks
We introduce a new representation learning approach for domain adaptation, in
which data at training and test time come from similar but different
distributions. Our approach is directly inspired by the theory on domain
adaptation suggesting that, for effective domain transfer to be achieved,
predictions must be made based on features that cannot discriminate between the
training (source) and test (target) domains. The approach implements this idea
in the context of neural network architectures that are trained on labeled data
from the source domain and unlabeled data from the target domain (no labeled
target-domain data is necessary). As the training progresses, the approach
promotes the emergence of features that are (i) discriminative for the main
learning task on the source domain and (ii) indiscriminate with respect to the
shift between the domains. We show that this adaptation behaviour can be
achieved in almost any feed-forward model by augmenting it with few standard
layers and a new gradient reversal layer. The resulting augmented architecture
can be trained using standard backpropagation and stochastic gradient descent,
and can thus be implemented with little effort using any of the deep learning
packages. We demonstrate the success of our approach for two distinct
classification problems (document sentiment analysis and image classification),
where state-of-the-art domain adaptation performance on standard benchmarks is
achieved. We also validate the approach for descriptor learning task in the
context of person re-identification application.Comment: Published in JMLR: http://jmlr.org/papers/v17/15-239.htm
Recent Progresses in Deep Learning based Acoustic Models (Updated)
In this paper, we summarize recent progresses made in deep learning based
acoustic models and the motivation and insights behind the surveyed techniques.
We first discuss acoustic models that can effectively exploit variable-length
contextual information, such as recurrent neural networks (RNNs), convolutional
neural networks (CNNs), and their various combination with other models. We
then describe acoustic models that are optimized end-to-end with emphasis on
feature representations learned jointly with rest of the system, the
connectionist temporal classification (CTC) criterion, and the attention-based
sequence-to-sequence model. We further illustrate robustness issues in speech
recognition systems, and discuss acoustic model adaptation, speech enhancement
and separation, and robust training strategies. We also cover modeling
techniques that lead to more efficient decoding and discuss possible future
directions in acoustic model research.Comment: This is an updated version with latest literature until ICASSP2018 of
the paper: Dong Yu and Jinyu Li, "Recent Progresses in Deep Learning based
Acoustic Models," vol.4, no.3, IEEE/CAA Journal of Automatica Sinica, 201
SALT: Subspace Alignment as an Auxiliary Learning Task for Domain Adaptation
Unsupervised domain adaptation aims to transfer and adapt knowledge learned
from a labeled source domain to an unlabeled target domain. Key components of
unsupervised domain adaptation include: (a) maximizing performance on the
target, and (b) aligning the source and target domains. Traditionally, these
tasks have either been considered as separate, or assumed to be implicitly
addressed together with high-capacity feature extractors. When considered
separately, alignment is usually viewed as a problem of aligning data
distributions, either through geometric approaches such as subspace alignment
or through distributional alignment such as optimal transport. This paper
represents a hybrid approach, where we assume simplified data geometry in the
form of subspaces, and consider alignment as an auxiliary task to the primary
task of maximizing performance on the source. The alignment is made rather
simple by leveraging tractable data geometry in the form of subspaces. We
synergistically allow certain parameters derived from the closed-form auxiliary
solution, to be affected by gradients from the primary task. The proposed
approach represents a unique fusion of geometric and model-based alignment with
gradients from a data-driven primary task. Our approach termed SALT, is a
simple framework that achieves comparable or sometimes outperforms
state-of-the-art on multiple standard benchmarks
- …