2,397 research outputs found
Learning from Label Proportions with Generative Adversarial Networks
In this paper, we leverage generative adversarial networks (GANs) to derive
an effective algorithm LLP-GAN for learning from label proportions (LLP), where
only the bag-level proportional information in labels is available. Endowed
with end-to-end structure, LLP-GAN performs approximation in the light of an
adversarial learning mechanism, without imposing restricted assumptions on
distribution. Accordingly, we can directly induce the final instance-level
classifier upon the discriminator. Under mild assumptions, we give the explicit
generative representation and prove the global optimality for LLP-GAN.
Additionally, compared with existing methods, our work empowers LLP solver with
capable scalability inheriting from deep models. Several experiments on
benchmark datasets demonstrate vivid advantages of the proposed approach.Comment: Accepted as a conference paper at NeurIPS 201
Normalized Wasserstein Distance for Mixture Distributions with Applications in Adversarial Learning and Domain Adaptation
Understanding proper distance measures between distributions is at the core
of several learning tasks such as generative models, domain adaptation,
clustering, etc. In this work, we focus on mixture distributions that arise
naturally in several application domains where the data contains different
sub-populations. For mixture distributions, established distance measures such
as the Wasserstein distance do not take into account imbalanced mixture
proportions. Thus, even if two mixture distributions have identical mixture
components but different mixture proportions, the Wasserstein distance between
them will be large. This often leads to undesired results in distance-based
learning methods for mixture distributions. In this paper, we resolve this
issue by introducing the Normalized Wasserstein measure. The key idea is to
introduce mixture proportions as optimization variables, effectively
normalizing mixture proportions in the Wasserstein formulation. Using the
proposed normalized Wasserstein measure leads to significant performance gains
for mixture distributions with imbalanced mixture proportions compared to the
vanilla Wasserstein distance. We demonstrate the effectiveness of the proposed
measure in GANs, domain adaptation and adversarial clustering in several
benchmark datasets.Comment: Accepted at ICCV 201
Partly Supervised Multitask Learning
Semi-supervised learning has recently been attracting attention as an
alternative to fully supervised models that require large pools of labeled
data. Moreover, optimizing a model for multiple tasks can provide better
generalizability than single-task learning. Leveraging self-supervision and
adversarial training, we propose a novel general purpose semi-supervised,
multiple-task model---namely, self-supervised, semi-supervised, multitask
learning (SMTL)---for accomplishing two important tasks in medical imaging,
segmentation and diagnostic classification. Experimental results on chest and
spine X-ray datasets suggest that our SMTL model significantly outperforms
semi-supervised single task, semi/fully-supervised multitask, and
fully-supervised single task models, even with a 50\% reduction of class and
segmentation labels. We hypothesize that our proposed model can be effective in
tackling limited annotation problems for joint training, not only in medical
imaging domains, but also for general-purpose vision tasks.Comment: 10 pages, 8 figures, 3 table
Unsupervised Learning for Cell-level Visual Representation in Histopathology Images with Generative Adversarial Networks
The visual attributes of cells, such as the nuclear morphology and chromatin
openness, are critical for histopathology image analysis. By learning
cell-level visual representation, we can obtain a rich mix of features that are
highly reusable for various tasks, such as cell-level classification, nuclei
segmentation, and cell counting. In this paper, we propose a unified generative
adversarial networks architecture with a new formulation of loss to perform
robust cell-level visual representation learning in an unsupervised setting.
Our model is not only label-free and easily trained but also capable of
cell-level unsupervised classification with interpretable visualization, which
achieves promising results in the unsupervised classification of bone marrow
cellular components. Based on the proposed cell-level visual representation
learning, we further develop a pipeline that exploits the varieties of cellular
elements to perform histopathology image classification, the advantages of
which are demonstrated on bone marrow datasets.Comment: Accepted for publication in IEEE Journal of Biomedical and Health
Informatic
Unpaired Photo-to-Caricature Translation on Faces in the Wild
Recently, image-to-image translation has been made much progress owing to the
success of conditional Generative Adversarial Networks (cGANs). And some
unpaired methods based on cycle consistency loss such as DualGAN, CycleGAN and
DiscoGAN are really popular. However, it's still very challenging for
translation tasks with the requirement of high-level visual information
conversion, such as photo-to-caricature translation that requires satire,
exaggeration, lifelikeness and artistry. We present an approach for learning to
translate faces in the wild from the source photo domain to the target
caricature domain with different styles, which can also be used for other
high-level image-to-image translation tasks. In order to capture global
structure with local statistics while translation, we design a dual pathway
model with one coarse discriminator and one fine discriminator. For generator,
we provide one extra perceptual loss in association with adversarial loss and
cycle consistency loss to achieve representation learning for two different
domains. Also the style can be learned by the auxiliary noise input.
Experiments on photo-to-caricature translation of faces in the wild show
considerable performance gain of our proposed method over state-of-the-art
translation methods as well as its potential real applications.Comment: 28 pages, 11 figure
On Target Shift in Adversarial Domain Adaptation
Discrepancy between training and testing domains is a fundamental problem in
the generalization of machine learning techniques. Recently, several approaches
have been proposed to learn domain invariant feature representations through
adversarial deep learning. However, label shift, where the percentage of data
in each class is different between domains, has received less attention. Label
shift naturally arises in many contexts, especially in behavioral studies where
the behaviors are freely chosen. In this work, we propose a method called
Domain Adversarial nets for Target Shift (DATS) to address label shift while
learning a domain invariant representation. This is accomplished by using
distribution matching to estimate label proportions in a blind test set. We
extend this framework to handle multiple domains by developing a scheme to
upweight source domains most similar to the target domain. Empirical results
show that this framework performs well under large label shift in synthetic and
real experiments, demonstrating the practical importance
Data augmentation for low resource sentiment analysis using generative adversarial networks
Sentiment analysis is a task that may suffer from a lack of data in certain
cases, as the datasets are often generated and annotated by humans. In cases
where data is inadequate for training discriminative models, generate models
may aid training via data augmentation. Generative Adversarial Networks (GANs)
are one such model that has advanced the state of the art in several tasks,
including as image and text generation. In this paper, I train GAN models on
low resource datasets, then use them for the purpose of data augmentation
towards improving sentiment classifier generalization. Given the constraints of
limited data, I explore various techniques to train the GAN models. I also
present an analysis of the quality of generated GAN data as more training data
for the GAN is made available. In this analysis, the generated data is
evaluated as a test set (against a model trained on real data points) as well
as a training set to train classification models. Finally, I also conduct a
visual analysis by projecting the generated and the real data into a
two-dimensional space using the t-Distributed Stochastic Neighbor Embedding
(t-SNE) method.Comment: Accepted to International Conference on Acoustics, Speech, and Signal
Processing (ICASSP), 201
Conditional Infilling GANs for Data Augmentation in Mammogram Classification
Deep learning approaches to breast cancer detection in mammograms have
recently shown promising results. However, such models are constrained by the
limited size of publicly available mammography datasets, in large part due to
privacy concerns and the high cost of generating expert annotations. Limited
dataset size is further exacerbated by substantial class imbalance since
"normal" images dramatically outnumber those with findings. Given the rapid
progress of generative models in synthesizing realistic images, and the known
effectiveness of simple data augmentation techniques (e.g. horizontal
flipping), we ask if it is possible to synthetically augment mammogram datasets
using generative adversarial networks (GANs). We train a class-conditional GAN
to perform contextual in-filling, which we then use to synthesize lesions onto
healthy screening mammograms. First, we show that GANs are capable of
generating high-resolution synthetic mammogram patches. Next, we experimentally
evaluate using the augmented dataset to improve breast cancer classification
performance. We observe that a ResNet-50 classifier trained with GAN-augmented
training data produces a higher AUROC compared to the same model trained only
on traditionally augmented data, demonstrating the potential of our approach.Comment: To appear in MICCAI 2018, Breast Image Analysis Worksho
Virtual Conditional Generative Adversarial Networks
When trained on multimodal image datasets, normal Generative Adversarial
Networks (GANs) are usually outperformed by class-conditional GANs and ensemble
GANs, but conditional GANs is restricted to labeled datasets and ensemble GANs
lack efficiency. We propose a novel GAN variant called virtual conditional GAN
(vcGAN) which is not only an ensemble GAN with multiple generative paths while
adding almost zero network parameters, but also a conditional GAN that can be
trained on unlabeled datasets without explicit clustering steps or objectives
other than the adversary loss. Inside the vcGAN's generator, a learnable
``analog-to-digital converter (ADC)" module maps a slice of the inputted
multivariate Gaussian noise to discrete/digital noise (virtual label),
according to which a selector selects the corresponding generative path to
produce the sample. All the generative paths share the same decoder network
while in each path the decoder network is fed with a concatenation of a
different pre-computed amplified one-hot vector and the inputted Gaussian
noise. We conducted a lot of experiments on several balanced/imbalanced image
datasets to demonstrate that vcGAN converges faster and achieves improved
Frech\'et Inception Distance (FID). In addition, we show the training byproduct
that the ADC in vcGAN learned the categorical probability of each mode and that
each generative path generates samples of specific mode, which enables
class-conditional sampling. Codes are available at
\url{https://github.com/annonnymmouss/vcgan
PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities
Data of different modalities generally convey complimentary but heterogeneous
information, and a more discriminative representation is often preferred by
combining multiple data modalities like the RGB and infrared features. However
in reality, obtaining both data channels is challenging due to many
limitations. For example, the RGB surveillance cameras are often restricted
from private spaces, which is in conflict with the need of abnormal activity
detection for personal security. As a result, using partial data channels to
build a full representation of multi-modalities is clearly desired. In this
paper, we propose a novel Partial-modal Generative Adversarial Networks
(PM-GANs) that learns a full-modal representation using data from only partial
modalities. The full representation is achieved by a generated representation
in place of the missing data channel. Extensive experiments are conducted to
verify the performance of our proposed method on action recognition, compared
with four state-of-the-art methods. Meanwhile, a new Infrared-Visible Dataset
for action recognition is introduced, and will be the first publicly available
action dataset that contains paired infrared and visible spectrum
- …