2,389 research outputs found
A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts
Most existing zero-shot learning methods consider the problem as a visual
semantic embedding one. Given the demonstrated capability of Generative
Adversarial Networks(GANs) to generate images, we instead leverage GANs to
imagine unseen categories from text descriptions and hence recognize novel
classes with no examples being seen. Specifically, we propose a simple yet
effective generative model that takes as input noisy text descriptions about an
unseen class (e.g.Wikipedia articles) and generates synthesized visual features
for this class. With added pseudo data, zero-shot learning is naturally
converted to a traditional classification problem. Additionally, to preserve
the inter-class discrimination of the generated features, a visual pivot
regularization is proposed as an explicit supervision. Unlike previous methods
using complex engineered regularizers, our approach can suppress the noise well
without additional regularization. Empirically, we show that our method
consistently outperforms the state of the art on the largest available
benchmarks on Text-based Zero-shot Learning.Comment: To appear in CVPR1
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders
While recent neural encoder-decoder models have shown great promise in
modeling open-domain conversations, they often generate dull and generic
responses. Unlike past work that has focused on diversifying the output of the
decoder at word-level to alleviate this problem, we present a novel framework
based on conditional variational autoencoders that captures the discourse-level
diversity in the encoder. Our model uses latent variables to learn a
distribution over potential conversational intents and generates diverse
responses using only greedy decoders. We have further developed a novel variant
that is integrated with linguistic prior knowledge for better performance.
Finally, the training procedure is improved by introducing a bag-of-word loss.
Our proposed models have been validated to generate significantly more diverse
responses than baseline approaches and exhibit competence in discourse-level
decision-making.Comment: Appeared in ACL2017 proceedings as a long paper. Correct a
calculation mistake in Table 1 E-bow & A-bow and results into higher score
Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels
In an effort to further advance semi-supervised generative and classification
tasks, we propose a simple yet effective training strategy called dual pseudo
training (DPT), built upon strong semi-supervised learners and diffusion
models. DPT operates in three stages: training a classifier on partially
labeled data to predict pseudo-labels; training a conditional generative model
using these pseudo-labels to generate pseudo images; and retraining the
classifier with a mix of real and pseudo images. Empirically, DPT consistently
achieves SOTA performance of semi-supervised generation and classification
across various settings. In particular, with one or two labels per class, DPT
achieves a Fr\'echet Inception Distance (FID) score of 3.08 or 2.52 on ImageNet
256x256. Besides, DPT outperforms competitive semi-supervised baselines
substantially on ImageNet classification tasks, achieving top-1 accuracies of
59.0 (+2.8), 69.5 (+3.0), and 74.4 (+2.0) with one, two, or five labels per
class, respectively. Notably, our results demonstrate that diffusion can
generate realistic images with only a few labels (e.g., <0.1%) and generative
augmentation remains viable for semi-supervised classification. Our code is
available at https://github.com/ML-GSAI/DPT.Comment: Accepted to NeurIPS 202
Causal-structure Driven Augmentations for Text OOD Generalization
The reliance of text classifiers on spurious correlations can lead to poor
generalization at deployment, raising concerns about their use in
safety-critical domains such as healthcare. In this work, we propose to use
counterfactual data augmentation, guided by knowledge of the causal structure
of the data, to simulate interventions on spurious features and to learn more
robust text classifiers. We show that this strategy is appropriate in
prediction problems where the label is spuriously correlated with an attribute.
Under the assumptions of such problems, we discuss the favorable sample
complexity of counterfactual data augmentation, compared to importance
re-weighting. Pragmatically, we match examples using auxiliary data, based on
diff-in-diff methodology, and use a large language model (LLM) to represent a
conditional probability of text. Through extensive experimentation on learning
caregiver-invariant predictors of clinical diagnoses from medical narratives
and on semi-synthetic data, we demonstrate that our method for simulating
interventions improves out-of-distribution (OOD) accuracy compared to baseline
invariant learning algorithms.Comment: Forthcoming in NeurIPS 202
- …