100 research outputs found
Disentanglement of Latent Representations via Sparse Causal Interventions
The process of generating data such as images is controlled by independent
and unknown factors of variation. The retrieval of these variables has been
studied extensively in the disentanglement, causal representation learning, and
independent component analysis fields. Recently, approaches merging these
domains together have shown great success. Instead of directly representing the
factors of variation, the problem of disentanglement can be seen as finding the
interventions on one image that yield a change to a single factor. Following
this assumption, we introduce a new method for disentanglement inspired by
causal dynamics that combines causality theory with vector-quantized
variational autoencoders. Our model considers the quantized vectors as causal
variables and links them in a causal graph. It performs causal interventions on
the graph and generates atomic transitions affecting a unique factor of
variation in the image. We also introduce a new task of action retrieval that
consists of finding the action responsible for the transition between two
images. We test our method on standard synthetic and real-world disentanglement
datasets. We show that it can effectively disentangle the factors of variation
and perform precise interventions on high-level semantic attributes of an image
without affecting its quality, even with imbalanced data distributions.Comment: 16 pages, 10 pages for the main paper and 6 pages for the supplement,
14 figures, submitted to IJCAI 2023. V2: added link to repositor
A Sequential Set Generation Method for Predicting Set-Valued Outputs
Consider a general machine learning setting where the output is a set of
labels or sequences. This output set is unordered and its size varies with the
input. Whereas multi-label classification methods seem a natural first resort,
they are not readily applicable to set-valued outputs because of the growth
rate of the output space; and because conventional sequence generation doesn't
reflect sets' order-free nature. In this paper, we propose a unified
framework--sequential set generation (SSG)--that can handle output sets of
labels and sequences. SSG is a meta-algorithm that leverages any probabilistic
learning method for label or sequence prediction, but employs a proper
regularization such that a new label or sequence is generated repeatedly until
the full set is produced. Though SSG is sequential in nature, it does not
penalize the ordering of the appearance of the set elements and can be applied
to a variety of set output problems, such as a set of classification labels or
sequences. We perform experiments with both benchmark and synthetic data sets
and demonstrate SSG's strong performance over baseline methods.Comment: Published at AAAI 201
Teaching Smaller Language Models To Generalise To Unseen Compositional Questions
We equip a smaller Language Model to generalise to answering challenging
compositional questions that have not been seen in training. To do so we
propose a combination of multitask supervised pretraining on up to 93 tasks
designed to instill diverse reasoning abilities, and a dense retrieval system
that aims to retrieve a set of evidential paragraph fragments. Recent progress
in question-answering has been achieved either through prompting methods
against very large pretrained Language Models in zero or few-shot fashion, or
by fine-tuning smaller models, sometimes in conjunction with information
retrieval. We focus on the less explored question of the extent to which
zero-shot generalisation can be enabled in smaller models with retrieval
against a corpus within which sufficient information to answer a particular
question may not exist. We establish strong baselines in this setting for
diverse evaluation datasets (StrategyQA, CommonsenseQA, IIRC, DROP, Musique and
ARC-DA), and show that performance can be significantly improved by adding
retrieval-augmented training datasets which are designed to expose our models
to a variety of heuristic reasoning strategies such as weighing partial
evidence or ignoring an irrelevant context
- …