326 research outputs found
Flow Factorized Representation Learning
A prominent goal of representation learning research is to achieve
representations which are factorized in a useful manner with respect to the
ground truth factors of variation. The fields of disentangled and equivariant
representation learning have approached this ideal from a range of
complimentary perspectives; however, to date, most approaches have proven to
either be ill-specified or insufficiently flexible to effectively separate all
realistic factors of interest in a learned latent space. In this work, we
propose an alternative viewpoint on such structured representation learning
which we call Flow Factorized Representation Learning, and demonstrate it to
learn both more efficient and more usefully structured representations than
existing frameworks. Specifically, we introduce a generative model which
specifies a distinct set of latent probability paths that define different
input transformations. Each latent flow is generated by the gradient field of a
learned potential following dynamic optimal transport. Our novel setup brings
new understandings to both \textit{disentanglement} and \textit{equivariance}.
We show that our model achieves higher likelihoods on standard representation
learning benchmarks while simultaneously being closer to approximately
equivariant models. Furthermore, we demonstrate that the transformations
learned by our model are flexibly composable and can also extrapolate to new
data, implying a degree of robustness and generalizability approaching the
ultimate goal of usefully factorized representation learning.Comment: NeurIPS2
Learning disentangled representations via product manifold projection
We propose a novel approach to disentangle the generative factors of
variation underlying a given set of observations. Our method builds upon the
idea that the (unknown) low-dimensional manifold underlying the data space can
be explicitly modeled as a product of submanifolds. This definition of
disentanglement gives rise to a novel weakly-supervised algorithm for
recovering the unknown explanatory factors behind the data. At training time,
our algorithm only requires pairs of non i.i.d. data samples whose elements
share at least one, possibly multidimensional, generative factor of variation.
We require no knowledge on the nature of these transformations, and do not make
any limiting assumption on the properties of each subspace. Our approach is
easy to implement, and can be successfully applied to different kinds of data
(from images to 3D surfaces) undergoing arbitrary transformations. In addition
to standard synthetic benchmarks, we showcase our method in challenging
real-world applications, where we compare favorably with the state of the art.Comment: 15 pages, 10 figure
Weakly supervised causal representation learning
Learning high-level causal representations together with a causal model from unstructured low-level data such as pixels is impossible from observational data alone. We prove under mild assumptions that this representation is however identifiable in a weakly supervised setting. This involves a dataset with paired samples before and after random, unknown interventions, but no further labels. We then introduce implicit latent causal models, variational autoencoders that represent causal variables and causal structure without having to optimize an explicit discrete graph structure. On simple image data, including a novel dataset of simulated robotic manipulation, we demonstrate that such models can reliably identify the causal structure and disentangle causal variables
SW-VAE: Weakly Supervised Learn Disentangled Representation Via Latent Factor Swapping
Representation disentanglement is an important goal of representation
learning that benefits various downstream tasks. To achieve this goal, many
unsupervised learning representation disentanglement approaches have been
developed. However, the training process without utilizing any supervision
signal have been proved to be inadequate for disentanglement representation
learning. Therefore, we propose a novel weakly-supervised training approach,
named as SW-VAE, which incorporates pairs of input observations as supervision
signals by using the generative factors of datasets. Furthermore, we introduce
strategies to gradually increase the learning difficulty during training to
smooth the training process. As shown on several datasets, our model shows
significant improvement over state-of-the-art (SOTA) methods on representation
disentanglement tasks
Disentangled Generative Causal Representation Learning
This paper proposes a Disentangled gEnerative cAusal Representation (DEAR)
learning method. Unlike existing disentanglement methods that enforce
independence of the latent variables, we consider the general case where the
underlying factors of interests can be causally correlated. We show that
previous methods with independent priors fail to disentangle causally
correlated factors. Motivated by this finding, we propose a new disentangled
learning method called DEAR that enables causal controllable generation and
causal representation learning. The key ingredient of this new formulation is
to use a structural causal model (SCM) as the prior for a bidirectional
generative model. The prior is then trained jointly with a generator and an
encoder using a suitable GAN loss incorporated with supervision. We provide
theoretical justification on the identifiability and asymptotic consistency of
the proposed method, which guarantees disentangled causal representation
learning under appropriate conditions. We conduct extensive experiments on both
synthesized and real data sets to demonstrate the effectiveness of DEAR in
causal controllable generation, and the benefits of the learned representations
for downstream tasks in terms of sample efficiency and distributional
robustness
Object-centric architectures enable efficient causal representation learning
Causal representation learning has showed a variety of settings in which we
can disentangle latent variables with identifiability guarantees (up to some
reasonable equivalence class). Common to all of these approaches is the
assumption that (1) the latent variables are represented as -dimensional
vectors, and (2) that the observations are the output of some injective
generative function of these latent variables. While these assumptions appear
benign, we show that when the observations are of multiple objects, the
generative function is no longer injective and disentanglement fails in
practice. We can address this failure by combining recent developments in
object-centric learning and causal representation learning. By modifying the
Slot Attention architecture arXiv:2006.15055, we develop an object-centric
architecture that leverages weak supervision from sparse perturbations to
disentangle each object's properties. This approach is more data-efficient in
the sense that it requires significantly fewer perturbations than a comparable
approach that encodes to a Euclidean space and we show that this approach
successfully disentangles the properties of a set of objects in a series of
simple image-based disentanglement experiments
On the Transfer of Disentangled Representations in Realistic Settings
Learning meaningful representations that disentangle the underlying structure
of the data generating process is considered to be of key importance in machine
learning. While disentangled representations were found to be useful for
diverse tasks such as abstract reasoning and fair classification, their
scalability and real-world impact remain questionable. We introduce a new
high-resolution dataset with 1M simulated images and over 1,800 annotated
real-world images of the same setup. In contrast to previous work, this new
dataset exhibits correlations, a complex underlying structure, and allows to
evaluate transfer to unseen simulated and real-world settings where the encoder
i) remains in distribution or ii) is out of distribution. We propose new
architectures in order to scale disentangled representation learning to
realistic high-resolution settings and conduct a large-scale empirical study of
disentangled representations on this dataset. We observe that disentanglement
is a good predictor for out-of-distribution (OOD) task performance.Comment: Published at ICLR 202
Representation Disentaglement via Regularization by Causal Identification
In this work, we propose the use of a causal collider structured model to
describe the underlying data generative process assumptions in disentangled
representation learning. This extends the conventional i.i.d. factorization
assumption model , inadequate to
handle learning from biased datasets (e.g., with sampling selection bias). The
collider structure, explains that conditional dependencies between the
underlying generating variables may be exist, even when these are in reality
unrelated, complicating disentanglement. Under the rubric of causal inference,
we show this issue can be reconciled under the condition of causal
identification; attainable from data and a combination of constraints, aimed at
controlling the dependencies characteristic of the \textit{collider} model. For
this, we propose regularization by identification (ReI), a modular
regularization engine designed to align the behavior of large scale generative
models with the disentanglement constraints imposed by causal identification.
Empirical evidence on standard benchmarks demonstrates the superiority of ReI
in learning disentangled representations in a variational framework. In a
real-world dataset we additionally show that our framework, results in
interpretable representations robust to out-of-distribution examples and that
align with the true expected effect from domain knowledge
- …