665 research outputs found
Interpretable Transformations with Encoder-Decoder Networks
Deep feature spaces have the capacity to encode complex transformations of
their input data. However, understanding the relative feature-space
relationship between two transformed encoded images is difficult. For instance,
what is the relative feature space relationship between two rotated images?
What is decoded when we interpolate in feature space? Ideally, we want to
disentangle confounding factors, such as pose, appearance, and illumination,
from object identity. Disentangling these is difficult because they interact in
very nonlinear ways. We propose a simple method to construct a deep feature
space, with explicitly disentangled representations of several known
transformations. A person or algorithm can then manipulate the disentangled
representation, for example, to re-render an image with explicit control over
parameterized degrees of freedom. The feature space is constructed using a
transforming encoder-decoder network with a custom feature transform layer,
acting on the hidden representations. We demonstrate the advantages of explicit
disentangling on a variety of datasets and transformations, and as an aid for
traditional tasks, such as classification.Comment: Accepted at ICCV 201
Holographic-(V)AE: an end-to-end SO(3)-Equivariant (Variational) Autoencoder in Fourier Space
Group-equivariant neural networks have emerged as a data-efficient approach
to solve classification and regression tasks, while respecting the relevant
symmetries of the data. However, little work has been done to extend this
paradigm to the unsupervised and generative domains. Here, we present
Holographic-(V)AE (H-(V)AE), a fully end-to-end SO(3)-equivariant (variational)
autoencoder in Fourier space, suitable for unsupervised learning and generation
of data distributed around a specified origin. H-(V)AE is trained to
reconstruct the spherical Fourier encoding of data, learning in the process a
latent space with a maximally informative invariant embedding alongside an
equivariant frame describing the orientation of the data. We extensively test
the performance of H-(V)AE on diverse datasets and show that its latent space
efficiently encodes the categorical features of spherical images and structural
features of protein atomic environments. Our work can further be seen as a case
study for equivariant modeling of a data distribution by reconstructing its
Fourier encoding
Learning the Irreducible Representations of Commutative Lie Groups
We present a new probabilistic model of compact commutative Lie groups that
produces invariant-equivariant and disentangled representations of data. To
define the notion of disentangling, we borrow a fundamental principle from
physics that is used to derive the elementary particles of a system from its
symmetries. Our model employs a newfound Bayesian conjugacy relation that
enables fully tractable probabilistic inference over compact commutative Lie
groups -- a class that includes the groups that describe the rotation and
cyclic translation of images. We train the model on pairs of transformed image
patches, and show that the learned invariant representation is highly effective
for classification
- …