1,324 research outputs found
Learning to Convolve: A Generalized Weight-Tying Approach
Recent work (Cohen & Welling, 2016) has shown that generalizations of
convolutions, based on group theory, provide powerful inductive biases for
learning. In these generalizations, filters are not only translated but can
also be rotated, flipped, etc. However, coming up with exact models of how to
rotate a 3 x 3 filter on a square pixel-grid is difficult. In this paper, we
learn how to transform filters for use in the group convolution, focussing on
roto-translation. For this, we learn a filter basis and all rotated versions of
that filter basis. Filters are then encoded by a set of rotation invariant
coefficients. To rotate a filter, we switch the basis. We demonstrate we can
produce feature maps with low sensitivity to input rotations, while achieving
high performance on MNIST and CIFAR-10.Comment: Accepted to ICML 201
Interpretable Transformations with Encoder-Decoder Networks
Deep feature spaces have the capacity to encode complex transformations of
their input data. However, understanding the relative feature-space
relationship between two transformed encoded images is difficult. For instance,
what is the relative feature space relationship between two rotated images?
What is decoded when we interpolate in feature space? Ideally, we want to
disentangle confounding factors, such as pose, appearance, and illumination,
from object identity. Disentangling these is difficult because they interact in
very nonlinear ways. We propose a simple method to construct a deep feature
space, with explicitly disentangled representations of several known
transformations. A person or algorithm can then manipulate the disentangled
representation, for example, to re-render an image with explicit control over
parameterized degrees of freedom. The feature space is constructed using a
transforming encoder-decoder network with a custom feature transform layer,
acting on the hidden representations. We demonstrate the advantages of explicit
disentangling on a variety of datasets and transformations, and as an aid for
traditional tasks, such as classification.Comment: Accepted at ICCV 201
Sampling Theorems for Unsupervised Learning in Linear Inverse Problems
Solving a linear inverse problem requires knowledge about the underlying
signal model. In many applications, this model is a priori unknown and has to
be learned from data. However, it is impossible to learn the model using
observations obtained via a single incomplete measurement operator, as there is
no information outside the range of the inverse operator, resulting in a
chicken-and-egg problem: to learn the model we need reconstructed signals, but
to reconstruct the signals we need to know the model. Two ways to overcome this
limitation are using multiple measurement operators or assuming that the signal
model is invariant to a certain group action. In this paper, we present
necessary and sufficient sampling conditions for learning the signal model from
partial measurements which only depend on the dimension of the model, and the
number of operators or properties of the group action that the model is
invariant to. As our results are agnostic of the learning algorithm, they shed
light into the fundamental limitations of learning from incomplete data and
have implications in a wide range set of practical algorithms, such as
dictionary learning, matrix completion and deep neural networks.Comment: arXiv admin note: substantial text overlap with arXiv:2201.1215
Sensing Theorems for Unsupervised Learning in Linear Inverse Problems
International audienceSolving an ill-posed linear inverse problem requires knowledge about the underlying signal model. In many applications, this model is a priori unknown and has to be learned from data. However, it is impossible to learn the model using observations obtained via a single incomplete measurement operator, as there is no information about the signal model in the nullspace of the operator, resulting in a chicken-and-egg problem: to learn the model we need reconstructed signals, but to reconstruct the signals we need to know the model. Two ways to overcome this limitation are using multiple measurement operators or assuming that the signal model is invariant to a certain group action. In this paper, we present necessary and sufficient sensing conditions for learning the signal model from measurement data alone which only depend on the dimension of the model and the number of operators or properties of the group action that the model is invariant to. As our results are agnostic of the learning algorithm, they shed light into the fundamental limitations of learning from incomplete data and have implications in a wide range set of practical algorithms, such as dictionary learning, matrix completion and deep neural networks
Homomorphism AutoEncoder — Learning Group Structured Representations from Observed Transitions
How can agents learn internal models that veridically represent interactions with the real world is a largely open question. As machine learning is moving towards representations containing not just observational but also interventional knowledge, we study this problem using tools from representation learning and group theory. We propose methods enabling an agent acting upon the world to learn internal representations of sensory information that are consistent with actions that modify it. We use an autoencoder equipped with a group representation acting on its latent space, trained using an equivariance-derived loss in order to enforce a suitable homomorphism property on the group representation. In contrast to existing work, our approach does not require prior knowledge of the group and does not restrict the set of actions the agent can perform. We motivate our method theoretically, and show empirically that it can learn a group representation of the actions, thereby capturing the structure of the set of transformations applied to the environment. We further show that this allows agents to predict the effect of sequences of future actions with improved accuracy
I2I: Image to Icosahedral Projection for Object Reasoning from Single-View Images
Reasoning about 3D objects based on 2D images is challenging due to large
variations in appearance caused by viewing the object from different
orientations. Ideally, our model would be invariant or equivariant to changes
in object pose. Unfortunately, this is typically not possible with 2D image
input because we do not have an a priori model of how the image would change
under out-of-plane object rotations. The only -equivariant
models that currently exist require point cloud input rather than 2D images. In
this paper, we propose a novel model architecture based on icosahedral group
convolution that reasons in by projecting the input image onto
an icosahedron. As a result of this projection, the model is approximately
equivariant to rotation in . We apply this model to an object
pose estimation task and find that it outperforms reasonable baselines
MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning
This paper introduces MDP homomorphic networks for deep reinforcement
learning. MDP homomorphic networks are neural networks that are equivariant
under symmetries in the joint state-action space of an MDP. Current approaches
to deep reinforcement learning do not usually exploit knowledge about such
structure. By building this prior knowledge into policy and value networks
using an equivariance constraint, we can reduce the size of the solution space.
We specifically focus on group-structured symmetries (invertible
transformations). Additionally, we introduce an easy method for constructing
equivariant network layers numerically, so the system designer need not solve
the constraints by hand, as is typically done. We construct MDP homomorphic
MLPs and CNNs that are equivariant under either a group of reflections or
rotations. We show that such networks converge faster than unstructured
baselines on CartPole, a grid world and Pong
Bispectral Neural Networks
We present a neural network architecture, Bispectral Neural Networks (BNNs)
for learning representations that are invariant to the actions of compact
commutative groups on the space over which a signal is defined. The model
incorporates the ansatz of the bispectrum, an analytically defined group
invariant that is complete -- that is, it preserves all signal structure while
removing only the variation due to group actions. Here, we demonstrate that
BNNs are able to simultaneously learn groups, their irreducible
representations, and corresponding complete invariant maps purely from the
symmetries implicit in data. Further, we demonstrate that the completeness
property endows these networks with strong adversarial robustness. This work
establishes Bispectral Neural Networks as a powerful computational primitive
for robust invariant representation learning
- …