289 research outputs found
Learning SO(3) Equivariant Representations with Spherical CNNs
We address the problem of 3D rotation equivariance in convolutional neural
networks. 3D rotations have been a challenging nuisance in 3D classification
tasks requiring higher capacity and extended data augmentation in order to
tackle it. We model 3D data with multi-valued spherical functions and we
propose a novel spherical convolutional network that implements exact
convolutions on the sphere by realizing them in the spherical harmonic domain.
Resulting filters have local symmetry and are localized by enforcing smooth
spectra. We apply a novel pooling on the spectral domain and our operations are
independent of the underlying spherical resolution throughout the network. We
show that networks with much lower capacity and without requiring data
augmentation can exhibit performance comparable to the state of the art in
standard retrieval and classification benchmarks.Comment: Camera-ready. Accepted to ECCV'18 as oral presentatio
I2I: Image to Icosahedral Projection for Object Reasoning from Single-View Images
Reasoning about 3D objects based on 2D images is challenging due to large
variations in appearance caused by viewing the object from different
orientations. Ideally, our model would be invariant or equivariant to changes
in object pose. Unfortunately, this is typically not possible with 2D image
input because we do not have an a priori model of how the image would change
under out-of-plane object rotations. The only -equivariant
models that currently exist require point cloud input rather than 2D images. In
this paper, we propose a novel model architecture based on icosahedral group
convolution that reasons in by projecting the input image onto
an icosahedron. As a result of this projection, the model is approximately
equivariant to rotation in . We apply this model to an object
pose estimation task and find that it outperforms reasonable baselines
Equivariance with Learned Canonicalization Functions
Symmetry-based neural networks often constrain the architecture in order to
achieve invariance or equivariance to a group of transformations. In this
paper, we propose an alternative that avoids this architectural constraint by
learning to produce canonical representations of the data. These
canonicalization functions can readily be plugged into non-equivariant backbone
architectures. We offer explicit ways to implement them for some groups of
interest. We show that this approach enjoys universality while providing
interpretable insights. Our main hypothesis, supported by our empirical
results, is that learning a small neural network to perform canonicalization is
better than using predefined heuristics. Our experiments show that learning the
canonicalization function is competitive with existing techniques for learning
equivariant functions across many tasks, including image classification,
-body dynamics prediction, point cloud classification and part segmentation,
while being faster across the board.Comment: 21 pages, 5 figure
- …