35,455 research outputs found
Interpretable Transformations with Encoder-Decoder Networks
Deep feature spaces have the capacity to encode complex transformations of
their input data. However, understanding the relative feature-space
relationship between two transformed encoded images is difficult. For instance,
what is the relative feature space relationship between two rotated images?
What is decoded when we interpolate in feature space? Ideally, we want to
disentangle confounding factors, such as pose, appearance, and illumination,
from object identity. Disentangling these is difficult because they interact in
very nonlinear ways. We propose a simple method to construct a deep feature
space, with explicitly disentangled representations of several known
transformations. A person or algorithm can then manipulate the disentangled
representation, for example, to re-render an image with explicit control over
parameterized degrees of freedom. The feature space is constructed using a
transforming encoder-decoder network with a custom feature transform layer,
acting on the hidden representations. We demonstrate the advantages of explicit
disentangling on a variety of datasets and transformations, and as an aid for
traditional tasks, such as classification.Comment: Accepted at ICCV 201
Salient Local 3D Features for 3D Shape Retrieval
In this paper we describe a new formulation for the 3D salient local features
based on the voxel grid inspired by the Scale Invariant Feature Transform
(SIFT). We use it to identify the salient keypoints (invariant points) on a 3D
voxelized model and calculate invariant 3D local feature descriptors at these
keypoints. We then use the bag of words approach on the 3D local features to
represent the 3D models for shape retrieval. The advantages of the method are
that it can be applied to rigid as well as to articulated and deformable 3D
models. Finally, this approach is applied for 3D Shape Retrieval on the McGill
articulated shape benchmark and then the retrieval results are presented and
compared to other methods.Comment: Three-Dimensional Imaging, Interaction, and Measurement. Edited by
Beraldin, J. Angelo; Cheok, Geraldine S.; McCarthy, Michael B.;
Neuschaefer-Rube, Ulrich; Baskurt, Atilla M.; McDowall, Ian E.; Dolinsky,
Margaret. Proceedings of the SPIE, Volume 7864, pp. 78640S-78640S-8 (2011).
Conference Location: San Francisco Airport, California, USA ISBN:
9780819484017 Date: 10 March 201
Learning SO(3) Equivariant Representations with Spherical CNNs
We address the problem of 3D rotation equivariance in convolutional neural
networks. 3D rotations have been a challenging nuisance in 3D classification
tasks requiring higher capacity and extended data augmentation in order to
tackle it. We model 3D data with multi-valued spherical functions and we
propose a novel spherical convolutional network that implements exact
convolutions on the sphere by realizing them in the spherical harmonic domain.
Resulting filters have local symmetry and are localized by enforcing smooth
spectra. We apply a novel pooling on the spectral domain and our operations are
independent of the underlying spherical resolution throughout the network. We
show that networks with much lower capacity and without requiring data
augmentation can exhibit performance comparable to the state of the art in
standard retrieval and classification benchmarks.Comment: Camera-ready. Accepted to ECCV'18 as oral presentatio
Low-rank SIFT: An Affine Invariant Feature for Place Recognition
In this paper, we present a novel affine-invariant feature based on SIFT,
leveraging the regular appearance of man-made objects. The feature achieves
full affine invariance without needing to simulate over affine parameter space.
Low-rank SIFT, as we name the feature, is based on our observation that local
tilt, which are caused by changes of camera axis orientation, could be
normalized by converting local patches to standard low-rank forms. Rotation,
translation and scaling invariance could be achieved in ways similar to SIFT.
As an extension of SIFT, our method seeks to add prior to solve the ill-posed
affine parameter estimation problem and normalizes them directly, and is
applicable to objects with regular structures. Furthermore, owing to recent
breakthrough in convex optimization, such parameter could be computed
efficiently. We will demonstrate its effectiveness in place recognition as our
major application. As extra contributions, we also describe our pipeline of
constructing geotagged building database from the ground up, as well as an
efficient scheme for automatic feature selection
- …