2,888 research outputs found
Learning to Convolve: A Generalized Weight-Tying Approach
Recent work (Cohen & Welling, 2016) has shown that generalizations of
convolutions, based on group theory, provide powerful inductive biases for
learning. In these generalizations, filters are not only translated but can
also be rotated, flipped, etc. However, coming up with exact models of how to
rotate a 3 x 3 filter on a square pixel-grid is difficult. In this paper, we
learn how to transform filters for use in the group convolution, focussing on
roto-translation. For this, we learn a filter basis and all rotated versions of
that filter basis. Filters are then encoded by a set of rotation invariant
coefficients. To rotate a filter, we switch the basis. We demonstrate we can
produce feature maps with low sensitivity to input rotations, while achieving
high performance on MNIST and CIFAR-10.Comment: Accepted to ICML 201
Principled Design and Implementation of Steerable Detectors
We provide a complete pipeline for the detection of patterns of interest in
an image. In our approach, the patterns are assumed to be adequately modeled by
a known template, and are located at unknown position and orientation. We
propose a continuous-domain additive image model, where the analyzed image is
the sum of the template and an isotropic background signal with self-similar
isotropic power-spectrum. The method is able to learn an optimal steerable
filter fulfilling the SNR criterion based on one single template and background
pair, that therefore strongly responds to the template, while optimally
decoupling from the background model. The proposed filter then allows for a
fast detection process, with the unknown orientation estimation through the use
of steerability properties. In practice, the implementation requires to
discretize the continuous-domain formulation on polar grids, which is performed
using radial B-splines. We demonstrate the practical usefulness of our method
on a variety of template approximation and pattern detection experiments
Deformable kernels for early vision
Early vision algorithms often have a first stage of linear-filtering that `extracts' from the image information at multiple scales of resolution and multiple orientations. A common difficulty in the design and implementation of such schemes is that one feels compelled to discretize coarsely the space of scales and orientations in order to reduce computation and storage costs. A technique is presented that allows: 1) computing the best approximation of a given family using linear combinations of a small number of `basis' functions; and 2) describing all finite-dimensional families, i.e., the families of filters for which a finite dimensional representation is possible with no error. The technique is based on singular value decomposition and may be applied to generating filters in arbitrary dimensions and subject to arbitrary deformations. The relevant functional analysis results are reviewed and precise conditions for the decomposition to be feasible are stated. Experimental results are presented that demonstrate the applicability of the technique to generating multiorientation multi-scale 2D edge-detection kernels. The implementation issues are also discussed
Rotationally Invariant Image Representation for Viewing Direction Classification in Cryo-EM
We introduce a new rotationally invariant viewing angle classification method
for identifying, among a large number of Cryo-EM projection images, similar
views without prior knowledge of the molecule. Our rotationally invariant
features are based on the bispectrum. Each image is denoised and compressed
using steerable principal component analysis (PCA) such that rotating an image
is equivalent to phase shifting the expansion coefficients. Thus we are able to
extend the theory of bispectrum of 1D periodic signals to 2D images. The
randomized PCA algorithm is then used to efficiently reduce the dimensionality
of the bispectrum coefficients, enabling fast computation of the similarity
between any pair of images. The nearest neighbors provide an initial
classification of similar viewing angles. In this way, rotational alignment is
only performed for images with their nearest neighbors. The initial nearest
neighbor classification and alignment are further improved by a new
classification method called vector diffusion maps. Our pipeline for viewing
angle classification and alignment is experimentally shown to be faster and
more accurate than reference-free alignment with rotationally invariant K-means
clustering, MSA/MRA 2D classification, and their modern approximations
- …