1,774 research outputs found
Convolutional neural network architecture for geometric matching
We address the problem of determining correspondences between two images in
agreement with a geometric model such as an affine or thin-plate spline
transformation, and estimating its parameters. The contributions of this work
are three-fold. First, we propose a convolutional neural network architecture
for geometric matching. The architecture is based on three main components that
mimic the standard steps of feature extraction, matching and simultaneous
inlier detection and model parameter estimation, while being trainable
end-to-end. Second, we demonstrate that the network parameters can be trained
from synthetically generated imagery without the need for manual annotation and
that our matching layer significantly increases generalization capabilities to
never seen before images. Finally, we show that the same model can perform both
instance-level and category-level matching giving state-of-the-art results on
the challenging Proposal Flow dataset.Comment: In 2017 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR 2017
Class-Based Feature Matching Across Unrestricted Transformations
We develop a novel method for class-based feature matching across large changes in viewing conditions. The method is based on the property that when objects share a similar part, the similarity is preserved across viewing conditions. Given a feature and a training set of object images, we first identify the subset of objects that share this feature. The transformation of the feature's appearance across viewing conditions is determined mainly by properties of the feature, rather than of the object in which it is embedded. Therefore, the transformed feature will be shared by approximately the same set of objects. Based on this consistency requirement, corresponding features can be reliably identified from a set of candidate matches. Unlike previous approaches, the proposed scheme compares feature appearances only in similar viewing conditions, rather than across different viewing conditions. As a result, the scheme is not restricted to locally planar objects or affine transformations. The approach also does not require examples of correct matches. We show that by using the proposed method, a dense set of accurate correspondences can be obtained. Experimental comparisons demonstrate that matching accuracy is significantly improved over previous schemes. Finally, we show that the scheme can be successfully used for invariant object recognition
Affine invariant visual phrases for object instance recognition
Object instance recognition approaches based on the
bag-of-words model are severely affected by the loss of
spatial consistency during retrieval. As a result, costly
RANSAC verification is needed to ensure geometric
consistency between the query and the retrieved images.
A common alternative is to inject geometric informa-
tion directly into the retrieval procedure, by endowing
the visual words with additional information. Most of
the existing approaches in this category can efficiently
handle only restricted classes of geometric transfor-
mations, including scale and translation. In this pa-
per, we propose a simple and efficient scheme that can
cover the more complex class of full affine transforma-
tions. We demonstrate the usefulness of our approach
in the case of planar object instance recognition, such
as recognition of books, logos, traffic signs, etc.This work was funded by a Google Faculty Research
Award, the Marie Curie grant CIG-334283-HRGP, a
CNRS chaire d'excellence.This is the author accepted manuscript. The final version is available at http://dx.doi.org/10.1109/MVA.2015.715312
Provably scale-covariant networks from oriented quasi quadrature measures in cascade
This article presents a continuous model for hierarchical networks based on a
combination of mathematically derived models of receptive fields and
biologically inspired computations. Based on a functional model of complex
cells in terms of an oriented quasi quadrature combination of first- and
second-order directional Gaussian derivatives, we couple such primitive
computations in cascade over combinatorial expansions over image orientations.
Scale-space properties of the computational primitives are analysed and it is
shown that the resulting representation allows for provable scale and rotation
covariance. A prototype application to texture analysis is developed and it is
demonstrated that a simplified mean-reduced representation of the resulting
QuasiQuadNet leads to promising experimental results on three texture datasets.Comment: 12 pages, 3 figures, 1 tabl
Invariance of visual operations at the level of receptive fields
Receptive field profiles registered by cell recordings have shown that
mammalian vision has developed receptive fields tuned to different sizes and
orientations in the image domain as well as to different image velocities in
space-time. This article presents a theoretical model by which families of
idealized receptive field profiles can be derived mathematically from a small
set of basic assumptions that correspond to structural properties of the
environment. The article also presents a theory for how basic invariance
properties to variations in scale, viewing direction and relative motion can be
obtained from the output of such receptive fields, using complementary
selection mechanisms that operate over the output of families of receptive
fields tuned to different parameters. Thereby, the theory shows how basic
invariance properties of a visual system can be obtained already at the level
of receptive fields, and we can explain the different shapes of receptive field
profiles found in biological vision from a requirement that the visual system
should be invariant to the natural types of image transformations that occur in
its environment.Comment: 40 pages, 17 figure
- …