11,176 research outputs found
Class-Based Feature Matching Across Unrestricted Transformations
We develop a novel method for class-based feature matching across large changes in viewing conditions. The method is based on the property that when objects share a similar part, the similarity is preserved across viewing conditions. Given a feature and a training set of object images, we first identify the subset of objects that share this feature. The transformation of the feature's appearance across viewing conditions is determined mainly by properties of the feature, rather than of the object in which it is embedded. Therefore, the transformed feature will be shared by approximately the same set of objects. Based on this consistency requirement, corresponding features can be reliably identified from a set of candidate matches. Unlike previous approaches, the proposed scheme compares feature appearances only in similar viewing conditions, rather than across different viewing conditions. As a result, the scheme is not restricted to locally planar objects or affine transformations. The approach also does not require examples of correct matches. We show that by using the proposed method, a dense set of accurate correspondences can be obtained. Experimental comparisons demonstrate that matching accuracy is significantly improved over previous schemes. Finally, we show that the scheme can be successfully used for invariant object recognition
Computational role of eccentricity dependent cortical magnification
We develop a sampling extension of M-theory focused on invariance to scale
and translation. Quite surprisingly, the theory predicts an architecture of
early vision with increasing receptive field sizes and a high resolution fovea
-- in agreement with data about the cortical magnification factor, V1 and the
retina. From the slope of the inverse of the magnification factor, M-theory
predicts a cortical "fovea" in V1 in the order of by basic units at
each receptive field size -- corresponding to a foveola of size around
minutes of arc at the highest resolution, degrees at the lowest
resolution. It also predicts uniform scale invariance over a fixed range of
scales independently of eccentricity, while translation invariance should
depend linearly on spatial frequency. Bouma's law of crowding follows in the
theory as an effect of cortical area-by-cortical area pooling; the Bouma
constant is the value expected if the signature responsible for recognition in
the crowding experiments originates in V2. From a broader perspective, the
emerging picture suggests that visual recognition under natural conditions
takes place by composing information from a set of fixations, with each
fixation providing recognition from a space-scale image fragment -- that is an
image patch represented at a set of increasing sizes and decreasing
resolutions
A Novel Hybrid CNN-AIS Visual Pattern Recognition Engine
Machine learning methods are used today for most recognition problems.
Convolutional Neural Networks (CNN) have time and again proved successful for
many image processing tasks primarily for their architecture. In this paper we
propose to apply CNN to small data sets like for example, personal albums or
other similar environs where the size of training dataset is a limitation,
within the framework of a proposed hybrid CNN-AIS model. We use Artificial
Immune System Principles to enhance small size of training data set. A layer of
Clonal Selection is added to the local filtering and max pooling of CNN
Architecture. The proposed Architecture is evaluated using the standard MNIST
dataset by limiting the data size and also with a small personal data sample
belonging to two different classes. Experimental results show that the proposed
hybrid CNN-AIS based recognition engine works well when the size of training
data is limited in siz
A Framework for Symmetric Part Detection in Cluttered Scenes
The role of symmetry in computer vision has waxed and waned in importance
during the evolution of the field from its earliest days. At first figuring
prominently in support of bottom-up indexing, it fell out of favor as shape
gave way to appearance and recognition gave way to detection. With a strong
prior in the form of a target object, the role of the weaker priors offered by
perceptual grouping was greatly diminished. However, as the field returns to
the problem of recognition from a large database, the bottom-up recovery of the
parts that make up the objects in a cluttered scene is critical for their
recognition. The medial axis community has long exploited the ubiquitous
regularity of symmetry as a basis for the decomposition of a closed contour
into medial parts. However, today's recognition systems are faced with
cluttered scenes, and the assumption that a closed contour exists, i.e. that
figure-ground segmentation has been solved, renders much of the medial axis
community's work inapplicable. In this article, we review a computational
framework, previously reported in Lee et al. (2013), Levinshtein et al. (2009,
2013), that bridges the representation power of the medial axis and the need to
recover and group an object's parts in a cluttered scene. Our framework is
rooted in the idea that a maximally inscribed disc, the building block of a
medial axis, can be modeled as a compact superpixel in the image. We evaluate
the method on images of cluttered scenes.Comment: 10 pages, 8 figure
- …