8,720 research outputs found
Coplanar Repeats by Energy Minimization
This paper proposes an automated method to detect, group and rectify
arbitrarily-arranged coplanar repeated elements via energy minimization. The
proposed energy functional combines several features that model how planes with
coplanar repeats are projected into images and captures global interactions
between different coplanar repeat groups and scene planes. An inference
framework based on a recent variant of -expansion is described and fast
convergence is demonstrated. We compare the proposed method to two widely-used
geometric multi-model fitting methods using a new dataset of annotated images
containing multiple scene planes with coplanar repeats in varied arrangements.
The evaluation shows a significant improvement in the accuracy of
rectifications computed from coplanar repeats detected with the proposed method
versus those detected with the baseline methods.Comment: 14 pages with supplemental materials attache
The constitution of visual perceptual units in the functional architecture of V1
Scope of this paper is to consider a mean field neural model which takes into
account the functional neurogeometry of the visual cortex modelled as a group
of rotations and translations. The model generalizes well known results of
Bressloff and Cowan which, in absence of input, accounts for hallucination
patterns. The main result of our study consists in showing that in presence of
a visual input, the eigenmodes of the linearized operator which become stable
represent perceptual units present in the image. The result is strictly related
to dimensionality reduction and clustering problems
Recommended from our members
Recognition by directed attention to recursively partitioned images
A learning/recognition model (and instantiating program) is described which recursively combines the learning paradigms of conceptual clustering (Michalski, 1980) and learning-from-examples to resolve the ambiguities of real-world recognition. The model is based on neuropsychological and psychological evidence that the visual system is analytic, hierarchical, and composed of a parallel/serial dichotomy (many, see conclusions by Crick, 1984). Emulating the experimental evidence, parallel processes in the model decompose the image into components and cluster the constituents in much the same way as the image processing technique known as moment analysis (Alt, 1962). Serial, attentive mechanisms then reassemble the decompositions by investigating spatial relationships between components. The use of attentive mechanisms extends the moment analysis technique to handle alterations in structure and solves the contention problem created by combining the two learning paradigms. The contention results from a disagreement between the teacher and the model on what constitutes the salient features at the highest level of the symbol. There are four cases ZBT must handle, two of which result from the disagreement with the teacher. The parallel/serial dichotomy represents a vertical/horizontal tradeoff between the invariant and variant features of a domain. The resultant learned hierarchy allows ZBT to recognize structural differences while avoiding problems of exponential growth
Radially-Distorted Conjugate Translations
This paper introduces the first minimal solvers that jointly solve for
affine-rectification and radial lens distortion from coplanar repeated
patterns. Even with imagery from moderately distorted lenses, plane
rectification using the pinhole camera model is inaccurate or invalid. The
proposed solvers incorporate lens distortion into the camera model and extend
accurate rectification to wide-angle imagery, which is now common from consumer
cameras. The solvers are derived from constraints induced by the conjugate
translations of an imaged scene plane, which are integrated with the division
model for radial lens distortion. The hidden-variable trick with ideal
saturation is used to reformulate the constraints so that the solvers generated
by the Grobner-basis method are stable, small and fast.
Rectification and lens distortion are recovered from either one conjugately
translated affine-covariant feature or two independently translated
similarity-covariant features. The proposed solvers are used in a \RANSAC-based
estimator, which gives accurate rectifications after few iterations. The
proposed solvers are evaluated against the state-of-the-art and demonstrate
significantly better rectifications on noisy measurements. Qualitative results
on diverse imagery demonstrate high-accuracy undistortions and rectifications.
The source code is publicly available at https://github.com/prittjam/repeats
Detecting the presence of large buildings in natural images
This paper addresses the issue of classification of lowlevel
features into high-level semantic concepts for the purpose of semantic annotation of consumer photographs. We adopt a multi-scale approach that relies on edge detection to extract an edge orientation-based feature description of the image, and apply an SVM learning technique to infer the presence of a dominant building object in a general purpose collection of digital photographs. The approach exploits prior knowledge on the image context through an assumption that all input images are ïżœoutdoorïżœ, i.e. indoor/outdoor classification (the context determination stage) has been performed. The proposed approach is validated on a diverse dataset of 1720 images and its performance compared with that of the MPEG-7 edge histogram descriptor
Learning Object Categories From Internet Image Searches
In this paper, we describe a simple approach to learning models of visual object categories from images gathered from Internet image search engines. The images for a given keyword are typically highly variable, with a large fraction being unrelated to the query term, and thus pose a challenging environment from which to learn. By training our models directly from Internet images, we remove the need to laboriously compile training data sets, required by most other recognition approaches-this opens up the possibility of learning object category models âon-the-fly.â We describe two simple approaches, derived from the probabilistic latent semantic analysis (pLSA) technique for text document analysis, that can be used to automatically learn object models from these data. We show two applications of the learned model: first, to rerank the images returned by the search engine, thus improving the quality of the search engine; and second, to recognize objects in other image data sets
- âŠ