44 research outputs found
Recommended from our members
Empirical Statistics and Stochastic Models for Visual Signals
Mathematic
Recommended from our members
On the Computational Architecture of the Neocortex I: The Role of the Thalamo-Cortical Loop
This paper proposes that each area of the cortex carries on its calculations with the active participation of a nucleus in the thalamus with which it is reciprocally and topographically connected. Each cortical area is responsible for maintaining and updating the organism's knowledge of a specific aspect of the world, ranging from low level raw data to high level abstract representations, and involving interpreting stimuli and generating actions. In doing this, it will draw on multiple sources of expertise, learned from experience, creating multiple, often conflicting, hypotheses which are integrated by the action of the thalamic neurons and then sent back to the standard input layer of the cortex. Thus this nucleus plays the role of an 'active blackboard' on which the current best reconstruction of some aspect of the world is always displayed. Evidence for this theory is reviewed and experimental tests are proposed. A sequel to this paper will discuss the cortico-cortical loops and propose quite different computational roles for them.Mathematic
Recommended from our members
2D-Shape Analysis Using Conformal Mapping
The study of 2D shapes and their similarities is a central problem in the field of vision. It arises in particular from the task of classifying and recognizing objects from their observed silhouette. Defining natural distances between 2D shapes creates a metric space of shapes, whose mathematical structure is inherently relevant to the classification task. One intriguing metric space comes from using conformal mappings of 2D shapes into each other, via the theory of TeichmĂŒller spaces. In this space every simple closed curve in the plane (a âshapeâ) is represented by a âfingerprintâ which is a diffeomorphism of the unit circle to itself (a differentiable and invertible, periodic function). More precisely, every shape defines to a unique equivalence class of such diffeomorphisms up to right multiplication by a Möbius map. The fingerprint does not change if the shape is varied by translations and scaling and any such equivalence class comes from some shape. This coset space, equipped with the infinitesimal Weil-Petersson (WP) Riemannian norm is a metric space. In this space, the shortest path between each two shapes is unique, and is given by a geodesic connecting them. Their distance from each other is given by integrating the WP-norm along that geodesic. In this paper we concentrate on solving the âweldingâ problem of âsewingâ together conformally the interior and exterior of the unit circle, glued on the unit circle by a given diffeomorphism, to obtain the unique 2D shape associated with this diffeomorphism. This will allow us to go back and forth between 2D shapes and their representing diffeomorphisms in this âspace of shapesâ. We then present an efficient method for computing the unique shortest path, the geodesic of shape morphing between each two end-point shapes. The group of diffeomorphisms of S^1 acts as a group of isometries on the space of shapes and we show how this can be used to define shape transformations, like for instance âadding a protruding limbâ to any shape.Mathematic
Recommended from our members
Vanishing Geodesic Distance on Spaces of Submanifolds and Diffeomorphisms
The L^2-metric or Fubini-Study metric on the non-linear
Grassmannian of all submanifolds of type M in a Riemannian manifold (N, g) induces geodesic distance 0. We discuss another metric which involves the mean curvature and shows that its geodesic distance
is a good topological metric. The vanishing phenomenon for
the geodesic distance holds also for all diffeomorphism groups for the
L^2-metric.Mathematic
Recommended from our members
A Stochastic Grammar of Images
This exploratory paper quests for a stochastic and context sensitive grammar of images. The grammar should achieve the following four objectives and thus serves as a unified framework of representation, learning, and recognition for a large number of object categories. (i) The grammar represents both the hierarchical decompositions from scenes, to objects, parts, primitives and pixels by terminal and non-terminal nodes and the contexts for spatial and functional relations by horizontal links between the nodes. It formulates each object category as the set of all possible valid configurations produced by the grammar. (ii) The grammar is embodied in a simple And-Or graph representation where each Or-node points to alternative sub-configurations and an And-node is decomposed into a number of components. This representation supports recursive top-down/bottom-up procedures for image parsing under the Bayesian framework and make it convenient to scale up in complexity. Given an input image, the image parsing task constructs a most probable parse graph on-the-fly as the output interpretation and this parse graph is a subgraph of the And-Or graph after making choice on the Or-nodes. (iii) A probabilistic model is defined on this And-Or graph representation to account for the natural occurrence frequency of objects and parts as well as their relations. This model is learned from a relatively small training set per category and then sampled to synthesize a large number of configurations to cover novel object instances in the test set. This generalization capability is mostly missing in discriminative machine learning methods and can largely improve recognition performance in experiments. (iv) To fill the well-known semantic gap between symbols and raw signals, the grammar includes a series of visual dictionaries and organizes them through graph composition. At the bottom-level the dictionary is a set of image primitives each having a number of anchor points with open bonds to link with other primitives. These primitives can be combined to form larger and larger graph structures for parts and objects. The ambiguities in inferring local primitives shall be resolved through top-down computation using larger structures. Finally these primitives forms a primal sketch representation which will generate the input image with every pixels explained. The proposal grammar integrates three prominent representations in the literature: stochastic grammars for composition, Markov (or graphical) models for contexts, and sparse coding with primitives (wavelets). It also combines the structure-based and appearance based methods in the vision literature. Finally the paper presents three case studies to illustrate the proposed grammar.Mathematic