5,803 research outputs found

    A survey of visual preprocessing and shape representation techniques

    Get PDF
    Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)

    Ventral-stream-like shape representation : from pixel intensity values to trainable object-selective COSFIRE models

    Get PDF
    Keywords: hierarchical representation, object recognition, shape, ventral stream, vision and scene understanding, robotics, handwriting analysisThe remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 → V4 → TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition. An S-COSFIRE filter is automatically configured to be selective for an arrangement of contour-based features that belong to a prototype shape specified by an example. The configuration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE filters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work. We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot. S-COSFIRE filters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms.peer-reviewe

    View-Invariant Object Category Learning, Recognition, and Search: How Spatial and Object Attention Are Coordinated Using Surface-Based Attentional Shrouds

    Full text link
    Air Force Office of Scientific Research (F49620-01-1-0397); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Predictive coding: A Possible Explanation of Filling-in at the blind spot

    Full text link
    Filling-in at the blind-spot is a perceptual phenomenon in which the visual system fills the informational void, which arises due to the absence of retinal input corresponding to the optic disc, with surrounding visual attributes. Though there are enough evidence to conclude that some kind of neural computation is involved in filling-in at the blind spot especially in the early visual cortex, the knowledge of the actual computational mechanism is far from complete. We have investigated the bar experiments and the associated filling-in phenomenon in the light of the hierarchical predictive coding framework, where the blind-spot was represented by the absence of early feed-forward connection. We recorded the responses of predictive estimator neurons at the blind-spot region in the V1 area of our three level (LGN-V1-V2) model network. These responses are in agreement with the results of earlier physiological studies and using the generative model we also showed that these response profiles indeed represent the filling-in completion. These demonstrate that predictive coding framework could account for the filling-in phenomena observed in several psychophysical and physiological experiments involving bar stimuli. These results suggest that the filling-in could naturally arise from the computational principle of hierarchical predictive coding (HPC) of natural images.Comment: 23 pages, 9 figure

    Linking Visual Development and Learning to Information Processing: Preattentive and Attentive Brain Dynamics

    Full text link
    National Science Foundation (SBE-0354378); Office of Naval Research (N00014-95-1-0657

    Cortical Dynamics of 3-D Surface Perception: Binocular and Half-Occluded Scenic Images

    Full text link
    Previous models of stereopsis have concentrated on the task of binocularly matching left and right eye primitives uniquely. A disparity smoothness constraint is often invoked to limit the number of possible matches. These approaches neglect the fact that surface discontinuities are both abundant in natural everyday scenes, and provide a useful cue for scene segmentation. da Vinci stereopsis refers to the more general problem of dealing with surface discontinuities and their associated unmatched monocular regions within binocular scenes. This study develops a mathematical realization of a neural network theory of biological vision, called FACADE Theory, that shows how early cortical stereopsis processes are related to later cortical processes of 3-D surface representation. The mathematical model demonstrates through computer simulation how the visual cortex may generate 3-D boundary segmentations and use them to control filling-in of 3-D surface properties in response to visual scenes. Model mechanisms correctly match disparate binocular regions while filling-in monocular regions with the correct depth within a binocularly viewed scene. This achievement required introduction of a new multiscale binocular filter for stereo matching which clarifies how cortical complex cells match image contours of like contrast polarity, while pooling signals from opposite contrast polarities. Competitive interactions among filter cells suggest how false binocular matches and unmatched monocular cues, which contain eye-of-origin information, arc automatically handled across multiple spatial scales. This network also helps to explain data concerning context-sensitive binocular matching. Pooling of signals from even-symmetric and odd-symmctric simple cells at complex cells helps to eliminate spurious activity peaks in matchable signals. Later stages of cortical processing by the blob and interblob streams, including refined concepts of cooperative boundary grouping and reciprocal stream interactions between boundary and surface representations, arc modeled to provide a complete simulation of the da Vinci stereopsis percept.Office of Naval Research (N00014-95-I-0409, N00014-85-1-0657, N00014-92-J-4015, N00014-91-J-4100); Airforce Office of Scientific Research (90-0175); National Science Foundation (IRI-90-00530); The James S. McDonnell Foundation (94-40
    corecore