14,807 research outputs found
Linking the Laminar Circuits of Visual Cortex to Visual Perception
A detailed neural model is being developed of how the laminar circuits of visual cortical areas V1 and V2 implement context-sensitive binding processes such as perceptual grouping and attention, and develop and learn in a stable way. The model clarifies how preattentive and attentive perceptual mechanisms are linked within these laminar circuits, notably how bottom-up, top-down, and horizontal cortical connections interact. Laminar circuits allow the responses of visual cortical neurons to be influenced, not only by the stimuli within their classical receptive fields, but also by stimuli in the extra-classical surround. Such context-sensitive visual processing can greatly enhance the analysis of visual scenes, especially those containing targets that are low contrast, partially occluded, or crowded by distractors. Attentional enhancement can selectively propagate along groupings of both real and illusory contours, thereby showing how attention can selectively enhance object representations. Model mechanisms clarify how intracortical and intercortical feedback help to stabilize cortical development and learning. Although feedback plays a key role, fast feedforward processing is possible in response to unambiguous information.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-1-0409); National Science Foundation (IRI-97-20333); Office of Naval Research (N00014-95-1-0657
The Laminar Architecture of Visual Cortex and Image Processing Technology
The mammalian neocortex is organized into layers which include circuits that form functional columns in cortical maps. A major unsolved problem concerns how bottom-up, top-down, and horizontal interactions are organized within cortical layers to generate adaptive behaviors. This article summarizes a model, called the LAMINART model, of how these interactions help visual cortex to realize: (1) the binding process whereby cortex groups distributed data into coherent object representations; (2) the attentional process whereby cortex selectively processes important events; and (3) the developmental and learning processes whereby cortex stably grows and tunes its circuits to match environmental constraints. Such Laminar Computing completes perceptual groupings that realize the property of Analog Coherence, whereby winning groupings bind together their inducing features without losing their ability to represent analog values of these features. Laminar Computing also efficiently unifies the computational requirements of preattentive filtering and grouping with those of attentional selection. It hereby shows how Adaptive Resonance Theory (ART) principles may be realized within the laminar circuits of neocortex. Applications include boundary segmentation and surface filling-in algorithms for processing Synthetic Aperture Radar images.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-1-0409); Office of Naval Research (N00014-95-1-0657
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy
In this paper we shall consider the problem of deploying attention to subsets
of the video streams for collating the most relevant data and information of
interest related to a given task. We formalize this monitoring problem as a
foraging problem. We propose a probabilistic framework to model observer's
attentive behavior as the behavior of a forager. The forager, moment to moment,
focuses its attention on the most informative stream/camera, detects
interesting objects or activities, or switches to a more profitable stream. The
approach proposed here is suitable to be exploited for multi-stream video
summarization. Meanwhile, it can serve as a preliminary step for more
sophisticated video surveillance, e.g. activity and behavior analysis.
Experimental results achieved on the UCR Videoweb Activities Dataset, a
publicly available dataset, are presented to illustrate the utility of the
proposed technique.Comment: Accepted to IEEE Transactions on Image Processin
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition
This paper presents a self-supervised method for visual detection of the
active speaker in a multi-person spoken interaction scenario. Active speaker
detection is a fundamental prerequisite for any artificial cognitive system
attempting to acquire language in social settings. The proposed method is
intended to complement the acoustic detection of the active speaker, thus
improving the system robustness in noisy conditions. The method can detect an
arbitrary number of possibly overlapping active speakers based exclusively on
visual information about their face. Furthermore, the method does not rely on
external annotations, thus complying with cognitive development. Instead, the
method uses information from the auditory modality to support learning in the
visual domain. This paper reports an extensive evaluation of the proposed
method using a large multi-person face-to-face interaction dataset. The results
show good performance in a speaker dependent setting. However, in a speaker
independent setting the proposed method yields a significantly lower
performance. We believe that the proposed method represents an essential
component of any artificial cognitive system or robotic platform engaging in
social interactions.Comment: 10 pages, IEEE Transactions on Cognitive and Developmental System
Towards a Unified Theory of Neocortex: Laminar Cortical Circuits for Vision and Cognition
A key goal of computational neuroscience is to link brain mechanisms to behavioral functions. The present article describes recent progress towards explaining how laminar neocortical circuits give rise to biological intelligence. These circuits embody two new and revolutionary computational paradigms: Complementary Computing and Laminar Computing. Circuit properties include a novel synthesis of feedforward and feedback processing, of digital and analog processing, and of pre-attentive and attentive processing. This synthesis clarifies the appeal of Bayesian approaches but has a far greater predictive range that naturally extends to self-organizing processes. Examples from vision and cognition are summarized. A LAMINART architecture unifies properties of visual development, learning, perceptual grouping, attention, and 3D vision. A key modeling theme is that the mechanisms which enable development and learning to occur in a stable way imply properties of adult behavior. It is noted how higher-order attentional constraints can influence multiple cortical regions, and how spatial and object attention work together to learn view-invariant object categories. In particular, a form-fitting spatial attentional shroud can allow an emerging view-invariant object category to remain active while multiple view categories are associated with it during sequences of saccadic eye movements. Finally, the chapter summarizes recent work on the LIST PARSE model of cognitive information processing by the laminar circuits of prefrontal cortex. LIST PARSE models the short-term storage of event sequences in working memory, their unitization through learning into sequence, or list, chunks, and their read-out in planned sequential performance that is under volitional control. LIST PARSE provides a laminar embodiment of Item and Order working memories, also called Competitive Queuing models, that have been supported by both psychophysical and neurobiological data. These examples show how variations of a common laminar cortical design can embody properties of visual and cognitive intelligence that seem, at least on the surface, to be mechanistically unrelated.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
- …