709 research outputs found

    Unbiased Directed Object Attention Graph for Object Navigation

    Full text link
    Object navigation tasks require agents to locate specific objects in unknown environments based on visual information. Previously, graph convolutions were used to implicitly explore the relationships between objects. However, due to differences in visibility among objects, it is easy to generate biases in object attention. Thus, in this paper, we propose a directed object attention (DOA) graph to guide the agent in explicitly learning the attention relationships between objects, thereby reducing the object attention bias. In particular, we use the DOA graph to perform unbiased adaptive object attention (UAOA) on the object features and unbiased adaptive image attention (UAIA) on the raw images, respectively. To distinguish features in different branches, a concise adaptive branch energy distribution (ABED) method is proposed. We assess our methods on the AI2-Thor dataset. Compared with the state-of-the-art (SOTA) method, our method reports 7.4%, 8.1% and 17.6% increase in success rate (SR), success weighted by path length (SPL) and success weighted by action efficiency (SAE), respectively.Comment: 13 pages, ready to ACM Mutimedia, under revie

    View-Invariant Object Category Learning, Recognition, and Search: How Spatial and Object Attention Are Coordinated Using Surface-Based Attentional Shrouds

    Full text link
    Air Force Office of Scientific Research (F49620-01-1-0397); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Texture Segregation By Visual Cortex: Perceptual Grouping, Attention, and Learning

    Get PDF
    A neural model is proposed of how laminar interactions in the visual cortex may learn and recognize object texture and form boundaries. The model brings together five interacting processes: region-based texture classification, contour-based boundary grouping, surface filling-in, spatial attention, and object attention. The model shows how form boundaries can determine regions in which surface filling-in occurs; how surface filling-in interacts with spatial attention to generate a form-fitting distribution of spatial attention, or attentional shroud; how the strongest shroud can inhibit weaker shrouds; and how the winning shroud regulates learning of texture categories, and thus the allocation of object attention. The model can discriminate abutted textures with blurred boundaries and is sensitive to texture boundary attributes like discontinuities in orientation and texture flow curvature as well as to relative orientations of texture elements. The model quantitatively fits a large set of human psychophysical data on orientation-based textures. Object boundar output of the model is compared to computer vision algorithms using a set of human segmented photographic images. The model classifies textures and suppresses noise using a multiple scale oriented filterbank and a distributed Adaptive Resonance Theory (dART) classifier. The matched signal between the bottom-up texture inputs and top-down learned texture categories is utilized by oriented competitive and cooperative grouping processes to generate texture boundaries that control surface filling-in and spatial attention. Topdown modulatory attentional feedback from boundary and surface representations to early filtering stages results in enhanced texture boundaries and more efficient learning of texture within attended surface regions. Surface-based attention also provides a self-supervising training signal for learning new textures. Importance of the surface-based attentional feedback in texture learning and classification is tested using a set of textured images from the Brodatz micro-texture album. Benchmark studies vary from 95.1% to 98.6% with attention, and from 90.6% to 93.2% without attention.Air Force Office of Scientific Research (F49620-01-1-0397, F49620-01-1-0423); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Eye fixation during multiple object attention is based on a representation of discrete spatial foci

    Get PDF
    We often look at and attend to several objects at once. How the brain determines where to point our eyes when we do this is poorly understood. Here we devised a novel paradigm to discriminate between different models of spatial selection guiding fixation. In contrast to standard static attentional tasks where the eye remains fixed at a predefined location, observers selected their own preferred fixation position while they tracked static targets that were arranged in specific geometric configurations and which changed identity over time. Fixations were best predicted by a representation of discrete spatial foci, not a polygonal grouping, simple 2-foci division of attention or a circular spotlight. Moreover, attentional performance was incompatible with serial selection, suggesting that attentional selection and fixation share the same spatial representation. Together with previous findings on fixational microsaccades during covert attention, our results suggest a more nuanced definition of overt vs. covert attention.Publisher PDFPeer reviewe

    Attention-controlled acquisition of a qualitative scene model for mobile robots

    Get PDF
    Haasch A. Attention-controlled acquisition of a qualitative scene model for mobile robots. Bielefeld (Germany): Bielefeld University; 2007.Robots that are used to support humans in dangerous environments, e.g., in manufacture facilities, are established for decades. Now, a new generation of service robots is focus of current research and about to be introduced. These intelligent service robots are intended to support humans in everyday life. To achieve a most comfortable human-robot interaction with non-expert users it is, thus, imperative for the acceptance of such robots to provide interaction interfaces that we humans are accustomed to in comparison to human-human communication. Consequently, intuitive modalities like gestures or spontaneous speech are needed to teach the robot previously unknown objects and locations. Then, the robot can be entrusted with tasks like fetch-and-carry orders even without an extensive training of the user. In this context, this dissertation introduces the multimodal Object Attention System which offers a flexible integration of common interaction modalities in combination with state-of-the-art image and speech processing techniques from other research projects. To prove the feasibility of the approach the presented Object Attention System has successfully been integrated in different robotic hardware. In particular, the mobile robot BIRON and the anthropomorphic robot BARTHOC of the Applied Computer Science Group at Bielefeld University. Concluding, the aim of this work, to acquire a qualitative Scene Model by a modular component offering object attention mechanisms, has been successfully achieved as demonstrated on numerous occasions like reviews for the EU-integrated Project COGNIRON or demos

    Towards a Unified Theory of Neocortex: Laminar Cortical Circuits for Vision and Cognition

    Full text link
    A key goal of computational neuroscience is to link brain mechanisms to behavioral functions. The present article describes recent progress towards explaining how laminar neocortical circuits give rise to biological intelligence. These circuits embody two new and revolutionary computational paradigms: Complementary Computing and Laminar Computing. Circuit properties include a novel synthesis of feedforward and feedback processing, of digital and analog processing, and of pre-attentive and attentive processing. This synthesis clarifies the appeal of Bayesian approaches but has a far greater predictive range that naturally extends to self-organizing processes. Examples from vision and cognition are summarized. A LAMINART architecture unifies properties of visual development, learning, perceptual grouping, attention, and 3D vision. A key modeling theme is that the mechanisms which enable development and learning to occur in a stable way imply properties of adult behavior. It is noted how higher-order attentional constraints can influence multiple cortical regions, and how spatial and object attention work together to learn view-invariant object categories. In particular, a form-fitting spatial attentional shroud can allow an emerging view-invariant object category to remain active while multiple view categories are associated with it during sequences of saccadic eye movements. Finally, the chapter summarizes recent work on the LIST PARSE model of cognitive information processing by the laminar circuits of prefrontal cortex. LIST PARSE models the short-term storage of event sequences in working memory, their unitization through learning into sequence, or list, chunks, and their read-out in planned sequential performance that is under volitional control. LIST PARSE provides a laminar embodiment of Item and Order working memories, also called Competitive Queuing models, that have been supported by both psychophysical and neurobiological data. These examples show how variations of a common laminar cortical design can embody properties of visual and cognitive intelligence that seem, at least on the surface, to be mechanistically unrelated.National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624
    • …
    corecore