198,872 research outputs found

    Conjunctive Visual and Auditory Development via Real-Time Dialogue

    Get PDF
    Human developmental learning is capable of dealing with the dynamic visual world, speech-based dialogue, and their complex real-time association. However, the architecture that realizes this for robotic cognitive development has not been reported in the past. This paper takes up this challenge. The proposed architecture does not require a strict coupling between visual and auditory stimuli. Two major operations contribute to the “abstraction” process: multiscale temporal priming and high-dimensional numeric abstraction through internal responses with reduced variance. As a basic principle of developmental learning, the programmer does not know the nature of the world events at the time of programming and, thus, hand-designed task-specific representation is not possible. We successfully tested the architecture on the SAIL robot under an unprecedented challenging multimodal interaction mode: use real-time speech dialogue as a teaching source for simultaneous and incremental visual learning and language acquisition, while the robot is viewing a dynamic world that contains a rotating object to which the dialogue is referring

    Real-Time Cognitive Computing Architecture for Data Fusion in a Dynamic Environment

    Get PDF
    A novel cognitive computing architecture is conceptualized for processing multiple channels of multi-modal sensory data streams simultaneously, and fusing the information in real time to generate intelligent reaction sequences. This unique architecture is capable of assimilating parallel data streams that could be analog, digital, synchronous/asynchronous, and could be programmed to act as a knowledge synthesizer and/or an "intelligent perception" processor. In this architecture, the bio-inspired models of visual pathway and olfactory receptor processing are combined as processing components, to achieve the composite function of "searching for a source of food while avoiding the predator." The architecture is particularly suited for scene analysis from visual data and odorant

    Visual Saliency Based on Multiscale Deep Features

    Get PDF
    Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision. In this CVPR 2015 paper, we discover that a high-quality visual saliency model can be trained with multiscale features extracted using a popular deep learning architecture, convolutional neural networks (CNNs), which have had many successes in visual recognition tasks. For learning such saliency models, we introduce a neural network architecture, which has fully connected layers on top of CNNs responsible for extracting features at three different scales. We then propose a refinement method to enhance the spatial coherence of our saliency results. Finally, aggregating multiple saliency maps computed for different levels of image segmentation can further boost the performance, yielding saliency maps better than those generated from a single segmentation. To promote further research and evaluation of visual saliency models, we also construct a new large database of 4447 challenging images and their pixelwise saliency annotation. Experimental results demonstrate that our proposed method is capable of achieving state-of-the-art performance on all public benchmarks, improving the F-Measure by 5.0% and 13.2% respectively on the MSRA-B dataset and our new dataset (HKU-IS), and lowering the mean absolute error by 5.7% and 35.1% respectively on these two datasets.Comment: To appear in CVPR 201

    Computer Analysis of Architecture Using Automatic Image Understanding

    Full text link
    In the past few years, computer vision and pattern recognition systems have been becoming increasingly more powerful, expanding the range of automatic tasks enabled by machine vision. Here we show that computer analysis of building images can perform quantitative analysis of architecture, and quantify similarities between city architectural styles in a quantitative fashion. Images of buildings from 18 cities and three countries were acquired using Google StreetView, and were used to train a machine vision system to automatically identify the location of the imaged building based on the image visual content. Experimental results show that the automatic computer analysis can automatically identify the geographical location of the StreetView image. More importantly, the algorithm was able to group the cities and countries and provide a phylogeny of the similarities between architectural styles as captured by StreetView images. These results demonstrate that computer vision and pattern recognition algorithms can perform the complex cognitive task of analyzing images of buildings, and can be used to measure and quantify visual similarities and differences between different styles of architectures. This experiment provides a new paradigm for studying architecture, based on a quantitative approach that can enhance the traditional manual observation and analysis. The source code used for the analysis is open and publicly available

    Visual Reference and Iconic Content

    Get PDF
    Evidence from cognitive science supports the claim that humans and other animals see the world as divided into objects. Although this claim is widely accepted, it remains unclear whether the mechanisms of visual reference have representational content or are directly instantiated in the functional architecture. I put forward a version of the former approach that construes object files as icons for objects. This view is consistent with the evidence that motivates the architectural account, can respond to the key arguments against representational accounts, and has explanatory advantages. I draw general lessons for the philosophy of perception and the naturalization of intentionality

    Client-Server Approach for Managing Visual Attention, Integrated in a Cognitive Architecture for a Social Robot

    Get PDF
    [EN] This paper proposes a novel system for managing visual attention in social robots. This system is based on a client/server approach that allows integration with a cognitive architecture controlling the robot. The core of this architecture is a distributed knowledge graph, in which the perceptual needs are expressed by the presence of arcs to stimuli that need to be perceived. The attention server sends motion commands to the actuators of the robot, while the attention clients send requests through the common knowledge representation. The common knowledge graph is shared by all levels of the architecture. This system has been implemented on ROS and tested on a social robot to verify the validity of the approach and was used to solve the tests proposed in RoboCup @ Home and SciROc robotic competitions. The tests have been used to quantitatively compare the proposal to traditional visual attention mechanismsSIMinisterio de Ciencia, Innovación y Universidade

    Towards the generation of visual qualia in artificial cognitive architectures

    Get PDF
    Proceeding of: Brain Inspired Cognitive Systems (BICS 2010). Madrid, Spain, 14-16 July, 2010.The nature and the generation of qualia in machines is a highly controversial issue. Even the existence of such a concept in the realm of artificial systems is often neglected or denied. In this work, we adopt a pragmatic approach to this problem using the Synthetic Phenomenology perspective. Specifically, we explore the generation of visual qualia in an artificial cognitive architecture inspired on the Global Workspace Theory (GWT). We argue that preliminary results obtained as part of this research line will help to characterize and identify artificial qualia as the direct products of conscious perception in machines. Additionally, we provide a computational model for integrated covert and overt perception in the framework of the GWT. A simple form of the apparent motion effect is used as a preliminary experimental context and a practical case study for the generation of synthetic visual experience. Thanks to an internal inspection subsystem, we are able to analyze both covert and overt percepts generated by our system when confronted with visual stimuli. The inspection of the internal states generated within the cognitive architecture enable us to discuss possible analogies with human cognition processes.This work was supported in part by the Spanish Ministry of Education under CICYT grant TRA2007-67374-C02-02.Publicad
    corecore