4,178 research outputs found

    Motion Invariance in Visual Environments

    Full text link
    The puzzle of computer vision might find new challenging solutions when we realize that most successful methods are working at image level, which is remarkably more difficult than processing directly visual streams, just as happens in nature. In this paper, we claim that their processing naturally leads to formulate the motion invariance principle, which enables the construction of a new theory of visual learning based on convolutional features. The theory addresses a number of intriguing questions that arise in natural vision, and offers a well-posed computational scheme for the discovery of convolutional filters over the retina. They are driven by the Euler-Lagrange differential equations derived from the principle of least cognitive action, that parallels laws of mechanics. Unlike traditional convolutional networks, which need massive supervision, the proposed theory offers a truly new scenario in which feature learning takes place by unsupervised processing of video signals. An experimental report of the theory is presented where we show that features extracted under motion invariance yield an improvement that can be assessed by measuring information-based indexes.Comment: arXiv admin note: substantial text overlap with arXiv:1801.0711

    Symbol Emergence in Robotics: A Survey

    Full text link
    Humans can learn the use of language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form a symbol system and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted on the construction of robotic systems and machine-learning methods that can learn the use of language through embodied multimodal interaction with their environment and other systems. Understanding human social interactions and developing a robot that can smoothly communicate with human users in the long term, requires an understanding of the dynamics of symbol systems and is crucially important. The embodied cognition and social interaction of participants gradually change a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER is a constructive approach towards an emergent symbol system. The emergent symbol system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e., humans and developmental robots. Specifically, we describe some state-of-art research topics concerning SER, e.g., multimodal categorization, word discovery, and a double articulation analysis, that enable a robot to obtain words and their embodied meanings from raw sensory--motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions of research in SER.Comment: submitted to Advanced Robotic

    Robots that Say ‘No’. Affective Symbol Grounding and the Case of Intent Interpretations

    Get PDF
    © 2017 IEEE. This article has been accepted for publication in a forthcoming issue of IEEE Transactions on Cognitive and Developmental Systems. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Modern theories on early child language acquisition tend to focus on referential words, mostly nouns, labeling concrete objects, or physical properties. In this experimental proof-of-concept study, we show how nonreferential negation words, typically belonging to a child's first ten words, may be acquired. A child-like humanoid robot is deployed in speech-wise unconstrained interaction with naïve human participants. In agreement with psycholinguistic observations, we corroborate the hypothesis that affect plays a pivotal role in the socially distributed acquisition process where the adept conversation partner provides linguistic interpretations of the affective displays of the less adept speaker. Negation words are prosodically salient within intent interpretations that are triggered by the learner's display of affect. From there they can be picked up and used by the budding language learner which may involve the grounding of these words in the very affective states that triggered them in the first place. The pragmatic analysis of the robot's linguistic performance indicates that the correct timing of negative utterances is essential for the listener to infer the meaning of otherwise ambiguous negative utterances. In order to assess the robot's performance thoroughly comparative data from psycholinguistic studies of parent-child dyads is needed highlighting the need for further interdisciplinary work.Peer reviewe

    Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks

    Full text link
    It is common to implicitly assume access to intelligently captured inputs (e.g., photos from a human photographer), yet autonomously capturing good observations is itself a major challenge. We address the problem of learning to look around: if a visual agent has the ability to voluntarily acquire new views to observe its environment, how can it learn efficient exploratory behaviors to acquire informative observations? We propose a reinforcement learning solution, where the agent is rewarded for actions that reduce its uncertainty about the unobserved portions of its environment. Based on this principle, we develop a recurrent neural network-based approach to perform active completion of panoramic natural scenes and 3D object shapes. Crucially, the learned policies are not tied to any recognition task nor to the particular semantic content seen during training. As a result, 1) the learned "look around" behavior is relevant even for new tasks in unseen environments, and 2) training data acquisition involves no manual labeling. Through tests in diverse settings, we demonstrate that our approach learns useful generic policies that transfer to new unseen tasks and environments. Completion episodes are shown at https://goo.gl/BgWX3W

    Action in Mind: Neural Models for Action and Intention Perception

    Get PDF
    To notice, recognize, and ultimately perceive the others’ actions and to discern the intention behind those observed actions is an essential skill for social communications and improves markedly the chances of survival. Encountering dangerous behavior, for instance, from a person or an animal requires an immediate and suitable reaction. In addition, as social creatures, we need to perceive, interpret, and judge correctly the other individual’s actions as a fundamental skill for our social life. In other words, our survival and success in adaptive social behavior and nonverbal communication depends heavily on our ability to thrive in complex social situations. However, it has been shown that humans spontaneously can decode animacy and social interactions even from strongly impoverished stimuli and this is a fundamental part of human experience that develops early in infancy and is shared with other primates. In addition, it is well established that perceptual and motor representations of actions are tightly coupled and both share common mechanisms. This coupling between action perception and action execution plays a critical role in action understanding as postulated in various studies and they are potentially important for our social cognition. This interaction likely is mediated by action-selective neurons in the superior temporal sulcus (STS), premotor and parietal cortex. STS and TPJ have been identified also as coarse neural substrate for the processing of social interactions stimuli. Despite this localization, the underlying exact neural circuits of this processing remain unclear. The aim of this thesis is to understand the neural mechanisms behind the action perception coupling and to investigate further how human brain perceive different classes of social interactions. To achieve this goal, first we introduce a neural model that provides a unifying account for multiple experiments on the interaction between action execution and action perception. The model reproduces correctly the interactions between action observation and execution in several experiments and provides a link towards electrophysiological detailed models of relevant circuits. This model might thus provide a starting point for the detailed quantitative investigation how motor plans interact with perceptual action representations at the level of single-cell mechanisms. Second we present a simple neural model that reproduces some of the key observations in psychophysical experiments about the perception of animacy and social interactions from stimuli. Even in its simple form the model proves that animacy and social interaction judgments partly might be derived by very elementary operations in hierarchical neural vision systems, without a need of sophisticated or accurate probabilistic inference

    A Virtual Conversational Agent for Teens with Autism: Experimental Results and Design Lessons

    Full text link
    We present the design of an online social skills development interface for teenagers with autism spectrum disorder (ASD). The interface is intended to enable private conversation practice anywhere, anytime using a web-browser. Users converse informally with a virtual agent, receiving feedback on nonverbal cues in real-time, and summary feedback. The prototype was developed in consultation with an expert UX designer, two psychologists, and a pediatrician. Using the data from 47 individuals, feedback and dialogue generation were automated using a hidden Markov model and a schema-driven dialogue manager capable of handling multi-topic conversations. We conducted a study with nine high-functioning ASD teenagers. Through a thematic analysis of post-experiment interviews, identified several key design considerations, notably: 1) Users should be fully briefed at the outset about the purpose and limitations of the system, to avoid unrealistic expectations. 2) An interface should incorporate positive acknowledgment of behavior change. 3) Realistic appearance of a virtual agent and responsiveness are important in engaging users. 4) Conversation personalization, for instance in prompting laconic users for more input and reciprocal questions, would help the teenagers engage for longer terms and increase the system's utility
    • …
    corecore