1,548 research outputs found

    Integration of Action and Language Knowledge: A Roadmap for Developmental Robotics

    Get PDF
    “This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.”This position paper proposes that the study of embodied cognitive agents, such as humanoid robots, can advance our understanding of the cognitive development of complex sensorimotor, linguistic, and social learning skills. This in turn will benefit the design of cognitive robots capable of learning to handle and manipulate objects and tools autonomously, to cooperate and communicate with other robots and humans, and to adapt their abilities to changing internal, environmental, and social conditions. Four key areas of research challenges are discussed, specifically for the issues related to the understanding of: 1) how agents learn and represent compositional actions; 2) how agents learn and represent compositional lexica; 3) the dynamics of social interaction and learning; and 4) how compositional action and language representations are integrated to bootstrap the cognitive system. The review of specific issues and progress in these areas is then translated into a practical roadmap based on a series of milestones. These milestones provide a possible set of cognitive robotics goals and test scenarios, thus acting as a research roadmap for future work on cognitive developmental robotics.Peer reviewe

    Multimodal Fusion as Communicative Acts during Human-Robot Interaction

    Get PDF
    Research on dialog systems is a very active area in social robotics. During the last two decades, these systems have evolved from those based only on speech recognition and synthesis to the current and modern systems, which include new components and multimodality. By multimodal dialogue we mean the interchange of information among several interlocutors, not just using their voice as the mean of transmission but also all the available channels such as gestures, facial expressions, touch, sounds, etc. These channels add information to the message to be transmitted in every dialogue turn. The dialogue manager (IDiM) is one of the components of the robotic dialog system (RDS) and is in charge of managing the dialogue flow during the conversational turns. In order to do that, it is necessary to coherently treat the inputs and outputs of information that flow by different communication channels: audio, vision, radio frequency, touch, etc. In our approach, this multichannel input of information is temporarily fused into communicative acts (CAs). Each CA groups the information that flows through the different input channels into the same pack, transmitting a unique message or global idea. Therefore, this temporary fusion of information allows the IDiM to abstract from the channels used during the interaction, focusing only on the message, not on the way it is transmitted. This article presents the whole RDS and the description of how the multimodal fusion of information is made as CAs. Finally, several scenarios where the multimodal dialogue is used are presented.Comunidad de Madri

    Developmental Bootstrapping of AIs

    Full text link
    Although some current AIs surpass human abilities in closed artificial worlds such as board games, their abilities in the real world are limited. They make strange mistakes and do not notice them. They cannot be instructed easily, fail to use common sense, and lack curiosity. They do not make good collaborators. Mainstream approaches for creating AIs are the traditional manually-constructed symbolic AI approach and generative and deep learning AI approaches including large language models (LLMs). These systems are not well suited for creating robust and trustworthy AIs. Although it is outside of the mainstream, the developmental bootstrapping approach has more potential. In developmental bootstrapping, AIs develop competences like human children do. They start with innate competences. They interact with the environment and learn from their interactions. They incrementally extend their innate competences with self-developed competences. They interact and learn from people and establish perceptual, cognitive, and common grounding. They acquire the competences they need through bootstrapping. However, developmental robotics has not yet produced AIs with robust adult-level competences. Projects have typically stopped at the Toddler Barrier corresponding to human infant development at about two years of age, before their speech is fluent. They also do not bridge the Reading Barrier, to skillfully and skeptically draw on the socially developed information resources that power current LLMs. The next competences in human cognitive development involve intrinsic motivation, imitation learning, imagination, coordination, and communication. This position paper lays out the logic, prospects, gaps, and challenges for extending the practice of developmental bootstrapping to acquire further competences and create robust, resilient, and human-compatible AIs.Comment: 102 pages, 29 figure

    Survey: Robot Programming by Demonstration

    Get PDF
    Robot PbD started about 30 years ago, growing importantly during the past decade. The rationale for moving from purely preprogrammed robots to very flexible user-based interfaces for training the robot to perform a task is three-fold. First and foremost, PbD, also referred to as {\em imitation learning} is a powerful mechanism for reducing the complexity of search spaces for learning. When observing either good or bad examples, one can reduce the search for a possible solution, by either starting the search from the observed good solution (local optima), or conversely, by eliminating from the search space what is known as a bad solution. Imitation learning is, thus, a powerful tool for enhancing and accelerating learning in both animals and artifacts. Second, imitation learning offers an implicit means of training a machine, such that explicit and tedious programming of a task by a human user can be minimized or eliminated (Figure \ref{fig:what-how}). Imitation learning is thus a ``natural'' means of interacting with a machine that would be accessible to lay people. And third, studying and modeling the coupling of perception and action, which is at the core of imitation learning, helps us to understand the mechanisms by which the self-organization of perception and action could arise during development. The reciprocal interaction of perception and action could explain how competence in motor control can be grounded in rich structure of perceptual variables, and vice versa, how the processes of perception can develop as means to create successful actions. PbD promises were thus multiple. On the one hand, one hoped that it would make the learning faster, in contrast to tedious reinforcement learning methods or trials-and-error learning. On the other hand, one expected that the methods, being user-friendly, would enhance the application of robots in human daily environments. Recent progresses in the field, which we review in this chapter, show that the field has make a leap forward the past decade toward these goals and that these promises may be fulfilled very soon
    corecore