3 research outputs found

    Autonomous Acquisition of Natural Situated Communication

    Get PDF
    An important part of human intelligence, both historically and operationally, is our ability to communicate. We learn how to communicate, and maintain our communicative skills, in a society of communicators – a highly effective way to reach and maintain proficiency in this complex skill. Principles that might allow artificial agents to learn language this way are in completely known at present – the multi-dimensional nature of socio-communicative skills are beyond every machine learning framework so far proposed. Our work begins to address the challenge of proposing a way for observation-based machine learning of natural language and communication. Our framework can learn complex communicative skills with minimal up-front knowledge. The system learns by incrementally producing predictive models of causal relationships in observed data, guided by goal-inference and reasoning using forward-inverse models. We present results from two experiments where our S1 agent learns human communication by observing two humans interacting in a realtime TV-style interview, using multimodal communicative gesture and situated language to talk about recycling of various materials and objects. S1 can learn multimodal complex language and multimodal communicative acts, a vocabulary of 100 words forming natural sentences with relatively complex sentence structure, including manual deictic reference and anaphora. S1 is seeded only with high-level information about goals of the interviewer and interviewee, and a small ontology; no grammar or other information is provided to S1 a priori. The agent learns the pragmatics, semantics, and syntax of complex utterances spoken and gestures from scratch, by observing the humans compare and contrast the cost and pollution related to recycling aluminum cans, glass bottles, newspaper, plastic, and wood. After 20 hours of observation S1 can perform an unscripted TV interview with a human, in the same style, without making mistakes

    A Probabilistic Approach to Learning a Visually Grounded Language Model through Human-Robot Interaction

    No full text
    A Language is among the most fascinating and complex cognitive activities that develops rapidly since the early months of infants' life. The aim of the present work is to provide a humanoid robot with cognitive, perceptual and motor skills fundamental for the acquisition of a rudimentary form of language. We present a novel probabilistic model, inspired by the findings in cognitive sciences, able to associate spoken words with their perceptually grounded meanings. The main focus is set on acquiring the meaning of various perceptual categories (e. g. red, blue, circle, above, etc.), rather than specific world entities (e. g. an apple, a toy, etc.). Our probabilistic model is based on a variant of multi-instance learning technique, and it enables a robotic platform to learn grounded meanings of adjective/noun terms. The systems could be used to understand and generate appropriate natural language descriptions of real objects in a scene, and it has been successfully tested on the NAO humanoid robotic platform

    Learning Hierarchical Compositional Task Definitions through Online Situated Interactive Language Instruction

    Full text link
    Artificial agents, from robots to personal assistants, have become competent workers in many settings and embodiments, but for the most part, they are limited to performing the capabilities and tasks with which they were initially programmed. Learning in these settings has predominately focused on learning to improve the agent’s performance on a task, and not on learning the actual definition of a task. The primary method for imbuing an agent with the task definition has been through programming by humans, who have detailed knowledge of the task, domain, and agent architecture. In contrast, humans quickly learn new tasks from scratch, often from instruction by another human. If we desire AI agents to be flexible and dynamically extendable, they will need to emulate these learning capabilities, and not be stuck with the limitation that task definitions must be acquired through programming. This dissertation explores the problem of how an Interactive Task Learning agent can learn the complete definition or formulation of novel tasks rapidly through online natural language instruction from a human instructor. Recent advances in natural language processing, memory systems, computer vision, spatial reasoning, robotics, and cognitive architectures make the time ripe to study how knowledge can be automatically acquired, represented, transferred, and operationalized. We present a learning approach embodied in an ITL agent that interactively learns the meaning of task concepts, the goals, actions, failure conditions, and task-specific terms, for 60 games and puzzles. In our approach, the agent learns hierarchical symbolic representations of task knowledge that enable it to transfer and compose knowledge, analyze and debug multiple interpretations, and communicate with the teacher to resolve ambiguity. Our results show that the agent can correctly generalize, disambiguate, and transfer concepts across variations of language descriptions and world representations, even with distractors present.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/153434/1/jrkirk_1.pd
    corecore