2,368 research outputs found

    VISION-BASED URBAN NAVIGATION PROCEDURES FOR VERBALLY INSTRUCTED ROBOTS

    Get PDF
    The work presented in this thesis is part of a project in instruction based learning (IBL) for mobile robots were a robot is designed that can be instructed by its users through unconstrained natural language. The robot uses vision guidance to follow route instructions in a miniature town model. The aim of the work presented here was to determine the functional vocabulary of the robot in the form of "primitive procedures". In contrast to previous work in the field of instructable robots this was done following a "user-centred" approach were the main concern was to create primitive procedures that can be directly associated with natural language instructions. To achieve this, a corpus of human-to-human natural language instructions was collected and analysed. A set of primitive actions was found with which the collected corpus could be represented. These primitive actions were then implemented as robot-executable procedures. Natural language instructions are under-specified when destined to be executed by a robot. This is because instructors omit information that they consider as "commonsense" and rely on the listener's sensory-motor capabilities to determine the details of the task execution. In this thesis the under-specification problem is solved by determining the missing information, either during the learning of new routes or during their execution by the robot. During learning, the missing information is determined by imitating the commonsense approach human listeners take to achieve the same purpose. During execution, missing information, such as the location of road layout features mentioned in route instructions, is determined from the robot's view by using image template matching. The original contribution of this thesis, in both these methods, lies in the fact that they are driven by the natural language examples found in the corpus collected for the IDL project. During the testing phase a high success rate of primitive calls, when these were considered individually, showed that the under-specification problem has overall been solved. A novel method for testing the primitive procedures, as part of complete route descriptions, is also proposed in this thesis. This was done by comparing the performance of human subjects when driving the robot, following route descriptions, with the performance of the robot when executing the same route descriptions. The results obtained from this comparison clearly indicated where errors occur from the time when a human speaker gives a route description to the time when the task is executed by a human listener or by the robot. Finally, a software speed controller is proposed in this thesis in order to control the wheel speeds of the robot used in this project. The controller employs PI (Proportional and Integral) and PID (Proportional, Integral and Differential) control and provides a good alternative to expensive hardware

    Low-level grounding in a multimodal mobile service robot conversational system using graphical models

    Get PDF
    The main task of a service robot with a voice-enabled communication interface is to engage a user in dialogue providing an access to the services it is designed for. In managing such interaction, inferring the user goal (intention) from the request for a service at each dialogue turn is the key issue. In service robot deployment conditions speech recognition limitations with noisy speech input and inexperienced users may jeopardize user goal identification. In this paper, we introduce a grounding state-based model motivated by reducing the risk of communication failure due to incorrect user goal identification. The model exploits the multiple modalities available in the service robot system to provide evidence for reaching grounding states. In order to handle the speech input as sufficiently grounded (correctly understood) by the robot, four proposed states have to be reached. Bayesian networks combining speech and non-speech modalities during user goal identification are used to estimate probability that each grounding state has been reached. These probabilities serve as a base for detecting whether the user is attending to the conversation, as well as for deciding on an alternative input modality (e.g., buttons) when the speech modality is unreliable. The Bayesian networks used in the grounding model are specially designed for modularity and computationally efficient inference. The potential of the proposed model is demonstrated comparing a conversational system for the mobile service robot RoboX employing only speech recognition for user goal identification, and a system equipped with multimodal grounding. The evaluation experiments use component and system level metrics for technical (objective) and user-based (subjective) evaluation with multimodal data collected during the conversations of the robot RoboX with user

    The 1990 progress report and future plans

    Get PDF
    This document describes the progress and plans of the Artificial Intelligence Research Branch (RIA) at ARC in 1990. Activities span a range from basic scientific research to engineering development and to fielded NASA applications, particularly those applications that are enabled by basic research carried out at RIA. Work is conducted in-house and through collaborative partners in academia and industry. Our major focus is on a limited number of research themes with a dual commitment to technical excellence and proven applicability to NASA short, medium, and long-term problems. RIA acts as the Agency's lead organization for research aspects of artificial intelligence, working closely with a second research laboratory at JPL and AI applications groups at all NASA centers

    Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement

    Full text link
    We present a new method, PARsing And visual GrOuNding (ParaGon), for grounding natural language in object placement tasks. Natural language generally describes objects and spatial relations with compositionality and ambiguity, two major obstacles to effective language grounding. For compositionality, ParaGon parses a language instruction into an object-centric graph representation to ground objects individually. For ambiguity, ParaGon uses a novel particle-based graph neural network to reason about object placements with uncertainty. Essentially, ParaGon integrates a parsing algorithm into a probabilistic, data-driven learning framework. It is fully differentiable and trained end-to-end from data for robustness against complex, ambiguous language input.Comment: To appear in ICRA 202

    Modeling Human-Robot-Interaction based on generic Interaction Patterns

    Get PDF
    Peltason J. Modeling Human-Robot-Interaction based on generic Interaction Patterns. Bielefeld: Bielefeld University; 2014

    Programming Robots by Demonstration using Augmented Reality

    Get PDF
    O mundo está a viver a quarta revolução industrial, a Indústria 4.0; marcada pela crescente inteligência e automação dos sistemas industriais. No entanto, existem tarefas que são muito complexas ou caras para serem totalmente automatizadas, seria mais eficiente se a máquina pudesse trabalhar com o ser humano, não apenas partilhando o mesmo espaço de trabalho, mas como colaboradores úteis. O foco da investigação para solucionar esse problema está em sistemas de interação homem-robô, percebendo em que aplicações podem ser úteis para implementar e quais são os desafios que enfrentam. Neste contexto, uma melhor interação entre as máquinas e os operadores pode levar a múltiplos benefícios, como menos, melhor e mais fácil treino, um ambiente mais seguro para o operador e a capacidade de resolver problemas mais rapidamente. O tema desta dissertação é relevante na medida em que é necessário aprender e implementar as tecnologias que mais contribuem para encontrar soluções para um trabalho mais simples e eficiente na indústria. Assim, é proposto o desenvolvimento de um protótipo industrial de um sistema de interação homem-máquina através de Realidade Estendida, no qual o objetivo é habilitar um operador industrial sem experiência em programação, a programar um robô colaborativo utilizando o Microsoft HoloLens 2. O sistema desenvolvido é dividido em duas partes distintas: o sistema de tracking, que regista o movimento das mãos do operador, e o sistema de tradução da programação por demonstração, que constrói o programa a ser enviado ao robô para que ele se mova. O sistema de monitorização e supervisão é executado pelo Microsoft HoloLens 2, utilizando a plataforma Unity e Visual Studio para programá-lo. A base do sistema de programação por demonstração foi desenvolvida em Robot Operating System (ROS). Os robôs incluídos nesta interface são Universal Robots UR5 (robô colaborativo) e ABB IRB 2600 (robô industrial). Adicionalmente, a interface foi construída para incorporar facilmente mais robôs.The world is living the fourth industrial revolution, Industry 4.0; marked by the increasing intelligence and automation of manufacturing systems. Nevertheless, there are types of tasks that are too complex or too expensive to be fully automated, it would be more efficient if the machine were able to work with the human, not only by sharing the same workspace but also as useful collaborators. A possible solution to that problem is on human-robot interactions systems, understanding the applications where they can be helpful to implement and what are the challenges they face. In this context a better interaction between the machines and the operators can lead to multiples benefits, like less, better, and easier training, a safer environment for the operator and the capacity to solve problems quicker. The focus of this dissertation is relevant as it is necessary to learn and implement the technologies which most contribute to find solutions for a simpler and more efficient work in industry. This dissertation proposes the development of an industrial prototype of a human machine interaction system through Extended Reality (XR), in which the objective is to enable an industrial operator without any programming experience to program a collaborative robot using the Microsoft HoloLens 2. The system itself is divided into two different parts: the tracking system, which records the operator's hand movement, and the translator of the programming by demonstration system, which builds the program to be sent to the robot to execute the task. The monitoring and supervision system is executed by the Microsoft HoloLens 2, using the Unity platform and Visual Studio to program it. The programming by demonstration system's core was developed in Robot Operating System (ROS). The robots included in this interface are Universal Robots UR5 (collaborative robot) and ABB IRB 2600 (industrial robot). Moreover, the interface was built to easily add other robots
    corecore