11 research outputs found

    Relational reinforcement learning with guided demonstrations

    Get PDF
    © . This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0Model-based reinforcement learning is a powerful paradigm for learning tasks in robotics. However, in-depth exploration is usually required and the actions have to be known in advance. Thus, we propose a novel algorithm that integrates the option of requesting teacher demonstrations to learn new domains with fewer action executions and no previous knowledge. Demonstrations allow new actions to be learned and they greatly reduce the amount of exploration required, but they are only requested when they are expected to yield a significant improvement because the teacher's time is considered to be more valuable than the robot's time. Moreover, selecting the appropriate action to demonstrate is not an easy task, and thus some guidance is provided to the teacher. The rule-based model is analyzed to determine the parts of the state that may be incomplete, and to provide the teacher with a set of possible problems for which a demonstration is needed. Rule analysis is also used to find better alternative models and to complete subgoals before requesting help, thereby minimizing the number of requested demonstrations. These improvements were demonstrated in a set of experiments, which included domains from the international planning competition and a robotic task. Adding teacher demonstrations and rule analysis reduced the amount of exploration required by up to 60% in some domains, and improved the success ratio by 35% in other domainsPeer ReviewedPostprint (author's final draft

    Learning relational models with human interaction for planning in robotics

    Get PDF
    Automated planning has proven to be useful to solve problems where an agent has to maximize a reward function by executing actions. As planners have been improved to salve more expressive and difficult problems, there is an increasing interest in using planning to improve efficiency in robotic tasks. However, planners rely on a domain model, which has to be either handcrafted or learned. Although learning domain models can be very costly, recent approaches provide generalization capabilities and integrate human feedback to reduce the amount of experiences required to learn. In this thesis we propase new methods that allow an agent with no previous knowledge to solve certain problems more efficiently by using task planning. First, we show how to apply probabilistic planning to improve robot performance in manipulation tasks (such as cleaning the dirt or clearing the tableware on a table). Planners obtain sequences of actions that get the best result in the long term, beating reactive strategies. Second, we introduce new reinforcement learning algorithms where the agent can actively request demonstrations from a teacher to learn new actions and speed up the learning process. In particular, we propase an algorithm that allows the user to set the mínimum quality to be achieved, where a better quality also implies that a larger number of demonstrations will be requested . Moreover, the learned model is analyzed to extract the unlearned or problematic parts of the model. This information allow the agent to provide guidance to the teacher when a demonstration is requested, and to avoid irrecoverable errors. Finally, a new domain model learner is introduced that, in addition to relational probabilistic action models, can also learn exogenous effects. This learner can be integrated with existing planners and reinforcement learning algorithms to salve a wide range of problems. In summary, we improve the use of learning and task planning to salve unknown tasks. The improvements allow an agent to obtain a larger benefit from planners, learn faster, balance the number of action executions and teacher demonstrations, avoid irrecoverable errors, interact with a teacher to solve difficult problems, and adapt to the behavior of other agents by learning their dynamics. All the proposed methods were compared with state-of-the-art approaches, and were also demonstrated in different scenarios, including challenging robotic tasks.La planificación automática ha probado ser de gran utilidad para resolver problemas en los que un agente tiene que ejecutar acciones para maximizar una función de recompensa. A medida que los planificadores han sido capaces de resolver problemas cada vez más complejos, ha habido un creciente interés por utilizar dichos planificadores para mejorar la eficiencia de tareas robóticas. Sin embargo, los planificadores requieren un modelo del dominio, el cual puede ser creado a mano o aprendido. Aunque aprender modelos automáticamente puede ser costoso, recientemente han aparecido métodos que permiten la interacción persona-máquina y generalizan el conocimiento para reducir la cantidad de experiencias requeridas para aprender. En esta tesis proponemos nuevos métodos que permiten a un agente sin conocimiento previo de la tarea resolver problemas de forma más eficiente mediante el uso de planificación automática. Comenzaremos mostrando cómo aplicar planificación probabilística para mejorar la eficiencia de robots en tareas de manipulación (como limpiar suciedad o recoger una mesa). Los planificadores son capaces de obtener las secuencias de acciones que producen los mejores resultados a largo plazo, superando a las estrategias reactivas. Por otro lado, presentamos nuevos algoritmos de aprendizaje por refuerzo en los que el agente puede solicitar demostraciones a un profesor. Dichas demostraciones permiten al agente acelerar el aprendizaje o aprender nuevas acciones. En particular, proponemos un algoritmo que permite al usuario establecer la mínima suma de recompensas que es aceptable obtener, donde una recompensa más alta implica que se requerirán más demostraciones. Además, el modelo aprendido será analizado para identificar qué partes están incompletas o son problemáticas. Esta información permitirá al agente evitar errores irrecuperables y también guiar al profesor cuando se solicite una demostración. Finalmente, se ha introducido un nuevo método de aprendizaje para modelos de dominios que, además de obtener modelos relacionales de acciones probabilísticas, también puede aprender efectos exógenos. Mostraremos cómo integrar este método en algoritmos de aprendizaje por refuerzo para poder abordar una mayor cantidad de problemas. En resumen, hemos mejorado el uso de técnicas de aprendizaje y planificación para resolver tareas desconocidas a priori. Estas mejoras permiten a un agente aprovechar mejor los planificadores, aprender más rápido, elegir entre reducir el número de acciones ejecutadas o el número de demostraciones solicitadas, evitar errores irrecuperables, interactuar con un profesor para resolver problemas complejos, y adaptarse al comportamiento de otros agentes aprendiendo sus dinámicas. Todos los métodos propuestos han sido comparados con trabajos del estado del arte, y han sido evaluados en distintos escenarios, incluyendo tareas robóticas

    Hybrid approaches for mobile robot navigation

    Get PDF
    The work described in this thesis contributes to the efficient solution of mobile robot navigation problems. A series of new evolutionary approaches is presented. Two novel evolutionary planners have been developed that reduce the computational overhead in generating plans of mobile robot movements. In comparison with the best-performing evolutionary scheme reported in the literature, the first of the planners significantly reduces the plan calculation time in static environments. The second planner was able to generate avoidance strategies in response to unexpected events arising from the presence of moving obstacles. To overcome limitations in responsiveness and the unrealistic assumptions regarding a priori knowledge that are inherent in planner-based and a vigation systems, subsequent work concentrated on hybrid approaches. These included a reactive component to identify rapidly and autonomously environmental features that were represented by a small number of critical waypoints. Not only is memory usage dramatically reduced by such a simplified representation, but also the calculation time to determine new plans is significantly reduced. Further significant enhancements of this work were firstly, dynamic avoidance to limit the likelihood of potential collisions with moving obstacles and secondly, exploration to identify statistically the dynamic characteristics of the environment. Finally, by retaining more extensive environmental knowledge gained during previous navigation activities, the capability of the hybrid navigation system was enhanced to allow planning to be performed for any start point and goal point

    Razonamiento basado en casos aplicado a la planificación heurística

    Get PDF
    La Planificación Automática es una rama de la Inteligencia Artificial que estudia la construcción de conjuntos o secuencias de acciones, llamadas planes, que permiten transformar el estado de un entorno, con el objetivo de alcanzar las metas de un problema planteado. La planificación heurística es un paradigma dentro de la planificación automática que resuelve los problemas utilizando algoritmos de búsqueda que son guiados por una función de evaluación llamada heurística. Este paradigma ha dado grandes frutos en los últimos años gracias al desarrollo de funciones heurísticas que se pueden construir de forma independiente al dominio de planificación. Los inconvenientes que presentan estas heurísticas son que, por un lado tienen un alto coste computacional, dificultando la resolución de problemas grandes dentro de un tiempo razonable. y por otro lado, la poca información en ciertos tipos de dominios, provocando que en ocasiones los algoritmos busquen infructuosamente una solución. Por esto, surge la idea de retomar técnicas de aprendizaje automático que en años pasados fueron utilizadas sobre otros paradigmas de planificación, con la idea de mejorar la eficiencia de los planificadores. El objetivo de esta tesis doctoral es desarrollar un sistema de razonamiento basado en casos que sirva para complementar la búsqueda de un planificador heurístico. Se estudia el uso del conocimiento de los casos en diferentes algoritmos de búsqueda, y se evalúa experimentalmente sobre un conjunto de dominios, que por su diversidad, permite validar la técnica. Adicionalmente, se valora el conocimiento aprendido en los casos para establecer relaciones entre la información que puede almacenarse y las mejoras que se pueden obtener en el proceso de planificación.---------------------------------------------------------------------------------------------------------------------------------------------Automated Planning is an Artificial Intelligence field that studies how to build sequences of actions totally or partially ordered. These sequences, called plans, transform the state of the environment with the aim of achieving a given set of goals. Heuristic Planning is a recent planning paradigm that solves problems with search algorithms guided by an evaluation function called heuristic. Heuristic planning is still nowadays one of the top approaches mainly because we can build domain-independent heuristics with an automated procedure. These heuristics have two drawbacks. The first one is related to their computational cost, which imposes size restrictions to the problem being solved. The second one is the poor guidance heuristics give to the algorithms in some domains. Thereby, part of the research community focuses on applying machine learning techniques used in the past within other planning paradigms. The objective of this dissertation consists of building a case-based reasoning system that supports the search process of a heuristic planner.We will integrate the knowledge given by domain case bases as a search control. The empirical evaluation shows the benefits of the approach. We also analyze the learned knowledge in order to find relations between domain-specific information being gathered and the improvements obtained by the planners in terms of time or plan quality

    Grounded Semantic Reasoning for Robotic Interaction with Real-World Objects

    Get PDF
    Robots are increasingly transitioning from specialized, single-task machines to general-purpose systems that operate in unstructured environments, such as homes, offices, and warehouses. In these real-world domains, robots need to manipulate novel objects while adapting to changes in environments and goals. Semantic knowledge, which concisely describes target domains with symbols, can potentially reveal the meaningful patterns shared between problems and environments. However, existing robots are yet to effectively reason about semantic data encoding complex relational knowledge or jointly reason about symbolic semantic data and multimodal data pertinent to robotic manipulation (e.g., object point clouds, 6-DoF poses, and attributes detected with multimodal sensing). This dissertation develops semantic reasoning frameworks capable of modeling complex semantic knowledge grounded in robot perception and action. We show that grounded semantic reasoning enables robots to more effectively perceive, model, and interact with objects in real-world environments. Specifically, this dissertation makes the following contributions: (1) a survey providing a unified view for the diversity of works in the field by formulating semantic reasoning as the integration of knowledge sources, computational frameworks, and world representations; (2) a method for predicting missing relations in large-scale knowledge graphs by leveraging type hierarchies of entities, effectively avoiding ambiguity while maintaining generalization of multi-hop reasoning patterns; (3) a method for predicting unknown properties of objects in various environmental contexts, outperforming prior knowledge graph and statistical relational learning methods due to the use of n-ary relations for modeling object properties; (4) a method for purposeful robotic grasping that accounts for a broad range of contexts (including object visual affordance, material, state, and task constraint), outperforming existing approaches in novel contexts and for unknown objects; (5) a systematic investigation into the generalization of task-oriented grasping that includes a benchmark dataset of 250k grasps, and a novel graph neural network that incorporates semantic relations into end-to-end learning of 6-DoF grasps; (6) a method for rearranging novel objects into semantically meaningful spatial structures based on high-level language instructions, more effectively capturing multi-object spatial constraints than existing pairwise spatial representations; (7) a novel planning-inspired approach that iteratively optimizes placements of partially observed objects subject to both physical constraints and semantic constraints inferred from language instructions.Ph.D

    Efficient Reinforcement Learning for Autonomous Navigation

    Get PDF
    Immer mehr Autoren betrachten das Konzept der rationalen Agenten als zentral für den Zugang zur künstlichen Intelligenz. Das Ziel dieser Arbeit war es diesen Zugang zu verbessern. Also einen rationalen Roboter-Agenten zu konzipieren, zu implementieren und in mehreren realen Umgebungen zu testen. Der Roboter-Agent soll selbständig die Lösung für das gestellte, anspruchsvolle Navigationsproblem erlernen. Der Schwerpunkt liegt nicht in der Erstellung einer Umgebungskarte, sondern in der Entwicklung von Methoden, die dem Agenten erlauben das Navigationsproblem in unterschiedlichen Umgebungen selbständig zu lösen und die gefundenen Lösungen ständig zu verbessern. Viele Methoden der modernen Künstlichen Intelligenz, wie neuronale Netze, Evolutionäre Algorithmen und Reinforcement-Learning kommen in dieser Arbeit zur Geltung. Bei der Entwicklung der Agenten wird die bekannte Reinforcement-Learning-Methode angewendet. Durch Einbindung vorhandener und bisher ungenutzter Informationen wird der Lernprozess effizienter. Weiterhin wird durch die Gestaltung der im rationalen Agenten angewendeten Architektur die Anzahl der zur Lösung der Aufgabe benötigten Entscheidungsschritte signifikant reduziert, was in einer Effizienzsteigerung des Lernprozesses resultiert. Der mit passender Architektur und mit effizienten Lernmethoden ausgestattete rationale Agent kann direkt in der Realität seinen Weg erlernen und nach jedem Durchlauf verbessern

    Learning relational navigation policies

    No full text
    Navigation is one of the fundamental tasks for a mobile robot. The majority of path planning approaches has been designed to entirely solve the given problem from scratch given the current and goal configurations of the robot Although these approaches yield highly efficient plans, the computed policies typically do not transfer to other, similar tasks. We propose to learn relational decision trees as abstract navigation strategies from example paths. Relational abstraction has several interesting and important properties. First, it allows a mobile robot to generalize navigation plans from specific examples provided by users or exploration. Second, the navigation policy learned in one environment can be transferred to unknown environments. In several experiments with real robots in a real environment and in simulated runs, we demonstrate the usefulness of our approach. © 2006 IEEE.status: publishe
    corecore