3 research outputs found

    Feature learning for multi-task inverse reinforcement learning

    Get PDF
    In this paper we study the question of life long learning of behaviors from human demonstrations by an intelligent system. One approach is to model the observed demonstrations by a stationary policy. Inverse rein-forcement learning, on the other hand, searches a reward function that makes the observed policy closed to optimal in the corresponding Markov decision process. This approach provides a model of the task solved by the demonstrator and has been shown to lead to better generalization in un-known contexts. However both approaches focus on learning a single task from the expert demonstration. In this paper we propose a feature learn-ing approach for inverse reinforcement learning in which several different tasks are demonstrated, but in which each task is modeled as a mixture of several, simpler, primitive tasks. We present an algorithm based on an al-ternate gradient descent to learn simultaneously a dictionary of primitive tasks (in the form of reward functions) and their combination into an ap-proximation of the task underlying observed behavior. We illustrate how this approach enables efficient re-use of knowledge from previous demon-strations. Namely knowledge on tasks that were previously observed by the learner is used to improve the learning of a new composite behavior, thus achieving transfer of knowledge between tasks

    Movement primitives as a robotic tool to interpret trajectories through learning-by-doing

    Get PDF
    Articulated movements are fundamental in many human and robotic tasks. While humans can learn and generalise arbitrarily long sequences of movements, and particularly can optimise them to fit the constraints and features of their body, robots are often programmed to execute point-to-point precise but fixed patterns. This study proposes a new approach to interpreting and reproducing articulated and complex trajectories as a set of known robot-based primitives. Instead of achieving accurate reproductions, the proposed approach aims at interpreting data in an agent-centred fashion, according to an agent's primitive movements. The method improves the accuracy of a reproduction with an incremental process that seeks first a rough approximation by capturing the most essential features of a demonstrated trajectory. Observing the discrepancy between the demonstrated and reproduced trajectories, the process then proceeds with incremental decompositions and new searches in sub-optimal parts of the trajectory. The aim is to achieve an agent-centred interpretation and progressive learning that fits in the first place the robots' capability, as opposed to a data-centred decomposition analysis. Tests on both geometric and human generated trajectories reveal that the use of own primitives results in remarkable robustness and generalisation properties of the method. In particular, because trajectories are understood and abstracted by means of agent-optimised primitives, the method has two main features: 1) Reproduced trajectories are general and represent an abstraction of the data. 2) The algorithm is capable of reconstructing highly noisy or corrupted data without pre-processing thanks to an implicit and emergent noise suppression and feature detection. This study suggests a novel bio-inspired approach to interpreting, learning and reproducing articulated movements and trajectories. Possible applications include drawing, writing, movement generation, object manipulation, and other tasks where the performance requires human-like interpretation and generalisation capabilities

    Entwicklung von Methoden zur Unterscheidung und Interpretation von Bewegungsmustern in dynamischen Szenen

    Get PDF
    Im Forschungsfeld der mobilen Assistenzroboter spielen Bewegungsabläufe eine zunehmend wichtige Rolle. Gerade in den Bewegungen der mit dem Assistenzroboter handelnden Person verstecken sich eine ganze Reihe Informationen, die zur Verbesserung der Interaktion herangezogen werden können. Eine wichtige Fragestellung bezüglich der Analyse von Bewegungen stellt die Repräsentation der Bewegungstrajektorien dar. Außerdem muss geklärt werden, welche Ähnlichkeitsmaße in den komplexeren Verfahren zum Einsatz kommen können bzw. welche speziellen Anforderungen sie erfüllen müssen. Den Kern der Arbeit stellen drei Verfahren dar, die im Wesentlichen den weiteren Verlauf einer beobachteten Bewegung über einen längeren Zeitraum vorhersagen können. Dabei handelt es sich um Echo State Netzwerke, Local Models und die spatio-temporale nicht-negative Matrixfaktorisierung (NMF). Die Arbeit als Ganzes versteht sich als einer der ersten Schritte zur systematischen Untersuchung von Bewegungsabläufen. Mit dieser Arbeit soll ein Entwickler in der Lage sein, aus einer breiten Palette an Werkzeugen sich für das Richtige für seinen speziellen Anwendungsfall zu entscheiden
    corecore