159 research outputs found
Data-Driven Imitation Learning for a Shopkeeper Robot with Periodically Changing Product Information
Data-driven imitation learning enables service robots to learn social interaction behaviors, but these systems cannot adapt after training to changes in the environment, such as changing products in a store. To solve this, a novel learning system that uses neural attention and approximate string matching to copy information from a product information database to its output is proposed. A camera shop interaction dataset was simulated for training/testing. The proposed system was found to outperform a baseline and a previous state of the art in an offline, human-judged evaluation
Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations
Learning from Demonstration (LfD) approaches empower end-users to teach
robots novel tasks via demonstrations of the desired behaviors, democratizing
access to robotics. However, current LfD frameworks are not capable of fast
adaptation to heterogeneous human demonstrations nor the large-scale deployment
in ubiquitous robotics applications. In this paper, we propose a novel LfD
framework, Fast Lifelong Adaptive Inverse Reinforcement learning (FLAIR). Our
approach (1) leverages learned strategies to construct policy mixtures for fast
adaptation to new demonstrations, allowing for quick end-user personalization,
(2) distills common knowledge across demonstrations, achieving accurate task
inference; and (3) expands its model only when needed in lifelong deployments,
maintaining a concise set of prototypical strategies that can approximate all
behaviors via policy mixtures. We empirically validate that FLAIR achieves
adaptability (i.e., the robot adapts to heterogeneous, user-specific task
preferences), efficiency (i.e., the robot achieves sample-efficient
adaptation), and scalability (i.e., the model grows sublinearly with the number
of demonstrations while maintaining high performance). FLAIR surpasses
benchmarks across three control tasks with an average 57% improvement in policy
returns and an average 78% fewer episodes required for demonstration modeling
using policy mixtures. Finally, we demonstrate the success of FLAIR in a table
tennis task and find users rate FLAIR as having higher task (p<.05) and
personalization (p<.05) performance
Reinforcement Learning from Demonstration
Off-the-shelf Reinforcement Learning (RL) algorithms suffer from slow learning performance, partly because they are expected to learn a task from scratch merely through an agent\u27s own experience. In this thesis, we show that learning from scratch is a limiting factor for the learning performance, and that when prior knowledge is available RL agents can learn a task faster. We evaluate relevant previous work and our own algorithms in various experiments. Our first contribution is the first implementation and evaluation of an existing interactive RL algorithm in a real-world domain with a humanoid robot. Interactive RL was evaluated in a simulated domain which motivated us for evaluating its practicality on a robot. Our evaluation shows that guidance reduces learning time, and that its positive effects increase with state space size. A natural follow up question after our first evaluation was, how do some other previous works compare to interactive RL. Our second contribution is an analysis of a user study, where na ive human teachers demonstrated a real-world object catching with a humanoid robot. We present the first comparison of several previous works in a common real-world domain with a user study. One conclusion of the user study was the high potential of RL despite poor usability due to slow learning rate. As an effort to improve the learning efficiency of RL learners, our third contribution is a novel human-agent knowledge transfer algorithm. Using demonstrations from three teachers with varying expertise in a simulated domain, we show that regardless of the skill level, human demonstrations can improve the asymptotic performance of an RL agent. As an alternative approach for encoding human knowledge in RL, we investigated the use of reward shaping. Our final contributions are Static Inverse Reinforcement Learning Shaping and Dynamic Inverse Reinforcement Learning Shaping algorithms that use human demonstrations for recovering a shaping reward function. Our experiments in simulated domains show that our approach outperforms the state-of-the-art in cumulative reward, learning rate and asymptotic performance. Overall we show that human demonstrators with varying skills can help RL agents to learn tasks more efficiently
The Future of Humanoid Robots
This book provides state of the art scientific and engineering research findings and developments in the field of humanoid robotics and its applications. It is expected that humanoids will change the way we interact with machines, and will have the ability to blend perfectly into an environment already designed for humans. The book contains chapters that aim to discover the future abilities of humanoid robots by presenting a variety of integrated research in various scientific and engineering fields, such as locomotion, perception, adaptive behavior, human-robot interaction, neuroscience and machine learning. The book is designed to be accessible and practical, with an emphasis on useful information to those working in the fields of robotics, cognitive science, artificial intelligence, computational methods and other fields of science directly or indirectly related to the development and usage of future humanoid robots. The editor of the book has extensive R&D experience, patents, and publications in the area of humanoid robotics, and his experience is reflected in editing the content of the book
Mutual reinforcement learning to improve robots as trainers
Recently, collaborative robots have begun to train humans to achieve complex tasks, and the mutual information exchange between them can lead to successful robot-human collaborations. In this thesis we demonstrate the application and effectiveness of a new approach called mutual reinforcement learning (MRL), where both humans and autonomous agents act as reinforcement learners in a skill transfer scenario over continuous communication and feedback. An autonomous agent initially acts as an instructor who can teach a novice human participant complex skills using the MRL strategy. While teaching skills in a physical (block-building) or simulated (Tetris) environment , the expert tries to identify appropriate reward channels preferred by each individual and adapts itself accordingly using an exploration-exploitation strategy. These reward channel preferences can identify important behaviors of the human participants, because they may well exercise the same behaviors in similar situations later. In this way, skill transfer takes place between an expert system and a novice human operator. We divided the subject population into three groups and observed the skill transfer phenomenon, analyzing it with Simpson' s psychometric model. 5-point Likert scales were also used to identify the cognitive models of the human participants. We obtained a shared cognitive model which not only improves human cognition but enhances the robots cognitive strategy to understand the mental model of its human partners while building a successful robot-human collaborative framework
Review of the techniques used in motorâcognitive humanârobot skill transfer
Abstract A conventional robot programming method extensively limits the reusability of skills in the developmental aspect. Engineers programme a robot in a targeted manner for the realisation of predefined skills. The low reusability of generalâpurpose robot skills is mainly reflected in inability in novel and complex scenarios. Skill transfer aims to transfer human skills to generalâpurpose manipulators or mobile robots to replicate humanâlike behaviours. Skill transfer methods that are commonly used at present, such as learning from demonstrated (LfD) or imitation learning, endow the robot with the expert's lowâlevel motor and highâlevel decisionâmaking ability, so that skills can be reproduced and generalised according to perceived context. The improvement of robot cognition usually relates to an improvement in the autonomous highâlevel decisionâmaking ability. Based on the idea of establishing a generic or specialised robot skill library, robots are expected to autonomously reason about the needs for using skills and plan compound movements according to sensory input. In recent years, in this area, many successful studies have demonstrated their effectiveness. Herein, a detailed review is provided on the transferring techniques of skills, applications, advancements, and limitations, especially in the LfD. Future research directions are also suggested
âThere should be some kind of a deeper meaning for English language in classes as wellâ:examination of Finnish upper secondary school studentsâ views on applying video games in English language education from sociocultural and ecological perspectives
Abstract. The purpose of this masterâs thesis is to examine how contemporary sociocultural and ecological learning theories that are applied in the National Core Curriculum support the use of video games in language learning and teaching and how students themselves perceive language learning and the role of video games in their learning process. Moreover, this study focuses on examining how the use of video games in language learning, and video game Detroit: Become Human in particular, can promote studentsâ learning process and more specifically, what kinds of affordances playing video games can offer for students who participate in this kind of action. The method applied in this study is qualitative content analysis. The data for this study was gathered from a learning experiment that consisted of a gaming session and interview with a small group of upper secondary school students. That is, research materials were game Detroit: Become Human, film recordings from the gaming session and an audio recording from the interview. The analysis of the data revealed that students value personal motivation and meaningful learning that steers away from textbooks and written exams. Furthermore, it appeared that playing video games in class promotes studentsâ learning process by providing an engaging and interactive environment for cross-curricular learning. Thus, affordances that arise from this kind of environment offer students a learning environment that evokes various emotions and reflections among them and encourages discussion among peers.TiivistelmĂ€. TĂ€mĂ€n pro gradu -tutkielman tarkoituksena on tarkastella, miten tĂ€mĂ€nhetkiset sosiokulttuurinen ja ekologinen oppimisteoria, joihin valtakunnallinen opetussuunnitelma pohjautuu, tukevat videopelien kĂ€yttöÀ oppimisessa ja opetuksessa, sekĂ€ sitĂ€, millainen kĂ€sitys opiskelijoilla on kielenoppimisesta ja videopelien roolista oppimisprosessissa. LisĂ€ksi tutkimus tarkastelee, kuinka videopelit ja etenkin peli Detroit: Become Human voivat edistÀÀ opiskelijoiden oppimista sekĂ€ erityisesti, millaisia affordansseja videopelien pelaaminen tarjoaa opiskelijoille, jotka osallistuvat tĂ€llaiseen toimintaan. Tutkimusmetodina kĂ€ytetÀÀn kvalitatiivista sisĂ€llönanalyysia. Tutkimusdata kerĂ€ttiin opetuskokeilusta, joka sisĂ€lsi pelituokion ja haastattelun lukiolaisten pienryhmĂ€n kanssa. Tutkimusmateriaalia olivat peli Detroit: Become Human, videotallenteet pelituokiosta sekĂ€ ÀÀnitallenne haastattelusta. Tutkimusaineiston analyysi paljasti, ettĂ€ opiskelijat arvostavat henkilökohtaista motivaatiota ja merkityksellistĂ€ oppimista, joka suuntautuu poispĂ€in oppikirjoista ja kirjallisista kokeista. LisĂ€ksi selvisi, ettĂ€ videopelien pelaaminen koulussa edistÀÀ opiskelijoiden oppimisprosessia tarjoamalla osallistavan ja vuorovaikutuksellisen ympĂ€ristön oppiainerajat ylittĂ€vĂ€lle oppimiselle. TĂ€llaisesta ympĂ€ristöstĂ€ esiin nousevat affordanssit tarjoavat opiskelijoille oppimisympĂ€ristön, joka herĂ€ttÀÀ pelaajissa monenlaisia tunteita ja pohdintoja ja kannustaa keskusteluun muiden kanssa
Children's perception and interpretation of robots and robot behaviour
The world of robotics, like that of all technology is changing rapidly (Melson, et al., 2009).
As part of an inter-disciplinary project investigating the emergence of artificial culture in
robot societies, this study set out to examine childrenâs perception of robots and interpretation
of robot behaviour. This thesis is situated in an interdisciplinary field of humanârobot
interactions, drawing on research from the disciplines of sociology and psychology as well as
the fields of engineering and ethics. The study was divided into four phases: phase one
involved children from two primary schools drawing a picture and writing a story about their
robot. In phase two, children observed e-puck robots interacting. Children were asked
questions regarding the function and purpose of the robotsâ actions. Phase three entailed data
collection at a public event: Manchester Science Festival. Three activities at the festival: âXRay
Art Under Your Skinâ, âSwarm Robotsâ and âBuild-a-Bugbotâ formed the focus of this
phase. In the first activity, children were asked to draw the components of a robot and were
then asked questions about their drawings. During the second exercise, childrenâs comments
were noted as they watched e-puck robot demonstrations. In the third exercise, children were
shown images and asked whether these images were a robot or a âno-botâ. They were then
prompted to provide explanations for their answers.
Phase 4 of the research involved children identifying patterns of behaviour amongst e-pucks.
This phase of the project was undertaken as a pilot for the âopen scienceâ approach to
research to be used by the wider project within which this PhD was nested. Consistent with
existing literature, children endowed robots with animate and inanimate characteristics
holding multiple understandings of robots simultaneously. The notion of control appeared to
be important in childrenâs conception of animacy. The results indicated childrenâs
perceptions of the location of the locus of control plays an important role in whether they
view robots as autonomous agents or controllable entities. The ways in which children
perceive robots and robot behaviour, in particular the ways in which children give meaning to
robots and robot behaviour will potentially come to characterise a particular generation.
Therefore, research should not only concentrate on the impact of these technologies on
children but should focus on capturing childrenâs perceptions and viewpoints to better
understand the impact of the changing technological world on the lives of children
Computational Methods for Cognitive and Cooperative Robotics
In the last decades design methods in control engineering made substantial progress in
the areas of robotics and computer animation. Nowadays these methods incorporate the
newest developments in machine learning and artificial intelligence. But the problems
of flexible and online-adaptive combinations of motor behaviors remain challenging for
human-like animations and for humanoid robotics. In this context, biologically-motivated
methods for the analysis and re-synthesis of human motor programs provide new insights
in and models for the anticipatory motion synthesis.
This thesis presents the authorâs achievements in the areas of cognitive and developmental robotics, cooperative and humanoid robotics and intelligent and machine learning methods in computer graphics. The first part of the thesis in the chapter âGoal-directed Imitation for Robotsâ considers imitation learning in cognitive and developmental robotics.
The work presented here details the authorâs progress in the development of hierarchical
motion recognition and planning inspired by recent discoveries of the functions of mirror-neuron cortical circuits in primates. The overall architecture is capable of âlearning for
imitationâ and âlearning by imitationâ. The complete system includes a low-level real-time
capable path planning subsystem for obstacle avoidance during arm reaching. The learning-based path planning subsystem is universal for all types of anthropomorphic robot arms, and is capable of knowledge transfer at the level of individual motor acts.
Next, the problems of learning and synthesis of motor synergies, the spatial and spatio-temporal combinations of motor features in sequential multi-action behavior, and the
problems of task-related action transitions are considered in the second part of the thesis
âKinematic Motion Synthesis for Computer Graphics and Roboticsâ. In this part, a new
approach of modeling complex full-body human actions by mixtures of time-shift invariant
motor primitives in presented. The online-capable full-body motion generation architecture
based on dynamic movement primitives driving the time-shift invariant motor synergies
was implemented as an online-reactive adaptive motion synthesis for computer graphics
and robotics applications.
The last chapter of the thesis entitled âContraction Theory and Self-organized Scenarios
in Computer Graphics and Roboticsâ is dedicated to optimal control strategies in multi-agent scenarios of large crowds of agents expressing highly nonlinear behaviors. This last
part presents new mathematical tools for stability analysis and synthesis of multi-agent
cooperative scenarios.In den letzten Jahrzehnten hat die Forschung in den Bereichen der Steuerung und Regelung
komplexer Systeme erhebliche Fortschritte gemacht, insbesondere in den Bereichen
Robotik und Computeranimation. Die Entwicklung solcher Systeme verwendet heutzutage
neueste Methoden und Entwicklungen im Bereich des maschinellen Lernens und der
kĂŒnstlichen Intelligenz. Die flexible und echtzeitfĂ€hige Kombination von motorischen Verhaltensweisen
ist eine wesentliche Herausforderung fĂŒr die Generierung menschenĂ€hnlicher
Animationen und in der humanoiden Robotik. In diesem Zusammenhang liefern biologisch
motivierte Methoden zur Analyse und Resynthese menschlicher motorischer Programme
neue Erkenntnisse und Modelle fĂŒr die antizipatorische Bewegungssynthese.
Diese Dissertation prÀsentiert die Ergebnisse der Arbeiten des Autors im Gebiet der
kognitiven und Entwicklungsrobotik, kooperativer und humanoider Robotersysteme sowie
intelligenter und maschineller Lernmethoden in der Computergrafik. Der erste Teil der
Dissertation im Kapitel âZielgerichtete Nachahmung fĂŒr Roboterâ behandelt das Imitationslernen
in der kognitiven und Entwicklungsrobotik. Die vorgestellten Arbeiten beschreiben
neue Methoden fĂŒr die hierarchische Bewegungserkennung und -planung, die durch
Erkenntnisse zur Funktion der kortikalen Spiegelneuronen-Schaltkreise bei Primaten inspiriert
wurden. Die entwickelte Architektur ist in der Lage, âdurch Imitation zu lernenâ
und âzu lernen zu imitierenâ. Das komplette entwickelte System enthĂ€lt ein echtzeitfĂ€higes
Pfadplanungssubsystem zur Hindernisvermeidung wĂ€hrend der DurchfĂŒhrung von Armbewegungen.
Das lernbasierte Pfadplanungssubsystem ist universell und fĂŒr alle Arten von
anthropomorphen Roboterarmen in der Lage, Wissen auf der Ebene einzelner motorischer
Handlungen zu ĂŒbertragen.
Im zweiten Teil der Arbeit âKinematische Bewegungssynthese fĂŒr Computergrafik und
Robotikâ werden die Probleme des Lernens und der Synthese motorischer Synergien, d.h.
von rÀumlichen und rÀumlich-zeitlichen Kombinationen motorischer Bewegungselemente
bei Bewegungssequenzen und bei aufgabenbezogenen Handlungs ĂŒbergĂ€ngen behandelt.
Es wird ein neuer Ansatz zur Modellierung komplexer menschlicher Ganzkörperaktionen
durch Mischungen von zeitverschiebungsinvarianten Motorprimitiven vorgestellt. Zudem
wurde ein online-fĂ€higer Synthesealgorithmus fĂŒr Ganzköperbewegungen entwickelt, der
auf dynamischen Bewegungsprimitiven basiert, die wiederum auf der Basis der gelernten
verschiebungsinvarianten Primitive konstruiert werden. Dieser Algorithmus wurde fĂŒr
verschiedene Probleme der Bewegungssynthese fĂŒr die Computergrafik- und Roboteranwendungen
implementiert.
Das letzte Kapitel der Dissertation mit dem Titel âKontraktionstheorie und selbstorganisierte
Szenarien in der Computergrafik und Robotikâ widmet sich optimalen Kontrollstrategien
in Multi-Agenten-Szenarien, wobei die Agenten durch eine hochgradig nichtlineare
Kinematik gekennzeichnet sind. Dieser letzte Teil prÀsentiert neue mathematische Werkzeuge
fĂŒr die StabilitĂ€tsanalyse und Synthese von kooperativen Multi-Agenten-Szenarien
- âŠ