Search CORE

61,526 research outputs found

The Assistive Multi-Armed Bandit

Author: Chan Lawrence
Dragan Anca
Hadfield-Menell Dylan
Srinivasa Siddhartha
Publication venue
Publication date: 24/01/2019
Field of study

Learning preferences implicit in the choices humans make is a well studied problem in both economics and computer science. However, most work makes the assumption that humans are acting (noisily) optimally with respect to their preferences. Such approaches can fail when people are themselves learning about what they want. In this work, we introduce the assistive multi-armed bandit, where a robot assists a human playing a bandit task to maximize cumulative reward. In this problem, the human does not know the reward function but can learn it through the rewards received from arm pulls; the robot only observes which arms the human pulls but not the reward associated with each pull. We offer sufficient and necessary conditions for successfully assisting the human in this framework. Surprisingly, better human performance in isolation does not necessarily lead to better performance when assisted by the robot: a human policy can do better by effectively communicating its observed rewards to the robot. We conduct proof-of-concept experiments that support these results. We see this work as contributing towards a theory behind algorithms for human-robot interaction.Comment: Accepted to HRI 201

arXiv.org e-Print Archive

Crossref

A biologically inspired meta-control navigation system for the Psikharpax rat robot

Author: Caluwaerts Ken
Dollé Laurent
Favre-Félix Antoine
Girard Benoît
Grand Christophe
Khamassi Mehdi
N'Guyen Steve
Staffa Mariacarla
Publication venue: 'IOP Publishing'
Publication date: 01/01/2012
Field of study

A biologically inspired navigation system for the mobile rat-like robot named Psikharpax is presented, allowing for self-localization and autonomous navigation in an initially unknown environment. The ability of parts of the model (e. g. the strategy selection mechanism) to reproduce rat behavioral data in various maze tasks has been validated before in simulations. But the capacity of the model to work on a real robot platform had not been tested. This paper presents our work on the implementation on the Psikharpax robot of two independent navigation strategies (a place-based planning strategy and a cue-guided taxon strategy) and a strategy selection meta-controller. We show how our robot can memorize which was the optimal strategy in each situation, by means of a reinforcement learning algorithm. Moreover, a context detector enables the controller to quickly adapt to changes in the environment-recognized as new contexts-and to restore previously acquired strategy preferences when a previously experienced context is recognized. This produces adaptivity closer to rat behavioral performance and constitutes a computational proposition of the role of the rat prefrontal cortex in strategy shifting. Moreover, such a brain-inspired meta-controller may provide an advancement for learning architectures in robotics

Ghent University Academic Bibliography

Learning-Based Synthesis of Safety Controllers

Author: beyene
brázdil
de moura
flanagan
garg
grädel
neider
neider
quinlan
solar-lezama
Publication venue
Publication date: 01/01/2019
Field of study

We propose a machine learning framework to synthesize reactive controllers for systems whose interactions with their adversarial environment are modeled by infinite-duration, two-player games over (potentially) infinite graphs. Our framework targets safety games with infinitely many vertices, but it is also applicable to safety games over finite graphs whose size is too prohibitive for conventional synthesis techniques. The learning takes place in a feedback loop between a teacher component, which can reason symbolically about the safety game, and a learning algorithm, which successively learns an overapproximation of the winning region from various kinds of examples provided by the teacher. We develop a novel decision tree learning algorithm for this setting and show that our algorithm is guaranteed to converge to a reactive safety controller if a suitable overapproximation of the winning region can be expressed as a decision tree. Finally, we empirically compare the performance of a prototype implementation to existing approaches, which are based on constraint solving and automata learning, respectively

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in Hardware

Author: Heim Steve
Ruppert Felix
Sarvestani Alborz A.
Spröwitz Alexander
Publication venue
Publication date: 01/01/2018
Field of study

Learning instead of designing robot controllers can greatly reduce engineering effort required, while also emphasizing robustness. Despite considerable progress in simulation, applying learning directly in hardware is still challenging, in part due to the necessity to explore potentially unstable parameters. We explore the concept of shaping the reward landscape with training wheels: temporary modifications of the physical hardware that facilitate learning. We demonstrate the concept with a robot leg mounted on a boom learning to hop fast. This proof of concept embodies typical challenges such as instability and contact, while being simple enough to empirically map out and visualize the reward landscape. Based on our results we propose three criteria for designing effective training wheels for learning in robotics. A video synopsis can be found at https://youtu.be/6iH5E3LrYh8.Comment: Accepted to the IEEE International Conference on Robotics and Automation (ICRA) 2018, 6 pages, 6 figure

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Multiform Adaptive Robot Skill Learning from Humans

Author: Lawhorn Raheem
Lu Lu
Ouyang Bo
Patil Siddharth
Susanibar Steve
Wang Cong
Zhao Leidi
Publication venue
Publication date: 17/08/2017
Field of study

Object manipulation is a basic element in everyday human lives. Robotic manipulation has progressed from maneuvering single-rigid-body objects with firm grasping to maneuvering soft objects and handling contact-rich actions. Meanwhile, technologies such as robot learning from demonstration have enabled humans to intuitively train robots. This paper discusses a new level of robotic learning-based manipulation. In contrast to the single form of learning from demonstration, we propose a multiform learning approach that integrates additional forms of skill acquisition, including adaptive learning from definition and evaluation. Moreover, going beyond state-of-the-art technologies of handling purely rigid or soft objects in a pseudo-static manner, our work allows robots to learn to handle partly rigid partly soft objects with time-critical skills and sophisticated contact control. Such capability of robotic manipulation offers a variety of new possibilities in human-robot interaction.Comment: Accepted to 2017 Dynamic Systems and Control Conference (DSCC), Tysons Corner, VA, October 11-1

arXiv.org e-Print Archive

Crossref

Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

Author: Horaud Radu
Lathuilière Stéphane
Massé Benoit
Mesejo Pablo
Publication venue: 'Elsevier BV'
Publication date: 23/04/2018
Field of study

This paper introduces a novel neural network-based reinforcement learning approach for robot gaze control. Our approach enables a robot to learn and to adapt its gaze control strategy for human-robot interaction neither with the use of external sensors nor with human supervision. The robot learns to focus its attention onto groups of people from its own audio-visual experiences, independently of the number of people, of their positions and of their physical appearances. In particular, we use a recurrent neural network architecture in combination with Q-learning to find an optimal action-selection policy; we pre-train the network using a simulated environment that mimics realistic scenarios that involve speaking/silent participants, thus avoiding the need of tedious sessions of a robot interacting with people. Our experimental evaluation suggests that the proposed method is robust against parameter estimation, i.e. the parameter values yielded by the method do not have a decisive impact on the performance. The best results are obtained when both audio and visual information is jointly used. Experiments with the Nao robot indicate that our framework is a step forward towards the autonomous learning of socially acceptable gaze behavior.Comment: Paper submitted to Pattern Recognition Letter

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Mobile Robot Lab Project to Introduce Engineering Students to Fault Diagnosis in Mechatronic Systems

Author: Fernández-Lozano Juan Jesús
Garcia-Cerezo Alfonso J.
Gómez-de-Gabriel Jesús Manuel
Mandow Anthony
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

This document is a self-archiving copy of the accepted version of the paper. Please find the final published version in IEEEXplore: http://dx.doi.org/10.1109/TE.2014.2358551This paper proposes lab work for learning fault detection and diagnosis (FDD) in mechatronic systems. These skills are important for engineering education because FDD is a key capability of competitive processes and products. The intended outcome of the lab work is that students become aware of the importance of faulty conditions and learn to design FDD strategies for a real system. To this end, the paper proposes a lab project where students are requested to develop a discrete event dynamic system (DEDS) diagnosis to cope with two faulty conditions in an autonomous mobile robot task. A sample solution is discussed for LEGO Mindstorms NXT robots with LabVIEW. This innovative practice is relevant to higher education engineering courses related to mechatronics, robotics, or DEDS. Results are also given of the application of this strategy as part of a postgraduate course on fault-tolerant mechatronic systems.This work was supported in part by the Spanish CICYT under Project DPI2011-22443

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional Universidad de Málaga