9,838 research outputs found
Learning to Represent Haptic Feedback for Partially-Observable Tasks
The sense of touch, being the earliest sensory system to develop in a human
body [1], plays a critical part of our daily interaction with the environment.
In order to successfully complete a task, many manipulation interactions
require incorporating haptic feedback. However, manually designing a feedback
mechanism can be extremely challenging. In this work, we consider manipulation
tasks that need to incorporate tactile sensor feedback in order to modify a
provided nominal plan. To incorporate partial observation, we present a new
framework that models the task as a partially observable Markov decision
process (POMDP) and learns an appropriate representation of haptic feedback
which can serve as the state for a POMDP model. The model, that is parametrized
by deep recurrent neural networks, utilizes variational Bayes methods to
optimize the approximate posterior. Finally, we build on deep Q-learning to be
able to select the optimal action in each state without access to a simulator.
We test our model on a PR2 robot for multiple tasks of turning a knob until it
clicks.Comment: IEEE International Conference on Robotics and Automation (ICRA), 201
Robot eye-hand coordination learning by watching human demonstrations: a task function approximation approach
We present a robot eye-hand coordination learning method that can directly
learn visual task specification by watching human demonstrations. Task
specification is represented as a task function, which is learned using inverse
reinforcement learning(IRL) by inferring differential rewards between state
changes. The learned task function is then used as continuous feedbacks in an
uncalibrated visual servoing(UVS) controller designed for the execution phase.
Our proposed method can directly learn from raw videos, which removes the need
for hand-engineered task specification. It can also provide task
interpretability by directly approximating the task function. Besides,
benefiting from the use of a traditional UVS controller, our training process
is efficient and the learned policy is independent from a particular robot
platform. Various experiments were designed to show that, for a certain DOF
task, our method can adapt to task/environment variances in target positions,
backgrounds, illuminations, and occlusions without prior retraining.Comment: Accepted in ICRA 201
Learning Dynamic Robot-to-Human Object Handover from Human Feedback
Object handover is a basic, but essential capability for robots interacting
with humans in many applications, e.g., caring for the elderly and assisting
workers in manufacturing workshops. It appears deceptively simple, as humans
perform object handover almost flawlessly. The success of humans, however,
belies the complexity of object handover as collaborative physical interaction
between two agents with limited communication. This paper presents a learning
algorithm for dynamic object handover, for example, when a robot hands over
water bottles to marathon runners passing by the water station. We formulate
the problem as contextual policy search, in which the robot learns object
handover by interacting with the human. A key challenge here is to learn the
latent reward of the handover task under noisy human feedback. Preliminary
experiments show that the robot learns to hand over a water bottle naturally
and that it adapts to the dynamics of human motion. One challenge for the
future is to combine the model-free learning algorithm with a model-based
planning approach and enable the robot to adapt over human preferences and
object characteristics, such as shape, weight, and surface texture.Comment: Appears in the Proceedings of the International Symposium on Robotics
Research (ISRR) 201
Mutual Alignment Transfer Learning
Training robots for operation in the real world is a complex, time consuming
and potentially expensive task. Despite significant success of reinforcement
learning in games and simulations, research in real robot applications has not
been able to match similar progress. While sample complexity can be reduced by
training policies in simulation, such policies can perform sub-optimally on the
real platform given imperfect calibration of model dynamics. We present an
approach -- supplemental to fine tuning on the real robot -- to further benefit
from parallel access to a simulator during training and reduce sample
requirements on the real robot. The developed approach harnesses auxiliary
rewards to guide the exploration for the real world agent based on the
proficiency of the agent in simulation and vice versa. In this context, we
demonstrate empirically that the reciprocal alignment for both agents provides
further benefit as the agent in simulation can adjust to optimize its behaviour
for states commonly visited by the real-world agent
- …