3 research outputs found
Data-Efficient Multirobot, Multitask Transfer Learning for Trajectory Tracking
Transfer learning has the potential to reduce the burden of data collection
and to decrease the unavoidable risks of the training phase. In this letter, we
introduce a multirobot, multitask transfer learning framework that allows a
system to complete a task by learning from a few demonstrations of another task
executed on another system. We focus on the trajectory tracking problem where
each trajectory represents a different task, since many robotic tasks can be
described as a trajectory tracking problem. The proposed multirobot transfer
learning framework is based on a combined adaptive control and
an iterative learning control approach. The key idea is that the adaptive
controller forces dynamically different systems to behave as a specified
reference model. The proposed multitask transfer learning framework uses
theoretical control results (e.g., the concept of vector relative degree) to
learn a map from desired trajectories to the inputs that make the system track
these trajectories with high accuracy. This map is used to calculate the inputs
for a new, unseen trajectory. Experimental results using two different
quadrotor platforms and six different trajectories show that, on average, the
proposed framework reduces the first-iteration tracking error by 74% when
information from tracking a different single trajectory on a different
quadrotor is utilized.Comment: 9 pages, 6 figures, submitted to RA-L 201
Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning
People can learn a wide range of tasks from their own experience, but can
also learn from observing other creatures. This can accelerate acquisition of
new skills even when the observed agent differs substantially from the learning
agent in terms of morphology. In this paper, we examine how reinforcement
learning algorithms can transfer knowledge between morphologically different
agents (e.g., different robots). We introduce a problem formulation where two
agents are tasked with learning multiple skills by sharing information. Our
method uses the skills that were learned by both agents to train invariant
feature spaces that can then be used to transfer other skills from one agent to
another. The process of learning these invariant feature spaces can be viewed
as a kind of "analogy making", or implicit learning of partial correspondences
between two distinct domains. We evaluate our transfer learning algorithm in
two simulated robotic manipulation skills, and illustrate that we can transfer
knowledge between simulated robotic arms with different numbers of links, as
well as simulated arms with different actuation mechanisms, where one robot is
torque-driven while the other is tendon-driven.Comment: Published as a conference paper at ICLR 201
Skill Transfer in Deep Reinforcement Learning under Morphological Heterogeneity
Transfer learning methods for reinforcement learning (RL) domains facilitate
the acquisition of new skills using previously acquired knowledge. The vast
majority of existing approaches assume that the agents have the same design,
e.g. same shape and action spaces. In this paper we address the problem of
transferring previously acquired skills amongst morphologically different
agents (MDAs). For instance, assuming that a bipedal agent has been trained to
move forward, could this skill be transferred on to a one-leg hopper so as to
make its training process for the same task more sample efficient? We frame
this problem as one of subspace learning whereby we aim to infer latent factors
representing the control mechanism that is common between MDAs. We propose a
novel paired variational encoder-decoder model, PVED, that disentangles the
control of MDAs into shared and agent-specific factors. The shared factors are
then leveraged for skill transfer using RL. Theoretically, we derive a theorem
indicating how the performance of PVED depends on the shared factors and agent
morphologies. Experimentally, PVED has been extensively validated on four
MuJoCo environments. We demonstrate its performance compared to a
state-of-the-art approach and several ablation cases, visualize and interpret
the hidden factors, and identify avenues for future improvements