10,145 research outputs found
Multi-task learning with Gaussian processes
Multi-task learning refers to learning multiple tasks simultaneously, in order to avoid tabula rasa learning
and to share information between similar tasks during learning. We consider a multi-task Gaussian
process regression model that learns related functions by inducing correlations between tasks directly.
Using this model as a reference for three other multi-task models, we provide a broad unifying view of
multi-task learning. This is possible because, unlike the other models, the multi-task Gaussian process
model encodes task relatedness explicitly.
Each multi-task learning model generally assumes that learning multiple tasks together is beneficial. We
analyze how and the extent to which multi-task learning helps improve the generalization of supervised
learning. Our analysis is conducted for the average-case on the multi-task Gaussian process model, and
we concentrate mainly on the case of two tasks, called the primary task and the secondary task. The
main parameters are the degree of relatedness ρ between the two tasks, and πS, the fraction of the total
training observations from the secondary task. Among other results, we show that asymmetric multitask
learning, where the secondary task is to help the learning of the primary task, can decrease a lower
bound on the average generalization error by a factor of up to ρ2πS. When there are no observations
for the primary task, there is also an intrinsic limit to which observations for the secondary task can
help the primary task. For symmetric multi-task learning, where the two tasks are to help each other to
learn, we find the learning to be characterized by the term πS(1 − πS)(1 − ρ2). As far as we are aware,
our analysis contributes to an understanding of multi-task learning that is orthogonal to the existing
PAC-based results on multi-task learning. For more than two tasks, we provide an understanding of
the multi-task Gaussian process model through structures in the predictive means and variances given
certain configurations of training observations. These results generalize existing ones in the geostatistics
literature, and may have practical applications in that domain.
We evaluate the multi-task Gaussian process model on the inverse dynamics problem for a robot manipulator.
The inverse dynamics problem is to compute the torques needed at the joints to drive the
manipulator along a given trajectory, and there are advantages to learning this function for adaptive
control. A robot manipulator will often need to be controlled while holding different loads in its end
effector, giving rise to a multi-context or multi-load learning problem, and we treat predicting the inverse
dynamics for a context/load as a task. We view the learning of the inverse dynamics as a function
approximation problem and place Gaussian process priors over the space of functions. We first show
that this is effective for learning the inverse dynamics for a single context. Then, by placing independent
Gaussian process priors over the latent functions of the inverse dynamics, we obtain a multi-task
Gaussian process prior for handling multiple loads, where the inter-context similarity depends on the
underlying inertial parameters of the manipulator. Experiments demonstrate that this multi-task formulation
is effective in sharing information among the various loads, and generally improves performance
over either learning only on single contexts or pooling the data over all contexts. In addition to the experimental
results, one of the contributions of this study is showing that the multi-task Gaussian process
model follows naturally from the physics of the inverse dynamics
Multi-task Gaussian Process Learning of Robot Inverse Dynamics
The inverse dynamics problem for a robotic manipulator is to compute the torques
needed at the joints to drive it along a given trajectory; it is beneficial to be able
to learn this function for adaptive control. A robotic manipulator will often need
to be controlled while holding different loads in its end effector, giving rise to a
multi-task learning problem. By placing independent Gaussian process priors over
the latent functions of the inverse dynamics, we obtain a multi-task Gaussian process
prior for handling multiple loads, where the inter-task similarity depends on
the underlying inertial parameters. Experiments demonstrate that this multi-task
formulation is effective in sharing information among the various loads, and generally
improves performance over either learning only on single tasks or pooling
the data over all tasks
Learning the dynamics of articulated tracked vehicles
In this work, we present a Bayesian non-parametric approach to model the motion control of ATVs. The motion control model is based on a Dirichlet Process-Gaussian Process (DP-GP) mixture model. The DP-GP mixture model provides a flexible representation of patterns of control manoeuvres along trajectories of different lengths and discretizations. The model also estimates the number of patterns, sufficient for modeling the dynamics of the ATV
Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics
The most data-efficient algorithms for reinforcement learning in robotics are
model-based policy search algorithms, which alternate between learning a
dynamical model of the robot and optimizing a policy to maximize the expected
return given the model and its uncertainties. Among the few proposed
approaches, the recently introduced Black-DROPS algorithm exploits a black-box
optimization algorithm to achieve both high data-efficiency and good
computation times when several cores are used; nevertheless, like all
model-based policy search approaches, Black-DROPS does not scale to high
dimensional state/action spaces. In this paper, we introduce a new model
learning procedure in Black-DROPS that leverages parameterized black-box priors
to (1) scale up to high-dimensional systems, and (2) be robust to large
inaccuracies of the prior information. We demonstrate the effectiveness of our
approach with the "pendubot" swing-up task in simulation and with a physical
hexapod robot (48D state space, 18D action space) that has to walk forward as
fast as possible. The results show that our new algorithm is more
data-efficient than previous model-based policy search algorithms (with and
without priors) and that it can allow a physical 6-legged robot to learn new
gaits in only 16 to 30 seconds of interaction time.Comment: Accepted at ICRA 2018; 8 pages, 4 figures, 2 algorithms, 1 table;
Video at https://youtu.be/HFkZkhGGzTo ; Spotlight ICRA presentation at
https://youtu.be/_MZYDhfWeL
From virtual demonstration to real-world manipulation using LSTM and MDN
Robots assisting the disabled or elderly must perform complex manipulation
tasks and must adapt to the home environment and preferences of their user.
Learning from demonstration is a promising choice, that would allow the
non-technical user to teach the robot different tasks. However, collecting
demonstrations in the home environment of a disabled user is time consuming,
disruptive to the comfort of the user, and presents safety challenges. It would
be desirable to perform the demonstrations in a virtual environment. In this
paper we describe a solution to the challenging problem of behavior transfer
from virtual demonstration to a physical robot. The virtual demonstrations are
used to train a deep neural network based controller, which is using a Long
Short Term Memory (LSTM) recurrent neural network to generate trajectories. The
training process uses a Mixture Density Network (MDN) to calculate an error
signal suitable for the multimodal nature of demonstrations. The controller
learned in the virtual environment is transferred to a physical robot (a
Rethink Robotics Baxter). An off-the-shelf vision component is used to
substitute for geometric knowledge available in the simulation and an inverse
kinematics module is used to allow the Baxter to enact the trajectory. Our
experimental studies validate the three contributions of the paper: (1) the
controller learned from virtual demonstrations can be used to successfully
perform the manipulation tasks on a physical robot, (2) the LSTM+MDN
architectural choice outperforms other choices, such as the use of feedforward
networks and mean-squared error based training signals and (3) allowing
imperfect demonstrations in the training set also allows the controller to
learn how to correct its manipulation mistakes
Online quantum mixture regression for trajectory learning by demonstration
In this work, we present the online Quantum Mixture Model (oQMM), which combines the merits of quantum mechanics and stochastic optimization. More specifically it allows for quantum effects on the mixture states, which in turn become a superposition of conventional mixture states. We propose an efficient stochastic online learning algorithm based on the online Expectation Maximization (EM), as well as a generation and decay scheme for model components. Our method is suitable for complex robotic applications, where data is abundant or where we wish to iteratively refine our model and conduct predictions during the course of learning. With a synthetic example, we show that the algorithm can achieve higher numerical stability. We also empirically demonstrate the efficacy of our method in well-known regression benchmark datasets. Under a trajectory Learning by Demonstration setting we employ a multi-shot learning application in joint angle space, where we observe higher quality of learning and reproduction. We compare against popular and well-established methods, widely adopted across the robotics community
- …