10,145 research outputs found

    Multi-task learning with Gaussian processes

    Get PDF
    Multi-task learning refers to learning multiple tasks simultaneously, in order to avoid tabula rasa learning and to share information between similar tasks during learning. We consider a multi-task Gaussian process regression model that learns related functions by inducing correlations between tasks directly. Using this model as a reference for three other multi-task models, we provide a broad unifying view of multi-task learning. This is possible because, unlike the other models, the multi-task Gaussian process model encodes task relatedness explicitly. Each multi-task learning model generally assumes that learning multiple tasks together is beneficial. We analyze how and the extent to which multi-task learning helps improve the generalization of supervised learning. Our analysis is conducted for the average-case on the multi-task Gaussian process model, and we concentrate mainly on the case of two tasks, called the primary task and the secondary task. The main parameters are the degree of relatedness ρ between the two tasks, and πS, the fraction of the total training observations from the secondary task. Among other results, we show that asymmetric multitask learning, where the secondary task is to help the learning of the primary task, can decrease a lower bound on the average generalization error by a factor of up to ρ2πS. When there are no observations for the primary task, there is also an intrinsic limit to which observations for the secondary task can help the primary task. For symmetric multi-task learning, where the two tasks are to help each other to learn, we find the learning to be characterized by the term πS(1 − πS)(1 − ρ2). As far as we are aware, our analysis contributes to an understanding of multi-task learning that is orthogonal to the existing PAC-based results on multi-task learning. For more than two tasks, we provide an understanding of the multi-task Gaussian process model through structures in the predictive means and variances given certain configurations of training observations. These results generalize existing ones in the geostatistics literature, and may have practical applications in that domain. We evaluate the multi-task Gaussian process model on the inverse dynamics problem for a robot manipulator. The inverse dynamics problem is to compute the torques needed at the joints to drive the manipulator along a given trajectory, and there are advantages to learning this function for adaptive control. A robot manipulator will often need to be controlled while holding different loads in its end effector, giving rise to a multi-context or multi-load learning problem, and we treat predicting the inverse dynamics for a context/load as a task. We view the learning of the inverse dynamics as a function approximation problem and place Gaussian process priors over the space of functions. We first show that this is effective for learning the inverse dynamics for a single context. Then, by placing independent Gaussian process priors over the latent functions of the inverse dynamics, we obtain a multi-task Gaussian process prior for handling multiple loads, where the inter-context similarity depends on the underlying inertial parameters of the manipulator. Experiments demonstrate that this multi-task formulation is effective in sharing information among the various loads, and generally improves performance over either learning only on single contexts or pooling the data over all contexts. In addition to the experimental results, one of the contributions of this study is showing that the multi-task Gaussian process model follows naturally from the physics of the inverse dynamics

    Multi-task Gaussian Process Learning of Robot Inverse Dynamics

    Get PDF
    The inverse dynamics problem for a robotic manipulator is to compute the torques needed at the joints to drive it along a given trajectory; it is beneficial to be able to learn this function for adaptive control. A robotic manipulator will often need to be controlled while holding different loads in its end effector, giving rise to a multi-task learning problem. By placing independent Gaussian process priors over the latent functions of the inverse dynamics, we obtain a multi-task Gaussian process prior for handling multiple loads, where the inter-task similarity depends on the underlying inertial parameters. Experiments demonstrate that this multi-task formulation is effective in sharing information among the various loads, and generally improves performance over either learning only on single tasks or pooling the data over all tasks

    Learning the dynamics of articulated tracked vehicles

    Get PDF
    In this work, we present a Bayesian non-parametric approach to model the motion control of ATVs. The motion control model is based on a Dirichlet Process-Gaussian Process (DP-GP) mixture model. The DP-GP mixture model provides a flexible representation of patterns of control manoeuvres along trajectories of different lengths and discretizations. The model also estimates the number of patterns, sufficient for modeling the dynamics of the ATV

    Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics

    Get PDF
    The most data-efficient algorithms for reinforcement learning in robotics are model-based policy search algorithms, which alternate between learning a dynamical model of the robot and optimizing a policy to maximize the expected return given the model and its uncertainties. Among the few proposed approaches, the recently introduced Black-DROPS algorithm exploits a black-box optimization algorithm to achieve both high data-efficiency and good computation times when several cores are used; nevertheless, like all model-based policy search approaches, Black-DROPS does not scale to high dimensional state/action spaces. In this paper, we introduce a new model learning procedure in Black-DROPS that leverages parameterized black-box priors to (1) scale up to high-dimensional systems, and (2) be robust to large inaccuracies of the prior information. We demonstrate the effectiveness of our approach with the "pendubot" swing-up task in simulation and with a physical hexapod robot (48D state space, 18D action space) that has to walk forward as fast as possible. The results show that our new algorithm is more data-efficient than previous model-based policy search algorithms (with and without priors) and that it can allow a physical 6-legged robot to learn new gaits in only 16 to 30 seconds of interaction time.Comment: Accepted at ICRA 2018; 8 pages, 4 figures, 2 algorithms, 1 table; Video at https://youtu.be/HFkZkhGGzTo ; Spotlight ICRA presentation at https://youtu.be/_MZYDhfWeL

    From virtual demonstration to real-world manipulation using LSTM and MDN

    Full text link
    Robots assisting the disabled or elderly must perform complex manipulation tasks and must adapt to the home environment and preferences of their user. Learning from demonstration is a promising choice, that would allow the non-technical user to teach the robot different tasks. However, collecting demonstrations in the home environment of a disabled user is time consuming, disruptive to the comfort of the user, and presents safety challenges. It would be desirable to perform the demonstrations in a virtual environment. In this paper we describe a solution to the challenging problem of behavior transfer from virtual demonstration to a physical robot. The virtual demonstrations are used to train a deep neural network based controller, which is using a Long Short Term Memory (LSTM) recurrent neural network to generate trajectories. The training process uses a Mixture Density Network (MDN) to calculate an error signal suitable for the multimodal nature of demonstrations. The controller learned in the virtual environment is transferred to a physical robot (a Rethink Robotics Baxter). An off-the-shelf vision component is used to substitute for geometric knowledge available in the simulation and an inverse kinematics module is used to allow the Baxter to enact the trajectory. Our experimental studies validate the three contributions of the paper: (1) the controller learned from virtual demonstrations can be used to successfully perform the manipulation tasks on a physical robot, (2) the LSTM+MDN architectural choice outperforms other choices, such as the use of feedforward networks and mean-squared error based training signals and (3) allowing imperfect demonstrations in the training set also allows the controller to learn how to correct its manipulation mistakes

    Online quantum mixture regression for trajectory learning by demonstration

    No full text
    In this work, we present the online Quantum Mixture Model (oQMM), which combines the merits of quantum mechanics and stochastic optimization. More specifically it allows for quantum effects on the mixture states, which in turn become a superposition of conventional mixture states. We propose an efficient stochastic online learning algorithm based on the online Expectation Maximization (EM), as well as a generation and decay scheme for model components. Our method is suitable for complex robotic applications, where data is abundant or where we wish to iteratively refine our model and conduct predictions during the course of learning. With a synthetic example, we show that the algorithm can achieve higher numerical stability. We also empirically demonstrate the efficacy of our method in well-known regression benchmark datasets. Under a trajectory Learning by Demonstration setting we employ a multi-shot learning application in joint angle space, where we observe higher quality of learning and reproduction. We compare against popular and well-established methods, widely adopted across the robotics community
    corecore