39 research outputs found

    Online semi-parametric learning for inverse dynamics modeling

    Full text link
    This paper presents a semi-parametric algorithm for online learning of a robot inverse dynamics model. It combines the strength of the parametric and non-parametric modeling. The former exploits the rigid body dynamics equa- tion, while the latter exploits a suitable kernel function. We provide an extensive comparison with other methods from the literature using real data from the iCub humanoid robot. In doing so we also compare two different techniques, namely cross validation and marginal likelihood optimization, for estimating the hyperparameters of the kernel function

    Stable Gaussian Process based Tracking Control of Lagrangian Systems

    Full text link
    High performance tracking control can only be achieved if a good model of the dynamics is available. However, such a model is often difficult to obtain from first order physics only. In this paper, we develop a data-driven control law that ensures closed loop stability of Lagrangian systems. For this purpose, we use Gaussian Process regression for the feed-forward compensation of the unknown dynamics of the system. The gains of the feedback part are adapted based on the uncertainty of the learned model. Thus, the feedback gains are kept low as long as the learned model describes the true system sufficiently precisely. We show how to select a suitable gain adaption law that incorporates the uncertainty of the model to guarantee a globally bounded tracking error. A simulation with a robot manipulator demonstrates the efficacy of the proposed control law.Comment: Please cite the conference paper. arXiv admin note: text overlap with arXiv:1806.0719

    Fast Model Identification via Physics Engines for Data-Efficient Policy Search

    Full text link
    This paper presents a method for identifying mechanical parameters of robots or objects, such as their mass and friction coefficients. Key features are the use of off-the-shelf physics engines and the adaptation of a Bayesian optimization technique towards minimizing the number of real-world experiments needed for model-based reinforcement learning. The proposed framework reproduces in a physics engine experiments performed on a real robot and optimizes the model's mechanical parameters so as to match real-world trajectories. The optimized model is then used for learning a policy in simulation, before real-world deployment. It is well understood, however, that it is hard to exactly reproduce real trajectories in simulation. Moreover, a near-optimal policy can be frequently found with an imperfect model. Therefore, this work proposes a strategy for identifying a model that is just good enough to approximate the value of a locally optimal policy with a certain confidence, instead of wasting effort on identifying the most accurate model. Evaluations, performed both in simulation and on a real robotic manipulation task, indicate that the proposed strategy results in an overall time-efficient, integrated model identification and learning solution, which significantly improves the data-efficiency of existing policy search algorithms.Comment: IJCAI 1

    Pseudospectral Model Predictive Control under Partially Learned Dynamics

    Full text link
    Trajectory optimization of a controlled dynamical system is an essential part of autonomy, however many trajectory optimization techniques are limited by the fidelity of the underlying parametric model. In the field of robotics, a lack of model knowledge can be overcome with machine learning techniques, utilizing measurements to build a dynamical model from the data. This paper aims to take the middle ground between these two approaches by introducing a semi-parametric representation of the underlying system dynamics. Our goal is to leverage the considerable information contained in a traditional physics based model and combine it with a data-driven, non-parametric regression technique known as a Gaussian Process. Integrating this semi-parametric model with model predictive pseudospectral control, we demonstrate this technique on both a cart pole and quadrotor simulation with unmodeled damping and parametric error. In order to manage parametric uncertainty, we introduce an algorithm that utilizes Sparse Spectrum Gaussian Processes (SSGP) for online learning after each rollout. We implement this online learning technique on a cart pole and quadrator, then demonstrate the use of online learning and obstacle avoidance for the dubin vehicle dynamics.Comment: Accepted but withdrawn from AIAA Scitech 201

    A New Data Source for Inverse Dynamics Learning

    Full text link
    Modern robotics is gravitating toward increasingly collaborative human robot interaction. Tools such as acceleration policies can naturally support the realization of reactive, adaptive, and compliant robots. These tools require us to model the system dynamics accurately -- a difficult task. The fundamental problem remains that simulation and reality diverge--we do not know how to accurately change a robot's state. Thus, recent research on improving inverse dynamics models has been focused on making use of machine learning techniques. Traditional learning techniques train on the actual realized accelerations, instead of the policy's desired accelerations, which is an indirect data source. Here we show how an additional training signal -- measured at the desired accelerations -- can be derived from a feedback control signal. This effectively creates a second data source for learning inverse dynamics models. Furthermore, we show how both the traditional and this new data source, can be used to train task-specific models of the inverse dynamics, when used independently or combined. We analyze the use of both data sources in simulation and demonstrate its effectiveness on a real-world robotic platform. We show that our system incrementally improves the learned inverse dynamics model, and when using both data sources combined converges more consistently and faster.Comment: IROS 201

    Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics

    Get PDF
    The most data-efficient algorithms for reinforcement learning in robotics are model-based policy search algorithms, which alternate between learning a dynamical model of the robot and optimizing a policy to maximize the expected return given the model and its uncertainties. Among the few proposed approaches, the recently introduced Black-DROPS algorithm exploits a black-box optimization algorithm to achieve both high data-efficiency and good computation times when several cores are used; nevertheless, like all model-based policy search approaches, Black-DROPS does not scale to high dimensional state/action spaces. In this paper, we introduce a new model learning procedure in Black-DROPS that leverages parameterized black-box priors to (1) scale up to high-dimensional systems, and (2) be robust to large inaccuracies of the prior information. We demonstrate the effectiveness of our approach with the "pendubot" swing-up task in simulation and with a physical hexapod robot (48D state space, 18D action space) that has to walk forward as fast as possible. The results show that our new algorithm is more data-efficient than previous model-based policy search algorithms (with and without priors) and that it can allow a physical 6-legged robot to learn new gaits in only 16 to 30 seconds of interaction time.Comment: Accepted at ICRA 2018; 8 pages, 4 figures, 2 algorithms, 1 table; Video at https://youtu.be/HFkZkhGGzTo ; Spotlight ICRA presentation at https://youtu.be/_MZYDhfWeL

    Accelerating Nearest Neighbor Search on Manycore Systems

    Full text link
    We develop methods for accelerating metric similarity search that are effective on modern hardware. Our algorithms factor into easily parallelizable components, making them simple to deploy and efficient on multicore CPUs and GPUs. Despite the simple structure of our algorithms, their search performance is provably sublinear in the size of the database, with a factor dependent only on its intrinsic dimensionality. We demonstrate that our methods provide substantial speedups on a range of datasets and hardware platforms. In particular, we present results on a 48-core server machine, on graphics hardware, and on a multicore desktop
    corecore