Search CORE

53 research outputs found

Grounding action in visuo-haptic space using experience networks

Author: Glover Arren
Schulz Ruth
Wiles Janet
Wyeth Gordon
Publication venue: Australian Robotics & Automation Association
Publication date: 01/01/2010
Field of study

Traditional approaches to the use of machine learning algorithms do not provide a method to learn multiple tasks in one-shot on an embodied robot. It is proposed that grounding actions within the sensory space leads to the development of action-state relationships which can be re-used despite a change in task. A novel approach called an Experience Network is developed and assessed on a real-world robot required to perform three separate tasks. After grounded representations were developed in the initial task, only minimal further learning was required to perform the second and third task

CiteSeerX

Queensland University of Technology ePrints Archive

University of Queensland eSpace

Sequential Gaussian Processes for Online Learning of Nonstationary Functions

Author: Dumitrascu Bianca
Engelhardt Barbara E.
Williamson Sinead A.
Zhang Michael Minyi
Publication venue
Publication date: 16/10/2019
Field of study

Many machine learning problems can be framed in the context of estimating functions, and often these are time-dependent functions that are estimated in real-time as observations arrive. Gaussian processes (GPs) are an attractive choice for modeling real-valued nonlinear functions due to their flexibility and uncertainty quantification. However, the typical GP regression model suffers from several drawbacks: i) Conventional GP inference scales

O(N^{3})

with respect to the number of observations; ii) updating a GP model sequentially is not trivial; and iii) covariance kernels often enforce stationarity constraints on the function, while GPs with non-stationary covariance kernels are often intractable to use in practice. To overcome these issues, we propose an online sequential Monte Carlo algorithm to fit mixtures of GPs that capture non-stationary behavior while allowing for fast, distributed inference. By formulating hyperparameter optimization as a multi-armed bandit problem, we accelerate mixing for real time inference. Our approach empirically improves performance over state-of-the-art methods for online GP estimation in the context of prediction for simulated non-stationary data and hospital time series data

arXiv.org e-Print Archive

Deep Forward and Inverse Perceptual Models for Tracking and Prediction

Author: Boots Byron
Lambert Alexander
Liu Zhen
Raj Amit
Shaban Amirreza
Publication venue
Publication date: 19/05/2018
Field of study

We consider the problems of learning forward models that map state to high-dimensional images and inverse models that map high-dimensional images to state in robotics. Specifically, we present a perceptual model for generating video frames from state with deep networks, and provide a framework for its use in tracking and prediction tasks. We show that our proposed model greatly outperforms standard deconvolutional methods and GANs for image generation, producing clear, photo-realistic images. We also develop a convolutional neural network model for state estimation and compare the result to an Extended Kalman Filter to estimate robot trajectories. We validate all models on a real robotic system.Comment: 8 pages, International Conference on Robotics and Automation (ICRA) 201

arXiv.org e-Print Archive

Crossref

Active Inference for Integrated State-Estimation, Control, and Learning

Author: Baioumy Mohamed
Duckworth Paul
Hawes Nick
Lacerda Bruno
Publication venue
Publication date: 30/03/2021
Field of study

This work presents an approach for control, state-estimation and learning model (hyper)parameters for robotic manipulators. It is based on the active inference framework, prominent in computational neuroscience as a theory of the brain, where behaviour arises from minimizing variational free-energy. The robotic manipulator shows adaptive and robust behaviour compared to state-of-the-art methods. Additionally, we show the exact relationship to classic methods such as PID control. Finally, we show that by learning a temporal parameter and model variances, our approach can deal with unmodelled dynamics, damps oscillations, and is robust against disturbances and poor initial parameters. The approach is validated on the `Franka Emika Panda' 7 DoF manipulator.Comment: 7 pages, 6 figures, accepted for presentation at the International Conference on Robotics and Automation (ICRA) 202

arXiv.org e-Print Archive

Oxford University Research Archive

Robust Filtering and Smoothing with Gaussian Processes

Author: Deisenroth MP
Hanebeck UD
Huber MF
Rasmussen CE
Turner RD
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

We propose a principled algorithm for robust Bayesian filtering and smoothing in nonlinear stochastic dynamic systems when both the transition function and the measurement function are described by non-parametric Gaussian process (GP) models. GPs are gaining increasing importance in signal processing, machine learning, robotics, and control for representing unknown system functions by posterior probability distributions. This modern way of "system identification" is more robust than finding point estimates of a parametric function representation. In this article, we present a principled algorithm for robust analytic smoothing in GP dynamic systems, which are increasingly used in robotics and control. Our numerical evaluations demonstrate the robustness of the proposed approach in situations where other state-of-the-art Gaussian filters and smoothers can fail.Comment: 7 pages, 1 figure, draft version of paper accepted at IEEE Transactions on Automatic Contro

arXiv.org e-Print Archive

TUbiblio

Spiral - Imperial College Digital Repository

CUED - Cambridge University Engineering Department

Safe Learning of Quadrotor Dynamics Using Barrier Certificates

Author: Egerstedt Magnus
Theodorou Evangelos A.
Wang Li
Publication venue
Publication date: 15/10/2017
Field of study

To effectively control complex dynamical systems, accurate nonlinear models are typically needed. However, these models are not always known. In this paper, we present a data-driven approach based on Gaussian processes that learns models of quadrotors operating in partially unknown environments. What makes this challenging is that if the learning process is not carefully controlled, the system will go unstable, i.e., the quadcopter will crash. To this end, barrier certificates are employed for safe learning. The barrier certificates establish a non-conservative forward invariant safe region, in which high probability safety guarantees are provided based on the statistics of the Gaussian Process. A learning controller is designed to efficiently explore those uncertain states and expand the barrier certified safe region based on an adaptive sampling scheme. In addition, a recursive Gaussian Process prediction method is developed to learn the complex quadrotor dynamics in real-time. Simulation results are provided to demonstrate the effectiveness of the proposed approach.Comment: Submitted to ICRA 2018, 8 page

arXiv.org e-Print Archive

Crossref

A New Data Source for Inverse Dynamics Learning

Author: Kappler Daniel
Meier Franziska
Ratliff Nathan
Schaal Stefan
Publication venue
Publication date: 01/01/2017
Field of study

Modern robotics is gravitating toward increasingly collaborative human robot interaction. Tools such as acceleration policies can naturally support the realization of reactive, adaptive, and compliant robots. These tools require us to model the system dynamics accurately -- a difficult task. The fundamental problem remains that simulation and reality diverge--we do not know how to accurately change a robot's state. Thus, recent research on improving inverse dynamics models has been focused on making use of machine learning techniques. Traditional learning techniques train on the actual realized accelerations, instead of the policy's desired accelerations, which is an indirect data source. Here we show how an additional training signal -- measured at the desired accelerations -- can be derived from a feedback control signal. This effectively creates a second data source for learning inverse dynamics models. Furthermore, we show how both the traditional and this new data source, can be used to train task-specific models of the inverse dynamics, when used independently or combined. We analyze the use of both data sources in simulation and demonstrate its effectiveness on a real-world robotic platform. We show that our system incrementally improves the learned inverse dynamics model, and when using both data sources combined converges more consistently and faster.Comment: IROS 201

arXiv.org e-Print Archive

Crossref

MPG.PuRe

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Author: Burdick Joel W.
Cheng Richard
Murray Richard M.
Orosz Gabor
Publication venue
Publication date: 01/02/2019
Field of study

Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) on-line learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages the success of RL algorithms to learn high-performance controllers, while the CBF-based controllers both guarantee safety and guide the learning process by constraining the set of explorable polices. We utilize Gaussian Processes (GPs) to model the system dynamics and its uncertainties. Our novel controller synthesis algorithm, RL-CBF, guarantees safety with high probability during the learning process, regardless of the RL algorithm used, and demonstrates greater policy exploration efficiency. We test our algorithm on (1) control of an inverted pendulum and (2) autonomous car-following with wireless vehicle-to-vehicle communication, and show that our algorithm attains much greater sample efficiency in learning than other state-of-the-art algorithms and maintains safety during the entire learning process.Comment: Published in AAAI 201

arXiv.org e-Print Archive

Caltech Authors

Association for the Advancement of Artificial Intelligence: AAAI Publications