Search CORE

30,614 research outputs found

Learning to Prevent Monocular SLAM Failure using Reinforcement Learning

Author: Bhowmick Brojeshwar
Daga Swapnil
Krishna K. Madhava
Pareekutty Nahas
Prasad Vignesh
Ravindran Balaraman
Saurabh Rohitashva Singh
Yadav Karmesh
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/01/2020
Field of study

Monocular SLAM refers to using a single camera to estimate robot ego motion while building a map of the environment. While Monocular SLAM is a well studied problem, automating Monocular SLAM by integrating it with trajectory planning frameworks is particularly challenging. This paper presents a novel formulation based on Reinforcement Learning (RL) that generates fail safe trajectories wherein the SLAM generated outputs do not deviate largely from their true values. Quintessentially, the RL framework successfully learns the otherwise complex relation between perceptual inputs and motor actions and uses this knowledge to generate trajectories that do not cause failure of SLAM. We show systematically in simulations how the quality of the SLAM dramatically improves when trajectories are computed using RL. Our method scales effectively across Monocular SLAM frameworks in both simulation and in real world experiments with a mobile robot.Comment: Accepted at the 11th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP) 2018 More info can be found at the project page at https://robotics.iiit.ac.in/people/vignesh.prasad/SLAMSafePlanner.html and the supplementary video can be found at https://www.youtube.com/watch?v=420QmM_Z8v

arXiv.org e-Print Archive

Crossref

Episodic Learning with Control Lyapunov Functions for Uncertain Robotic Systems

Author: Ames Aaron D.
Dorobantu Victor D.
Le Hoang M.
Taylor Andrew J.
Yue Yisong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/03/2019
Field of study

Many modern nonlinear control methods aim to endow systems with guaranteed properties, such as stability or safety, and have been successfully applied to the domain of robotics. However, model uncertainty remains a persistent challenge, weakening theoretical guarantees and causing implementation failures on physical systems. This paper develops a machine learning framework centered around Control Lyapunov Functions (CLFs) to adapt to parametric uncertainty and unmodeled dynamics in general robotic systems. Our proposed method proceeds by iteratively updating estimates of Lyapunov function derivatives and improving controllers, ultimately yielding a stabilizing quadratic program model-based controller. We validate our approach on a planar Segway simulation, demonstrating substantial performance improvements by iteratively refining on a base model-free controller

arXiv.org e-Print Archive

Crossref

Caltech Authors

Discovering Blind Spots in Reinforcement Learning

Author: Dey Debadeepta
Horvitz Eric
Kamar Ece
Ramakrishnan Ramya
Shah Julie
Publication venue
Publication date: 23/05/2018
Field of study

Agents trained in simulation may make errors in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult to discover because the agent cannot predict them a priori. We propose using oracle feedback to learn a predictive model of these blind spots to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: The agent does not have the appropriate features to represent the true state of the world and thus cannot distinguish among numerous states. We formalize the problem of discovering blind spots in RL as a noisy supervised learning problem with class imbalance. We learn models to predict blind spots in unseen regions of the state space by combining techniques for label aggregation, calibration, and supervised learning. The models take into consideration noise emerging from different forms of oracle feedback, including demonstrations and corrections. We evaluate our approach on two domains and show that it achieves higher predictive performance than baseline methods, and that the learned model can be used to selectively query an oracle at execution time to prevent errors. We also empirically analyze the biases of various feedback types and how they influence the discovery of blind spots.Comment: To appear at AAMAS 201

arXiv.org e-Print Archive

DSpace@MIT