Search CORE

1,821 research outputs found

Risk-aware navigation for UAV digital data collection

Author: Xing Zhi
Publication venue: SURFACE at Syracuse University
Publication date: 25/08/2017
Field of study

This thesis studies the navigation task for autonomous UAVs to collect digital data in a risky environment. Three problem formulations are proposed according to different real-world situations. First, we focus on uniform probabilistic risk and assume UAV has unlimited amount of energy. With these assumptions, we provide the graph-based Data-collecting Robot Problem (DRP) model, and propose heuristic planning solutions that consist of a clustering step and a tour building step. Experiments show our methods provide high-quality solutions with high expected reward. Second, we investigate non-uniform probabilistic risk and limited energy capacity of UAV. We present the Data-collection Problem (DCP) to model the task. DCP is a grid-based Markov decision process, and we utilize reinforcement learning with a deep Ensemble Navigation Network (ENN) to tackle the problem. Given four simple navigation algorithms and some additional heuristic information, ENN is able to find improved solutions. Finally, we consider the risk in the form of an opponent and limited energy capacity of UAV, for which we resort to the Data-collection Game (DCG) model. DCG is a grid-based two-player stochastic game where the opponent may have different strategies. We propose opponent modeling to improve data-collection efficiency, design four deep neural networks that model the opponent\u27s behavior at different levels, and empirically prove that explicit opponent modeling with a dedicated network provides superior performance

Syracuse University Research Facility and Collaborative Environment

Recommended from our members

Action selection in modular reinforcement learning

Author: Zhang Ruohan
Publication venue
Publication date: 16/09/2014
Field of study

textModular reinforcement learning is an approach to resolve the curse of dimensionality problem in traditional reinforcement learning. We design and implement a modular reinforcement learning algorithm, which is based on three major components: Markov decision process decomposition, module training, and global action selection. We define and formalize module class and module instance concepts in decomposition step. Under our framework of decomposition, we train each modules efficiently using SARSA(

\lambda

) algorithm. Then we design, implement, test, and compare three action selection algorithms based on different heuristics: Module Combination, Module Selection, and Module Voting. For last two algorithms, we propose a method to calculate module weights efficiently, by using standard deviation of Q-values of each module. We show that Module Combination and Module Voting algorithms produce satisfactory performance in our test domain.Computer Science

Texas ScholarWorks

Reinforcement Learning: A Survey

Author: Kaelbling L. P.
Littman M. L.
Moore A. W.
Publication venue
Publication date: 01/01/1996
Field of study

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Active Reward Learning for Co-Robotic Vision Based Exploration in Bandwidth Limited Environments

Author: Girdhar Yogesh
How Jonathan P.
Jamieson Stewart
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/03/2020
Field of study

We present a novel POMDP problem formulation for a robot that must autonomously decide where to go to collect new and scientifically relevant images given a limited ability to communicate with its human operator. From this formulation we derive constraints and design principles for the observation model, reward model, and communication strategy of such a robot, exploring techniques to deal with the very high-dimensional observation space and scarcity of relevant training data. We introduce a novel active reward learning strategy based on making queries to help the robot minimize path "regret" online, and evaluate it for suitability in autonomous visual exploration through simulations. We demonstrate that, in some bandwidth-limited environments, this novel regret-based criterion enables the robotic explorer to collect up to 17% more reward per mission than the next-best criterion.Comment: 7 pages, 4 figures; accepted for presentation in IEEE Int. Conf. on Robotics and Automation, ICRA '20, Paris, France, June 202

arXiv.org e-Print Archive

Crossref

DSpace@MIT