Search CORE

683 research outputs found

Perseus: Randomized Point-based Value Iteration for POMDPs

Author: Spaan M. T. J.
Vlassis N.
Publication venue: 'AI Access Foundation'
Publication date: 09/09/2011
Field of study

Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Point-based approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agents belief space. We present a randomized point-based value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Contrary to other point-based methods, Perseus backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set. We show how the same idea can be extended to dealing with continuous action spaces. Experimental results show the potential of Perseus in large scale POMDP problems

arXiv.org e-Print Archive

Crossref

PLGRIM: Hierarchical Value Learning for Large-scale Exploration in Unknown Environments

Author: Agha-mohammadi Ali-akbar
Bouman Amanda
Burdick Joel
Fan David D.
Kim Sung-Kyun
Otsu Kyohei
Salhotra Gautam
Publication venue
Publication date: 17/05/2021
Field of study

In order for an autonomous robot to efficiently explore an unknown environment, it must account for uncertainty in sensor measurements, hazard assessment, localization, and motion execution. Making decisions for maximal reward in a stochastic setting requires value learning and policy construction over a belief space, i.e., probability distribution over all possible robot-world states. However, belief space planning in a large spatial environment over long temporal horizons suffers from severe computational challenges. Moreover, constructed policies must safely adapt to unexpected changes in the belief at runtime. This work proposes a scalable value learning framework, PLGRIM (Probabilistic Local and Global Reasoning on Information roadMaps), that bridges the gap between (i) local, risk-aware resiliency and (ii) global, reward-seeking mission objectives. Leveraging hierarchical belief space planners with information-rich graph structures, PLGRIM addresses large-scale exploration problems while providing locally near-optimal coverage plans. We validate our proposed framework with high-fidelity dynamic simulations in diverse environments and on physical robots in Martian-analog lava tubes

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

KR $^3$ : An Architecture for Knowledge Representation and Reasoning in Robotics

Author: Gelfond Michael
Sridharan Mohan
Wyatt Jeremy
Zhang Shiqi
Publication venue
Publication date: 01/01/2014
Field of study

This paper describes an architecture that combines the complementary strengths of declarative programming and probabilistic graphical models to enable robots to represent, reason with, and learn from, qualitative and quantitative descriptions of uncertainty and knowledge. An action language is used for the low-level (LL) and high-level (HL) system descriptions in the architecture, and the definition of recorded histories in the HL is expanded to allow prioritized defaults. For any given goal, tentative plans created in the HL using default knowledge and commonsense reasoning are implemented in the LL using probabilistic algorithms, with the corresponding observations used to update the HL history. Tight coupling between the two levels enables automatic selection of relevant variables and generation of suitable action policies in the LL for each HL action, and supports reasoning with violation of defaults, noisy observations and unreliable actions in large and complex domains. The architecture is evaluated in simulation and on physical robots transporting objects in indoor domains; the benefit on robots is a reduction in task execution time of 39% compared with a purely probabilistic, but still hierarchical, approach.Comment: The paper appears in the Proceedings of the 15th International Workshop on Non-Monotonic Reasoning (NMR 2014

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Birmingham Research Portal

Analysis of methods for playing human robot hide-and-seek in a simple real world urban environment

Author: Alquézar Mancho René
Goldhoorn Alex
Sanfeliu Cortés Alberto
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The hide-and-seek game has many interesting aspects for studying cognitive functions in robots and the interactions between mobile robots and humans. Some MOMDP (Mixed Observable Markovian Decision Processes) models and a heuristic-based method are proposed and evaluated as an automated seeker. MOMDPs are used because the hider's position is not always known (partially observable), and the seeker's position is fully observable. The MOMDP model is used in an o-line method for which two reward functions are tried. Because the time complexity of this model grows exponentially with the number of (partially observable) states, an on-line hierarchical MOMDP model was proposed to handle bigger maps. To reduce the states in the on-line method a robot centered segmentation is used. In addition to extensive simulations, games with a human hider and a real mobile robot as a seeker have been done in a simple urban environment.Peer ReviewedPostprint (author’s final draft

UPCommons. Portal del coneixement obert de la UPC

Autonomous surveillance robots: a decision-making framework for networked multiagent systems

Author: Castillo Montoya José Carlos
Lima Pedro U.
Melo Francisco S.
Messias Joao
Veloso Manuela
Witwicki Stefan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2017
Field of study

This article proposes an architecture for an intelligent surveillance system, where the aim is to mitigate the burden on humans in conventional surveillance systems by incorporating intelligent interfaces, computer vision, and autonomous mobile robots. Central to the intelligent surveillance system is the application of research into planning and decision making in this novel context. In this article, we describe the robot surveillance decision problem and explain how the integration of components in our system supports fully automated decision making. Several concrete scenarios deployed in real surveillance environments exemplify both the flexibility of our system to experiment with different representations and algorithms and the portability of our system into a variety of problem contexts. Moreover, these scenarios demonstrate how planning enables robots to effectively balance surveillance objectives, autonomously performing the job of human patrols and responders.This work was partially supported by the Portuguese Fundação para a Ciência e a Tecnologia (FCT), through strategic funding for Institute for Systems and Robotics/Laboratory for Robotics and Engineering Systems (ISR/LARSyS) under grant PEst-OE/EEI/LA0021/2013 and through the Carnegie Mellon Portugal Program under grant CMU-PT/SIA/0023/2009. This study also received national funds through the FCT, with reference UID/CEC/S0021/2013, and through grant FCT UID/EEA/50009/2013 of ISR/LARSyS

Universidad Carlos III de Madrid e-Archivo

Partially Observable Monte Carlo Planning with state variable constraints for mobile robot navigation

Author: Alberto Castellini
Alessandro Farinelli
Enrico Marchesini
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Autonomous mobile robots employed in industrial applications often operate in complex and uncertain environments. In this paper we propose an approach based on an extension of Partially Observable Monte Carlo Planning (POMCP) for robot velocity regulation in industrial-like environments characterized by uncertain motion difficulties. The velocity selected by POMCP is used by a standard engine controller which deals with path planning. This two-layer approach allows POMCP to exploit prior knowledge on the relationships between task similarities to improve performance in terms of time spent to traverse a path with obstacles. We also propose three measures to support human-understanding of the strategy used by POMCP to improve the performance. The overall architecture is tested on a Turtlebot3 in two environments, a rectangular path and a realistic production line in a research lab. Tests performed on a C++ simulator confirm the capability of the proposed approach to profitably use prior knowledge, achieving a performance improvement from 0.7% to 3.1% depending on the complexity of the path. Experiments on a Unity simulator show that the proposed two-layer approach outperforms also single-layer approaches based only on the engine controller (i.e., without the POMCP layer). In this case the performance improvement is up to 37% comparing to a state-of-the-art deep reinforcement learning engine controller, and up to 51% comparing to the standard ROS engine controller. Finally, experiments in a real-world testing arena confirm the possibility to run the approach on real robots

Catalogo dei prodotti della ricerca