683 research outputs found
Perseus: Randomized Point-based Value Iteration for POMDPs
Partially observable Markov decision processes (POMDPs) form an attractive
and principled framework for agent planning under uncertainty. Point-based
approximate techniques for POMDPs compute a policy based on a finite set of
points collected in advance from the agents belief space. We present a
randomized point-based value iteration algorithm called Perseus. The algorithm
performs approximate value backup stages, ensuring that in each backup stage
the value of each point in the belief set is improved; the key observation is
that a single backup may improve the value of many belief points. Contrary to
other point-based methods, Perseus backs up only a (randomly selected) subset
of points in the belief set, sufficient for improving the value of each belief
point in the set. We show how the same idea can be extended to dealing with
continuous action spaces. Experimental results show the potential of Perseus in
large scale POMDP problems
PLGRIM: Hierarchical Value Learning for Large-scale Exploration in Unknown Environments
In order for an autonomous robot to efficiently explore an unknown
environment, it must account for uncertainty in sensor measurements, hazard
assessment, localization, and motion execution. Making decisions for maximal
reward in a stochastic setting requires value learning and policy construction
over a belief space, i.e., probability distribution over all possible
robot-world states. However, belief space planning in a large spatial
environment over long temporal horizons suffers from severe computational
challenges. Moreover, constructed policies must safely adapt to unexpected
changes in the belief at runtime. This work proposes a scalable value learning
framework, PLGRIM (Probabilistic Local and Global Reasoning on Information
roadMaps), that bridges the gap between (i) local, risk-aware resiliency and
(ii) global, reward-seeking mission objectives. Leveraging hierarchical belief
space planners with information-rich graph structures, PLGRIM addresses
large-scale exploration problems while providing locally near-optimal coverage
plans. We validate our proposed framework with high-fidelity dynamic
simulations in diverse environments and on physical robots in Martian-analog
lava tubes
KR: An Architecture for Knowledge Representation and Reasoning in Robotics
This paper describes an architecture that combines the complementary
strengths of declarative programming and probabilistic graphical models to
enable robots to represent, reason with, and learn from, qualitative and
quantitative descriptions of uncertainty and knowledge. An action language is
used for the low-level (LL) and high-level (HL) system descriptions in the
architecture, and the definition of recorded histories in the HL is expanded to
allow prioritized defaults. For any given goal, tentative plans created in the
HL using default knowledge and commonsense reasoning are implemented in the LL
using probabilistic algorithms, with the corresponding observations used to
update the HL history. Tight coupling between the two levels enables automatic
selection of relevant variables and generation of suitable action policies in
the LL for each HL action, and supports reasoning with violation of defaults,
noisy observations and unreliable actions in large and complex domains. The
architecture is evaluated in simulation and on physical robots transporting
objects in indoor domains; the benefit on robots is a reduction in task
execution time of 39% compared with a purely probabilistic, but still
hierarchical, approach.Comment: The paper appears in the Proceedings of the 15th International
Workshop on Non-Monotonic Reasoning (NMR 2014
Analysis of methods for playing human robot hide-and-seek in a simple real world urban environment
The hide-and-seek game has many interesting aspects for studying cognitive functions in robots and the interactions between mobile robots and humans. Some MOMDP (Mixed Observable Markovian Decision Processes) models and a heuristic-based method are proposed and evaluated as an automated seeker. MOMDPs are used because the hider's position is not always known (partially observable), and the seeker's position is fully observable. The MOMDP model is used in an o-line method for which two reward functions are tried. Because the time complexity of this model grows exponentially with the number of (partially observable) states, an on-line hierarchical MOMDP model was proposed to handle bigger maps. To reduce the states in the on-line method a robot centered segmentation is used. In addition to extensive simulations, games with a human hider and a real mobile robot as a seeker have been done in a simple urban environment.Peer ReviewedPostprint (author’s final draft
Autonomous surveillance robots: a decision-making framework for networked multiagent systems
This article proposes an architecture for an intelligent surveillance system, where the aim is to mitigate the burden on humans in conventional surveillance systems by incorporating intelligent interfaces, computer vision, and autonomous mobile robots. Central to the intelligent surveillance system is the application of research into planning and decision making in this novel context. In this article, we describe the robot surveillance decision problem and explain how the integration of components in our system supports fully automated decision making. Several concrete scenarios deployed in real surveillance environments exemplify both the flexibility of our system to experiment with different representations and algorithms and the portability of our system into a variety of problem contexts. Moreover, these scenarios demonstrate how planning enables robots to effectively balance surveillance objectives, autonomously performing the job of human patrols and responders.This work was partially supported by the Portuguese Fundação para a Ciência e a Tecnologia (FCT), through strategic funding for Institute for Systems and Robotics/Laboratory for Robotics and Engineering Systems (ISR/LARSyS) under grant PEst-OE/EEI/LA0021/2013 and through the Carnegie Mellon Portugal Program under grant CMU-PT/SIA/0023/2009. This study also received national funds through the FCT, with reference UID/CEC/S0021/2013, and through grant FCT UID/EEA/50009/2013 of ISR/LARSyS
Partially Observable Monte Carlo Planning with state variable constraints for mobile robot navigation
Autonomous mobile robots employed in industrial applications often operate in complex and uncertain environments. In this paper we propose an approach based on an extension of Partially Observable Monte Carlo Planning (POMCP) for robot velocity regulation in industrial-like environments characterized by uncertain motion difficulties. The velocity selected by POMCP is used by a standard engine controller which deals with path planning. This two-layer approach allows POMCP to exploit prior knowledge on the relationships between task similarities to improve performance in terms of time spent to traverse a path with obstacles. We also propose three measures to support human-understanding of the strategy used by POMCP to improve the performance. The overall architecture is tested on a Turtlebot3 in two environments, a rectangular path and a realistic production line in a research lab. Tests performed on a C++ simulator confirm the capability of the proposed approach to profitably use prior knowledge, achieving a performance improvement from 0.7% to 3.1% depending on the complexity of the path. Experiments on a Unity simulator show that the proposed two-layer approach outperforms also single-layer approaches based only on the engine controller (i.e., without the POMCP layer). In this case the performance improvement is up to 37% comparing to a state-of-the-art deep reinforcement learning engine controller, and up to 51% comparing to the standard ROS engine controller. Finally, experiments in a real-world testing arena confirm the possibility to run the approach on real robots
- …